The traditional relational database approach for implementing data driven processing would typically store metadata in a set of relational tables. Limitations with maintainability, version control, and deployment make this an unattractive approach compared to the latest XML-centric approaches. How can this be done in PL/SQL?
With Oracle’s rich support for XML in the database it is now very hard to not exploit its use in your database application designs. XML is now becoming the most popular format for holding configuration metadata for bespoke systems and even the native format for many desktop applications. This article describes how to develop a generic PL/SQL package to read the hierarchical structure of XML metadata and map it to a hierarchical structure of a PL/SQL nested table record type driven by its own XML mapping rules. It is directed at PL/SQL developers looking to use XML as a metadata format for their data driven applications.
Data Driven Applications
Traditional Approach
Before looking at how XML can be read by a PL/SQL application, let us take a look at how we may have traditionally designed a data driven application. As an experienced data modeller, over the years I have developed or come across many enterprise level data models that have had a subset of entities to control the behaviour of the application – the data driven “rules” of the system.
Typically, the physical design would have resulted in a series of screens to maintain these metadata tables, often having to navigate down several parent child relationships. Configuring a development system via these screens may have been acceptable, but configuring a controlled test system or production system using proper versioning control and deployment methods often forced developers to deliver the rules using DML scripts. This is still not a particularly elegant approach and not really properly addressing the issues with maintainability, version control, and deployment.
From a PL/SQL coding perspective, retrieving and navigating around the metadata in the database tables meant repetitive calls to the database in nested loops. A more elegant design would have been to map the relational data to a nested table PL/SQL record type which would have enabled more efficient and flexible navigation.
The problem with rules and configuration metadata is they can become enhanced and changed over time. This usually resulted in changes to the table design, new columns, etc, which are not very convenient to chop and change after a system has been deployed and been in use for a while.
So the traditional approach, though a familiar and better understood solution by veteran database designers, has plenty of scope for improvement. The drawbacks of taking a traditional approach can be summarized as follows:
- Maintainability – data maintained via screens and/or DML scripts.
- Version control – to achieve good traceability and version control of the configuration data changes would probably require this built into the tables adding a further unwieldy dimension to the data design.
- Deployment – new data and further changes would require DML scripts to deploy.
- Flexibility – Changes to the metadata design would require DDL changes and possible transformation of data to accommodate a new structure.
XML Approach
XML has become a very attractive means of representing certain types of data, in particular small “envelopes” of data such as message payloads, configuration metadata, transient data in web applications, etc.
In my view, it is definitely not ready for high capacity use such as storing transactional data, performing complex queries, etc. So there is still a very strong and valid case for having a hybrid approach, where XML provides an excellent alternative to holding configuration metadata in a series of relational tables.
The attraction of using XML comes from the way it overcomes the limitations discussed earlier:
- Maintainability – If you are content with editing XML documents with an XML editor, then this rids the need for maintenance screens. Otherwise, with a bit of XSL development it is possible to develop browser based screens to edit the XML directly.
- Version control – XML is a plain text document which can be easily stored in a version control system such as CVS or VSS, along with comment tags to stamp the version and other information against the document.
- Deployment – XML can be easily stored in the database either in an XMLTYPE column or in a CLOB as a single item of data, and subsequently retrieved by a PL/SQL application and the XML mapped to a nested table PL/SQL record type.
- Flexibility – Changing the structure of XML is very easy, in particular adding or removing elements and attributes.
Design Considerations
XML and PL/SQL
At first glance it is easy to be over-whelmed with the available XML functionality in PL/SQL. It helps if you can understand the origins of the various capabilities and see what supersedes what as there is quite a bit of “legacy” support. The main split is between XML Developers Kit (XDK) introduced in Oracle 8i and XMLDB introduced in Oracle 9i R2.
I find XMLDB, with its XMLTYPE and API packages DBMS_DOM, DBMS_PARSER, and DBMS_XMLPROCESSOR, is by far the most convenient to use and easily satisfies all of my XML programming needs. The XMLTYPE object type is also supported with a host of methods to interrogate and manipulate the XML content.
There are several ways of reading the content of an XMLTYPE document, mainly via its ExtractValue method or the DBMS_XMLDOM package. ExtractValue is convenient for pulling out specific values from the XML document identified by an XPATH string, and DBMS_XMLDOM is better for walking the through the nodes of the XML document in particular where you want to read in the full data content.
PL/SQL Control Flow
Whichever method you op for to read the content of the XMLTYPE, where it is being used to drive the flow of the program as in the case of metadata, then in needs to be converted into a more accessible format. The best way to simulate the hierarchical structure of XML is using nested tables within PL/SQL record types. In Oracle 9i there is no limitation to how many levels of nesting you can descend, so your typical XML document should be easily accommodated.
Once you have all of your XML metadata loaded into a nested table PL/SQL record type, then it easy to traverse the arrays via nested loops in the PL/SQL code, and having very little cost or overhead with any further repetitive traverses.
We are now onto the main objective of this article which is to explain how you would retrieve data from XML and load up the PL/SQL record type. In the past I would have done this by looping through the nodes in the XMLTYPE using DBMS_XMLDOM, mapping parent elements to records in an associative array, and then mapping elements and attributes holding data into columns on the record type. Apart from having to hard code all of your mappings, this has been fairly straight forward to code and has worked effectively. When you start to use this approach extensively and find that you are repeating the same structure of code again and again, and keep including the same little intricacies of using DBMS_XMLDOM, then its time to write a generic package which is driven by its own metadata to define the mappings between the XMLTYPE and the PL/SQL record type.
Strong Typing
The main issue with PL/SQL record types in a compiled program is that they are strong typed, i.e. their definition is predefined at compile time, which means that any generic package developed to perform the mapping would have to be compiled with the PL/SQL record type it is trying to populate. A very useful way to overcome this is to use dynamic PL/SQL and generate the PL/SQL on the fly to access the PL/SQL record type.
Ideally, we would want to have a PL/SQL function to return a PL/SQL record type populated with the data content of the XML. Since parameters have to be strong typed as well, then it will not be possible to develop a packaged function or procedure that returns the PL/SQL record type populated with the XML data content. An alternative means of passing data between programs is to use package global variables. Dynamic PL/SQL can reference global variables in any package loaded into memory by the session, so it is feasible to generate dynamic PL/SQL code to populate each component of the record type global variable.
Dynamic PL/SQL
Having the EXECUTE IMMEDIATE command makes using dynamic PL/SQL very straight forward. Though this command does have its limitations where code cannot exceed the 32k bytes in size of code. The alternative is to use the more traditional, but also more resilient method of using the DBMS_SQL supplied package, which has no limit to the size of SQL you use. It is not as easy to use as EXECUTE IMMEDIATE, but it does the job.
Solution
Outlined Design
So we have now discussed and overcome each potential limitation and now have a workable solution. So to summarise the overall design we have:
- The generic PL/SQL program would use its own XML metadata to define the XML to PL/SQL record type mappings.
- DBMS_XMLDOM would be used to retrieve the data from the XMLTYPE.
- The data content of the XML would be mapped into PL/SQL record type using dynamic PL/SQL.
- A global variable representing the PL/SQL record type would be used as an interface between the the calling program and the package solution.
Demonstration Example
Source XML Metadata
Rather than show an authentic example and have to explain what the metadata is representing, I have put together a hypothetical example. It shows a simple hierarchical structure that descends five levels with content data in a mixture of elements and attributes. The elements have been given obvious names eg Level1, Level2, … etc.
<?xml version="1.0"?> <level1 Id="1" Name="Name 1"> <description>Level 1 Demo</description> <level2 Id="11" Name="Name 1.1"> <description>Level 2 Demo - 1.1</description> <level3 Id="111" Name="Name 1.1.1"> <description>Level 3 Demo - 1.1.1</description> <level4 Id="1111" Name="Name 1.1.1.1"> <description>Level 4 Demo - 1.1.1.1</description> <level5 Id="11111" Name="Name 1.1.1.1.1"> <description>Level 5 Demo - 1.1.1.1.1</description> </level5> <level5 Id="11112" Name="Name 1.1.1.1.2"> <description>Level 5 Demo - 1.1.1.1.2</description> </level5> </level4> </level3> <level3 Id="112" Name="Name 1.1.2"> <description>Level 3 Demo - 1.1.2</description> <level4 Id="1121" Name="Name 1.1.2.1"> <description>Level 4 Demo - 1.1.2.1</description> <level5 Id="11211" Name="Name 1.1.2.1.1"> <description>Level 5 Demo - 1.1.2.1.1</description> </level5> <level5 Id="11212" Name="Name 1.1.2.1.2"> <description>Level 5 Demo - 1.1.2.1.2</description> </level5> </level4> </level3> </level2> <level2 Id="12" Name="Name 1.2"> <description>Level 2 Demo - 1.2</description> </level2> </level1>
Target Record Type
This XML data needs to be loaded into a PL/SQL record type that simulates the hierarchical structure of the XML metadata:
TYPE t_level5 IS RECORD ( id VARCHAR2(10), name VARCHAR2(100), descr VARCHAR2(100) ); TYPE t_level5_col IS TABLE OF t_level5; TYPE t_level4 IS RECORD ( id VARCHAR2(10), name VARCHAR2(100), descr VARCHAR2(100), level5 t_level5_col ); TYPE t_level4_col IS TABLE OF t_level4; TYPE t_level3 IS RECORD ( id VARCHAR2(10), name VARCHAR2(100), descr VARCHAR2(100), level4 t_level4_col ); TYPE t_level3_col IS TABLE OF t_level3; TYPE t_level2 IS RECORD ( id VARCHAR2(10), name VARCHAR2(100), descr VARCHAR2(100), level3 t_level3_col ); TYPE t_level2_col IS TABLE OF t_level2; TYPE t_level1 IS RECORD ( id VARCHAR2(10), name VARCHAR2(100), descr VARCHAR2(100), level2 t_level2_col ); TYPE t_level1_col IS TABLE OF t_level1; TYPE t_param IS RECORD ( level1 t_level1_col );
The record types do not necessarily have to be PL/SQL record types, but can also be object types defined in the database.
XML to Record Type Mappings
To define a mapping between the source XML metadata and the target PL/SQL record type, we use some XML metadata:
<?xml version="1.0"?> <Mappings Global="demo_metadata.g_param" DateFormat="YYYYMMDD"> <Record Id="Level1" ParentId="" Source="Level1" ParentSource="" Target="level1" Type="test_metadata2.t_level1_col"> <Mapping Type="String" Source="@Id" Target="id"/> <Mapping Type="String" Source="@Name" Target="name"/> <Mapping Type="String" Source="Description" Target="descr"/> <Mapping Type="Boolean" Source="@IsValid" Target="is_valid_ind"/> <Mapping Type="Number" Source="@Version" Target="version"/> <Mapping Type="LargeString" Source="Comments" Target="comments"/> <Mapping Type="Date" Source="CreatedOn" Target="create_dt"/> <Mapping Type="Date" Source="UpdatedOn" Target="update_dt" DateFormat="YYYYMMDDHH24MISS"/> </Record> <Record Id="Level2" ParentId="Level1" Source="Level2" ParentSource="Level1" Target="level2" Type="test_metadata2.t_level2_col"> <Mapping Type="String" Source="@Id" Target="id"/> <Mapping Type="String" Source="@Name" Target="name"/> <Mapping Type="String" Source="Description" Target="descr"/> </Record> <Record Id="Level3" ParentId="Level2" Source="Level3" ParentSource="Level2" Target="level3" Type="test_metadata2.t_level3_col"> <Mapping Type="String" Source="@Id" Target="id"/> <Mapping Type="String" Source="@Name" Target="name"/> <Mapping Type="String" Source="Description" Target="descr"/> </Record> <Record Id="Level4" ParentId="Level3" Source="Level4" ParentSource="Level3" Target="level4" Type="test_metadata2.t_level4_col"> <Mapping Type="String" Source="@Id" Target="id"/> <Mapping Type="String" Source="@Name" Target="name"/> <Mapping Type="String" Source="Description" Target="descr"/> </Record> <Record Id="Level5" ParentId="Level4" Source="Level5" ParentSource="Level4" Target="level5" Type="test_metadata2.t_level5_col"> <Mapping Type="String" Source="@Id" Target="id"/> <Mapping Type="String" Source="@Name" Target="name"/> <Mapping Type="String" Source="Description" Target="descr"/> </Record> </Mappings>
The metadata has two levels of mapping: one at record level and one at column level. There are rules for defining the mappings:
- The record mappings are driven from the record types, i.e. from the target’s perspective.
- The target record type must be a collection (TABLE TYPE)
- The hierarchical relationship between the records is defined using an arbitrarily defined unique id (“Id”) and the parent id (“ParentId”) on the child record acting as a foreign key. The full parent-child chain must be defined without any missing links or gaps.
- The Record Mapping definitions must follow in a parent child sequence, where a record mapping defined before its parent is not supported.
- The records do not have to contain any column mappings
- There can only be one and only one top most parent record, i.e. where the ParentId is NULL. It must also be the first to be defined.
- The “Source” refers to the source element that contains the attributes and/or elements holding the data content. The “ParentSource” refers to the parent element name in order to distinguish between elements and attributes used in different parent elements.
- The “Target” is the name of the record column on the target PL/SQL record type.
- The “Global” is the name of the target global variable to be populated with the XML data content.
- It is also important to define the “Type” of the PL/SQL record, including making any reference to the package header the type is defined in. Database object types also can be used.
- Column mappings are defined within the “Mapping” elements. Source attribute names must be prefixed with an “@” as in the XPATH notation.
- The column type is used to determine what data conversion is required. The allowable values are:
- String. No conversion, but must be less than 32k bytes.
- Number (performs a to_number conversion), returns null if conversion fails.
- Boolean (accepts yes, no, true, false, Y, N, which it will convert into a TRUE or FALSE value in BOOLEAN column on the record type.
- Date which assumes a format mask as specified by the “FormatMask” attribute in the mappings metadata. If this mask has not been supplied, then it uses the default date format of the database. The format mask can be defined globally against the “Metadata” root element, or against an individual column mapping.
- LongString. The maximum size the XMLDOM package can read from an XML node is 64k, so this is to handle data between 32k bytes and 64k bytes. Rather than use CLOBs in the record type ad worry about the management of temporary LOBs, the data is read into an table type of VARCHAR2(32767). With current limitations this would result in array not exceeding two entries.
Conclusion
The solution described here gives a quick and easy method for mapping an XML document to a PL/SQL record type.