OPENXML - Inside story
In the
first part
we had seen how we can use SQL Server's in built capabilities to get XML as an
output. In this article we would take a preview of how to use yet another XML
capability from SQL Server 2000. OpenXML primarily gives the ability to insert
XML data to the relational database.
Not wasting much lets look at the syntax of OPENXML:
OPENXML(iDoc,
rowpattern, flags)
[WITH (rowsetschema [colpatterns] | tablename)]
Taking a closer look into the parameters:
-
iDoc . We get this by calling a stored procedure
called sp_xml_preparedocument. We’ll talk more about this stored procedure in a
moment.
-
The RowPattern parameter specified which nodes we
want OPENXML to process using XPath.
-
The Flags parameter specifies the format of our
results. The following values can be used:
-
0 – Default value. Attribute centric mapping.
-
1 – Use Attribute centric mapping.
-
2 – Use element centric mapping.
-
8 – Only unconsumed data should be copied to the
overflow property @mp;xmltext.
Attribute centric grabs the data from specific
elements whereas element centric grabs data from specific sub elements. This
will all make sense when we do a couple of examples. Below is an simple
example:
DECLARE
@index int
DECLARE @XMLdoc varchar(8000)
SET
@XMLdoc ='<DataSet>
<Person id="1">
<Name>Bharat</Name>
<PhoneNo>2225042</PhoneNo>
</Person>
<Person id="2">
<Name>Saravana</Name>
<PhoneNo>6762945</PhoneNo>
</Person>
</DataSet>'
EXEC
sp_xml_preparedocument @index OUTPUT, @XMLdoc
SELECT
*
FROM OPENXML (@index, 'DataSet/Person')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo', id Varchar(10))
EXEC
sp_xml_removedocument @index
Well I think the syntax of OPENXML clear and the
usage. But there are two new stored procedures you can see. "sp_xml_preparedocument"
would be used to prepare the XML document and gets loaded into the memory. This
statement returns a pointer as an integer. And for all the reference we would
need this pointer only. And the next stored procedure is "sp_xml_removedocument"
this would freeup the memory for you.
To retrieve a more useful rowset, the rowpattern
parameter should specify the path to the level in the document from which you
require data. You define the path as an XPath expression in which the nodes in
the tree, separated by / delimiters, are identified. You can see from our
example that we have used the "DataSet/Person" as a XPath expression. The rowpattern
parameter can limit the rows being returned by including an XPath expression
that defines some selection criteria. For example, the following sample code
could be used to return a rowset that contains all items with id greater than
1.
SELECT
*
FROM OPENXML (@index, 'DataSet/Person[@id>1]')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo', id Varchar(10))
Now adding to this, if you were to see the column
Names are similar to the element name in the XML document. You can give an
alias to this as the first parameter. See the example below where we display id
as identifier.
SELECT
*
FROM OPENXML (@index, 'DataSet/Person')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo', [Identifier]
Varchar(10) '@id')
The most practical application of the OpenXML function
is to insert data from an XML document into tables in the database. Which
Transact-SQL statement you use to do this depends on whether the table you want
to store the data in already exists.
Inserting to New table
You might occasionally want to use the data in an XML
document to create and populate a new table. This strategy might be a suitable
approach were you revising a product catalog, say, for which the XML catalog
document contained an updated version of the entire catalog. The most efficient
way to update the database might be to simply drop the existing catalog table
and re-create it with the new data. Since we have got the got the data from the
XML document we can directly use the Select ..Into clause and
create a new table.
SELECT
* INTO Persons
FROM OPENXML (@index, 'DataSet/Person')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo',
[Identifier] Varchar(10))
And now that from the above example we have created a
new Persons table we can view this data as:
Select
* from Persons
If the Persons table already exists, the above
Transact-SQL statement will fail, so any stored procedure using this approach
would need to use the DROP TABLE statement to delete the existing table first.
Of course, if an existing table is dropped, any constraints, such as primary
keys or foreign keys, and any indexes will be dropped with it. Hence use this
approach with caution.
Inserting into existing table
From the above example you must have guessed how we
are going to insert into an existing table. Well, yes we are going to use the "Insert
into ... Select .." clause. So lets create an table to hold the data
upfront.
Create
Table Persons ( Identifier Int, Name Varchar(100), PhoneNo Varchar(30))
And now insert the data:
Insert
into Persons (Name, PhoneNo, Identifier)
SELECT *
FROM OPENXML (@index, 'DataSet/Person')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo',
[Identifier] Varchar(10)
'@id')
In these examples we have taken a simple table and
inserted values. These can span multiple tables also. Transaction has to be
handled by the caller (ADO transaction or MSDTC transaction) or in the stored
procedure.
Advanced OPENXML
The basic usage of OPENXML has been explained above.
In addition to being useful for inserting XML data into tables, the OpenXML
function can be used to retrieve metadata from an XML document. This
functionality allows you to write code that queries the actual XML structure of
the document, and you could use it to build your own XML-processing
application. A typical example:
SELECT *
FROM OPENXML (@index, 'DataSet/Person')
WITH ([Id] Varchar(10)
'@mp:id',
NodeName VARCHAR(20)
'@mp:localname',
ParentNode VARCHAR(20)
'@mp:parentlocalname')
Now the output from this query is interesting. It
tells that Persons node come at 2nd and 6th position. The NodeName is Persons
and the ParentNode is DataSet. There is arsenal of such metadata parameters you
can use: I've listed them below:
-
@mp:id - Unique identifier for
the specified node.
-
@mp:localname - The name of the
element.
-
@mp:namespaceuri - The namespace
for the specified node. The value of this property is NULL if no namespace is
defined.
-
@mp:parentid - The ID of the
parent node.
-
@mp:parentlocalname - The name of
the parent node.
-
@mp:prefix - The prefix used for
the specified node.
-
@mp:prev - The ID of the previous
sibling node.
-
@mp:parentnamespaceuri - The
namespace of the parent node.
-
@mp:parentprefix - The prefix of
the parent node.
-
@mp:xmltext - The textual
representation of the node and its attributes and subelements.
I am not getting to the details of each in this
article. These can be experimented with to ...
Conclusion
In the future articles we will discuss other XML
features offered by SQL Server 2000 in detail.
|