Monday, February 20, 2012

OPENXML Question

I have an text column in a table that was intended to store XML data about queries that users submitted in our database. The problem appears that the application inserting records into the table doesn't do a great job of formatting the XML and I am having a great deal of trouble getting workable results.

The XML data in the table looks like:
<request>
<name>LastName</name>
<oper>=</oper>
<value>Smith</value>
<name>FirstName</name>
<oper>=</oper>
<value>John</value>
<name>MiddleName</name>
<oper>=</oper>
<value>Q.</value>
</request>

Note the groups of three elements (name, oper, value). These always appear sequentially in the data and there are always a fixed number of elements. If an element has no data, then it contains the value 'null' (ie, <value>null</value>).

I want to get a result like:

Col1 Col2 Col3
======== ====== ========
LastName = Smith
FirstName = John
MiddleName = Q

When I execute OPENXML, I specify option 2 (element centric) and the following WITH clause:

WITH (name varchar(10),
oper varchar(10),
value varchar(50) )

I get a single row returned. I have also tried:

WITH (name varchar(10) '@.name',
oper varchar(10) '@.oper',
value varchar(50) '@.value')

I get a single row returned (with null values).

I've tried lots of other permutations, but nothing so far has worked. The only thing that has been modestly successful is to create and Edge Table (exclude the WITH clause), but that's really hard for me to work with. Another option is to insert a <line num="#">...<line> around each group of three elements (name, oper, value). That worked well, but it's a pain to implement (it means going back and updating all rows where the XML data exists and inserting the proper data.

I am by no means an XML guru, so if anyone has a suggestion, I'd love to hear it.

Regards,

hmscottI think you are struggling because you have assigned meaning to the order of elements in your source xml file. To paraphrase Brett: the order of elements in an xml file *should* have no meaning. Assigning meaning to the order of elements leads to fragile and inflexible code (dependent on element order to work correctly).

a better xml design would be something that groups each tuple, but (as you mention above) that means you have to change the xml.

For instance:

<request>
<var name="LastName" oper="=" value="Smith"/>
...
</request>

so that it's unambiguous which name/oper/value goes together.|||I think you are struggling because you have assigned meaning to the order of elements in your source xml file. To paraphrase Brett: the order of elements in an xml file *should* have no meaning. Assigning meaning to the order of elements leads to fragile and inflexible code (dependent on element order to work correctly).

I didn't assign meaning to the order of anything. This is legacy code that's been around since before my time (and which also means that the developer whose neck I wish to wring has long since departed).

I know what I would do to prevent the problem; what can I do to deal with the data as it is (about 1,000,000+ rows)?

Regards,

hmscott|||well, this works, but I'm not terribly proud of it. It's certainly fragile... :)

declare @.names table (id int identity(1,1), name varchar(10))
declare @.opers table (id int identity(1,1), oper varchar(10))
declare @.values table (id int identity(1,1), value varchar(10))

DECLARE @.idoc int
DECLARE @.doc varchar(1000)
SET @.doc ='
<request>
<name>LastName</name>
<oper>=</oper>
<value>Smith</value>
<name>FirstName</name>
<oper>=</oper>
<value>John</value>
<name>MiddleName</name>
<oper>=</oper>
<value>Q.</value>
</request>'
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @.idoc OUTPUT, @.doc

insert into @.names (name) SELECT * FROM OPENXML (@.idoc, '/request/name',1) WITH (name varchar(10) '.')
insert into @.opers (oper) SELECT * FROM OPENXML (@.idoc, '/request/oper',1) WITH (oper varchar(10) '.')
insert into @.values (value) SELECT * FROM OPENXML (@.idoc, '/request/value',1) WITH (value varchar(10) '.')

EXEC sp_xml_removedocument @.idoc

select n.name, o.oper, v.value from
@.names n
join @.opers o on o.id=n.id
join @.values v on v.id=n.id|||well, this works, but I'm not terribly proud of it. It's certainly fragile... :)

Hey, thanks. I hadn't thought of trying it that way.
:beer:

That's what I get for beating my head against the wall for too long.

Regards,

hmscott|||glad to help. note that my idea falls down completely if there is a missing name,oper, or value element anywhere in your file, because then it would group the wrong elements together.

that's what I meant by it being fragile. :)|||glad to help. note that my idea falls down completely if there is a missing name,oper, or value element anywhere in your file, because then it would group the wrong elements together.

that's what I meant by it being fragile. :)

That's understood. Actually, the way the data is organized, I think I can throw this inside a cursor. It still will fail in the event of a missing element, but perhaps a bit of error checking will capture that.

Thanks again.

hmscott

No comments:

Post a Comment