Monday, February 20, 2012

OPENXML performance

Hi,
we use heavily openxml to insert data into our sqlserver 2000 sp3 tables.
We have some performance issues that seems to be caused by our use of OPENXML.
Doing a trace and wait analisys (1) we found that much of the time is taken
by oledb operation.
Looking at the trace file we saw a remote scan appening any time we do an
xpath query over openxml.
We managed to do all the openxml work at the sp start, putting all the data
in table variables (in memory), freeing the xml doc and then using those
tables to do the work.
Looking at docs we didn't find great infos about some aspects of OPENXML.
OPENXML uses an OLEDB rowset to do the work, so it seems that it loads the
xml in a MSXML dom and then populates the rowset.
Looking at the sqlserver proc it seems that it uses MSXML2, so looking at
the version of it we found 8.2.7919.0.
This seems a 2.6 SP2 version.
Is it possible/recommended to upgrade the version to 2.5 sp3?
Is it possible/recommended to upgrade the version to MSXML3 or MSXML4?
It seems also that there's a buffer where those dom documents are stored.
Is it possible to trace how much documents/space are keeped by mssql?
There's some optimization to set up with OPENXML?
To pass the xml fragment to mssql we use mssql oledb, we can have some
advantage by using sqlxmloledb?
Carlo Folini
(1) http://support.microsoft.com/default...B;EN-US;271509
See below.
Best regards
Michael
"Carlo Folini" <folini@.community.nospam> wrote in message
news:A01A902E-799B-44E5-8D9B-54813547B769@.microsoft.com...
> Hi,
> we use heavily openxml to insert data into our sqlserver 2000 sp3 tables.
> We have some performance issues that seems to be caused by our use of
> OPENXML.
> Doing a trace and wait analisys (1) we found that much of the time is
> taken
> by oledb operation.
> Looking at the trace file we saw a remote scan appening any time we do an
> xpath query over openxml.
> We managed to do all the openxml work at the sp start, putting all the
> data
> in table variables (in memory), freeing the xml doc and then using those
> tables to do the work.
> Looking at docs we didn't find great infos about some aspects of OPENXML.
> OPENXML uses an OLEDB rowset to do the work, so it seems that it loads the
> xml in a MSXML dom and then populates the rowset.
> Looking at the sqlserver proc it seems that it uses MSXML2, so looking at
> the version of it we found 8.2.7919.0.
> This seems a 2.6 SP2 version.
Correct.

> Is it possible/recommended to upgrade the version to 2.5 sp3?
Do you mean 2.6 SP3? It is not supported to do that upgrade yourself. There
are some issues with running the MSI if SP2 is already installed.

> Is it possible/recommended to upgrade the version to MSXML3 or MSXML4?
No. Although the next SQL Server 2000 service pack is probably moving to use
MSXML 3.

> It seems also that there's a buffer where those dom documents are stored.
> Is it possible to trace how much documents/space are keeped by mssql?
Unfortunately not. The rule of thumb is 3 to 6 times as much as the original
textual size.

> There's some optimization to set up with OPENXML?
There are some best practices (some you are already following above). In
addition, you want to avoid parent axis (..) unless needed, and do not use
flag 3.

> To pass the xml fragment to mssql we use mssql oledb, we can have some
> advantage by using sqlxmloledb?
Not from the OpenXML point of view.
What are the symptoms that you see and what is the doc size and load
characteristics (feel free to contact me per email).
Best regards
Michael
> --
> Carlo Folini
> (1) http://support.microsoft.com/default...B;EN-US;271509
|||inline...

> Do you mean 2.6 SP3? It is not supported to do that upgrade yourself. There
> are some issues with running the MSI if SP2 is already installed.
Yes, I meant 2.6 sp3. The sp2 version is different from the 'official one'
published on the support site.
We have 8.2.7919.0 instead of 8.2.8307.0.
Do you think that we have to update our sp2 installation?
> There are some best practices (some you are already following above). In
> addition, you want to avoid parent axis (..) unless needed, and do not use
> flag 3.
What do you mean for "flag 3"?
Having the following xml structure/query, how can we avoid parent axis?
SELECT Address
FROM OPENXML(@.HDoc, 'MyData/Tab/Row', 1)
WITH (MyTableName VARCHAR(18) '../@.TableName',
Address VARCHAR(120))
WHERE MyTableName = 'Pippo'
<MyData>
<Tab MyTableName="Pippo">
<Row address="sdfdsf fds"/>
<Row address="432 432243"/>
<Row address="bcv bvccb bvc"/>
</Tab>
</MyData>

> What are the symptoms that you see and what is the doc size and load
> characteristics (feel free to contact me per email).
>
We have some critical conditions that we are investigating, in those
situation the CPU% grows to 100%.
Having the cpu to 100% the time taken by each sp grows, causing the lock to
be held for long time.
So we saw in our log that we have a number of timeouts and also some
deadlock conditions.
The timeouts are throwed by the sp that uses openxml.
The xml stream (passed as text parameter to the sp) ranges from 6k to 60k.
The data are essentially an xml representation of some tables content (the
data are retrieved from host via COMTI and serialized to xml by a vb6
component).
Thanks for you help
Carlo
|||See below.
Best regards
Michael
"Carlo Folini" <folini@.community.nospam> wrote in message
news:96468399-2219-4418-A80B-B9D30D52E2D2@.microsoft.com...
> inline...
> Yes, I meant 2.6 sp3. The sp2 version is different from the 'official one'
> published on the support site.
> We have 8.2.7919.0 instead of 8.2.8307.0.
> Do you think that we have to update our sp2 installation?
The support site is a bit newer, since it includes some additional bug
fixes. I don't think they help you with your perf issues.

> What do you mean for "flag 3"?
With the syntax OPENXML( handle, row expression, flag) flag is the third
argument. It should not be set to 3 for performance reasons with MSXML 2.6
any version (MSXML 3.0 should have fixed the problem, still 3 should be
avoided if possible).

> Having the following xml structure/query, how can we avoid parent axis?
> SELECT Address
> FROM OPENXML(@.HDoc, 'MyData/Tab/Row', 1)
> WITH (MyTableName VARCHAR(18) '../@.TableName',
> Address VARCHAR(120))
> WHERE MyTableName = 'Pippo'
> <MyData>
> <Tab MyTableName="Pippo">
> <Row address="sdfdsf fds"/>
> <Row address="432 432243"/>
> <Row address="bcv bvccb bvc"/>
> </Tab>
> </MyData>
You cannot, unless you can copy the attribute to its children when the XML
is generated or along the way (ie, mid-tier). Note that SQL Server SP4
should not have this problem.

> We have some critical conditions that we are investigating, in those
> situation the CPU% grows to 100%.
> Having the cpu to 100% the time taken by each sp grows, causing the lock
> to
> be held for long time.
> So we saw in our log that we have a number of timeouts and also some
> deadlock conditions.
> The timeouts are throwed by the sp that uses openxml.
> The xml stream (passed as text parameter to the sp) ranges from 6k to 60k.
> The data are essentially an xml representation of some tables content (the
> data are retrieved from host via COMTI and serialized to xml by a vb6
> component).
I assume that the XPath execution will take utilization up to that level.
The document size does not look too bad though. How many concurrent
transactions are using OpenXML at the same time?
Also, deadlocks normally also indicate a problem with respect to your update
logic. Can you take the OpenXML part out of the transactions that deadlock
and see what happens then?

> Thanks for you help
> Carlo
Best regards
Michael

No comments:

Post a Comment