Wednesday, March 28, 2012
Optimizing insert performance
We're using SQL Server 2000 on a Windows 2000 box with dual
hyperthreaded Intel CPUs, raided hard disks and 2GB RAM. It should
certainly be able to handle our currently fairly modest demands. We
have all the latest security patches. Parallelism is disabled.
We're currently experiencing unacceptable slowness for inserts to a
particular table, Account_Transaction (see the DDL at the end of this
message). As you can see, the main features are a BIGINT auto-increment
primary key, and three foreign keys. This table currently has 7 million
rows and another million are inserted each month. We are currently
unhappy with the length of time it takes to insert into this table as
this operation is at the heart of our application.
Here is an example insert, called from within a stored procedure:
INSERT INTO dbo.Account_Transaction(
invoice_id, transaction_type_id, qty, amount, commission,
tax_percentage, currency_code, points, line_id, comment,
balance_before, external_id)
VALUES
(
@.invoice_id,
@.transaction_type_id,
@.qty,
@.amount,
@.amount_commission,
@.tax_percentage,
@.currency_code_home,
@.points,
@.line_id,
@.comment,
@.balance,
@.external_id
)
Looking at this query's SP:StmtCompleted event from the profiler we can
see the query took 375ms to complete and performed 76 reads (and
apparently zero writes, although the data was definitely inserted). We
think this is far too long.
What is strange though is that if we run the exact same statement
OUTSIDE of the calling procedure in query analyzer, substituting
literals for the variables, then it apparently completes in just 63ms,
although it performs the same number of writes:
INSERT INTO dbo.Account_Transaction(
invoice_id, transaction_type_id, qty, amount, commission,
tax_percentage, currency_code, points, line_id, comment,
balance_before, external_id)
VALUES
(6842289, 12, 1, 1, 0, 10, 'AUD', 0, 4, 'test', 0, 0)
If we now try the same two things on our test database, running on a a
bog-standard desktop machine, with only 300000 rows in the
Account_Transaction table, then both queries execute in similar times of
around 70ms.
So my question is - what is causing the slowness, and how can we speed
it up? How do we make the insert operation from within the SP take the
same amount of time as it takes when run on its own?
Any help appreciated,
Mike C
CREATE TABLE [dbo].[Account_Transaction] (
[id] [bigint] IDENTITY (1, 1) NOT NULL ,
[invoice_id] [bigint] NOT NULL ,
[transaction_type_id] [int] NOT NULL ,
[qty] [int] NOT NULL ,
[amount] [money] NOT NULL ,
[tax_percentage] [decimal](9, 3) NULL ,
[commission] [float] NOT NULL ,
[currency_code] [char] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[line_id] [int] NOT NULL ,
[points] [int] NOT NULL ,
[comment] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[balance_before] [money] NOT NULL ,
[balance_after] [money] NULL ,
[external_id] [bigint] NULL ,
[msrepl_tran_version] [uniqueidentifier] NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Account_Transaction] WITH NOCHECK ADD
CONSTRAINT [PK_Account_Transaction] PRIMARY KEY CLUSTERED
(
[id]
) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
ALTER TABLE [dbo].[Account_Transaction] WITH NOCHECK ADD
CONSTRAINT [DF__Account_T__amoun__3118447E] DEFAULT (0) FOR [amount],
CONSTRAINT [DF__Account_T__point__320C68B7] DEFAULT (0) FOR [points],
CONSTRAINT [DF__Account_T__msrep__31783731] DEFAULT (newid()) FOR
[msrepl_tran_version]
GO
CREATE INDEX [IX_Account_Transaction] ON
[dbo].[Account_Transaction]([invoice_id]) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
ALTER TABLE [dbo].[Account_Transaction] ADD
CONSTRAINT [FK_Account_Transaction_Account_Invoice]
FOREIGN KEY
(
[invoice_id]
) REFERENCES [dbo].[Account_Invoice] (
[id]
) ON DELETE CASCADE ON UPDATE CASCADE ,
CONSTRAINT [FK_Account_Transaction_Line] FOREIGN KEY
(
[line_id]
) REFERENCES [dbo].[Line] (
[id]
),
CONSTRAINT [FK_Account_Transaction_Transaction_Type
] FOREIGN KEY
(
[transaction_type_id]
) REFERENCES [dbo].[Transaction_Type] (
[id]
)
GOMike
1) Make sure that your table has no triggers
2) Try drop the indexes just prior to INSERTING and re-create them after
Another thing might help that if you do it by using SP which accepted
parameters try to re-assign these parameters to local variable
CREATE PROC myProc
@.par INT
AS
DECLARE @.lpar INT
SET @.lpar =@.par
INSERT INTO Table VALUES (@.lpar )
"Mike Chamberlain" <none@.hotmail.com> wrote in message
news:eGXwhg%23HGHA.1180@.TK2MSFTNGP09.phx.gbl...
> Hi there.
> We're using SQL Server 2000 on a Windows 2000 box with dual hyperthreaded
> Intel CPUs, raided hard disks and 2GB RAM. It should certainly be able to
> handle our currently fairly modest demands. We have all the latest
> security patches. Parallelism is disabled.
> We're currently experiencing unacceptable slowness for inserts to a
> particular table, Account_Transaction (see the DDL at the end of this
> message). As you can see, the main features are a BIGINT auto-increment
> primary key, and three foreign keys. This table currently has 7 million
> rows and another million are inserted each month. We are currently
> unhappy with the length of time it takes to insert into this table as this
> operation is at the heart of our application.
> Here is an example insert, called from within a stored procedure:
> INSERT INTO dbo.Account_Transaction(
> invoice_id, transaction_type_id, qty, amount, commission,
> tax_percentage, currency_code, points, line_id, comment,
> balance_before, external_id)
> VALUES
> (
> @.invoice_id,
> @.transaction_type_id,
> @.qty,
> @.amount,
> @.amount_commission,
> @.tax_percentage,
> @.currency_code_home,
> @.points,
> @.line_id,
> @.comment,
> @.balance,
> @.external_id
> )
> Looking at this query's SP:StmtCompleted event from the profiler we can
> see the query took 375ms to complete and performed 76 reads (and
> apparently zero writes, although the data was definitely inserted). We
> think this is far too long.
> What is strange though is that if we run the exact same statement OUTSIDE
> of the calling procedure in query analyzer, substituting literals for the
> variables, then it apparently completes in just 63ms, although it performs
> the same number of writes:
> INSERT INTO dbo.Account_Transaction(
> invoice_id, transaction_type_id, qty, amount, commission,
> tax_percentage, currency_code, points, line_id, comment,
> balance_before, external_id)
> VALUES
> (6842289, 12, 1, 1, 0, 10, 'AUD', 0, 4, 'test', 0, 0)
> If we now try the same two things on our test database, running on a a
> bog-standard desktop machine, with only 300000 rows in the
> Account_Transaction table, then both queries execute in similar times of
> around 70ms.
> So my question is - what is causing the slowness, and how can we speed it
> up? How do we make the insert operation from within the SP take the same
> amount of time as it takes when run on its own?
> Any help appreciated,
> --
> Mike C
>
>
> CREATE TABLE [dbo].[Account_Transaction] (
> [id] [bigint] IDENTITY (1, 1) NOT NULL ,
> [invoice_id] [bigint] NOT NULL ,
> [transaction_type_id] [int] NOT NULL ,
> [qty] [int] NOT NULL ,
> [amount] [money] NOT NULL ,
> [tax_percentage] [decimal](9, 3) NULL ,
> [commission] [float] NOT NULL ,
> [currency_code] [char] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
> [line_id] [int] NOT NULL ,
> [points] [int] NOT NULL ,
> [comment] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [balance_before] [money] NOT NULL ,
> [balance_after] [money] NULL ,
> [external_id] [bigint] NULL ,
> [msrepl_tran_version] [uniqueidentifier] NOT NULL
> ) ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[Account_Transaction] WITH NOCHECK ADD
> CONSTRAINT [PK_Account_Transaction] PRIMARY KEY CLUSTERED
> (
> [id]
> ) WITH FILLFACTOR = 90 ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[Account_Transaction] WITH NOCHECK ADD
> CONSTRAINT [DF__Account_T__amoun__3118447E] DEFAULT (0) FOR [amount],
> CONSTRAINT [DF__Account_T__point__320C68B7] DEFAULT (0) FOR [points],
> CONSTRAINT [DF__Account_T__msrep__31783731] DEFAULT (newid()) FOR
> [msrepl_tran_version]
> GO
> CREATE INDEX [IX_Account_Transaction] ON
> [dbo].[Account_Transaction]([invoice_id]) WITH FILLFACTOR = 90 ON
> [PRIMARY]
> GO
> ALTER TABLE [dbo].[Account_Transaction] ADD
> CONSTRAINT [FK_Account_Transaction_Account_Invoice]
FOREIGN KEY
> (
> [invoice_id]
> ) REFERENCES [dbo].[Account_Invoice] (
> [id]
> ) ON DELETE CASCADE ON UPDATE CASCADE ,
> CONSTRAINT [FK_Account_Transaction_Line] FOREIGN KEY
> (
> [line_id]
> ) REFERENCES [dbo].[Line] (
> [id]
> ),
> CONSTRAINT [FK_Account_Transaction_Transaction_Type
] FOREIGN KEY
> (
> [transaction_type_id]
> ) REFERENCES [dbo].[Transaction_Type] (
> [id]
> )
> GO|||It sounds like Mike is speaking of single transactional inserts and not bulk
inserts, so dropping / recreating indexes on the table would not be
practical.
"Uri Dimant" <urid@.iscar.co.il> wrote in message
news:%23Unqxm%23HGHA.528@.TK2MSFTNGP12.phx.gbl...
> Mike
> 1) Make sure that your table has no triggers
> 2) Try drop the indexes just prior to INSERTING and re-create them after
>
> Another thing might help that if you do it by using SP which accepted
> parameters try to re-assign these parameters to local variable
>
> CREATE PROC myProc
> @.par INT
> AS
> DECLARE @.lpar INT
>
> SET @.lpar =@.par
> INSERT INTO Table VALUES (@.lpar )
>
>
> "Mike Chamberlain" <none@.hotmail.com> wrote in message
> news:eGXwhg%23HGHA.1180@.TK2MSFTNGP09.phx.gbl...
>|||Consider if the stored procedure is re-compiling each time it is executed.
This could explain the overhead when executing the insert from the procedure
while it still performs the same number of writes as the test insert from
Query Analyzer.
INF: Conditions for Stored Procedure Recompilation
http://support.microsoft.com/defaul...kb;en-us;104445
How to identify the cause of recompilation in an SP:Recompile event
http://support.microsoft.com/defaul...kb;en-us;308737
"Mike Chamberlain" <none@.hotmail.com> wrote in message
news:eGXwhg%23HGHA.1180@.TK2MSFTNGP09.phx.gbl...
> Hi there.
> We're using SQL Server 2000 on a Windows 2000 box with dual hyperthreaded
> Intel CPUs, raided hard disks and 2GB RAM. It should certainly be able to
> handle our currently fairly modest demands. We have all the latest
> security patches. Parallelism is disabled.
> We're currently experiencing unacceptable slowness for inserts to a
> particular table, Account_Transaction (see the DDL at the end of this
> message). As you can see, the main features are a BIGINT auto-increment
> primary key, and three foreign keys. This table currently has 7 million
> rows and another million are inserted each month. We are currently
> unhappy with the length of time it takes to insert into this table as this
> operation is at the heart of our application.
> Here is an example insert, called from within a stored procedure:
> INSERT INTO dbo.Account_Transaction(
> invoice_id, transaction_type_id, qty, amount, commission,
> tax_percentage, currency_code, points, line_id, comment,
> balance_before, external_id)
> VALUES
> (
> @.invoice_id,
> @.transaction_type_id,
> @.qty,
> @.amount,
> @.amount_commission,
> @.tax_percentage,
> @.currency_code_home,
> @.points,
> @.line_id,
> @.comment,
> @.balance,
> @.external_id
> )
> Looking at this query's SP:StmtCompleted event from the profiler we can
> see the query took 375ms to complete and performed 76 reads (and
> apparently zero writes, although the data was definitely inserted). We
> think this is far too long.
> What is strange though is that if we run the exact same statement OUTSIDE
> of the calling procedure in query analyzer, substituting literals for the
> variables, then it apparently completes in just 63ms, although it performs
> the same number of writes:
> INSERT INTO dbo.Account_Transaction(
> invoice_id, transaction_type_id, qty, amount, commission,
> tax_percentage, currency_code, points, line_id, comment,
> balance_before, external_id)
> VALUES
> (6842289, 12, 1, 1, 0, 10, 'AUD', 0, 4, 'test', 0, 0)
> If we now try the same two things on our test database, running on a a
> bog-standard desktop machine, with only 300000 rows in the
> Account_Transaction table, then both queries execute in similar times of
> around 70ms.
> So my question is - what is causing the slowness, and how can we speed it
> up? How do we make the insert operation from within the SP take the same
> amount of time as it takes when run on its own?
> Any help appreciated,
> --
> Mike C
>
>
> CREATE TABLE [dbo].[Account_Transaction] (
> [id] [bigint] IDENTITY (1, 1) NOT NULL ,
> [invoice_id] [bigint] NOT NULL ,
> [transaction_type_id] [int] NOT NULL ,
> [qty] [int] NOT NULL ,
> [amount] [money] NOT NULL ,
> [tax_percentage] [decimal](9, 3) NULL ,
> [commission] [float] NOT NULL ,
> [currency_code] [char] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
> [line_id] [int] NOT NULL ,
> [points] [int] NOT NULL ,
> [comment] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [balance_before] [money] NOT NULL ,
> [balance_after] [money] NULL ,
> [external_id] [bigint] NULL ,
> [msrepl_tran_version] [uniqueidentifier] NOT NULL
> ) ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[Account_Transaction] WITH NOCHECK ADD
> CONSTRAINT [PK_Account_Transaction] PRIMARY KEY CLUSTERED
> (
> [id]
> ) WITH FILLFACTOR = 90 ON [PRIMARY]
> GO
> ALTER TABLE [dbo].[Account_Transaction] WITH NOCHECK ADD
> CONSTRAINT [DF__Account_T__amoun__3118447E] DEFAULT (0) FOR [amount],
> CONSTRAINT [DF__Account_T__point__320C68B7] DEFAULT (0) FOR [points],
> CONSTRAINT [DF__Account_T__msrep__31783731] DEFAULT (newid()) FOR
> [msrepl_tran_version]
> GO
> CREATE INDEX [IX_Account_Transaction] ON
> [dbo].[Account_Transaction]([invoice_id]) WITH FILLFACTOR = 90 ON
> [PRIMARY]
> GO
> ALTER TABLE [dbo].[Account_Transaction] ADD
> CONSTRAINT [FK_Account_Transaction_Account_Invoice]
FOREIGN KEY
> (
> [invoice_id]
> ) REFERENCES [dbo].[Account_Invoice] (
> [id]
> ) ON DELETE CASCADE ON UPDATE CASCADE ,
> CONSTRAINT [FK_Account_Transaction_Line] FOREIGN KEY
> (
> [line_id]
> ) REFERENCES [dbo].[Line] (
> [id]
> ),
> CONSTRAINT [FK_Account_Transaction_Transaction_Type
] FOREIGN KEY
> (
> [transaction_type_id]
> ) REFERENCES [dbo].[Transaction_Type] (
> [id]
> )
> GOsql
Optimizing Bulk Insert
have found from tests is:
1. I need to have the batch size set to a minimum of 4% of the source file
(otherwise the command just quits with no message).
2. I need to have a clustered index and in the bulk insert command, use the
ORDER clause with the same order as the clustered index.
3. I'm not sure if a format file makes this quicker but I am using on.
Please let me know if my findings are incorrect and if there are any other
things that I could be doing to make this go any faster.
Any help would be appreciated.
EllieCheck out this link:
http://www.mssqlcity.com/Tips/bulk_...ptimization.htm
"Ellie" <nospam@.nospam.net> wrote in message
news:ewfl2ibhGHA.5096@.TK2MSFTNGP02.phx.gbl...
>I am trying to find the quickest way to have a bulk insert completed. What
>I have found from tests is:
> 1. I need to have the batch size set to a minimum of 4% of the source file
> (otherwise the command just quits with no message).
> 2. I need to have a clustered index and in the bulk insert command, use
> the ORDER clause with the same order as the clustered index.
> 3. I'm not sure if a format file makes this quicker but I am using on.
> Please let me know if my findings are incorrect and if there are any other
> things that I could be doing to make this go any faster.
> Any help would be appreciated.
> Ellie
>|||Ellie (nospam@.nospam.net) writes:
> 3. I'm not sure if a format file makes this quicker but I am using on.
I believe the fastest format is native format. But that requires of
course that the data comes from a source that can produce native format.
In essence, another SQL Server.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx
Friday, March 23, 2012
optimize insert to access DB
Hi,
I'm using access tables to store my data. to retrive data I use jet odbc engine in c++.
I want to insert a large amuont of records (about 20 mega records) to my database in the fasts way i can. if i use sql syntax (insert into table ()....) it's takes for ages (about 500 records per second).
if i'm writing a csv file and then use import (via access) it's much faster but here I have two problems
1.I dont know how to use the access import tool from c++.
2.I dont think I can distributie the access import tool with my product.
so my questions are :
1. Does any know any tool that insert records in an optimize way?
2. How can I use the access import tool in c++?
3. Can i use the jet engine to import csv files?
thank ishay
This forum is dedicated to SSIS. You may get better luck asking this question in the appropriate forum.
You could use SSIS to do this, and it should bulk insert into Access using the OleDB Destination Component. You won't be able to distribute it to your users though.
Could you mark this thread as answered if you have no further questions so we can close it out? Thanks!
Wednesday, March 21, 2012
Optimizations
doing a bulk insert from an odbc data source. If these tables have indexes
defined, will the bulk insert step make use of these? The reason I ask is
because I wasn't sure if it is worth performing optimizations (maintenance
plan) immediately after the insert. Any help is appreciated.
--
DrewDrew,
If the data to be imported is huge (a lot of rows), then it is better to
drop the indexes and recreate them after the bulk operation. Remember,
indexes are updates for each row inserted, updated or deleted.
AMB
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexes
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew|||Depending on the amount of data being inserted, its probably better to drop
the indexes and recreate them after the bulk insert, especially if one of the
indexes is a clustered index.
Bulk insert is not helped by indexes of any kind, but it sure can be
hindered by them. When the table is truncated, all the pointers in the
indexes are worthless. When the new data comes in, the clustered index (if
exists) has to manuver each row into the same order on the pages as the
clustered index is configured. this will slow down the bulk insert unless the
data is already in the same order.
if the data set is small, then just load the data and do a dbcc indexdefrag
to reindex the table.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexes
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew|||Bulk insert can be faster if you drop the indexes first. However, there are
other things you may want to consider (for example, database recovery mode).
You can search for "optimizing bulk copy performance" in BOL for more detail.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexes
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew|||Thanks for your input everyone. Again, much appreciated.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexes
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew
Optimizations
doing a bulk insert from an odbc data source. If these tables have indexes
defined, will the bulk insert step make use of these? The reason I ask is
because I wasn't sure if it is worth performing optimizations (maintenance
plan) immediately after the insert. Any help is appreciated.
Drew
Drew,
If the data to be imported is huge (a lot of rows), then it is better to
drop the indexes and recreate them after the bulk operation. Remember,
indexes are updates for each row inserted, updated or deleted.
AMB
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexes
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew
|||Depending on the amount of data being inserted, its probably better to drop
the indexes and recreate them after the bulk insert, especially if one of the
indexes is a clustered index.
Bulk insert is not helped by indexes of any kind, but it sure can be
hindered by them. When the table is truncated, all the pointers in the
indexes are worthless. When the new data comes in, the clustered index (if
exists) has to manuver each row into the same order on the pages as the
clustered index is configured. this will slow down the bulk insert unless the
data is already in the same order.
if the data set is small, then just load the data and do a dbcc indexdefrag
to reindex the table.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexes
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew
|||Bulk insert can be faster if you drop the indexes first. However, there are
other things you may want to consider (for example, database recovery mode).
You can search for "optimizing bulk copy performance" in BOL for more detail.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexes
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew
|||Thanks for your input everyone. Again, much appreciated.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexes
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew
sql
Optimizations
doing a bulk insert from an odbc data source. If these tables have indexes
defined, will the bulk insert step make use of these? The reason I ask is
because I wasn't sure if it is worth performing optimizations (maintenance
plan) immediately after the insert. Any help is appreciated.
--
DrewDrew,
If the data to be imported is huge (a lot of rows), then it is better to
drop the indexes and recreate them after the bulk operation. Remember,
indexes are updates for each row inserted, updated or deleted.
AMB
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexe
s
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew|||Depending on the amount of data being inserted, its probably better to drop
the indexes and recreate them after the bulk insert, especially if one of th
e
indexes is a clustered index.
Bulk insert is not helped by indexes of any kind, but it sure can be
hindered by them. When the table is truncated, all the pointers in the
indexes are worthless. When the new data comes in, the clustered index (if
exists) has to manuver each row into the same order on the pages as the
clustered index is configured. this will slow down the bulk insert unless th
e
data is already in the same order.
if the data set is small, then just load the data and do a dbcc indexdefrag
to reindex the table.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexe
s
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew|||Bulk insert can be faster if you drop the indexes first. However, there are
other things you may want to consider (for example, database recovery mode).
You can search for "optimizing bulk copy performance" in BOL for more detail
.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexe
s
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew|||Thanks for your input everyone. Again, much appreciated.
"Drew" wrote:
> I currently have a nightly process that truncates a set of tables prior to
> doing a bulk insert from an odbc data source. If these tables have indexe
s
> defined, will the bulk insert step make use of these? The reason I ask is
> because I wasn't sure if it is worth performing optimizations (maintenance
> plan) immediately after the insert. Any help is appreciated.
> --
> Drew
Optimization Rules for XML Bulk Insert
I was wondering if there were any guide lines to optimize the XML Bulk
Insert component by using different XML formats. I read an article that did
cases, but can't remember where I saw it.
Thank for any help!
-Frank
I found what I was looking for.
http://msdn.microsoft.com/library/de...erformance.asp
If anyone wanted to know also.
"Frank DeLuccia" <fdeluccia-RemoveThisInsert-@.edifice-ims.com> wrote in
message news:OMsHu%239REHA.3984@.TK2MSFTNGP11.phx.gbl...
> Hello Again,
> I was wondering if there were any guide lines to optimize the XML Bulk
> Insert component by using different XML formats. I read an article that
did
> cases, but can't remember where I saw it.
> Thank for any help!
> -Frank
>
Friday, March 9, 2012
Opinons on using Delete then BULK INSERT
INSERT. It reads through the flat file first deleting rows in the target
table that if finds. It then issues the BULK INSERT command on the same flat
file. This has the effect of doing an insert or update.
The DBA reports that the tables tend to become more fragmented because of
the deletes rather than using update.
Are there any pitfalls to this approach? Does it sound OK?
Are there other approaches other than Transact SQL for inserting or updating
large amounts of flat file data?
Thanks!!
McG
[url]http://mcg
news:eudcIdGPGHA.3984@.TK2MSFTNGP14.phx.gbl...
> Hi. Just looking for your opinions on a technique I have seen for using
> BULK INSERT. It reads through the flat file first deleting rows in the
> target table that if finds. It then issues the BULK INSERT command on the
> same flat file. This has the effect of doing an insert or update.
> The DBA reports that the tables tend to become more fragmented because of
> the deletes rather than using update.
> Are there any pitfalls to this approach? Does it sound OK?
> Are there other approaches other than Transact SQL for inserting or
> updating large amounts of flat file data?
> Thanks!!
> --
> McG
> [url]http://mcg
>
>
It depends on how big the tables are and how big the flat file is.
Personally, I would probably BULK INSERT the flat file into a working table
first. Then using a chunking method, I would update/insert about 10,000
rows at a time until the task completed. This has the advantage of letting
SQL Server do what it does best using SET theory for determining updates and
inserts, but also doing it in small enough chunks that other processing can
continue with only slight interruption.
Just my .02
Rick Sawtell
MCT, MCSD, MCDBA|||Thanks for the pointer Rick.
McG
[url]http://mcg
"Rick Sawtell" <Quickening@.msn.com> wrote in message
news:%23pxvSgGPGHA.1124@.TK2MSFTNGP10.phx.gbl...
> "McG
> news:eudcIdGPGHA.3984@.TK2MSFTNGP14.phx.gbl...
> It depends on how big the tables are and how big the flat file is.
> Personally, I would probably BULK INSERT the flat file into a working
> table first. Then using a chunking method, I would update/insert about
> 10,000 rows at a time until the task completed. This has the advantage of
> letting SQL Server do what it does best using SET theory for determining
> updates and inserts, but also doing it in small enough chunks that other
> processing can continue with only slight interruption.
> Just my .02
>
> Rick Sawtell
> MCT, MCSD, MCDBA
>
>
>
Saturday, February 25, 2012
Operating system error code 3(The system cannot find the path specified.).
I use Bulk insert to put data to myTable.
When the SQL server is in local machin, it works well. But when I put
the data in a sql server situated not locally, then I get a error
message like this:
Could not bulk insert because file 'C:\Data\2003
txtfiles\abif_20031130.txt' could not be opened. Operating system
error code 3(The system cannot find the path specified.).
BULK INSERT myTable
FROM 'C:\Data\2003 txtfiles\abif_20031130.txt'
with (
-- codepage = ' + char(39) + 'ACP' + char(39) + ',
fieldterminator = ';',
rowterminator = '\n',
keepnulls,
maxerrors=0)
Someone can explan me what the error shows up
Thanks in advance
- Loi -The location of the file is relative to the SQL Server on which the BULK
INSERT command runs. To import a file from another machine, specify an UNC
path rather than use drive letters. For example:
BULK INSERT myTable
FROM '\\MyOtherServer\MyShare\Data\2003 txtfiles\abif_20031130.txt'
Note that the SQL Server service account needs permissions to the share.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Loi" <votanloi@.hotmail.com> wrote in message
news:af9512d5.0406030313.73ed644b@.posting.google.c om...
> Hi All,
> I use Bulk insert to put data to myTable.
> When the SQL server is in local machin, it works well. But when I put
> the data in a sql server situated not locally, then I get a error
> message like this:
> Could not bulk insert because file 'C:\Data\2003
> txtfiles\abif_20031130.txt' could not be opened. Operating system
> error code 3(The system cannot find the path specified.).
> BULK INSERT myTable
> FROM 'C:\Data\2003 txtfiles\abif_20031130.txt'
> with (
> -- codepage = ' + char(39) + 'ACP' + char(39) + ',
> fieldterminator = ';',
> rowterminator = '\n',
> keepnulls,
> maxerrors=0)
> Someone can explan me what the error shows up
> Thanks in advance
> - Loi -|||Hi Dan,
Thanks for your reply.
It does not work yet.
A next question:
SQL Server service account needs permissions to the share.
In my case SQL server is setup following:
In the security tab: Authentication - SQL server and windows.
Startup sevice account - system account.
Can somebody give me any hints
Regards
- Loi -
"Dan Guzman" <danguzman@.nospam-earthlink.net> wrote in message news:<laEvc.20183$Tn6.12968@.newsread1.news.pas.earthlink .net>...
> The location of the file is relative to the SQL Server on which the BULK
> INSERT command runs. To import a file from another machine, specify an UNC
> path rather than use drive letters. For example:
> BULK INSERT myTable
> FROM '\\MyOtherServer\MyShare\Data\2003 txtfiles\abif_20031130.txt'
> Note that the SQL Server service account needs permissions to the share.
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> "Loi" <votanloi@.hotmail.com> wrote in message
> news:af9512d5.0406030313.73ed644b@.posting.google.c om...
> > Hi All,
> > I use Bulk insert to put data to myTable.
> > When the SQL server is in local machin, it works well. But when I put
> > the data in a sql server situated not locally, then I get a error
> > message like this:
> > Could not bulk insert because file 'C:\Data\2003
> > txtfiles\abif_20031130.txt' could not be opened. Operating system
> > error code 3(The system cannot find the path specified.).
> > BULK INSERT myTable
> > FROM 'C:\Data\2003 txtfiles\abif_20031130.txt'
> > with (
> > -- codepage = ' + char(39) + 'ACP' + char(39) + ',
> > fieldterminator = ';',
> > rowterminator = '\n',
> > keepnulls,
> > maxerrors=0)
> > Someone can explan me what the error shows up
> > Thanks in advance
> > - Loi -|||> Startup sevice account - system account.
Try using a domain account rather than the local system account and ensure
the account has permissions to the share.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Loi Tan Vo" <loi.tan.vo@.fnh.no> wrote in message
news:73f04281.0406040358.5a342f93@.posting.google.c om...
> Hi Dan,
> Thanks for your reply.
> It does not work yet.
> A next question:
> SQL Server service account needs permissions to the share.
> In my case SQL server is setup following:
> In the security tab: Authentication - SQL server and windows.
> Startup sevice account - system account.
> Can somebody give me any hints
> Regards
> - Loi -
> "Dan Guzman" <danguzman@.nospam-earthlink.net> wrote in message
news:<laEvc.20183$Tn6.12968@.newsread1.news.pas.earthlink .net>...
> > The location of the file is relative to the SQL Server on which the BULK
> > INSERT command runs. To import a file from another machine, specify an
UNC
> > path rather than use drive letters. For example:
> > BULK INSERT myTable
> > FROM '\\MyOtherServer\MyShare\Data\2003 txtfiles\abif_20031130.txt'
> > Note that the SQL Server service account needs permissions to the share.
> > --
> > Hope this helps.
> > Dan Guzman
> > SQL Server MVP
> > "Loi" <votanloi@.hotmail.com> wrote in message
> > news:af9512d5.0406030313.73ed644b@.posting.google.c om...
> > > Hi All,
> > > > I use Bulk insert to put data to myTable.
> > > When the SQL server is in local machin, it works well. But when I put
> > > the data in a sql server situated not locally, then I get a error
> > > message like this:
> > > > Could not bulk insert because file 'C:\Data\2003
> > > txtfiles\abif_20031130.txt' could not be opened. Operating system
> > > error code 3(The system cannot find the path specified.).
> > > > BULK INSERT myTable
> > > FROM 'C:\Data\2003 txtfiles\abif_20031130.txt'
> > > with (
> > > -- codepage = ' + char(39) + 'ACP' + char(39) + ',
> > > fieldterminator = ';',
> > > rowterminator = '\n',
> > > keepnulls,
> > > maxerrors=0)
> > > > Someone can explan me what the error shows up
> > > > Thanks in advance
> > > - Loi -
Operand type clash: datetime is incompatible with text
for my datetime fields?
Here is the statement
INSERT INTO tblCalendar (
adjName, DispDueDate, ClaimNumber, Juris, HearingDate, HearingType,
ClaimantName, Location, HearingTime, HearingPart, WCB#, Counsel,
DateRecd)
VALUES (
@.Adjuster, @.newDispositionDueDate, @.ClaimNumber, @.Juris, @.newHearingDate,
@.HearingType, @.ClaimantName, @.HearingLocation, @.newHearingTime,
@.HearingPartRoom, @.WCBClaimNumber, @.AssignedCounsel, @.newNoticeReceived)You might want to check the datatypes of each column in tblCalendar table
and make sure they are compatible with the variables in the VALUES() clause.
If you have difficulty, post the exact CREATE TABLE statements and the
declared types and assigned values for each variables in the INSERT
statement.
Anith
openxml/'for xml' image data problem
using 'for xml' and openxml.
The code below demonstrates my problem. I have a table, tblAttachment that
contains a field named 'file' that contains image data.
prsXML is a simple proc that takes an xml document, extracts rows using
openxml and updates the row.
--prsXML
create proc dbo.prsXML2 @.XMLDoc text as begin
declare @.iDoc int
EXEC sp_xml_preparedocument @.idoc OUTPUT, @.XMLDoc
select * into #tblAttachment
from openxml (@.idoc, '/NewDataSet/tblAttachment', 2) with tblAttachment
update tblAttachment set [ID] = t.[ID], [FileName] = t.[FileName],
[File] = t.[File]
from #tblAttachment t inner join tblAttachment f on f.ID = t.ID
drop table #tblAttachment
exec sp_xml_removedocument @.idoc
end
If I call prsXML passing it a simple xml document as shown below it updates
the row in the database.
prsXML '<?xml version="1.0" ?>
<NewDataSet>
<tblAttachment>
<ID>49AA5490-DD38-4A7D-87E7-0525E07930AF</ID>
<FileName>test.jpg</FileName>
<File>QUJDREVGRw==</File>
</tblAttachment>
</NewDataSet>'
However, when I then select this updated row using:
select [file] from tblAttachment where ID =
'49AA5490-DD38-4A7D-87E7-0525E07930AF'
for xml auto, elements, binary base64
it returns:
<tblAttachment>
<file>UQBVAEoARABSAEUAVgBHAFIAdwA9AD0A</file>
</tblAttachment>
The element <File>QUJDREVGRw==</File>
does not match the element: <file>UQBVAEoARABSAEUAVgBHAFIAdwA9AD0A</file>
returned by the last query.
It appears that encoding/decoding is messing up the image data. What is
going on and how can I fix it?
Thanks for your help.
Jay
The issue is that FOR XML binary base64 does encode the content of your
image column using base64 encoding, but the OpenXML with clause does not
perform a decoding (let's say it is a design quirk :-)).
So you have the following workarounds:
1. Extract the encoded image and run a base64-decoder on it before doing the
insertion. The algorithm is well-described in the literature and can be
implemented using T-SQL or an extended stored proc.
2. You are waiting for SQL Server 2005, where the nodes() method will
provide you automatic decoding in the same scenario.
Sorry and best regards
Michael
"jamccormick" <jamccormick@.discussions.microsoft.com> wrote in message
news:17A4CDE2-90C8-46B0-A17E-493E2A6AF23E@.microsoft.com...
> I'm having problems with image fields when trying to insert XML into a
> table
> using 'for xml' and openxml.
> The code below demonstrates my problem. I have a table, tblAttachment
> that
> contains a field named 'file' that contains image data.
> prsXML is a simple proc that takes an xml document, extracts rows using
> openxml and updates the row.
> --prsXML
> create proc dbo.prsXML2 @.XMLDoc text as begin
> declare @.iDoc int
> EXEC sp_xml_preparedocument @.idoc OUTPUT, @.XMLDoc
> select * into #tblAttachment
> from openxml (@.idoc, '/NewDataSet/tblAttachment', 2) with tblAttachment
> update tblAttachment set [ID] = t.[ID], [FileName] = t.[FileName],
> [File] = t.[File]
> from #tblAttachment t inner join tblAttachment f on f.ID = t.ID
> drop table #tblAttachment
> exec sp_xml_removedocument @.idoc
> end
> If I call prsXML passing it a simple xml document as shown below it
> updates
> the row in the database.
> prsXML '<?xml version="1.0" ?>
> <NewDataSet>
> <tblAttachment>
> <ID>49AA5490-DD38-4A7D-87E7-0525E07930AF</ID>
> <FileName>test.jpg</FileName>
> <File>QUJDREVGRw==</File>
> </tblAttachment>
> </NewDataSet>'
> However, when I then select this updated row using:
> select [file] from tblAttachment where ID =
> '49AA5490-DD38-4A7D-87E7-0525E07930AF'
> for xml auto, elements, binary base64
> it returns:
> <tblAttachment>
> <file>UQBVAEoARABSAEUAVgBHAFIAdwA9AD0A</file>
> </tblAttachment>
> The element <File>QUJDREVGRw==</File>
> does not match the element: <file>UQBVAEoARABSAEUAVgBHAFIAdwA9AD0A</file>
> returned by the last query.
> It appears that encoding/decoding is messing up the image data. What is
> going on and how can I fix it?
> Thanks for your help.
> --
> Jay
OPENXML won't return rows
the table structure:
CREATE TABLE dbo.Customer(
CustomerGUID uniqueidentifier NOT NULL,
CustomerName varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
Version int NULL,
ActivatedDate smalldatetime NULL,
ActivatedByWS uniqueidentifier NULL,
DeactivatedDate smalldatetime NULL,
DeactivatedByWS varchar(10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
UpdatedByWS uniqueidentifier NULL,
CustomerType char(2) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
SalesOfficeName varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
NULL,
PRIMARY KEY CLUSTERED
(
CustomerGUID ASC
)WITH (IGNORE_DUP_KEY = OFF) ON PRIMARY,
UNIQUE NONCLUSTERED
(
CustomerName ASC
)WITH (IGNORE_DUP_KEY = OFF) ON PRIMARY
) ON PRIMARY
GO
Here is the code to get the data from the document and insert it:
DECLARE @.docHandle int
declare @.xmlDocument xml
set @.xmlDocument = N'<ROOT>
<Customer>
<CustomerGUID>05062E6D-453B-4CAE-9EA4-BAF7D00E4235</CustomerGUID>
<CustomerType>CU</CustomerType>
<SalesOfficeName>Roadway Main</SalesOfficeName>
<UpdatedByWS>D7DA4883-656D-4923-9C5C-FD83EBE6F1BE</UpdatedByWS>
<ActivatedDate>2005-11-20T00:00:00</ActivatedDate>
</Customer>
</ROOT>'
EXEC sp_xml_preparedocument @.docHandle OUTPUT, @.xmlDocument
-- comment out the actual INSERT
--INSERT INTO Customer (CustomerGUID, CustomerType, SalesOfficeName,
UpdatedByWS, ActivatedDate)
SELECT CustomerGUID, CustomerType, SalesOfficeName, UpdatedByWS,
ActivatedDate
FROM OPENXML(@.docHandle, N'/ROOT/Customer')
WITH Customer
EXEC sp_xml_removedocument @.docHandle
GO
I get this result:
CustomerGUID CustomerType SalesOfficeName UpdatedByWS ActivatedDate
-- -- -- -- --
NULL NULL NULL NULL NULL
I think that I am not describing the XML correctly to the XML variable,
but I don't know what to do next. Any ideas? Thanks.
You need to specify the flags parameter to tell OPENXML to look for elements
instead of attributes. The default Flags value is 1, which is attributes -
use 2 for elements, or 3 for both (or better yet use colpatterns in a table
def in the WITH clause).
DECLARE @.docHandle int
declare @.xmlDocument xml
set @.xmlDocument = N'<ROOT>
<Customer>
<CustomerGUID>05062E6D-453B-4CAE-9EA4-BAF7D00E4235</CustomerGUID>
<CustomerType>CU</CustomerType>
<SalesOfficeName>Roadway Main</SalesOfficeName>
<UpdatedByWS>D7DA4883-656D-4923-9C5C-FD83EBE6F1BE</UpdatedByWS>
<ActivatedDate>2005-11-20T00:00:00</ActivatedDate>
</Customer>
</ROOT>'
EXEC sp_xml_preparedocument @.docHandle OUTPUT, @.xmlDocument
-- comment out the actual INSERT
--INSERT INTO Customer (CustomerGUID, CustomerType, SalesOfficeName,
UpdatedByWS, ActivatedDate)
SELECT CustomerGUID, CustomerType, SalesOfficeName,
UpdatedByWS,ActivatedDate
FROM OPENXML(@.docHandle, N'/ROOT/Customer', 2)
WITH Customer
Cheers,
Graeme
_____________________
Graeme Malcolm
Principal Technologist
Content Master
- a member of CM Group
www.contentmaster.com
<googleThis@.nadolna.net> wrote in message
news:1132684803.666917.108890@.g47g2000cwa.googlegr oups.com...
>I am trying to insert a row from an XML document into a table. Here is
> the table structure:
> CREATE TABLE dbo.Customer(
> CustomerGUID uniqueidentifier NOT NULL,
> CustomerName varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
> Version int NULL,
> ActivatedDate smalldatetime NULL,
> ActivatedByWS uniqueidentifier NULL,
> DeactivatedDate smalldatetime NULL,
> DeactivatedByWS varchar(10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
> UpdatedByWS uniqueidentifier NULL,
> CustomerType char(2) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
> SalesOfficeName varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
> NULL,
> PRIMARY KEY CLUSTERED
> (
> CustomerGUID ASC
> )WITH (IGNORE_DUP_KEY = OFF) ON PRIMARY,
> UNIQUE NONCLUSTERED
> (
> CustomerName ASC
> )WITH (IGNORE_DUP_KEY = OFF) ON PRIMARY
> ) ON PRIMARY
> GO
>
> Here is the code to get the data from the document and insert it:
> DECLARE @.docHandle int
> declare @.xmlDocument xml
> set @.xmlDocument = N'<ROOT>
> <Customer>
> <CustomerGUID>05062E6D-453B-4CAE-9EA4-BAF7D00E4235</CustomerGUID>
> <CustomerType>CU</CustomerType>
> <SalesOfficeName>Roadway Main</SalesOfficeName>
> <UpdatedByWS>D7DA4883-656D-4923-9C5C-FD83EBE6F1BE</UpdatedByWS>
> <ActivatedDate>2005-11-20T00:00:00</ActivatedDate>
> </Customer>
> </ROOT>'
> EXEC sp_xml_preparedocument @.docHandle OUTPUT, @.xmlDocument
> -- comment out the actual INSERT
> --INSERT INTO Customer (CustomerGUID, CustomerType, SalesOfficeName,
> UpdatedByWS, ActivatedDate)
> SELECT CustomerGUID, CustomerType, SalesOfficeName, UpdatedByWS,
> ActivatedDate
> FROM OPENXML(@.docHandle, N'/ROOT/Customer')
> WITH Customer
> EXEC sp_xml_removedocument @.docHandle
> GO
> I get this result:
> CustomerGUID CustomerType SalesOfficeName UpdatedByWS ActivatedDate
> -- -- -- -- --
> NULL NULL NULL NULL NULL
> I think that I am not describing the XML correctly to the XML variable,
> but I don't know what to do next. Any ideas? Thanks.
>
|||That was it! Thanks very much, Graeme.
OPENXML won't return rows
the table structure:
CREATE TABLE dbo.Customer(
CustomerGUID uniqueidentifier NOT NULL,
CustomerName varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
Version int NULL,
ActivatedDate smalldatetime NULL,
ActivatedByWS uniqueidentifier NULL,
DeactivatedDate smalldatetime NULL,
DeactivatedByWS varchar(10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
UpdatedByWS uniqueidentifier NULL,
CustomerType char(2) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
SalesOfficeName varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
NULL,
PRIMARY KEY CLUSTERED
(
CustomerGUID ASC
)WITH (IGNORE_DUP_KEY = OFF) ON PRIMARY,
UNIQUE NONCLUSTERED
(
CustomerName ASC
)WITH (IGNORE_DUP_KEY = OFF) ON PRIMARY
) ON PRIMARY
GO
---
Here is the code to get the data from the document and insert it:
---
DECLARE @.docHandle int
declare @.xmlDocument xml
set @.xmlDocument = N'<ROOT>
<Customer>
<CustomerGUID>05062E6D-453B-4CAE-9EA4-BAF7D00E4235</CustomerGUID>
<CustomerType>CU</CustomerType>
<SalesOfficeName>Roadway Main</SalesOfficeName>
<UpdatedByWS>D7DA4883-656D-4923-9C5C-FD83EBE6F1BE</UpdatedByWS>
<ActivatedDate>2005-11-20T00:00:00</ActivatedDate>
</Customer>
</ROOT>'
EXEC sp_xml_preparedocument @.docHandle OUTPUT, @.xmlDocument
-- comment out the actual INSERT
--INSERT INTO Customer (CustomerGUID, CustomerType, SalesOfficeName,
UpdatedByWS, ActivatedDate)
SELECT CustomerGUID, CustomerType, SalesOfficeName, UpdatedByWS,
ActivatedDate
FROM OPENXML(@.docHandle, N'/ROOT/Customer')
WITH Customer
EXEC sp_xml_removedocument @.docHandle
GO
I get this result:
CustomerGUID CustomerType SalesOfficeName UpdatedByWS ActivatedDate
-- -- -- -- --
NULL NULL NULL NULL NULL
I think that I am not describing the XML correctly to the XML variable,
but I don't know what to do next. Any ideas? Thanks.You need to specify the flags parameter to tell OPENXML to look for elements
instead of attributes. The default Flags value is 1, which is attributes -
use 2 for elements, or 3 for both (or better yet use colpatterns in a table
def in the WITH clause).
DECLARE @.docHandle int
declare @.xmlDocument xml
set @.xmlDocument = N'<ROOT>
<Customer>
<CustomerGUID>05062E6D-453B-4CAE-9EA4-BAF7D00E4235</CustomerGUID>
<CustomerType>CU</CustomerType>
<SalesOfficeName>Roadway Main</SalesOfficeName>
<UpdatedByWS>D7DA4883-656D-4923-9C5C-FD83EBE6F1BE</UpdatedByWS>
<ActivatedDate>2005-11-20T00:00:00</ActivatedDate>
</Customer>
</ROOT>'
EXEC sp_xml_preparedocument @.docHandle OUTPUT, @.xmlDocument
-- comment out the actual INSERT
--INSERT INTO Customer (CustomerGUID, CustomerType, SalesOfficeName,
UpdatedByWS, ActivatedDate)
SELECT CustomerGUID, CustomerType, SalesOfficeName,
UpdatedByWS,ActivatedDate
FROM OPENXML(@.docHandle, N'/ROOT/Customer', 2)
WITH Customer
Cheers,
Graeme
_____________________
Graeme Malcolm
Principal Technologist
Content Master
- a member of CM Group
www.contentmaster.com
<googleThis@.nadolna.net> wrote in message
news:1132684803.666917.108890@.g47g2000cwa.googlegroups.com...
>I am trying to insert a row from an XML document into a table. Here is
> the table structure:
> CREATE TABLE dbo.Customer(
> CustomerGUID uniqueidentifier NOT NULL,
> CustomerName varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
> Version int NULL,
> ActivatedDate smalldatetime NULL,
> ActivatedByWS uniqueidentifier NULL,
> DeactivatedDate smalldatetime NULL,
> DeactivatedByWS varchar(10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
> UpdatedByWS uniqueidentifier NULL,
> CustomerType char(2) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
> SalesOfficeName varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
> NULL,
> PRIMARY KEY CLUSTERED
> (
> CustomerGUID ASC
> )WITH (IGNORE_DUP_KEY = OFF) ON PRIMARY,
> UNIQUE NONCLUSTERED
> (
> CustomerName ASC
> )WITH (IGNORE_DUP_KEY = OFF) ON PRIMARY
> ) ON PRIMARY
> GO
>
> ---
> Here is the code to get the data from the document and insert it:
> ---
> DECLARE @.docHandle int
> declare @.xmlDocument xml
> set @.xmlDocument = N'<ROOT>
> <Customer>
> <CustomerGUID>05062E6D-453B-4CAE-9EA4-BAF7D00E4235</CustomerGUID>
> <CustomerType>CU</CustomerType>
> <SalesOfficeName>Roadway Main</SalesOfficeName>
> <UpdatedByWS>D7DA4883-656D-4923-9C5C-FD83EBE6F1BE</UpdatedByWS>
> <ActivatedDate>2005-11-20T00:00:00</ActivatedDate>
> </Customer>
> </ROOT>'
> EXEC sp_xml_preparedocument @.docHandle OUTPUT, @.xmlDocument
> -- comment out the actual INSERT
> --INSERT INTO Customer (CustomerGUID, CustomerType, SalesOfficeName,
> UpdatedByWS, ActivatedDate)
> SELECT CustomerGUID, CustomerType, SalesOfficeName, UpdatedByWS,
> ActivatedDate
> FROM OPENXML(@.docHandle, N'/ROOT/Customer')
> WITH Customer
> EXEC sp_xml_removedocument @.docHandle
> GO
> I get this result:
> CustomerGUID CustomerType SalesOfficeName UpdatedByWS ActivatedDate
> -- -- -- -- --
> NULL NULL NULL NULL NULL
> I think that I am not describing the XML correctly to the XML variable,
> but I don't know what to do next. Any ideas? Thanks.
>|||That was it! Thanks very much, Graeme.
Monday, February 20, 2012
OPENXML performance
we use heavily openxml to insert data into our sqlserver 2000 sp3 tables.
We have some performance issues that seems to be caused by our use of OPENXML.
Doing a trace and wait analisys (1) we found that much of the time is taken
by oledb operation.
Looking at the trace file we saw a remote scan appening any time we do an
xpath query over openxml.
We managed to do all the openxml work at the sp start, putting all the data
in table variables (in memory), freeing the xml doc and then using those
tables to do the work.
Looking at docs we didn't find great infos about some aspects of OPENXML.
OPENXML uses an OLEDB rowset to do the work, so it seems that it loads the
xml in a MSXML dom and then populates the rowset.
Looking at the sqlserver proc it seems that it uses MSXML2, so looking at
the version of it we found 8.2.7919.0.
This seems a 2.6 SP2 version.
Is it possible/recommended to upgrade the version to 2.5 sp3?
Is it possible/recommended to upgrade the version to MSXML3 or MSXML4?
It seems also that there's a buffer where those dom documents are stored.
Is it possible to trace how much documents/space are keeped by mssql?
There's some optimization to set up with OPENXML?
To pass the xml fragment to mssql we use mssql oledb, we can have some
advantage by using sqlxmloledb?
Carlo Folini
(1) http://support.microsoft.com/default...B;EN-US;271509
See below.
Best regards
Michael
"Carlo Folini" <folini@.community.nospam> wrote in message
news:A01A902E-799B-44E5-8D9B-54813547B769@.microsoft.com...
> Hi,
> we use heavily openxml to insert data into our sqlserver 2000 sp3 tables.
> We have some performance issues that seems to be caused by our use of
> OPENXML.
> Doing a trace and wait analisys (1) we found that much of the time is
> taken
> by oledb operation.
> Looking at the trace file we saw a remote scan appening any time we do an
> xpath query over openxml.
> We managed to do all the openxml work at the sp start, putting all the
> data
> in table variables (in memory), freeing the xml doc and then using those
> tables to do the work.
> Looking at docs we didn't find great infos about some aspects of OPENXML.
> OPENXML uses an OLEDB rowset to do the work, so it seems that it loads the
> xml in a MSXML dom and then populates the rowset.
> Looking at the sqlserver proc it seems that it uses MSXML2, so looking at
> the version of it we found 8.2.7919.0.
> This seems a 2.6 SP2 version.
Correct.
> Is it possible/recommended to upgrade the version to 2.5 sp3?
Do you mean 2.6 SP3? It is not supported to do that upgrade yourself. There
are some issues with running the MSI if SP2 is already installed.
> Is it possible/recommended to upgrade the version to MSXML3 or MSXML4?
No. Although the next SQL Server 2000 service pack is probably moving to use
MSXML 3.
> It seems also that there's a buffer where those dom documents are stored.
> Is it possible to trace how much documents/space are keeped by mssql?
Unfortunately not. The rule of thumb is 3 to 6 times as much as the original
textual size.
> There's some optimization to set up with OPENXML?
There are some best practices (some you are already following above). In
addition, you want to avoid parent axis (..) unless needed, and do not use
flag 3.
> To pass the xml fragment to mssql we use mssql oledb, we can have some
> advantage by using sqlxmloledb?
Not from the OpenXML point of view.
What are the symptoms that you see and what is the doc size and load
characteristics (feel free to contact me per email).
Best regards
Michael
> --
> Carlo Folini
> (1) http://support.microsoft.com/default...B;EN-US;271509
|||inline...
> Do you mean 2.6 SP3? It is not supported to do that upgrade yourself. There
> are some issues with running the MSI if SP2 is already installed.
Yes, I meant 2.6 sp3. The sp2 version is different from the 'official one'
published on the support site.
We have 8.2.7919.0 instead of 8.2.8307.0.
Do you think that we have to update our sp2 installation?
> There are some best practices (some you are already following above). In
> addition, you want to avoid parent axis (..) unless needed, and do not use
> flag 3.
What do you mean for "flag 3"?
Having the following xml structure/query, how can we avoid parent axis?
SELECT Address
FROM OPENXML(@.HDoc, 'MyData/Tab/Row', 1)
WITH (MyTableName VARCHAR(18) '../@.TableName',
Address VARCHAR(120))
WHERE MyTableName = 'Pippo'
<MyData>
<Tab MyTableName="Pippo">
<Row address="sdfdsf fds"/>
<Row address="432 432243"/>
<Row address="bcv bvccb bvc"/>
</Tab>
</MyData>
> What are the symptoms that you see and what is the doc size and load
> characteristics (feel free to contact me per email).
>
We have some critical conditions that we are investigating, in those
situation the CPU% grows to 100%.
Having the cpu to 100% the time taken by each sp grows, causing the lock to
be held for long time.
So we saw in our log that we have a number of timeouts and also some
deadlock conditions.
The timeouts are throwed by the sp that uses openxml.
The xml stream (passed as text parameter to the sp) ranges from 6k to 60k.
The data are essentially an xml representation of some tables content (the
data are retrieved from host via COMTI and serialized to xml by a vb6
component).
Thanks for you help
Carlo
|||See below.
Best regards
Michael
"Carlo Folini" <folini@.community.nospam> wrote in message
news:96468399-2219-4418-A80B-B9D30D52E2D2@.microsoft.com...
> inline...
> Yes, I meant 2.6 sp3. The sp2 version is different from the 'official one'
> published on the support site.
> We have 8.2.7919.0 instead of 8.2.8307.0.
> Do you think that we have to update our sp2 installation?
The support site is a bit newer, since it includes some additional bug
fixes. I don't think they help you with your perf issues.
> What do you mean for "flag 3"?
With the syntax OPENXML( handle, row expression, flag) flag is the third
argument. It should not be set to 3 for performance reasons with MSXML 2.6
any version (MSXML 3.0 should have fixed the problem, still 3 should be
avoided if possible).
> Having the following xml structure/query, how can we avoid parent axis?
> SELECT Address
> FROM OPENXML(@.HDoc, 'MyData/Tab/Row', 1)
> WITH (MyTableName VARCHAR(18) '../@.TableName',
> Address VARCHAR(120))
> WHERE MyTableName = 'Pippo'
> <MyData>
> <Tab MyTableName="Pippo">
> <Row address="sdfdsf fds"/>
> <Row address="432 432243"/>
> <Row address="bcv bvccb bvc"/>
> </Tab>
> </MyData>
You cannot, unless you can copy the attribute to its children when the XML
is generated or along the way (ie, mid-tier). Note that SQL Server SP4
should not have this problem.
> We have some critical conditions that we are investigating, in those
> situation the CPU% grows to 100%.
> Having the cpu to 100% the time taken by each sp grows, causing the lock
> to
> be held for long time.
> So we saw in our log that we have a number of timeouts and also some
> deadlock conditions.
> The timeouts are throwed by the sp that uses openxml.
> The xml stream (passed as text parameter to the sp) ranges from 6k to 60k.
> The data are essentially an xml representation of some tables content (the
> data are retrieved from host via COMTI and serialized to xml by a vb6
> component).
I assume that the XPath execution will take utilization up to that level.
The document size does not look too bad though. How many concurrent
transactions are using OpenXML at the same time?
Also, deadlocks normally also indicate a problem with respect to your update
logic. Can you take the OpenXML part out of the transactions that deadlock
and see what happens then?
> Thanks for you help
> Carlo
Best regards
Michael
OPENXML INSERT/UPDATE and NTEXT
I'm having problems inserting/updating a NTEXT field using OPENXML.
The field always gets a blank value with any parameters it receives.
Here's the (simplified) query:
---
declare @.doc int
-- Actually, it's a parameter
declare @.xml nvarchar(4000);
set @.xml = N'
<Article>
<Id>be60839f-cc33-4a9f-af91-e3bbcb7617ac</Id>
<Content>yada yada yada</Content>
</Article>'
EXEC sp_xml_preparedocument @.doc OUTPUT, @.xml
UPDATE Article
SET Content = new.Content
FROM OPENXML(@.doc, 'Article', 3) WITH Article new
WHERE Article.Id = new.Id
IF @.@.rowcount = 0
INSERT INTO Article
SELECT *
FROM OPENXML(@.doc, 'Article', 3) WITH Article
EXEC sp_xml_removedocument @.doc
---
Oddly, "SELECT * FROM OPENXML(@.doc, 'Article', 3) WITH Article" shows the
value.
Any ideas?
DiegoI think you need to revise your OPENXML syntax a bit. Try the following
instead of what you have:
UPDATE Article
SET Content = new.Content
FROM OPENXML(@.doc, 'Article', 3)
WITH (
id uniqueidentifier 'Id',
content nvarchar(400) 'Content'
) new
WHERE Article.Id = new.Id
Adam Machanic
SQL Server MVP
http://www.sqljunkies.com/weblog/amachanic
--
"Diego Mijelshon" <no@.thanks.com> wrote in message
news:ei8UgeEGFHA.1188@.tk2msftngp13.phx.gbl...
> Hi,
> I'm having problems inserting/updating a NTEXT field using OPENXML.
> The field always gets a blank value with any parameters it receives.
> Here's the (simplified) query:
> ---
> declare @.doc int
> -- Actually, it's a parameter
> declare @.xml nvarchar(4000);
> set @.xml = N'
> <Article>
> <Id>be60839f-cc33-4a9f-af91-e3bbcb7617ac</Id>
> <Content>yada yada yada</Content>
> </Article>'
> EXEC sp_xml_preparedocument @.doc OUTPUT, @.xml
> UPDATE Article
> SET Content = new.Content
> FROM OPENXML(@.doc, 'Article', 3) WITH Article new
> WHERE Article.Id = new.Id
> IF @.@.rowcount = 0
> INSERT INTO Article
> SELECT *
> FROM OPENXML(@.doc, 'Article', 3) WITH Article
> EXEC sp_xml_removedocument @.doc
> ---
> Oddly, "SELECT * FROM OPENXML(@.doc, 'Article', 3) WITH Article" shows the
> value.
> Any ideas?
> Diego
>|||Adam,
Thanks for your answer, but that didn't fix it.
Besides, the beauty in the "WITH tablename" clause, combined with FOR XML
AUTO, is that I get "free" O/R-M using XML Serializing.
I'll share my solution. It still looks like a bug to me...
---
SELECT *
INTO ##tmp
FROM OPENXML(@.doc, 'Article', 3) WITH Article
UPDATE Article
SET Content = new.Content
FROM ##tmp new
WHERE Article.Id = new.Id
IF @.@.rowcount = 0
INSERT INTO Article
SELECT *
FROM ##tmp
DROP TABLE ##tmp
---
As you can see, inserting in a temporal table does the trick.
Diego
"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:OxCpmzEGFHA.1476@.TK2MSFTNGP09.phx.gbl...
> I think you need to revise your OPENXML syntax a bit. Try the following
> instead of what you have:
>
> UPDATE Article
> SET Content = new.Content
> FROM OPENXML(@.doc, 'Article', 3)
> WITH (
> id uniqueidentifier 'Id',
> content nvarchar(400) 'Content'
> ) new
> WHERE Article.Id = new.Id
>
> --
> Adam Machanic
> SQL Server MVP
> http://www.sqljunkies.com/weblog/amachanic
> --
>
> "Diego Mijelshon" <no@.thanks.com> wrote in message
> news:ei8UgeEGFHA.1188@.tk2msftngp13.phx.gbl...
the
>|||"Diego Mijelshon" <no@.thanks.com> wrote in message
news:um$u8aIGFHA.548@.TK2MSFTNGP14.phx.gbl...
> DROP TABLE ##tmp
> ---
> As you can see, inserting in a temporal table does the trick.
I'm glad you found something that works. FYI, you should probably use a
local temporary table (single #) instead of a global one (##) unless you
need access to this same temp table from other processes...
Adam Machanic
SQL Server MVP
http://www.sqljunkies.com/weblog/amachanic
--|||"Adam Machanic" <amachanic@.hotmail._removetoemail_.com> wrote in message
news:uWtvvjIGFHA.560@.TK2MSFTNGP15.phx.gbl...
> "Diego Mijelshon" <no@.thanks.com> wrote in message
> news:um$u8aIGFHA.548@.TK2MSFTNGP14.phx.gbl...
> I'm glad you found something that works. FYI, you should probably use a
> local temporary table (single #) instead of a global one (##) unless you
> need access to this same temp table from other processes...
Thanks for the correction, I had it backwards :-)
Diego
OpenXML Identity Insert
I am inserting data into a table using the OpenXML command (see query below). The table that I am inserting it into has an identity column named SOFTWARE_UID. I need to return an xmldoc with the identifier included for each row inserted. Does anyone have a clue how I can do that efficiently? I know I can do it with a cursor, but if that were the case, then why not just do x inserts with an @.@.identity call? I am looking to do this in an efficient way, preferebly with an out of box function (such as @.@.identities).
Declare @.XMLString varchar(max)
Set @.XMLString = '<IT_SOFTWARE>
<SOFT name="Soft1" status="1"/>
<SOFT name="Soft2" status="1"/>
<SOFT name="Soft3" status="1"/>
</IT_SOFTWARE>'
declare @.XmlHandle int
EXEC sp_xml_preparedocument @.XmlHandle output, @.XMLString
Insert Into INFORMATION_SYSTEMS.dbo.IT_SOFTWARE
SELECT SOFTWARE_NAME,SOFTWARE_STATUS_ID
FROM OPENXML (@.XmlHandle, '/IT_SOFTWARE/SOFT',1)
WITH ( SOFTWARE_NAME varchar(max) '@.name',
SOFTWARE_STATUS_ID int '@.status')
EXEC sp_xml_removedocument @.XmlHandle
As you are using varchar(max) I'll assume your are
using SQL Server 2005
This should give you your results from your
table along with the identities
select SOFTWARE_NAME as "@.name",
SOFTWARE_STATUS_ID as "@.status",
SOFTWARE_UID as "@.UID"
FROM INFORMATION_SYSTEMS.dbo.IT_SOFTWARE
FOR XML PATH('SOFT'), ROOT('IT_SOFTWARE'), TYPE
You can also change from using OPENXML and
sp_xml_preparedocument/sp_xml_removedocument to this
declare @.xml xml
set @.xml=@.XMLString
Insert Into INFORMATION_SYSTEMS.dbo.IT_SOFTWARE(SOFTWARE_NAME,SOFTWARE_STATUS_ID)
select R.n.value('@.name','varchar(max)') as SOFTWARE_NAME,
R.n.value('@.status','int') as SOFTWARE_STATUS_ID
from @.xml.nodes('/IT_SOFTWARE/SOFT') as R(n)
|||
If you are after only the newly inserted rows you
can use the output..into clause in the insert statement
DECLARE @.IT_SOFTWARE table (
SOFTWARE_UID int,
SOFTWARE_NAME varchar(max),
SOFTWARE_STATUS_ID int);
Insert Into INFORMATION_SYSTEMS.dbo.IT_SOFTWARE(SOFTWARE_NAME,SOFTWARE_STATUS_ID)
output inserted.SOFTWARE_UID,
inserted.SOFTWARE_NAME,
inserted.SOFTWARE_STATUS_ID
into @.IT_SOFTWARE(SOFTWARE_UID,SOFTWARE_NAME,SOFTWARE_STATUS_ID)
select R.n.value('@.name','varchar(max)') as SOFTWARE_NAME,
R.n.value('@.status','int') as SOFTWARE_STATUS_ID
from @.xml.nodes('/IT_SOFTWARE/SOFT') as R(n)
select SOFTWARE_NAME as "@.name",
SOFTWARE_STATUS_ID as "@.status",
SOFTWARE_UID as "@.UID"
FROM @.IT_SOFTWARE
FOR XML PATH('SOFT'), ROOT('IT_SOFTWARE'), TYPE