OpenXML Identity Insert: Opinions about Insertion Technique

Looking for some insight from the professionals about how they handle
row inserts. Specifically single row inserts through a stored
procedure versus bulk inserts.

One argument are people who say all inserts (and updates and deletions
I guess) should go through stored procedures. The reasoning is that
the developers that code the client side have no reason to understand
HOW the data is stored, just that it is. Another problem is an insert
that deals with multiple tables. It would be very easy for the
developer to forget a step. That last point also applies to business
logic. In my case, adding a security to our SecurityMaster can touch 1
to 4 tables depending on the type of security. Also, certain fields
are required while others are set to null for depending on the type.

Because a stored procedure cannot be passed datasets but only scalar
values, when you need to deal with multiple (i.e. bulk) rows you are
stuck using cursors. This post is NOT about the pros and cons of
cursors. There are plenty of those on the boards (some of them
probably started by me and showing my understanding (or more
correctly, lack of) of the way to do things). Stored procedures also
give you the ability to abort and/or log inserts that cannot happen
because of contraints and/or business rule failures.

Another approach is to write code (not accessible from outside the
database) that handles bulk inserts. You would need to write in rules
to "extract" or "exclude" rows that do not match constraints or
business rules otherwise ALL the inserts would fail because of one bad
row. I guess you could put the "potential" rows into a temp table.
Apply your rules to the temp table and delete / move rows that would
fail. Any rows left can that be bulk inserted. (You could also use the
rows that were moved to another temp table for logging why they
failed.)

So that leaves use with two possible ways to get data into the system.
A single row based approach for client apps and a bulk based for
internal use. But that leaves use with another problem. You now have
business logic in TWO separate areas. You have to remember to modify
code or fix bugs in multiple locations.

For those that are still reading my post, my question is...

How do you handle this? What is the approach you take?Jay

What we have are a series of core stored procedures for single row
insert/update/deletes. As you point out, this may do extra
validation/processing etc. (Examples include security checking,
auditing, optimistic locking, business rule validation etc). These are
auto-generated.

In addition, we sometimes do bulk inserts from another table or data
source. To ensure we dont duplicate validation/processing AND to
ensure all data goes through the one filter before being committed, we
simply call the exact same stored procedure multiple times, typically
from within a cursor, but can be done from code in our case C#.
Depending on the application, this may be done as a transaction block.
We typically let the stored proc do the validation of the data and log
any rows that failed to insert.

This works like a charm for us and I have used this approach in a
number of successful projects.

However, this bulk insert approach does not scale very well and if you
do a lot of batch style processing, you really need to be looking at
customised scripts/stored procedures to get the best performance.

JayCallas@.hotmail.com (Jason) wrote in message news:<f01a7c89.0402042129.6f9090f6@.posting.google.com>...
> Looking for some insight from the professionals about how they handle
> row inserts. Specifically single row inserts through a stored
> procedure versus bulk inserts.
> One argument are people who say all inserts (and updates and deletions
> I guess) should go through stored procedures. The reasoning is that
> the developers that code the client side have no reason to understand
> HOW the data is stored, just that it is. Another problem is an insert
> that deals with multiple tables. It would be very easy for the
> developer to forget a step. That last point also applies to business
> logic. In my case, adding a security to our SecurityMaster can touch 1
> to 4 tables depending on the type of security. Also, certain fields
> are required while others are set to null for depending on the type.
> Because a stored procedure cannot be passed datasets but only scalar
> values, when you need to deal with multiple (i.e. bulk) rows you are
> stuck using cursors. This post is NOT about the pros and cons of
> cursors. There are plenty of those on the boards (some of them
> probably started by me and showing my understanding (or more
> correctly, lack of) of the way to do things). Stored procedures also
> give you the ability to abort and/or log inserts that cannot happen
> because of contraints and/or business rule failures.
> Another approach is to write code (not accessible from outside the
> database) that handles bulk inserts. You would need to write in rules
> to "extract" or "exclude" rows that do not match constraints or
> business rules otherwise ALL the inserts would fail because of one bad
> row. I guess you could put the "potential" rows into a temp table.
> Apply your rules to the temp table and delete / move rows that would
> fail. Any rows left can that be bulk inserted. (You could also use the
> rows that were moved to another temp table for logging why they
> failed.)
> So that leaves use with two possible ways to get data into the system.
> A single row based approach for client apps and a bulk based for
> internal use. But that leaves use with another problem. You now have
> business logic in TWO separate areas. You have to remember to modify
> code or fix bugs in multiple locations.
> For those that are still reading my post, my question is...
> How do you handle this? What is the approach you take?|||Jason (JayCallas@.hotmail.com) writes:
> Because a stored procedure cannot be passed datasets but only scalar
> values,

Actually, this is not completely true anymore. With SQL 2000 we got
the rowset provider OPENXML. So you can send in a complex dataset with
data for umpteen tables in one single procedure call.

It is probably still not as effecient as bulk insert, but bulk insert has
its limitations in validation etc. You can use a staging table, but
that comes with a cost of course.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||PromisedOyster@.hotmail.com (Mystery Man) wrote in message news:<87c81238.0402050421.69e0e0a@.posting.google.com>...
> Jay
> What we have are a series of core stored procedures for single row
> insert/update/deletes. As you point out, this may do extra
> validation/processing etc. (Examples include security checking,
> auditing, optimistic locking, business rule validation etc). These are
> auto-generated.

I use a combination - depending on what I want to achieve - if I am
loading a routine 'bulk' load then that runs split into adequate
batches (to minimise impact on transactions and logs) - but if I have
a luser wanting to enter data I like the stored proc approach for all
sorts of reasons.|||My personal experience is that business logic gets more and more
complicated as a project develops. The users wants to change this,
tweak that, etc. And stored procedures are the only sure-fire way I
can accomplish all of the user's requests. I do try to "modularize"
common code. For example, a user wants to view a dataset. The web
page calls the viewdataSP. Inside the viewdataSP, it calls a
userAccessSP. Erland has a pretty good article about how to call
stored procedures from a stored procedure.

Friday, March 9, 2012

Opinions about Insertion Technique

No comments:

Post a Comment

OpenXML Identity Insert

Blog Archive

About Me