Showing posts with label script. Show all posts
Showing posts with label script. Show all posts

Friday, March 30, 2012

Optimizing Store Procedure and Avoiding Cursors....

I'm trying to optimize an ETL batch procedure which takes an extremely long
time to run. I was using cursors for most part of the batch script on
different Store Procedures. Now I'm trying to remove the cursors and replace
them with alternative methods such as using temp tables.
I'm really puzzled as to why the cursor procedure runs much faster than
using temp tables... can anyone point out what is wrong to me?
Below is a snippet of 2 test SPs I've compared against.
TestTable holds 200000 records with 3 columns
1. Using Cursors (This procedure took 4 - 8 secs to run)
declare @.test as numeric
declare @.header as nvarchar(50)
declare @.body as nvarchar(50)
declare @.footer as nvarchar(50)
declare test_cursor cursor for select header, body, footer from testTable
order by header
open test_cursor fetch next from test_cursor into @.header, @.body, @.footer
set @.test = 0
while @.@.fetch_status = 0
begin
set @.test = @.test + 1
fetch next from test_cursor into @.header, @.body, @.footer
end
select @.test
close test_cursor
deallocate test_cursor
2. Using Temp Tables (This procedure took > 1 min)
declare @.test as numeric
declare @.header as nvarchar(50)
declare @.body as nvarchar(50)
declare @.footer as nvarchar(50)
declare @.count int, @.loopcount int
select identity(int,1,1) id, header, body, footer into #temp from testtable
select @.count = @.@.rowcount
set @.loopcount = 1
while @.loopcount <= @.count
begin
select @.header = header, @.body = body, @.footer = footer from
#temp where id = @.loopcount
set @.loopcount = @.loopcount + 1
end
drop table #tempLooking at the code I think it maybe the the where clause in the tempdb,
rember that it has to search through all 200000 records whereas with the
cursor your going through them from start to finish, no searching.
Peter
"Happiness is nothing more than good health and a bad memory."
Albert Schweitzer
"Nestor" wrote:

> I'm trying to optimize an ETL batch procedure which takes an extremely lon
g
> time to run. I was using cursors for most part of the batch script on
> different Store Procedures. Now I'm trying to remove the cursors and repla
ce
> them with alternative methods such as using temp tables.
> I'm really puzzled as to why the cursor procedure runs much faster than
> using temp tables... can anyone point out what is wrong to me?
> Below is a snippet of 2 test SPs I've compared against.
> TestTable holds 200000 records with 3 columns
> 1. Using Cursors (This procedure took 4 - 8 secs to run)
> declare @.test as numeric
> declare @.header as nvarchar(50)
> declare @.body as nvarchar(50)
> declare @.footer as nvarchar(50)
> declare test_cursor cursor for select header, body, footer from testTable
> order by header
> open test_cursor fetch next from test_cursor into @.header, @.body, @.footer
> set @.test = 0
> while @.@.fetch_status = 0
> begin
> set @.test = @.test + 1
> fetch next from test_cursor into @.header, @.body, @.footer
> end
> select @.test
> close test_cursor
> deallocate test_cursor
>
> 2. Using Temp Tables (This procedure took > 1 min)
> declare @.test as numeric
> declare @.header as nvarchar(50)
> declare @.body as nvarchar(50)
> declare @.footer as nvarchar(50)
> declare @.count int, @.loopcount int
> select identity(int,1,1) id, header, body, footer into #temp from testtab
le
> select @.count = @.@.rowcount
> set @.loopcount = 1
> while @.loopcount <= @.count
> begin
> select @.header = header, @.body = body, @.footer = footer from
> #temp where id = @.loopcount
> set @.loopcount = @.loopcount + 1
> end
> drop table #temp
>
>
>|||Both 1 and 2 are cursors! Looping through a temp table is just a cursor
in disguise. Unfortunately your sample code doesn't do anything useful
so it's impossible to suggest an alternative set-based solution without
using cursors. Your goal should be to avoid processing one row at a
time by ANY method.
If you need more help please post DDL, sample data INSERTs and show
your required end result.
http://www.aspfaq.com/etiquette.asp?id=5006
David Portas
SQL Server MVP
--|||Hello Nestor,
Going through your code, I dont see any reason why the #temp based approach
should be any better
than the Cursor based approach (if it is supposed to be).
The #temp based approach, Loops anyway with a while Loop which is simillar
to Cursor based method.
Normally, when we say we have to get rid of the cursor, we mean to replace
the Cursor based approach
with a SET BASED APPROACH and your #temp based approach still does record by
record. And as Peter says
for each iteration a select is executed against the table.
Hope this helps.
Gopi
"Nestor" <n3570r@.yahoo.com> wrote in message
news:ORU5F$IJFHA.1172@.TK2MSFTNGP12.phx.gbl...
> I'm trying to optimize an ETL batch procedure which takes an extremely
> long time to run. I was using cursors for most part of the batch script on
> different Store Procedures. Now I'm trying to remove the cursors and
> replace them with alternative methods such as using temp tables.
> I'm really puzzled as to why the cursor procedure runs much faster than
> using temp tables... can anyone point out what is wrong to me?
> Below is a snippet of 2 test SPs I've compared against.
> TestTable holds 200000 records with 3 columns
> 1. Using Cursors (This procedure took 4 - 8 secs to run)
> declare @.test as numeric
> declare @.header as nvarchar(50)
> declare @.body as nvarchar(50)
> declare @.footer as nvarchar(50)
> declare test_cursor cursor for select header, body, footer from testTable
> order by header
> open test_cursor fetch next from test_cursor into @.header, @.body, @.footer
> set @.test = 0
> while @.@.fetch_status = 0
> begin
> set @.test = @.test + 1
> fetch next from test_cursor into @.header, @.body, @.footer
> end
> select @.test
> close test_cursor
> deallocate test_cursor
>
> 2. Using Temp Tables (This procedure took > 1 min)
> declare @.test as numeric
> declare @.header as nvarchar(50)
> declare @.body as nvarchar(50)
> declare @.footer as nvarchar(50)
> declare @.count int, @.loopcount int
> select identity(int,1,1) id, header, body, footer into #temp from
> testtable
> select @.count = @.@.rowcount
> set @.loopcount = 1
> while @.loopcount <= @.count
> begin
> select @.header = header, @.body = body, @.footer = footer from
> #temp where id = @.loopcount
> set @.loopcount = @.loopcount + 1
> end
> drop table #temp
>
>|||Although I haven't tested it, I'd expect the pseudo-cursor would perform
much better if you add a unique clustered index on the #temp id column.
However, as David pointed out, changing a cursor to another iterative
technique isn't the ideal solution and might not be any faster than the
cursor.
We can't make any specific recommendations because the example you posted
does no productive work. If your actual process does something like execute
another proc for each row returned, the set-based solution would be to use
in-line set-based processing instead. This technique is often orders of
magnitude faster than an iterative approach.
Hope this helps.
Dan Guzman
SQL Server MVP
"Nestor" <n3570r@.yahoo.com> wrote in message
news:ORU5F$IJFHA.1172@.TK2MSFTNGP12.phx.gbl...
> I'm trying to optimize an ETL batch procedure which takes an extremely
> long time to run. I was using cursors for most part of the batch script on
> different Store Procedures. Now I'm trying to remove the cursors and
> replace them with alternative methods such as using temp tables.
> I'm really puzzled as to why the cursor procedure runs much faster than
> using temp tables... can anyone point out what is wrong to me?
> Below is a snippet of 2 test SPs I've compared against.
> TestTable holds 200000 records with 3 columns
> 1. Using Cursors (This procedure took 4 - 8 secs to run)
> declare @.test as numeric
> declare @.header as nvarchar(50)
> declare @.body as nvarchar(50)
> declare @.footer as nvarchar(50)
> declare test_cursor cursor for select header, body, footer from testTable
> order by header
> open test_cursor fetch next from test_cursor into @.header, @.body, @.footer
> set @.test = 0
> while @.@.fetch_status = 0
> begin
> set @.test = @.test + 1
> fetch next from test_cursor into @.header, @.body, @.footer
> end
> select @.test
> close test_cursor
> deallocate test_cursor
>
> 2. Using Temp Tables (This procedure took > 1 min)
> declare @.test as numeric
> declare @.header as nvarchar(50)
> declare @.body as nvarchar(50)
> declare @.footer as nvarchar(50)
> declare @.count int, @.loopcount int
> select identity(int,1,1) id, header, body, footer into #temp from
> testtable
> select @.count = @.@.rowcount
> set @.loopcount = 1
> while @.loopcount <= @.count
> begin
> select @.header = header, @.body = body, @.footer = footer from
> #temp where id = @.loopcount
> set @.loopcount = @.loopcount + 1
> end
> drop table #temp
>
>|||Dan,
Could you please share with me what "in-line set-based " means.
I was looking at a Stored Proc which has a nested cursors (One cursor inside
another cursor)
and the While loop of the second a SP is called with the two parameters
(each of them coming
from each of the Cursor code.
Here is the PSeudo-Code :
Declare Cursor Cursor1 Select * from table1
Open Cursor1
Fetch into @.var1
While @.@.FETCH_STATUS <> 1
Declare Cursor Cursor2 Select * from table2
Open Cursor2
Fetch into @.var2
While @.@.FETCH_STATUS <> 1
Exec Spname @.var1, @.var2
........
.....
DEALLOCATE Cursor1
DEALLOCATE Cursor2
I have noticed that the two cursors can be combined into one using
appropriate Joins.
However, I was under the impression I will still have to use a Cursor for
executing the SP
Could you please point me in the right direction if there is a way out.
Thanks,
Gopi
<< If your actual process does something like execute another proc for each
row returned, the set-based
solution would be to use in-line set-based processing instead>>
"Dan Guzman" <guzmanda@.nospam-online.sbcglobal.net> wrote in message
news:OXOup8KJFHA.4092@.tk2msftngp13.phx.gbl...
> Although I haven't tested it, I'd expect the pseudo-cursor would perform
> much better if you add a unique clustered index on the #temp id column.
> However, as David pointed out, changing a cursor to another iterative
> technique isn't the ideal solution and might not be any faster than the
> cursor.
> We can't make any specific recommendations because the example you posted
> does no productive work. If your actual process does something like
> execute another proc for each row returned, the set-based solution would
> be to use in-line set-based processing instead. This technique is often
> orders of magnitude faster than an iterative approach.
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> "Nestor" <n3570r@.yahoo.com> wrote in message
> news:ORU5F$IJFHA.1172@.TK2MSFTNGP12.phx.gbl...
>|||> However, I was under the impression I will still have to use a Cursor
for
> executing the SP
Yes, using an SP in this way forces you to use the iterative row-by-row
approach. The set-based solution is to perform whatever logic is
contained in SPname using a query against your tables so that the
entire operation is performed against the data set rather than one row
at a time. For data manipulation operations that usually has advantages
over the cursor in terms of performance, scalability and
maintainability.
David Portas
SQL Server MVP
--|||> Here is the PSeudo-Code :
Your pseudo-code effectively calls the proc for each row returned by the
Cartesian product of table1 and table2. Assuming that the executed proc
simply inserts into a third table, a working code example:
CREATE TABLE table1
(
Col1 int NOT NULL PRIMARY KEY
)
INSERT INTO table1 VALUES(1)
INSERT INTO table1 VALUES(2)
INSERT INTO table1 VALUES(3)
CREATE TABLE table2
(
Col1 int NOT NULL PRIMARY KEY
)
INSERT INTO table2 VALUES(4)
INSERT INTO table2 VALUES(5)
INSERT INTO table2 VALUES(6)
CREATE TABLE table3
(
Col1 int NOT NULL,
Col2 int NOT NULL,
PRIMARY KEY (Col1, Col2)
)
GO
CREATE PROC Spname
@.var1 int,
@.var2 int
AS
SET NOCOUNT ON
INSERT INTO Table3(Col1, Col2)
VALUES(@.var1, @.var2)
GO
DECLARE @.var1 int,
@.var2 int
DECLARE Cursor1 CURSOR
LOCAL FAST_FORWARD READ_ONLY FOR
SELECT * FROM table1
OPEN Cursor1
FETCH NEXT FROM Cursor1 INTO @.var1
WHILE @.@.FETCH_STATUS <> -1
BEGIN
DECLARE Cursor2 CURSOR LOCAL FOR SELECT * FROM table2
OPEN Cursor2
FETCH NEXT FROM Cursor2 INTO @.var2
WHILE @.@.FETCH_STATUS <> -1
BEGIN
EXEC Spname @.var1, @.var2
FETCH NEXT FROM Cursor2 INTO @.var2
END
CLOSE Cursor2
DEALLOCATE Cursor2
FETCH NEXT FROM Cursor1 INTO @.var1
END
CLOSE Cursor1
DEALLOCATE Cursor1
GO

> I have noticed that the two cursors can be combined into one using
> appropriate Joins.
Exactly. That's the first step.

> However, I was under the impression I will still have to use a Cursor for
> executing the SP
Yes you will. This is why I suggested including the equivalent proc code
inline *instead of* executing the proc for each row. The entire cursor
script and proc code in my functional example could be replaced with a
single insert statement:
INSERT INTO Table3(Col1, Col2)
SELECT t1.Col1, t2.Col1
FROM table1 t1
CROSS JOIN table2 t2
Of course, a lot depends on exactly what the called proc does. Also, your
actual code probably isn't a CROSS JOIN. A simple INNER JOIN example:
INSERT INTO Table3(Col1, Col2)
SELECT t1.Col1, t2.Col1
FROM table1 t1
JOIN table2 t2 ON
t1.Col1 = t2.SomeColumn
If you need more help, you need to provide details on what the proc actually
does. Actual code will help.
Hope this helps.
Dan Guzman
SQL Server MVP
"rgn" <gopinathr@.healthasyst.com> wrote in message
news:%23Sx92CLJFHA.1948@.TK2MSFTNGP14.phx.gbl...
> Dan,
> Could you please share with me what "in-line set-based " means.
> I was looking at a Stored Proc which has a nested cursors (One cursor
> inside another cursor)
> and the While loop of the second a SP is called with the two parameters
> (each of them coming
> from each of the Cursor code.
> Here is the PSeudo-Code :
> Declare Cursor Cursor1 Select * from table1
> Open Cursor1
> Fetch into @.var1
> While @.@.FETCH_STATUS <> 1
>
> Declare Cursor Cursor2 Select * from table2
> Open Cursor2
> Fetch into @.var2
> While @.@.FETCH_STATUS <> 1
> Exec Spname @.var1, @.var2
> ........
> .....
> DEALLOCATE Cursor1
> DEALLOCATE Cursor2
> I have noticed that the two cursors can be combined into one using
> appropriate Joins.
> However, I was under the impression I will still have to use a Cursor for
> executing the SP
> Could you please point me in the right direction if there is a way out.
> Thanks,
> Gopi
> << If your actual process does something like execute another proc for
> each row returned, the set-based
> solution would be to use in-line set-based processing instead>>
>
> "Dan Guzman" <guzmanda@.nospam-online.sbcglobal.net> wrote in message
> news:OXOup8KJFHA.4092@.tk2msftngp13.phx.gbl...
>|||Thanks for pointing that out. I've got the code base from an article in
http://www.extremeexperts.com/SQL/A...TSQLResult.aspx
Am I doing it wrong with instructions from that article? In any case, what
exactly is SET BASED APPROACH? I do realise and know that SQL Server is
meant to be relational and not sequential, but I need to understand more on
how to apply the alternative first before trying to overhaul the scripts
running in the production servers.
Can anyone give me an equivalent of the cursor procedure I posted using SET
BASE APPROACH so that I can better relate?
Thanks.
Regards,
Nestor
"rgn" <gopinathr@.healthasyst.com> wrote in message
news:e$w%23rZJJFHA.2604@.TK2MSFTNGP15.phx.gbl...
> Hello Nestor,
> Going through your code, I dont see any reason why the #temp based
> approach should be any better
> than the Cursor based approach (if it is supposed to be).
> The #temp based approach, Loops anyway with a while Loop which is simillar
> to Cursor based method.
> Normally, when we say we have to get rid of the cursor, we mean to replace
> the Cursor based approach
> with a SET BASED APPROACH and your #temp based approach still does record
> by record. And as Peter says
> for each iteration a select is executed against the table.
> Hope this helps.
> Gopi
>
> "Nestor" <n3570r@.yahoo.com> wrote in message
> news:ORU5F$IJFHA.1172@.TK2MSFTNGP12.phx.gbl...
>|||The article demonstrates some techniques for iteration but doesn't give
any recommendations as to when to use them. 99.99% of the time
row-by-row processing isn't necessary so those techniques shouldn't be
required. The Set Based approach means that you use SQL DML statements
(basically SELECT, UPDATE, INSERT, DELETE) that operate on a SET of
rows at a time rather than ONE row at a time. That should always be the
standard, preferred way to write code.

> Can anyone give me an equivalent of the cursor procedure I posted
using SET
> BASE APPROACH so that I can better relate?
But your cursor didn't do anything except assign a few variables. How
can we show you how to replace it if we don't know what it's supposed
to do? What we need to know is what you are actually trying to achieve
then we can suggest the right solution. It may even be that there isn't
a set-based solution to your problem - that's rarely the case but we
can't tell you until we understand the problem.
The following article explains the best way to post your problem here:
http://www.aspfaq.com/etiquette.asp?id=5006
David Portas
SQL Server MVP
--

Optimizing SQL Query

Hello,
I have to run a script which will delete records from 10 tables
based on a certain condition. Instead of having to run the condition 10
times I thought running it once would have better performance and did
something like this
CREATE PROCEDURE testScript AS
set nocount on
declare @.script_table table (row_id int identity(1,1), sid int)
declare @.max int, @.now int, @.result varchar(100)
insert into @.script_table
select sid from Alpha where lname like 'DOWN' and (fname like 'Tom' or
fname like 'Seinfeld')
select @.max = max(row_id) from @.script_table
select @.now=2, @.result=convert(varchar(6), sid) from @.script_table
where row_id=1
while @.max >=@.now
Begin
select @.result = @.result + ',' + convert(varchar(6), sid) from
@.script_table where row_id=@.now
set @.now=@.now + 1;
End
select * from Beta where convert(varchar(5), sid) in ( @.result )
if @.@.error <> 0
BEGIN
ROLLBACK Transaction
RAISERROR ( 'E', 2, 127)
END
...
...
...
but when I run this I dont get any values. Instead when I run this
query I get the output rows
select * from Beta where convert(varchar(5), sid) in (select sid from
Alpha where lname like 'DOWN' and (fname like 'Tom' or fname like
'Seinfeld'))
since @.result has the output from Alpha with a comma delimiter I was
under the impression that this should give me the result but instead I
dont get any rows. Is this because @.result is a varchar? Am I doing
something wrong here? Any help would be great..
Thanks
Khris> Is this because @.result is a varchar?
Yes. You can, however, parse a delimited string and have values returned in
a table, that you can use in your query.
My favourite solution is this one (by Dejan Sarka):
http://solidqualitylearning.com/blo.../10/22/200.aspx
ML
http://milambda.blogspot.com/|||Thanks ML, I will try that out|||Thanks ML I will try that out
Khris

Wednesday, March 28, 2012

Optimizing a query

Hi
First of all the first script throws a syntax error. Is that mysql syntax?
SELECT Col1 FROM
(
SELECT table1.Col1 FROM table1 JOIN table2 ON table1.Col1 =
table2.Col1 WHERE exp1 AND ....
) AS Der
--Or
SELECT * FROM
(
SELECT table1.Col1 FROM table1 WHERE exp1
UNION ALL
SELECT table1.Col1 FROM table2 WHERE exp2
) AS Der
"jeeji" <jihad_dk@.yahoo.com> wrote in message
news:1151500670.143984.181140@.x69g2000cwx.googlegroups.com...
> Hi
> I have a query which I would like to optimize and thought I had found a
> solution.
> The query is a join of several selects from several different tables.
> So we have in general the following:
> SELECT Col1 FROM
> (
> SELECT table1.Col1 FROM table1 WHERE exp1
> JOIN
> SELECT table1.Col1 FROM table2 WHERE exp2
> ON table1.Col1 = table2.Col1
> )
> or something like that. Of course my query was a lot larger than this,
> but it is just to illustrate the problem.
> The above was too slow, and I made an attempt to optimize it and
> thought it did work. The following is what I did:
> Instead of the SELECT's I could create temporary tables and select into
> those so I get the following:
> SELECT table1.Col1
> INTO #temp_table1
> FROM table1 WHERE exp1
> SELECT table1.Col1
> INTO #temp_table2
> FROM table2 WHERE exp2
> And then join those two tables:
> SELECT Col1 FROM
> (
> #temp_table1
> JOIN
> #temp_table2
> ON #temp_table1.Col1 = #temp_table2.Col1
> )
> and at the end drop the temporary tables.
> DROP Table #temp_table1
> DROP Table #temp_table2
> This gives the same result as the original statement.
> I profiled a bit and saw that if I created the temp tables, and ran the
> join statement several times after each other, the first time it is run
> it take a lot of time compared to the following. Can anyone explain to
> me why this is the case? I thought that the temporary tables were
> created and loaded into the cache, and therefore all calls to the big
> JOIN statement should take equally long time.
> The thing is that I cannot reuse the temporary tables, so I get no
> benefit from all the above.
> Thank you very much in advance
> Jeeji
>Uri,
I think that is just a small fragment of the query. (At least I hope it's
just a fragment...)
Arnie Rowland, YACE*
"To be successful, your heart must accompany your knowledge."
*Yet Another certification Exam
"jeeji" <jihad_dk@.yahoo.com> wrote in message
news:1151500670.143984.181140@.x69g2000cwx.googlegroups.com...
> Hi
> I have a query which I would like to optimize and thought I had found a
> solution.
> The query is a join of several selects from several different tables.
> So we have in general the following:
> SELECT Col1 FROM
> (
> SELECT table1.Col1 FROM table1 WHERE exp1
> JOIN
> SELECT table1.Col1 FROM table2 WHERE exp2
> ON table1.Col1 = table2.Col1
> )
> or something like that. Of course my query was a lot larger than this,
> but it is just to illustrate the problem.
> The above was too slow, and I made an attempt to optimize it and
> thought it did work. The following is what I did:
> Instead of the SELECT's I could create temporary tables and select into
> those so I get the following:
> SELECT table1.Col1
> INTO #temp_table1
> FROM table1 WHERE exp1
> SELECT table1.Col1
> INTO #temp_table2
> FROM table2 WHERE exp2
> And then join those two tables:
> SELECT Col1 FROM
> (
> #temp_table1
> JOIN
> #temp_table2
> ON #temp_table1.Col1 = #temp_table2.Col1
> )
> and at the end drop the temporary tables.
> DROP Table #temp_table1
> DROP Table #temp_table2
> This gives the same result as the original statement.
> I profiled a bit and saw that if I created the temp tables, and ran the
> join statement several times after each other, the first time it is run
> it take a lot of time compared to the following. Can anyone explain to
> me why this is the case? I thought that the temporary tables were
> created and loaded into the cache, and therefore all calls to the big
> JOIN statement should take equally long time.
> The thing is that I cannot reuse the temporary tables, so I get no
> benefit from all the above.
> Thank you very much in advance
> Jeeji
>|||Hi
I have a query which I would like to optimize and thought I had found a
solution.
The query is a join of several selects from several different tables.
So we have in general the following:
SELECT Col1 FROM
(
SELECT table1.Col1 FROM table1 WHERE exp1
JOIN
SELECT table1.Col1 FROM table2 WHERE exp2
ON table1.Col1 = table2.Col1
)
or something like that. Of course my query was a lot larger than this,
but it is just to illustrate the problem.
The above was too slow, and I made an attempt to optimize it and
thought it did work. The following is what I did:
Instead of the SELECT's I could create temporary tables and select into
those so I get the following:
SELECT table1.Col1
INTO #temp_table1
FROM table1 WHERE exp1
SELECT table1.Col1
INTO #temp_table2
FROM table2 WHERE exp2
And then join those two tables:
SELECT Col1 FROM
(
#temp_table1
JOIN
#temp_table2
ON #temp_table1.Col1 = #temp_table2.Col1
)
and at the end drop the temporary tables.
DROP Table #temp_table1
DROP Table #temp_table2
This gives the same result as the original statement.
I profiled a bit and saw that if I created the temp tables, and ran the
join statement several times after each other, the first time it is run
it take a lot of time compared to the following. Can anyone explain to
me why this is the case? I thought that the temporary tables were
created and loaded into the cache, and therefore all calls to the big
JOIN statement should take equally long time.
The thing is that I cannot reuse the temporary tables, so I get no
benefit from all the above.
Thank you very much in advance
Jeeji|||Hi
First of all the first script throws a syntax error. Is that mysql syntax?
SELECT Col1 FROM
(
SELECT table1.Col1 FROM table1 JOIN table2 ON table1.Col1 =
table2.Col1 WHERE exp1 AND ....
) AS Der
--Or
SELECT * FROM
(
SELECT table1.Col1 FROM table1 WHERE exp1
UNION ALL
SELECT table1.Col1 FROM table2 WHERE exp2
) AS Der
"jeeji" <jihad_dk@.yahoo.com> wrote in message
news:1151500670.143984.181140@.x69g2000cwx.googlegroups.com...
> Hi
> I have a query which I would like to optimize and thought I had found a
> solution.
> The query is a join of several selects from several different tables.
> So we have in general the following:
> SELECT Col1 FROM
> (
> SELECT table1.Col1 FROM table1 WHERE exp1
> JOIN
> SELECT table1.Col1 FROM table2 WHERE exp2
> ON table1.Col1 = table2.Col1
> )
> or something like that. Of course my query was a lot larger than this,
> but it is just to illustrate the problem.
> The above was too slow, and I made an attempt to optimize it and
> thought it did work. The following is what I did:
> Instead of the SELECT's I could create temporary tables and select into
> those so I get the following:
> SELECT table1.Col1
> INTO #temp_table1
> FROM table1 WHERE exp1
> SELECT table1.Col1
> INTO #temp_table2
> FROM table2 WHERE exp2
> And then join those two tables:
> SELECT Col1 FROM
> (
> #temp_table1
> JOIN
> #temp_table2
> ON #temp_table1.Col1 = #temp_table2.Col1
> )
> and at the end drop the temporary tables.
> DROP Table #temp_table1
> DROP Table #temp_table2
> This gives the same result as the original statement.
> I profiled a bit and saw that if I created the temp tables, and ran the
> join statement several times after each other, the first time it is run
> it take a lot of time compared to the following. Can anyone explain to
> me why this is the case? I thought that the temporary tables were
> created and loaded into the cache, and therefore all calls to the big
> JOIN statement should take equally long time.
> The thing is that I cannot reuse the temporary tables, so I get no
> benefit from all the above.
> Thank you very much in advance
> Jeeji
>|||Uri,
I think that is just a small fragment of the query. (At least I hope it's
just a fragment...)
Arnie Rowland, YACE*
"To be successful, your heart must accompany your knowledge."
*Yet Another certification Exam
"jeeji" <jihad_dk@.yahoo.com> wrote in message
news:1151500670.143984.181140@.x69g2000cwx.googlegroups.com...
> Hi
> I have a query which I would like to optimize and thought I had found a
> solution.
> The query is a join of several selects from several different tables.
> So we have in general the following:
> SELECT Col1 FROM
> (
> SELECT table1.Col1 FROM table1 WHERE exp1
> JOIN
> SELECT table1.Col1 FROM table2 WHERE exp2
> ON table1.Col1 = table2.Col1
> )
> or something like that. Of course my query was a lot larger than this,
> but it is just to illustrate the problem.
> The above was too slow, and I made an attempt to optimize it and
> thought it did work. The following is what I did:
> Instead of the SELECT's I could create temporary tables and select into
> those so I get the following:
> SELECT table1.Col1
> INTO #temp_table1
> FROM table1 WHERE exp1
> SELECT table1.Col1
> INTO #temp_table2
> FROM table2 WHERE exp2
> And then join those two tables:
> SELECT Col1 FROM
> (
> #temp_table1
> JOIN
> #temp_table2
> ON #temp_table1.Col1 = #temp_table2.Col1
> )
> and at the end drop the temporary tables.
> DROP Table #temp_table1
> DROP Table #temp_table2
> This gives the same result as the original statement.
> I profiled a bit and saw that if I created the temp tables, and ran the
> join statement several times after each other, the first time it is run
> it take a lot of time compared to the following. Can anyone explain to
> me why this is the case? I thought that the temporary tables were
> created and loaded into the cache, and therefore all calls to the big
> JOIN statement should take equally long time.
> The thing is that I cannot reuse the temporary tables, so I get no
> benefit from all the above.
> Thank you very much in advance
> Jeeji
>|||Hi.
I thought I made it clear that it was just a fragment of the code. But
now something new came to my attention.
I can see that if I clear the SQL server cache and call the first SQL
statement (the one without the temporary table), the SQL profiler shows
that this call takes longer (around twice) than it takes if I clear the
server cache and run the temporary table version of the call.
However if I do the same calls from a C# application, the opposite
occurs.
To clear the SQL server cache I use:
DBCC FreeSystemCache('All')
And for the C# application I am using .Net 2.0, using the
System.Data.SQLClient calls.
Any advice would just be great.
Arnie Rowland wrote:[vbcol=seagreen]
> Uri,
> I think that is just a small fragment of the query. (At least I hope it's
> just a fragment...)
> --
> Arnie Rowland, YACE*
> "To be successful, your heart must accompany your knowledge."
> *Yet Another certification Exam
>
> "jeeji" <jihad_dk@.yahoo.com> wrote in message
> news:1151500670.143984.181140@.x69g2000cwx.googlegroups.com...|||Sorry for the distrubance guys. I figured it out myself.
The problem is that I did not clear the cache from the C# application.
If I did, the same results show.
Jeeji
jeeji skrev:
[vbcol=seagreen]
> Hi.
> I thought I made it clear that it was just a fragment of the code. But
> now something new came to my attention.
> I can see that if I clear the SQL server cache and call the first SQL
> statement (the one without the temporary table), the SQL profiler shows
> that this call takes longer (around twice) than it takes if I clear the
> server cache and run the temporary table version of the call.
> However if I do the same calls from a C# application, the opposite
> occurs.
> To clear the SQL server cache I use:
> DBCC FreeSystemCache('All')
> And for the C# application I am using .Net 2.0, using the
> System.Data.SQLClient calls.
> Any advice would just be great.
> Arnie Rowland wrote:|||Hi.
I thought I made it clear that it was just a fragment of the code. But
now something new came to my attention.
I can see that if I clear the SQL server cache and call the first SQL
statement (the one without the temporary table), the SQL profiler shows
that this call takes longer (around twice) than it takes if I clear the
server cache and run the temporary table version of the call.
However if I do the same calls from a C# application, the opposite
occurs.
To clear the SQL server cache I use:
DBCC FreeSystemCache('All')
And for the C# application I am using .Net 2.0, using the
System.Data.SQLClient calls.
Any advice would just be great.
Arnie Rowland wrote:[vbcol=seagreen]
> Uri,
> I think that is just a small fragment of the query. (At least I hope it's
> just a fragment...)
> --
> Arnie Rowland, YACE*
> "To be successful, your heart must accompany your knowledge."
> *Yet Another certification Exam
>
> "jeeji" <jihad_dk@.yahoo.com> wrote in message
> news:1151500670.143984.181140@.x69g2000cwx.googlegroups.com...|||Sorry for the distrubance guys. I figured it out myself.
The problem is that I did not clear the cache from the C# application.
If I did, the same results show.
Jeeji
jeeji skrev:
[vbcol=seagreen]
> Hi.
> I thought I made it clear that it was just a fragment of the code. But
> now something new came to my attention.
> I can see that if I clear the SQL server cache and call the first SQL
> statement (the one without the temporary table), the SQL profiler shows
> that this call takes longer (around twice) than it takes if I clear the
> server cache and run the temporary table version of the call.
> However if I do the same calls from a C# application, the opposite
> occurs.
> To clear the SQL server cache I use:
> DBCC FreeSystemCache('All')
> And for the C# application I am using .Net 2.0, using the
> System.Data.SQLClient calls.
> Any advice would just be great.
> Arnie Rowland wrote:

Optimizing a large

Hi
I'm running a script that updates the same column on every row in a huge table. The table is being updated with data from another table in another database, but they are in the same SQL instance
The log and data files for the database being read from are on an internal drive (D:\). The log and datafiles for the database being updated are on an external SAN (X:\
Would any of the following speed the process up significantly
1)Remove the indexes from the table being updated
2)Set the recovery model to 'simple' on the database being updated
3)Put the log and data files on the database being updated on different physical disks (not necessary i believe if I perform number 2)
thanks
KevinKevin,
1. If the column that you are updating has an index on it you might
consider dropping it until after the update.
2. This won't affect in any way the number of items logged or the amount of
data logged during an Update. If you do the update in one transaction it
won't matter either way.
3. It's always best to place the Log and Data files on separate drive
arrays. It makes no difference where the log for the one being read from
are since the log is not used for reads.
I would recommend you attempt the Updates in smaller batches. This will
keep the log in check and usually results in a faster overall operation.
You can usually achieve this with a loop of some sort.
--
Andrew J. Kelly
SQL Server MVP
"Kevin" <anonymous@.discussions.microsoft.com> wrote in message
news:715A5D36-5F4F-43D5-8B5B-854B711F3A9A@.microsoft.com...
> Hi,
> I'm running a script that updates the same column on every row in a huge
table. The table is being updated with data from another table in another
database, but they are in the same SQL instance.
> The log and data files for the database being read from are on an internal
drive (D:\). The log and datafiles for the database being updated are on an
external SAN (X:\)
> Would any of the following speed the process up significantly?
> 1)Remove the indexes from the table being updated?
> 2)Set the recovery model to 'simple' on the database being updated?
> 3)Put the log and data files on the database being updated on different
physical disks (not necessary i believe if I perform number 2).
> thanks,
> Kevin|||Thanks for your response
Does the frequency of log checkpoints significantly slow down a long running update statement? Should I create a large transaction log that will not need to dynamically grow and set the recovery interval to a high value so that the frequency of checkpoints is reduced
thanks
Kevin|||Kevin,
First off you should always have the log file larger than it needs to be for
any given operation. Anytime it has to grow it will impact performance to
some degree. As for check points that depends. Checkpoints certainly can
add overhead to the system, especially disk IO and CPU. Whether they have a
large negative effect on your situation is hard to say from here. If you
make a large recovery interval it will most definitely adversely affect the
other users when it does happen as there will be a lot more to do at one
time. The key in your situation is to do the updates in smaller batches.
Andrew J. Kelly
SQL Server MVP
"Kevin" <anonymous@.discussions.microsoft.com> wrote in message
news:4A65CF81-2DD6-4F53-AA7A-E7B00FF50CE2@.microsoft.com...
> Thanks for your response.
> Does the frequency of log checkpoints significantly slow down a long
running update statement? Should I create a large transaction log that will
not need to dynamically grow and set the recovery interval to a high value
so that the frequency of checkpoints is reduced?
> thanks,
> Kevin

Optimizing a large

Hi,
I'm running a script that updates the same column on every row in a huge tab
le. The table is being updated with data from another table in another datab
ase, but they are in the same SQL instance.
The log and data files for the database being read from are on an internal d
rive (D:\). The log and datafiles for the database being updated are on an e
xternal SAN (X:\)
Would any of the following speed the process up significantly?
1)Remove the indexes from the table being updated?
2)Set the recovery model to 'simple' on the database being updated?
3)Put the log and data files on the database being updated on different phys
ical disks (not necessary i believe if I perform number 2).
thanks,
KevinKevin,
1. If the column that you are updating has an index on it you might
consider dropping it until after the update.
2. This won't affect in any way the number of items logged or the amount of
data logged during an Update. If you do the update in one transaction it
won't matter either way.
3. It's always best to place the Log and Data files on separate drive
arrays. It makes no difference where the log for the one being read from
are since the log is not used for reads.
I would recommend you attempt the Updates in smaller batches. This will
keep the log in check and usually results in a faster overall operation.
You can usually achieve this with a loop of some sort.
Andrew J. Kelly
SQL Server MVP
"Kevin" <anonymous@.discussions.microsoft.com> wrote in message
news:715A5D36-5F4F-43D5-8B5B-854B711F3A9A@.microsoft.com...
quote:

> Hi,
> I'm running a script that updates the same column on every row in a huge

table. The table is being updated with data from another table in another
database, but they are in the same SQL instance.
quote:

> The log and data files for the database being read from are on an internal

drive (D:\). The log and datafiles for the database being updated are on an
external SAN (X:\)
quote:

> Would any of the following speed the process up significantly?
> 1)Remove the indexes from the table being updated?
> 2)Set the recovery model to 'simple' on the database being updated?
> 3)Put the log and data files on the database being updated on different

physical disks (not necessary i believe if I perform number 2).
quote:

> thanks,
> Kevin
|||Thanks for your response.
Does the frequency of log checkpoints significantly slow down a long running
update statement? Should I create a large transaction log that will not nee
d to dynamically grow and set the recovery interval to a high value so that
the frequency of checkpoint
s is reduced?
thanks,
Kevin|||Kevin,
First off you should always have the log file larger than it needs to be for
any given operation. Anytime it has to grow it will impact performance to
some degree. As for check points that depends. Checkpoints certainly can
add overhead to the system, especially disk IO and CPU. Whether they have a
large negative effect on your situation is hard to say from here. If you
make a large recovery interval it will most definitely adversely affect the
other users when it does happen as there will be a lot more to do at one
time. The key in your situation is to do the updates in smaller batches.
Andrew J. Kelly
SQL Server MVP
"Kevin" <anonymous@.discussions.microsoft.com> wrote in message
news:4A65CF81-2DD6-4F53-AA7A-E7B00FF50CE2@.microsoft.com...
quote:

> Thanks for your response.
> Does the frequency of log checkpoints significantly slow down a long

running update statement? Should I create a large transaction log that will
not need to dynamically grow and set the recovery interval to a high value
so that the frequency of checkpoints is reduced?
quote:

> thanks,
> Kevin

Tuesday, March 20, 2012

Optimization

Hi All,
I am writing a script that runs the optimization process. I am not sure how to test it? Any help is greatly appreciated.The optimization process?|||Sorry, script that uses dbcc dbreindex command. I don't want to use optimization that can be setup through maintenance plan.|||Perhaps if you created a table with an index on a guid column, then insert 10,000 rows into it. The GUID index should be considerably fragmented.|||You trying to verify that DBCC is working?

If your script just runs DBCC commands, then pipe the output of the job to a file (good idea anyway). If there are any errors or messages from DBCC you will find them there.|||unless this is a serious application that takes in thousands of records a day you are not going to have to do this too often.

here is some light reading on the subject...

http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx|||blindman,

do you have a sample code of how to send a job output to a file?|||In the stored procedure sp_add_jobstep, there is a parameter @.output_file_name. This is for the output file you would configure on the advanced tab of a job step in EM. Here is part of a much larger script that I have:

exec sp_add_jobstep
@.job_id = @.jobid,
@.step_id = 1, --@.step_id
@.step_name = N'DBCC Checks', --@.step_name
@.subsystem = 'TSQL', --@.subsystem
@.command = @.cmd, --@.command
@.cmdexec_success_code = 0, --@.cmdexec_success_code
@.on_success_action = 3, --@.on_success_action
@.on_success_step_id = 0, --@.on_success_step_id
@.on_fail_action = 2, --@.on_fail_action
@.on_fail_step_id = 0, --@.on_fail_step_id
@.database_name = master, --@.database_name
@.database_user_name = dbo, --@.database_user_name
@.retry_attempts = 0, --@.retry_attempts
@.retry_interval = 0, --@.retry_interval
@.output_file_name = @.LogName, --@.output_file_name
@.flags = 4 --@.flags (for output file)

Yeah, I know the comments aren't too helpful. I think I meant to go back and make them prettier at some point. BOL will have the full list of parameters, and their definitions.|||In the Edit Job Step dialog box of Enterprise Manager, go to the Advanced tab and you can set the location of your output file.|||Thank you very much for the help.

Wednesday, March 7, 2012

operation not allowed when object is closed...

I have this stored procedure on SQL 2005:

USE [Eventlog]

GO

/****** Object: StoredProcedure [dbo].[SelectCustomerSoftwareLicenses] Script Date: 08/07/2007 16:56:32 ******/

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

ALTER PROCEDURE [dbo].[SelectCustomerSoftwareLicenses]

(

@.CustomerID char(8)

)

AS

BEGIN

DECLARE @.Temp TABLE (SoftwareID int)

INSERT INTO @.Temp

SELECT SoftwareID FROM Workstations

JOIN WorkstationSoftware ON Workstations.WorkstationID = WorkstationSoftware.WorkstationID

WHERE Workstations.CustomerID = @.CustomerID

UNION ALL

SELECT SoftwareID FROM Notebooks

JOIN NotebookSoftware ON Notebooks.NotebookID = NotebookSoftware.NotebookID

WHERE Notebooks.CustomerID = @.CustomerID

UNION ALL

SELECT SoftwareID FROM Machines

JOIN MachinesSoftware ON Machines.MachineID = MachinesSoftware.MachineID

WHERE Machines.CustomerID = @.CustomerID

DECLARE @.SoftwareInstalls TABLE (rowid int identity(1,1), SoftwareID int, Installs int)

INSERT INTO @.SoftwareInstalls

SELECT SoftwareID, COUNT(*) AS Installs FROM @.Temp

GROUP BY SoftwareID

DECLARE @.rowid int

SET @.rowid = (SELECT COUNT(*) FROM @.SoftwareInstalls)

WHILE @.rowid > 0 BEGIN

UPDATE SoftwareLicenses

SET Installs = (SELECT Installs FROM @.SoftwareInstalls WHERE rowid = @.rowid)

WHERE SoftwareID = (SELECT SoftwareID FROM @.SoftwareInstalls WHERE rowid = @.rowid)

DELETE FROM @.SoftwareInstalls

WHERE rowid = @.rowid

SET @.rowid = (SELECT COUNT(*) FROM @.SoftwareInstalls)

END

SELECT SoftwareLicenses.SoftwareID, Software.Software, SoftwareLicenses.Licenses, SoftwareLicenses.Installs FROM SoftwareLicenses

JOIN Software ON SoftwareLicenses.SoftwareID = Software.SoftwareID

WHERE SoftwareLicenses.CustomerID = @.CustomerID

ORDER BY Software.Software

END

When i execute it in a Query in SQL Studio it works fine, but when i execute it from an ASP page, i get following error:

ADODB.Recordset error '800a0e78'

Operation is not allowed when the object is closed.

/administration/licenses_edit.asp, line 56

Here the conection:

Set OBJdbConnection = Server.CreateObject("ADODB.Connection")
OBJdbConnection.ConnectionTimeout = Session("ConnectionTimeout")
OBJdbConnection.CommandTimeout = Session("CommandTimeout")
OBJdbConnection.Open Session("ConnectionString")
Set SQLStmt = Server.CreateObject("ADODB.Command")
Set RS = Server.CreateObject("ADODB.Recordset")

SQLStmt.CommandText = "EXECUTE SelectCustomerSoftwareLicenses '" & Request("CustomerID") & "'"
SQLStmt.CommandType = 1
Set SQLStmt.ActiveConnection = OBJdbConnection
RS.Open SQLStmt
RS.Close

Can anyone help please?

It this because of the variable tables?

If I recall correctly, an ADODB recordset is not a disconnected object.

You must do your actions between the OPEN and CLOSE.

Are you using VB v6, or Access?

(.NET allows the use of disconnected data using a dataset -NOT a recordset.)

|||

I'm using VB v6 and SQL Server 2005

I am going through my recordset between the open and close.

I think the problem lies in the scope of the variable table in stored procedure, because if i remove that whole chunk with the variable tables, there are no problems.

I have made the script work in totally different way, so i haven't solved the problem, just worked around it Smile

But it would still be nice to know if it is the scope of the varible tables that is being exceeded, and how, if possible to avoid this...?

|||

Just add "set nocount on" as the first statement in your sproc and your problem should go away.

Code Snippet

ALTER PROCEDURE [dbo].[SelectCustomerSoftwareLicenses]

(

@.CustomerID char(8)

)

AS

set nocount on

BEGIN

DECLARE @.Temp TABLE (SoftwareID int)

INSERT INTO @.Temp

SELECT SoftwareID FROM Workstations

JOIN WorkstationSoftware ON Workstations.WorkstationID = WorkstationSoftware.WorkstationID

WHERE Workstations.CustomerID = @.CustomerID

UNION ALL

SELECT SoftwareID FROM Notebooks

JOIN NotebookSoftware ON Notebooks.NotebookID = NotebookSoftware.NotebookID

WHERE Notebooks.CustomerID = @.CustomerID

UNION ALL

SELECT SoftwareID FROM Machines

JOIN MachinesSoftware ON Machines.MachineID = MachinesSoftware.MachineID

WHERE Machines.CustomerID = @.CustomerID

DECLARE @.SoftwareInstalls TABLE (rowid int identity(1,1), SoftwareID int, Installs int)

INSERT INTO @.SoftwareInstalls

SELECT SoftwareID, COUNT(*) AS Installs FROM @.Temp

GROUP BY SoftwareID

DECLARE @.rowid int

SET @.rowid = (SELECT COUNT(*) FROM @.SoftwareInstalls)

WHILE @.rowid > 0 BEGIN

UPDATE SoftwareLicenses

SET Installs = (SELECT Installs FROM @.SoftwareInstalls WHERE rowid = @.rowid)

WHERE SoftwareID = (SELECT SoftwareID FROM @.SoftwareInstalls WHERE rowid = @.rowid)

DELETE FROM @.SoftwareInstalls

WHERE rowid = @.rowid

SET @.rowid = (SELECT COUNT(*) FROM @.SoftwareInstalls)

END

SELECT SoftwareLicenses.SoftwareID, Software.Software, SoftwareLicenses.Licenses, SoftwareLicenses.Installs FROM SoftwareLicenses

JOIN Software ON SoftwareLicenses.SoftwareID = Software.SoftwareID

WHERE SoftwareLicenses.CustomerID = @.CustomerID

ORDER BY Software.Software

END

|||add the following code before "RS.Open SQLStmt"

"Set RS.ActiveConnection = OBJdbConnection"|||

Yes. "SET NOCOUNT ON" will fix your issue. The recordset will not get the resultset from the procedures. Instead of the resultset, the Insert statement's feedback will go.

You can add the "SET NOCOUNT ON" on your sp at first line or you can use the bellow command text,

SQLStmt.CommandText = "SET NOCOUNT ON;EXECUTE SelectCustomerSoftwareLicenses '" & Request("CustomerID") & "'"