Friday, March 30, 2012
optimizing stored procedures
i'm working in a database, i'm not the creater of it, and was assigned to optimize the stored procedures running against this database.
can someone tell me what steps should i follow to check these SP's and to what should i look at and what can be done to optimise them.
thanks expertsIt might be a two line question but it is a massive subject. People make their careers out of this.
read stuf here:
http://www.sql-server-performance.com/
Learn to read execution plans.
Learn about SET STATISTICS IO
Learn to use profiler.
Read BoL (section on Optimising Database Performance).
Learn your database.
Learn about indexing strategies.|||First make sure all the critical joins are indexed. Then eliminate any cursors or loops.|||I usually I just ask the QA people what parts of the app sucks the will to live out of them and then I trace those screens to identify the really big pieces of suck ass code and I look at the execution plans of those queries. I like being reactive.|||exactly. the ONLY sure way to find where the perf problems are is to measure.
ask any dev where the most time is spent in their code, and 90% will give you the wrong answer. the other 10% are right only because they have profiled it. :)|||I would trac it all to a table, th query the damn thing for the longest running pieces of code...
I guess we could make you look like a hero, sinve it seems no one else knows what's going on.
How big is this company?
And how what's your experience level
And do you have SQL Server client tools installed
Do you know what books online is?
Do you know what Google is?|||I guess we could make you look like a hero
...
Do you know what Google is?Downgrading as you type.|||There are some good articles about perf tuning here:
http://www.sql-server-performance.com/articles_performance.asp
http://www.sql-server-performance.com/articles_audit.asp
in particular this one will show you how to use profiler for catching the biggest pigs in your database:
http://www.sql-server-performance.com/sql_server_performance_audit10.asp
and elsewhere on that site.
Be aware, however, sql is not the only process that can kill your perf. compiled code can be poorly written too. for compiled code you need to use a different profiling tool (I usually use perf studio in VS).|||I trace all the queries to a table, as Brett mentioned. Then query it for the total number of reads grouped by say the first 50 or 100 characters of the textdata field. I find it is usually better to tune the average running query that runs 1,000's of times per day than the one pig that runs once at midnight. With all the variations that the textdata field can have, getting that aggregate right is a challenge, though.|||...
Downgrading as you type.
Are you saying that you should assume something about someone's experience?|||Nope - but if you take a poster on a forum from not knowing what google is to being a hero dba then I will eat all my hats.
Anyway yes - the OP needs to identify the sprocs to tune and then needs to learn the skills to tune them. Or post them here of course.
optimizing resource allocation
Some of the queries we are running are resource intensive which is impacting
other users running other processes on the same SQL server. Is there a way
to limit the time allocated to a query so it pauses after x seconds in order
to allow other processes to run? Please note that the query should not be
permanently interrupted but temporarily halted.
I have checked the "query governor" option but it seems that this parameter
would disallow running the query if it exceeds the cost limit.
Regards
Hi
Perhaps it's time to start to optimize your queries.Compare an execution
plan , make sure that the optimizer is available to use indexes defined on
the tables.
"imad koussa" <imad.koussa@.aims-co.com> wrote in message
news:uu6kex0FGHA.2444@.TK2MSFTNGP11.phx.gbl...
> Hi All,
> Some of the queries we are running are resource intensive which is
> impacting other users running other processes on the same SQL server. Is
> there a way to limit the time allocated to a query so it pauses after x
> seconds in order to allow other processes to run? Please note that the
> query should not be permanently interrupted but temporarily halted.
> I have checked the "query governor" option but it seems that this
> parameter would disallow running the query if it exceeds the cost limit.
> Regards
>
|||Basically no there is no easy or practical way to do that. But if the
resources are CPU you should consider using a MAXDOP hint in those queries
to limit the number of CPU's they can use in parallel.
Andrew J. Kelly SQL MVP
"imad koussa" <imad.koussa@.aims-co.com> wrote in message
news:uu6kex0FGHA.2444@.TK2MSFTNGP11.phx.gbl...
> Hi All,
> Some of the queries we are running are resource intensive which is
> impacting other users running other processes on the same SQL server. Is
> there a way to limit the time allocated to a query so it pauses after x
> seconds in order to allow other processes to run? Please note that the
> query should not be permanently interrupted but temporarily halted.
> I have checked the "query governor" option but it seems that this
> parameter would disallow running the query if it exceeds the cost limit.
> Regards
>
optimizing resource allocation
Some of the queries we are running are resource intensive which is impacting
other users running other processes on the same SQL server. Is there a way
to limit the time allocated to a query so it pauses after x seconds in order
to allow other processes to run? Please note that the query should not be
permanently interrupted but temporarily halted.
I have checked the "query governor" option but it seems that this parameter
would disallow running the query if it exceeds the cost limit.
RegardsHi
Perhaps it's time to start to optimize your queries.Compare an execution
plan , make sure that the optimizer is available to use indexes defined on
the tables.
"imad koussa" <imad.koussa@.aims-co.com> wrote in message
news:uu6kex0FGHA.2444@.TK2MSFTNGP11.phx.gbl...
> Hi All,
> Some of the queries we are running are resource intensive which is
> impacting other users running other processes on the same SQL server. Is
> there a way to limit the time allocated to a query so it pauses after x
> seconds in order to allow other processes to run? Please note that the
> query should not be permanently interrupted but temporarily halted.
> I have checked the "query governor" option but it seems that this
> parameter would disallow running the query if it exceeds the cost limit.
> Regards
>|||Basically no there is no easy or practical way to do that. But if the
resources are CPU you should consider using a MAXDOP hint in those queries
to limit the number of CPU's they can use in parallel.
Andrew J. Kelly SQL MVP
"imad koussa" <imad.koussa@.aims-co.com> wrote in message
news:uu6kex0FGHA.2444@.TK2MSFTNGP11.phx.gbl...
> Hi All,
> Some of the queries we are running are resource intensive which is
> impacting other users running other processes on the same SQL server. Is
> there a way to limit the time allocated to a query so it pauses after x
> seconds in order to allow other processes to run? Please note that the
> query should not be permanently interrupted but temporarily halted.
> I have checked the "query governor" option but it seems that this
> parameter would disallow running the query if it exceeds the cost limit.
> Regards
>
optimizing resource allocation
Some of the queries we are running are resource intensive which is impacting
other users running other processes on the same SQL server. Is there a way
to limit the time allocated to a query so it pauses after x seconds in order
to allow other processes to run? Please note that the query should not be
permanently interrupted but temporarily halted.
I have checked the "query governor" option but it seems that this parameter
would disallow running the query if it exceeds the cost limit.
RegardsHi
Perhaps it's time to start to optimize your queries.Compare an execution
plan , make sure that the optimizer is available to use indexes defined on
the tables.
"imad koussa" <imad.koussa@.aims-co.com> wrote in message
news:uu6kex0FGHA.2444@.TK2MSFTNGP11.phx.gbl...
> Hi All,
> Some of the queries we are running are resource intensive which is
> impacting other users running other processes on the same SQL server. Is
> there a way to limit the time allocated to a query so it pauses after x
> seconds in order to allow other processes to run? Please note that the
> query should not be permanently interrupted but temporarily halted.
> I have checked the "query governor" option but it seems that this
> parameter would disallow running the query if it exceeds the cost limit.
> Regards
>|||Basically no there is no easy or practical way to do that. But if the
resources are CPU you should consider using a MAXDOP hint in those queries
to limit the number of CPU's they can use in parallel.
--
Andrew J. Kelly SQL MVP
"imad koussa" <imad.koussa@.aims-co.com> wrote in message
news:uu6kex0FGHA.2444@.TK2MSFTNGP11.phx.gbl...
> Hi All,
> Some of the queries we are running are resource intensive which is
> impacting other users running other processes on the same SQL server. Is
> there a way to limit the time allocated to a query so it pauses after x
> seconds in order to allow other processes to run? Please note that the
> query should not be permanently interrupted but temporarily halted.
> I have checked the "query governor" option but it seems that this
> parameter would disallow running the query if it exceeds the cost limit.
> Regards
>
Optimizing Query to Run
select Date =cast(cl1.cl_rundate as datetime(102)),count(cl1.cl_recno) as 'Completed Initials', cl1.cl_status as Status from dbo.vw_Completedorders cl1 where cl1.lob_lineofbusiness = 'aaa'
and cl1.cl_rundate > '20060801' and not exists (
select cl2.cl_company from dbo.vw_Paidorders cl2 where
cl2.lob_lineofbusiness = 'aa'and cl2.cl_company = cl1.cl_order and cl2.cl_rundate > '20060101' and datediff(day,cl2.cl_rundate,cl1.cl_rundate) < 180)
group by cl1.cl_status, cl1.cl_rundate
Aragon:
I started to do a mock-up of this; however, I quickly decided that without additional information this was not a good idea. Since this database is "very large" and you "can't get it complete" I think it is a good idea to try to get more information. What I see as critical to this query is:
Avoiding table scans -- especially on the view that is included in the "not exists" condition|||Unfortunately in both Views, there are no indexes available. I do have a table I could use, Orders. In the Orders table I have 5 indexes available. The only one pertainable to my search is the recNo field. Unfortunately RunDate is not an index or it may actually work.Both views references 2 tables, orders and lineofbussines. LineOfBussiness has 3 columns and no indexes.
The main table to reference, orders, has approximately 29k rows per day for each day since late 2004. CompletedOrders has about 11k per day for the same period and PaidOrders has about 10k per day.
Does this give you enough background?
|||
Try running the Database Tuning Advisor in SQL Server 2005 or Index Tuning Wizard in SQL Server 2000. You need to use a workload or script that contains this query and other common queries/DML statements. DTA/ITW can then recommend additional indexes that will help. It is hard to tell by just looking at your query where the problem might be. Both the objects are views so we don't know what tables are being referenced or how the views are defined. In addition to this, the indexes on the base tables referenced by the views also matter.
Btw, you are using invalid cast specification for datetime "cast(cl1.cl_rundate as datetime(102))". The code will fail if you upgrade your server to SQL Server 2005. The correct way is to do "cast(cl1.cl_rundate as datetime)". And it is also not clear why you need the cast assuming that the column is smalldatetime/datetime.
|||Aragon:
This is probably enough for a mock-up, but I have no more time today. The big question is that if you can prove that you need the indexes on the tables or the views will you be allowed to add them?
|||Thanks for help. Unfortunately, what I really need for indexing is the date field. We tried to index on date before and the database had issues. The rebuild of indexes took too long to load. We have to wait for mainframe jobs to run prior to rebuilding so indexing ran into the work day and caused sych issues with other db's.Dave
the date field is ANSI, so that's why I convert it to a datetime.
|||
Got it. Another piece of optimization that will help is to partition the data; sorry for neglecting to mention this in the previous pass. I will see if I can mock this up this morning.
|||I got it to run using recno to limit the results but it appears to be giving faulty data. I'm having counts of 500 on days when there were only 100 total.Dave
|||
Aragon:
I finally got some time today to look at this again. I was able to load a bunch of mock-up data and look at the results. Below is the schema that I used for my mock-up along with some questions. I targeted my mock-up for about 10,000,000 rows which is 29K records per day times 365 days.
Dave
-- -
-- Mock-up schema?
--
--
-- QUESTIONS:
--
--
-- 1. I am not sure about the "CL_RUNDATE" field. I have modeled this field as
-- CHAR (10) storing date as '2006.05.21'. Exactly what kind of data element
-- is the "CL_RUNDATE" field?
-- 2. I am not sure how the views join the Orders table and the LineOfBusiness table.
-- How are these two table joined.
-- 3. Your query filters based on the field "LOB_LINEOFBUSINESS"; does that field of
-- the view come from the ORDERS table or the LINEOFBUSINESS table?
--
-- 4. In your overview you state that you "look at completed orders that have not had
-- another paid order in 180 days." However, in your query you alias view
-- VW_paidOrders with cl2 and look for orders > '20060101' while aliasing view
-- VW_completedOrders with cl1 and looking for orders > '20060801'. This seems
-- backwards.
-- 5. When you take the date difference DATEDIFF (day, cl2.cl_runDate, cl1.cl_runDate)
-- and this also seems backwards.
--
-- 6. When you join the two views you join the PaidOrders "cl_company" column to the
-- completedOrders "cl_order column. Are these two columns simply two different
-- instances of a representation of a company?
--
-- 7. Another question that I saw has to do with this issue:
--
-- Company A Order 1 Paid 1/15/2006
-- Company A Order 2 Completed 2/2/2006
-- Company A Order 3 Completed 9/9/2006
--
-- Assuming that 4/2 is 180 days ago we have Order 2 that precedes 4/2 and
-- has no subsequent PAID order. Should this cause to the 2/2 record to be
-- included in the report or should that fact that there was a 9/9 COMPLETED
-- order after the 2/2 order suppress the 2/2 order from being included?
--
-- --
-- Comments:
-- --
--
-- At this point I have put together a "working mockup"; however, I still need these
-- issues above discussed. I tried out a mock-up of 10,000,000 rows. This requires
-- 10 GB of disk space to store and about 10 GB of transient log space to run the
-- process to generate the data.
--
-- The index that I created is intended to access target data without the use of any
-- bookmark lookups. When I benchmarked the first set of tests this index reduced
-- logical read from 200000 to 296 -- a very large I/O reduction; this is a good
-- starting point but still doesn't yet prove anything. I am for the moment somewhat
-- optimistic.
--
-- None of this addresses partitioning of data.
--
-- -
--
-- Tables:
--
-- MockOrder
-- LineOfBusiness
--
-- Views:
--
-- vw_CompletedOrders
-- vw_PaidOrders
-- -
create table dbo.mockOrder
( cl_recno integer identity
constraint pk_mockOrder primary key,
lob_code char (4) not null,
cl_status char (2) not null,
cl_rundate char(10) not null,
cl_company char (30) not null,
filler char (730) not null
)
go
create index lobCompany_ndx
on mockOrder (cl_company, lob_code, cl_status, cl_rundate)
go
create table dbo.LineOfBusiness
( lob_code char (4) not null
constraint pk_lineOfBusiness primary key,
lob_lineOfBusiness char (4) not null
)
go
create view dbo.vw_Completedorders
as
select cl_recno,
a.lob_code,
b.lob_lineOfBusiness,
a.cl_status,
cl_rundate,
cl_company as cl_order
from dbo.mockOrder a
inner join dbo.lineOfBusiness b
on a.cl_status = 'C'
and a.lob_code = b.lob_code
go
create view dbo.vw_paidOrders
as
select cl_recno,
a.lob_code,
b.lob_lineOfBusiness,
a.cl_status,
cl_rundate,
cl_company
from dbo.mockOrder a
inner join dbo.lineOfBusiness b
on a.cl_status = 'P'
and a.lob_code = b.lob_code
Sorry, I missed an index:
|||Wow, that's a lot of work you put into it. Thanks. Answers below.create index lobStatus_ndx
on mockOrder (lob_code, cl_status, cl_rundate, cl_company)
1. I am not sure about the "CL_RUNDATE" field. I have modeled this field as
-- CHAR (10) storing date as '2006.05.21'. Exactly what kind of data element is the "CL_RUNDATE" field?
A: Ansi date format. Ex, 20060930
-- 2. I am not sure how the views join the Orders table and the LineOfBusiness table.
-- How are these two table joined.
A: There is a sort code field in both tables. LineOfBusiness is a grouping of sort codes.
-- 3. Your query filters based on the field "LOB_LINEOFBUSINESS"; does that field of
-- the view come from the ORDERS table or the LINEOFBUSINESS table?
--
A: It originates in the LOB table, but is added to the records in both Views.
-- 4. In your overview you state that you "look at completed orders that have not had
-- another paid order in 180 days." However, in your query you alias view
-- VW_paidOrders with cl2 and look for orders > '20060101' while aliasing view
-- VW_completedOrders with cl1 and looking for orders > '20060801'. This seems
-- backwards.
A: I think I mispoke. I'm looking for NEW orders completed when there hasn't been an OLD order the last 180 days. I did forget, in the above example to include a cl2.cl_rundate >= cl1.cl_rundate clause.
-- 5. When you take the date difference DATEDIFF (day, cl2.cl_runDate, cl1.cl_runDate)
-- and this also seems backwards.
--
A: See above. But it shouldn't matter, should it?
-- 6. When you join the two views you join the PaidOrders "cl_company" column to the
-- completedOrders "cl_order column. Are these two columns simply two different
-- instances of a representation of a company?
--
A: I mistyped. Both are company
-- 7. Another question that I saw has to do with this issue:
--
-- Company A Order 1 Paid 1/15/2006
-- Company A Order 2 Completed 2/2/2006
-- Company A Order 3 Completed 9/9/2006
--
-- Assuming that 4/2 is 180 days ago we have Order 2 that precedes 4/2 and
-- has no subsequent PAID order. Should this cause to the 2/2 record to be
-- included in the report or should that fact that there was a 9/9 COMPLETED
-- order after the 2/2 order suppress the 2/2 order from being included?
A: In the above example I would hope to select orders 1 and 3.
|||
Aragon:
Thanks for the answers. Here are the test cases that I set up. Please look these over and let me which of these test cases is false:
set nocount on
truncate table mockOrderselect convert (varchar (10), dateadd (day, -180, getdate()), 112)
-- Results: 20060406-- --
-- CASE 1:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 01' 0011 'AAA' 'C' 20060401
--
-- This row is returned because:
-- (1) LOB = 'AAA'
-- (2) Status = 'C'
-- (3) Run Date (20060401) < 20060406
-- (4) There is no subsequent PAID record
-- --
insert into mockOrder values ( 11, 'AAA', 'C', '20060401', 'Company 01', ' ')-- --
-- CASE 2:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 02' 0021 'AAA' 'C' 20060501
--
-- No rows returned because rundate > 20060406
-- --
insert into mockOrder values ( 21, 'AAA', 'C', '20060501', 'Company 02', ' ')
-- --
-- CASE 3:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 03' 0031 'BBB' 'C' 20060401
--
-- No rows returned because LOB is 'BBB'
-- --
insert into mockOrder values ( 31, 'BBB', 'C', '20060401', 'Company 03', ' ')
-- --
-- CASE 4:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 04' 0041 'AAA' 'C' 20060402
-- 'COMPANY 04' 0042 'AA' 'P' 20060430
--
-- No rows returned; although there is a qualifying 'C' record, there is
-- a subsequent 'P' record that disqualifies the qualifying 'C' record
-- --
insert into mockOrder values ( 41, 'AAA', 'C', '20060402', 'Company 04', ' ')
insert into mockOrder values ( 42, 'AA', 'P', '20060430', 'Company 04', ' ')
-- --
-- CASE 5:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 05' 0051 'AAA' 'C' 20060102
-- 'COMPANY 05' 0052 'AA' 'P' 20060131
-- 'COMPANY 05' 0053 'AAA' 'C' 20060401
--
-- The row for Rec No 53 is returned because:
-- (1) LOB = 'AAA'
-- (2) Status = 'C'
-- (3) Run Date (20060401) < 20060406
-- (4) There is no subsequent PAID record
-- --
insert into mockOrder values ( 51, 'AAA', 'C', '20060102', 'Company 05', ' ')
insert into mockOrder values ( 52, 'AA', 'P', '20060131', 'Company 05', ' ')
insert into mockOrder values ( 53, 'AAA', 'C', '20060401', 'Company 05', ' ')
-- --
-- CASE 6:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 06' 0061 'AAA' 'C' 20060102
-- 'COMPANY 06' 0062 'AA' 'P' 20060131
-- 'COMPANY 06' 0063 'AAA' 'C' 20060501
--
-- No Rows returned; (1) Rec No 61 would otherwise qualify, but Rec NO 62
-- is a subsequent 'P' record and therefore Rec No 61 does not qualify.
-- Rec No 63 does not qualify because Run Date (20060501) > 20060406
-- --
insert into mockOrder values ( 61, 'AAA', 'C', '20060102', 'Company 06', ' ')
insert into mockOrder values ( 62, 'AA', 'P', '20060131', 'Company 06', ' ')
insert into mockOrder values ( 63, 'AAA', 'C', '20060501', 'Company 06', ' ')
-- --
-- CASE 7:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 07' 0071 'AAA' 'C' 20060102
-- 'COMPANY 07' 0072 'AAA' 'C' 20060531
--
-- Rec No 71 is returned because it has no matching 'P' record
-- --
insert into mockOrder values ( 71, 'AAA', 'C', '20060102', 'Company 07', ' ')
insert into mockOrder values ( 73, 'AAA', 'C', '20060531', 'Company 07', ' ')
-- --
-- CASE 8:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 08' 0081 'AAA' 'C' 20060101
-- 'COMPANY 08' 0062 'AA' 'P' 20060901
--
-- Row included; the qualifying 'C' record does not have a matching 'P'
-- record for 180 days.
-- --
insert into mockOrder values ( 81, 'AAA', 'C', '20060101', 'Company 08', ' ')
insert into mockOrder values ( 82, 'AA', 'P', '20060901', 'Company 08', ' ')
-- --
-- CASE 9:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 09' 0091 'AAA' 'C' 20060101
-- 'COMPANY 09' 0091 'AAA' 'C' 20060515
-- 'COMPANY 08' 0093 'AA' 'P' 20060901
--
-- Row included; Rec No 91 qualifies and does not get a 'P' record until
-- more than 180 days later
-- --
insert into mockOrder values ( 91, 'AAA', 'C', '20060102', 'Company 09', ' ')
insert into mockOrder values ( 92, 'AAA', 'C', '20060515', 'Company 09', ' ')
insert into mockOrder values ( 93, 'AA', 'P', '20060901', 'Company 09', ' ')
-- --
-- CASE 10:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 10' 0101 'AAA' 'C' 20060101
-- 'COMPANY 10' 0102 'AAA' 'C' 20060105
-- 'COMPANY 10' 0103 'AAA' 'C' 20060515
-- 'COMPANY 10' 0104 'AA' 'P' 20060901
--
-- 2 Row included; Rec No 91 qualifies and does not get a 'P' record until
-- more than 180 days later
-- --
insert into mockOrder values (101, 'AAA', 'C', '20060102', 'Company 10', ' ')
insert into mockOrder values (102, 'AAA', 'C', '20060105', 'Company 10', ' ')
insert into mockOrder values (103, 'AAA', 'C', '20060515', 'Company 10', ' ')
insert into mockOrder values (104, 'AA', 'P', '20060901', 'Company 10', ' ')
-- --select cl_company,
cl_recno,
lob_sort_code,
cl_status,
cl_rundate
from mockOrder
order by cl_recno
---- cl_company cl_recno lob_sort_code cl_status cl_rundate
-- -- - -
-- Company 01 11 AAA C 20060401
-- Company 02 21 AAA C 20060501
-- Company 03 31 BBB C 20060401
--
-- Company 04 41 AAA C 20060402
-- Company 04 42 AA P 20060430
--
-- Company 05 51 AAA C 20060102
-- Company 05 52 AA P 20060131
-- Company 05 53 AAA C 20060401
--
-- Company 06 61 AAA C 20060102
-- Company 06 62 AA P 20060131
-- Company 06 63 AAA C 20060501
--
-- Company 07 71 AAA C 20060102
-- Company 07 73 AAA C 20060531
--
-- Company 08 81 AAA C 20060101
-- Company 08 82 AA P 20060901
--
-- Company 09 91 AAA C 20060102
-- Company 09 92 AAA C 20060515
-- Company 09 93 AA P 20060901
--
-- Company 10 101 AAA C 20060102
-- Company 10 102 AAA C 20060105
-- Company 10 103 AAA C 20060515
-- Company 10 104 AA P 20060901
--
--
After inserting the test data described above, I ran the following with the indicated results:
declare @.compDate char (8)
declare @.startDate char (8)
declare @.paidDate char (8)
set @.compDate = convert (varchar (8), dateadd (day, -180, getdate()), 112)
set @.startDate = '20060101'
set @.paidDate = convert (varchar (8), dateadd (day, 180, @.startdate), 112)-- --
-- Display dates that are important to the selection process
-- --
select @.compDate as [@.compDate],
@.startDate as [@.startDate],
@.paidDate as [paidDate]-- @.compDate @.startDate paidDate
-- - --
-- 20060406 20060101 20060630
-- --
-- Display the records that make it through the filtering process
-- --
select cl_company,
cl_recno,
lob_LineOfBusiness,
cl_rundate
from dbo.vw_completedOrders cmp
where lob_lineOfBusiness = 'AAA'
and cl_rundate >= @.startDate
and cl_rundate <= '20060406'
and not exists
( select pd.cl_company
from dbo.vw_paidOrders pd
where pd.cl_company = cmp.cl_company
and pd.cl_rundate > cmp.cl_rundate
and datediff (day, cmp.cl_rundate, pd.cl_rundate) <= 180
and pd.lob_lineOfBusiness = 'AA'
)-- cl_company cl_recno lob_LineOfBusiness cl_rundate
-- -- -
-- Company 01 11 aaa 20060401
-- Company 05 53 aaa 20060401
-- Company 07 71 aaa 20060102
-- Company 08 81 aaa 20060101
-- Company 09 91 aaa 20060102
-- Company 10 101 aaa 20060102
-- Company 10 102 aaa 20060105
-- --
-- Display the summary
-- --
select Date = cast (cmp.cl_rundate as datetime (102)),
count(cmp.cl_recno) as 'Completed Initials',
cmp.cl_status as Status
from dbo.vw_completedOrders cmp
where lob_lineOfBusiness = 'AAA'
and cl_rundate >= @.startDate
and cl_rundate <= '20060406'
and not exists
( select pd.cl_company
from dbo.vw_paidOrders pd
where pd.cl_company = cmp.cl_company
and pd.cl_rundate > cmp.cl_rundate
and datediff (day, cmp.cl_rundate, pd.cl_rundate) <= 180
and pd.lob_lineOfBusiness = 'AA'
)
group by cmp.cl_status, cmp.cl_rundate-- Date Completed Initials Status
--
-- 2006-01-01 00:00:00.000 1 C
-- 2006-01-02 00:00:00.000 3 C
-- 2006-01-05 00:00:00.000 1 C
-- 2006-04-01 00:00:00.000 2 C
Let me know if this looks close.
Dave
Optimizing Query to Run
select Date =cast(cl1.cl_rundate as datetime(102)),count(cl1.cl_recno) as 'Completed Initials', cl1.cl_status as Status from dbo.vw_Completedorders cl1 where cl1.lob_lineofbusiness = 'aaa'
and cl1.cl_rundate > '20060801' and not exists (
select cl2.cl_company from dbo.vw_Paidorders cl2 where
cl2.lob_lineofbusiness = 'aa'and cl2.cl_company = cl1.cl_order and cl2.cl_rundate > '20060101' and datediff(day,cl2.cl_rundate,cl1.cl_rundate) < 180)
group by cl1.cl_status, cl1.cl_rundate
Aragon:
I started to do a mock-up of this; however, I quickly decided that without additional information this was not a good idea. Since this database is "very large" and you "can't get it complete" I think it is a good idea to try to get more information. What I see as critical to this query is:
Avoiding table scans -- especially on the view that is included in the "not exists" condition|||Unfortunately in both Views, there are no indexes available. I do have a table I could use, Orders. In the Orders table I have 5 indexes available. The only one pertainable to my search is the recNo field. Unfortunately RunDate is not an index or it may actually work.
Both views references 2 tables, orders and lineofbussines. LineOfBussiness has 3 columns and no indexes.
The main table to reference, orders, has approximately 29k rows per day for each day since late 2004. CompletedOrders has about 11k per day for the same period and PaidOrders has about 10k per day.
Does this give you enough background?|||
Try running the Database Tuning Advisor in SQL Server 2005 or Index Tuning Wizard in SQL Server 2000. You need to use a workload or script that contains this query and other common queries/DML statements. DTA/ITW can then recommend additional indexes that will help. It is hard to tell by just looking at your query where the problem might be. Both the objects are views so we don't know what tables are being referenced or how the views are defined. In addition to this, the indexes on the base tables referenced by the views also matter.
Btw, you are using invalid cast specification for datetime "cast(cl1.cl_rundate as datetime(102))". The code will fail if you upgrade your server to SQL Server 2005. The correct way is to do "cast(cl1.cl_rundate as datetime)". And it is also not clear why you need the cast assuming that the column is smalldatetime/datetime.
|||Aragon:
This is probably enough for a mock-up, but I have no more time today. The big question is that if you can prove that you need the indexes on the tables or the views will you be allowed to add them?
|||Thanks for help. Unfortunately, what I really need for indexing is the date field. We tried to index on date before and the database had issues. The rebuild of indexes took too long to load. We have to wait for mainframe jobs to run prior to rebuilding so indexing ran into the work day and caused sych issues with other db's.Dave
the date field is ANSI, so that's why I convert it to a datetime.|||
Got it. Another piece of optimization that will help is to partition the data; sorry for neglecting to mention this in the previous pass. I will see if I can mock this up this morning.
|||I got it to run using recno to limit the results but it appears to be giving faulty data. I'm having counts of 500 on days when there were only 100 total.|||Dave
Aragon:
I finally got some time today to look at this again. I was able to load a bunch of mock-up data and look at the results. Below is the schema that I used for my mock-up along with some questions. I targeted my mock-up for about 10,000,000 rows which is 29K records per day times 365 days.
Dave
-- -
-- Mock-up schema?
--
--
-- QUESTIONS:
--
--
-- 1. I am not sure about the "CL_RUNDATE" field. I have modeled this field as
-- CHAR (10) storing date as '2006.05.21'. Exactly what kind of data element
-- is the "CL_RUNDATE" field?
-- 2. I am not sure how the views join the Orders table and the LineOfBusiness table.
-- How are these two table joined.
-- 3. Your query filters based on the field "LOB_LINEOFBUSINESS"; does that field of
-- the view come from the ORDERS table or the LINEOFBUSINESS table?
--
-- 4. In your overview you state that you "look at completed orders that have not had
-- another paid order in 180 days." However, in your query you alias view
-- VW_paidOrders with cl2 and look for orders > '20060101' while aliasing view
-- VW_completedOrders with cl1 and looking for orders > '20060801'. This seems
-- backwards.
-- 5. When you take the date difference DATEDIFF (day, cl2.cl_runDate, cl1.cl_runDate)
-- and this also seems backwards.
--
-- 6. When you join the two views you join the PaidOrders "cl_company" column to the
-- completedOrders "cl_order column. Are these two columns simply two different
-- instances of a representation of a company?
--
-- 7. Another question that I saw has to do with this issue:
--
-- Company A Order 1 Paid 1/15/2006
-- Company A Order 2 Completed 2/2/2006
-- Company A Order 3 Completed 9/9/2006
--
-- Assuming that 4/2 is 180 days ago we have Order 2 that precedes 4/2 and
-- has no subsequent PAID order. Should this cause to the 2/2 record to be
-- included in the report or should that fact that there was a 9/9 COMPLETED
-- order after the 2/2 order suppress the 2/2 order from being included?
--
-- --
-- Comments:
-- --
--
-- At this point I have put together a "working mockup"; however, I still need these
-- issues above discussed. I tried out a mock-up of 10,000,000 rows. This requires
-- 10 GB of disk space to store and about 10 GB of transient log space to run the
-- process to generate the data.
--
-- The index that I created is intended to access target data without the use of any
-- bookmark lookups. When I benchmarked the first set of tests this index reduced
-- logical read from 200000 to 296 -- a very large I/O reduction; this is a good
-- starting point but still doesn't yet prove anything. I am for the moment somewhat
-- optimistic.
--
-- None of this addresses partitioning of data.
--
-- -
--
-- Tables:
--
-- MockOrder
-- LineOfBusiness
--
-- Views:
--
-- vw_CompletedOrders
-- vw_PaidOrders
-- -
create table dbo.mockOrder
( cl_recno integer identity
constraint pk_mockOrder primary key,
lob_code char (4) not null,
cl_status char (2) not null,
cl_rundate char(10) not null,
cl_company char (30) not null,
filler char (730) not null
)
go
create index lobCompany_ndx
on mockOrder (cl_company, lob_code, cl_status, cl_rundate)
go
create table dbo.LineOfBusiness
( lob_code char (4) not null
constraint pk_lineOfBusiness primary key,
lob_lineOfBusiness char (4) not null
)
go
create view dbo.vw_Completedorders
as
select cl_recno,
a.lob_code,
b.lob_lineOfBusiness,
a.cl_status,
cl_rundate,
cl_company as cl_order
from dbo.mockOrder a
inner join dbo.lineOfBusiness b
on a.cl_status = 'C'
and a.lob_code = b.lob_code
go
create view dbo.vw_paidOrders
as
select cl_recno,
a.lob_code,
b.lob_lineOfBusiness,
a.cl_status,
cl_rundate,
cl_company
from dbo.mockOrder a
inner join dbo.lineOfBusiness b
on a.cl_status = 'P'
and a.lob_code = b.lob_code
Sorry, I missed an index:
|||Wow, that's a lot of work you put into it. Thanks. Answers below.create index lobStatus_ndx
on mockOrder (lob_code, cl_status, cl_rundate, cl_company)
1. I am not sure about the "CL_RUNDATE" field. I have modeled this field as
-- CHAR (10) storing date as '2006.05.21'. Exactly what kind of data element is the "CL_RUNDATE" field?
A: Ansi date format. Ex, 20060930
-- 2. I am not sure how the views join the Orders table and the LineOfBusiness table.
-- How are these two table joined.
A: There is a sort code field in both tables. LineOfBusiness is a grouping of sort codes.
-- 3. Your query filters based on the field "LOB_LINEOFBUSINESS"; does that field of
-- the view come from the ORDERS table or the LINEOFBUSINESS table?
--
A: It originates in the LOB table, but is added to the records in both Views.
-- 4. In your overview you state that you "look at completed orders that have not had
-- another paid order in 180 days." However, in your query you alias view
-- VW_paidOrders with cl2 and look for orders > '20060101' while aliasing view
-- VW_completedOrders with cl1 and looking for orders > '20060801'. This seems
-- backwards.
A: I think I mispoke. I'm looking for NEW orders completed when there hasn't been an OLD order the last 180 days. I did forget, in the above example to include a cl2.cl_rundate >= cl1.cl_rundate clause.
-- 5. When you take the date difference DATEDIFF (day, cl2.cl_runDate, cl1.cl_runDate)
-- and this also seems backwards.
--
A: See above. But it shouldn't matter, should it?
-- 6. When you join the two views you join the PaidOrders "cl_company" column to the
-- completedOrders "cl_order column. Are these two columns simply two different
-- instances of a representation of a company?
--
A: I mistyped. Both are company
-- 7. Another question that I saw has to do with this issue:
--
-- Company A Order 1 Paid 1/15/2006
-- Company A Order 2 Completed 2/2/2006
-- Company A Order 3 Completed 9/9/2006
--
-- Assuming that 4/2 is 180 days ago we have Order 2 that precedes 4/2 and
-- has no subsequent PAID order. Should this cause to the 2/2 record to be
-- included in the report or should that fact that there was a 9/9 COMPLETED
-- order after the 2/2 order suppress the 2/2 order from being included?
A: In the above example I would hope to select orders 1 and 3.
|||
Aragon:
Thanks for the answers. Here are the test cases that I set up. Please look these over and let me which of these test cases is false:
set nocount on
truncate table mockOrderselect convert (varchar (10), dateadd (day, -180, getdate()), 112)
-- Results: 20060406-- --
-- CASE 1:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 01' 0011 'AAA' 'C' 20060401
--
-- This row is returned because:
-- (1) LOB = 'AAA'
-- (2) Status = 'C'
-- (3) Run Date (20060401) < 20060406
-- (4) There is no subsequent PAID record
-- --
insert into mockOrder values ( 11, 'AAA', 'C', '20060401', 'Company 01', ' ')-- --
-- CASE 2:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 02' 0021 'AAA' 'C' 20060501
--
-- No rows returned because rundate > 20060406
-- --
insert into mockOrder values ( 21, 'AAA', 'C', '20060501', 'Company 02', ' ')
-- --
-- CASE 3:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 03' 0031 'BBB' 'C' 20060401
--
-- No rows returned because LOB is 'BBB'
-- --
insert into mockOrder values ( 31, 'BBB', 'C', '20060401', 'Company 03', ' ')
-- --
-- CASE 4:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 04' 0041 'AAA' 'C' 20060402
-- 'COMPANY 04' 0042 'AA' 'P' 20060430
--
-- No rows returned; although there is a qualifying 'C' record, there is
-- a subsequent 'P' record that disqualifies the qualifying 'C' record
-- --
insert into mockOrder values ( 41, 'AAA', 'C', '20060402', 'Company 04', ' ')
insert into mockOrder values ( 42, 'AA', 'P', '20060430', 'Company 04', ' ')
-- --
-- CASE 5:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 05' 0051 'AAA' 'C' 20060102
-- 'COMPANY 05' 0052 'AA' 'P' 20060131
-- 'COMPANY 05' 0053 'AAA' 'C' 20060401
--
-- The row for Rec No 53 is returned because:
-- (1) LOB = 'AAA'
-- (2) Status = 'C'
-- (3) Run Date (20060401) < 20060406
-- (4) There is no subsequent PAID record
-- --
insert into mockOrder values ( 51, 'AAA', 'C', '20060102', 'Company 05', ' ')
insert into mockOrder values ( 52, 'AA', 'P', '20060131', 'Company 05', ' ')
insert into mockOrder values ( 53, 'AAA', 'C', '20060401', 'Company 05', ' ')
-- --
-- CASE 6:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 06' 0061 'AAA' 'C' 20060102
-- 'COMPANY 06' 0062 'AA' 'P' 20060131
-- 'COMPANY 06' 0063 'AAA' 'C' 20060501
--
-- No Rows returned; (1) Rec No 61 would otherwise qualify, but Rec NO 62
-- is a subsequent 'P' record and therefore Rec No 61 does not qualify.
-- Rec No 63 does not qualify because Run Date (20060501) > 20060406
-- --
insert into mockOrder values ( 61, 'AAA', 'C', '20060102', 'Company 06', ' ')
insert into mockOrder values ( 62, 'AA', 'P', '20060131', 'Company 06', ' ')
insert into mockOrder values ( 63, 'AAA', 'C', '20060501', 'Company 06', ' ')
-- --
-- CASE 7:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 07' 0071 'AAA' 'C' 20060102
-- 'COMPANY 07' 0072 'AAA' 'C' 20060531
--
-- Rec No 71 is returned because it has no matching 'P' record
-- --
insert into mockOrder values ( 71, 'AAA', 'C', '20060102', 'Company 07', ' ')
insert into mockOrder values ( 73, 'AAA', 'C', '20060531', 'Company 07', ' ')
-- --
-- CASE 8:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 08' 0081 'AAA' 'C' 20060101
-- 'COMPANY 08' 0062 'AA' 'P' 20060901
--
-- Row included; the qualifying 'C' record does not have a matching 'P'
-- record for 180 days.
-- --
insert into mockOrder values ( 81, 'AAA', 'C', '20060101', 'Company 08', ' ')
insert into mockOrder values ( 82, 'AA', 'P', '20060901', 'Company 08', ' ')
-- --
-- CASE 9:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 09' 0091 'AAA' 'C' 20060101
-- 'COMPANY 09' 0091 'AAA' 'C' 20060515
-- 'COMPANY 08' 0093 'AA' 'P' 20060901
--
-- Row included; Rec No 91 qualifies and does not get a 'P' record until
-- more than 180 days later
-- --
insert into mockOrder values ( 91, 'AAA', 'C', '20060102', 'Company 09', ' ')
insert into mockOrder values ( 92, 'AAA', 'C', '20060515', 'Company 09', ' ')
insert into mockOrder values ( 93, 'AA', 'P', '20060901', 'Company 09', ' ')
-- --
-- CASE 10:
--
-- -- -- - --
-- Company Rec No LOB Stat Run Date
-- -- -- - --
--
-- 'COMPANY 10' 0101 'AAA' 'C' 20060101
-- 'COMPANY 10' 0102 'AAA' 'C' 20060105
-- 'COMPANY 10' 0103 'AAA' 'C' 20060515
-- 'COMPANY 10' 0104 'AA' 'P' 20060901
--
-- 2 Row included; Rec No 91 qualifies and does not get a 'P' record until
-- more than 180 days later
-- --
insert into mockOrder values (101, 'AAA', 'C', '20060102', 'Company 10', ' ')
insert into mockOrder values (102, 'AAA', 'C', '20060105', 'Company 10', ' ')
insert into mockOrder values (103, 'AAA', 'C', '20060515', 'Company 10', ' ')
insert into mockOrder values (104, 'AA', 'P', '20060901', 'Company 10', ' ')
-- --select cl_company,
cl_recno,
lob_sort_code,
cl_status,
cl_rundate
from mockOrder
order by cl_recno
---- cl_company cl_recno lob_sort_code cl_status cl_rundate
-- -- - -
-- Company 01 11 AAA C 20060401
-- Company 02 21 AAA C 20060501
-- Company 03 31 BBB C 20060401
--
-- Company 04 41 AAA C 20060402
-- Company 04 42 AA P 20060430
--
-- Company 05 51 AAA C 20060102
-- Company 05 52 AA P 20060131
-- Company 05 53 AAA C 20060401
--
-- Company 06 61 AAA C 20060102
-- Company 06 62 AA P 20060131
-- Company 06 63 AAA C 20060501
--
-- Company 07 71 AAA C 20060102
-- Company 07 73 AAA C 20060531
--
-- Company 08 81 AAA C 20060101
-- Company 08 82 AA P 20060901
--
-- Company 09 91 AAA C 20060102
-- Company 09 92 AAA C 20060515
-- Company 09 93 AA P 20060901
--
-- Company 10 101 AAA C 20060102
-- Company 10 102 AAA C 20060105
-- Company 10 103 AAA C 20060515
-- Company 10 104 AA P 20060901
--
--
After inserting the test data described above, I ran the following with the indicated results:
declare @.compDate char (8)
declare @.startDate char (8)
declare @.paidDate char (8)
set @.compDate = convert (varchar (8), dateadd (day, -180, getdate()), 112)
set @.startDate = '20060101'
set @.paidDate = convert (varchar (8), dateadd (day, 180, @.startdate), 112)-- --
-- Display dates that are important to the selection process
-- --
select @.compDate as [@.compDate],
@.startDate as [@.startDate],
@.paidDate as [paidDate]-- @.compDate @.startDate paidDate
-- - --
-- 20060406 20060101 20060630
-- --
-- Display the records that make it through the filtering process
-- --
select cl_company,
cl_recno,
lob_LineOfBusiness,
cl_rundate
from dbo.vw_completedOrders cmp
where lob_lineOfBusiness = 'AAA'
and cl_rundate >= @.startDate
and cl_rundate <= '20060406'
and not exists
( select pd.cl_company
from dbo.vw_paidOrders pd
where pd.cl_company = cmp.cl_company
and pd.cl_rundate > cmp.cl_rundate
and datediff (day, cmp.cl_rundate, pd.cl_rundate) <= 180
and pd.lob_lineOfBusiness = 'AA'
)-- cl_company cl_recno lob_LineOfBusiness cl_rundate
-- -- -
-- Company 01 11 aaa 20060401
-- Company 05 53 aaa 20060401
-- Company 07 71 aaa 20060102
-- Company 08 81 aaa 20060101
-- Company 09 91 aaa 20060102
-- Company 10 101 aaa 20060102
-- Company 10 102 aaa 20060105
-- --
-- Display the summary
-- --
select Date = cast (cmp.cl_rundate as datetime (102)),
count(cmp.cl_recno) as 'Completed Initials',
cmp.cl_status as Status
from dbo.vw_completedOrders cmp
where lob_lineOfBusiness = 'AAA'
and cl_rundate >= @.startDate
and cl_rundate <= '20060406'
and not exists
( select pd.cl_company
from dbo.vw_paidOrders pd
where pd.cl_company = cmp.cl_company
and pd.cl_rundate > cmp.cl_rundate
and datediff (day, cmp.cl_rundate, pd.cl_rundate) <= 180
and pd.lob_lineOfBusiness = 'AA'
)
group by cmp.cl_status, cmp.cl_rundate-- Date Completed Initials Status
--
-- 2006-01-01 00:00:00.000 1 C
-- 2006-01-02 00:00:00.000 3 C
-- 2006-01-05 00:00:00.000 1 C
-- 2006-04-01 00:00:00.000 2 C
Let me know if this looks close.
Dave
Wednesday, March 28, 2012
Optimizing a large
I'm running a script that updates the same column on every row in a huge table. The table is being updated with data from another table in another database, but they are in the same SQL instance
The log and data files for the database being read from are on an internal drive (D:\). The log and datafiles for the database being updated are on an external SAN (X:\
Would any of the following speed the process up significantly
1)Remove the indexes from the table being updated
2)Set the recovery model to 'simple' on the database being updated
3)Put the log and data files on the database being updated on different physical disks (not necessary i believe if I perform number 2)
thanks
KevinKevin,
1. If the column that you are updating has an index on it you might
consider dropping it until after the update.
2. This won't affect in any way the number of items logged or the amount of
data logged during an Update. If you do the update in one transaction it
won't matter either way.
3. It's always best to place the Log and Data files on separate drive
arrays. It makes no difference where the log for the one being read from
are since the log is not used for reads.
I would recommend you attempt the Updates in smaller batches. This will
keep the log in check and usually results in a faster overall operation.
You can usually achieve this with a loop of some sort.
--
Andrew J. Kelly
SQL Server MVP
"Kevin" <anonymous@.discussions.microsoft.com> wrote in message
news:715A5D36-5F4F-43D5-8B5B-854B711F3A9A@.microsoft.com...
> Hi,
> I'm running a script that updates the same column on every row in a huge
table. The table is being updated with data from another table in another
database, but they are in the same SQL instance.
> The log and data files for the database being read from are on an internal
drive (D:\). The log and datafiles for the database being updated are on an
external SAN (X:\)
> Would any of the following speed the process up significantly?
> 1)Remove the indexes from the table being updated?
> 2)Set the recovery model to 'simple' on the database being updated?
> 3)Put the log and data files on the database being updated on different
physical disks (not necessary i believe if I perform number 2).
> thanks,
> Kevin|||Thanks for your response
Does the frequency of log checkpoints significantly slow down a long running update statement? Should I create a large transaction log that will not need to dynamically grow and set the recovery interval to a high value so that the frequency of checkpoints is reduced
thanks
Kevin|||Kevin,
First off you should always have the log file larger than it needs to be for
any given operation. Anytime it has to grow it will impact performance to
some degree. As for check points that depends. Checkpoints certainly can
add overhead to the system, especially disk IO and CPU. Whether they have a
large negative effect on your situation is hard to say from here. If you
make a large recovery interval it will most definitely adversely affect the
other users when it does happen as there will be a lot more to do at one
time. The key in your situation is to do the updates in smaller batches.
Andrew J. Kelly
SQL Server MVP
"Kevin" <anonymous@.discussions.microsoft.com> wrote in message
news:4A65CF81-2DD6-4F53-AA7A-E7B00FF50CE2@.microsoft.com...
> Thanks for your response.
> Does the frequency of log checkpoints significantly slow down a long
running update statement? Should I create a large transaction log that will
not need to dynamically grow and set the recovery interval to a high value
so that the frequency of checkpoints is reduced?
> thanks,
> Kevin
Optimizing a large
I'm running a script that updates the same column on every row in a huge tab
le. The table is being updated with data from another table in another datab
ase, but they are in the same SQL instance.
The log and data files for the database being read from are on an internal d
rive (D:\). The log and datafiles for the database being updated are on an e
xternal SAN (X:\)
Would any of the following speed the process up significantly?
1)Remove the indexes from the table being updated?
2)Set the recovery model to 'simple' on the database being updated?
3)Put the log and data files on the database being updated on different phys
ical disks (not necessary i believe if I perform number 2).
thanks,
KevinKevin,
1. If the column that you are updating has an index on it you might
consider dropping it until after the update.
2. This won't affect in any way the number of items logged or the amount of
data logged during an Update. If you do the update in one transaction it
won't matter either way.
3. It's always best to place the Log and Data files on separate drive
arrays. It makes no difference where the log for the one being read from
are since the log is not used for reads.
I would recommend you attempt the Updates in smaller batches. This will
keep the log in check and usually results in a faster overall operation.
You can usually achieve this with a loop of some sort.
Andrew J. Kelly
SQL Server MVP
"Kevin" <anonymous@.discussions.microsoft.com> wrote in message
news:715A5D36-5F4F-43D5-8B5B-854B711F3A9A@.microsoft.com...
quote:
> Hi,
> I'm running a script that updates the same column on every row in a huge
table. The table is being updated with data from another table in another
database, but they are in the same SQL instance.
quote:
> The log and data files for the database being read from are on an internal
drive (D:\). The log and datafiles for the database being updated are on an
external SAN (X:\)
quote:
> Would any of the following speed the process up significantly?
> 1)Remove the indexes from the table being updated?
> 2)Set the recovery model to 'simple' on the database being updated?
> 3)Put the log and data files on the database being updated on different
physical disks (not necessary i believe if I perform number 2).
quote:|||Thanks for your response.
> thanks,
> Kevin
Does the frequency of log checkpoints significantly slow down a long running
update statement? Should I create a large transaction log that will not nee
d to dynamically grow and set the recovery interval to a high value so that
the frequency of checkpoint
s is reduced?
thanks,
Kevin|||Kevin,
First off you should always have the log file larger than it needs to be for
any given operation. Anytime it has to grow it will impact performance to
some degree. As for check points that depends. Checkpoints certainly can
add overhead to the system, especially disk IO and CPU. Whether they have a
large negative effect on your situation is hard to say from here. If you
make a large recovery interval it will most definitely adversely affect the
other users when it does happen as there will be a lot more to do at one
time. The key in your situation is to do the updates in smaller batches.
Andrew J. Kelly
SQL Server MVP
"Kevin" <anonymous@.discussions.microsoft.com> wrote in message
news:4A65CF81-2DD6-4F53-AA7A-E7B00FF50CE2@.microsoft.com...
quote:
> Thanks for your response.
> Does the frequency of log checkpoints significantly slow down a long
running update statement? Should I create a large transaction log that will
not need to dynamically grow and set the recovery interval to a high value
so that the frequency of checkpoints is reduced?
quote:
> thanks,
> Kevin
Optimizing
In Enterprise Manager 2000 as part of a Maintenance Plan. There is a Optimizations Tab. Running this and choosing Reorganize data w/original amt of free space, the same as Running dbreindex on the tables?
Any help would be appreciated.
Thank You
Essentially yes.
Andrew J. Kelly
SQL Server MVP
"Marc305" <anonymous@.discussions.microsoft.com> wrote in message
news:C49CF28B-DD22-4EB4-AFFB-EA22FFE83838@.microsoft.com...
> Hi,
> In Enterprise Manager 2000 as part of a Maintenance Plan. There is a
Optimizations Tab. Running this and choosing Reorganize data w/original amt
of free space, the same as Running dbreindex on the tables?
> Any help would be appreciated.
> Thank You
Optimizing
In Enterprise Manager 2000 as part of a Maintenance Plan. There is a Optimiz
ations Tab. Running this and choosing Reorganize data w/original amt of free
space, the same as Running dbreindex on the tables?
Any help would be appreciated.
Thank YouEssentially yes.
Andrew J. Kelly
SQL Server MVP
"Marc305" <anonymous@.discussions.microsoft.com> wrote in message
news:C49CF28B-DD22-4EB4-AFFB-EA22FFE83838@.microsoft.com...
> Hi,
> In Enterprise Manager 2000 as part of a Maintenance Plan. There is a
Optimizations Tab. Running this and choosing Reorganize data w/original amt
of free space, the same as Running dbreindex on the tables?
> Any help would be appreciated.
> Thank You
Optimizing
In Enterprise Manager 2000 as part of a Maintenance Plan. There is a Optimizations Tab. Running this and choosing Reorganize data w/original amt of free space, the same as Running dbreindex on the tables
Any help would be appreciated
Thank YouEssentially yes.
Andrew J. Kelly
SQL Server MVP
"Marc305" <anonymous@.discussions.microsoft.com> wrote in message
news:C49CF28B-DD22-4EB4-AFFB-EA22FFE83838@.microsoft.com...
> Hi,
> In Enterprise Manager 2000 as part of a Maintenance Plan. There is a
Optimizations Tab. Running this and choosing Reorganize data w/original amt
of free space, the same as Running dbreindex on the tables?
> Any help would be appreciated.
> Thank You
Monday, March 26, 2012
Optimize Temp tables
My application makes heavy use of temp tables because we have a few
thousands of stored proc running every day to renerate reports.
What are the positive and negative impact to add clustered index on the
temp table?
what is the negative impact if I use derived table instead temp table?
When I have big stored proc more than 1500 lines and have at least 5
temp tables, what is the best solution to make the heavy stored proc
run faster?
Thanks so much,
Silaphet,
> When I have big stored proc more than 1500 lines and have at least 5
> temp tables, what is the best solution to make the heavy stored proc
> run faster?
With that much code running just to generate a report, you might consider
having a background process (or a data warehouse) pre-aggregate the data.
This is my signature. It is a general reminder.
Please post DDL, sample data and desired results.
See http://www.aspfaq.com/5006 for info.
|||In addition to Aaron's response , create a temp tables at the beginning of
the stored procedure (reduce recompile).
Also, it may help to improve performance if you add NC/CI indexes depending
on your requrements.
Look at an execution plan of the stored procedure to ensure that optimizer
is available to use indexes.
"Silaphet" <kmounkhaty@.cox.net> wrote in message
news:1112583294.022932.260500@.z14g2000cwz.googlegr oups.com...
> Hi All,
> My application makes heavy use of temp tables because we have a few
> thousands of stored proc running every day to renerate reports.
> What are the positive and negative impact to add clustered index on the
> temp table?
> what is the negative impact if I use derived table instead temp table?
> When I have big stored proc more than 1500 lines and have at least 5
> temp tables, what is the best solution to make the heavy stored proc
> run faster?
> Thanks so much,
> Silaphet,
>
|||Another possibility would be to consider an INDEXED VIEW.
Typically, you want to keep your OLTP and DSS/OLAP operations and data
seperated. It might be wise to reconsider the cohosting of this data in the
same database and reconsider creating something new elsewhere that better
fits your architecture.
Sincerely,
Anthony Thomas
"Silaphet" <kmounkhaty@.cox.net> wrote in message
news:1112583294.022932.260500@.z14g2000cwz.googlegr oups.com...
Hi All,
My application makes heavy use of temp tables because we have a few
thousands of stored proc running every day to renerate reports.
What are the positive and negative impact to add clustered index on the
temp table?
what is the negative impact if I use derived table instead temp table?
When I have big stored proc more than 1500 lines and have at least 5
temp tables, what is the best solution to make the heavy stored proc
run faster?
Thanks so much,
Silaphet,
Optimize Temp tables
My application makes heavy use of temp tables because we have a few
thousands of stored proc running every day to renerate reports.
What are the positive and negative impact to add clustered index on the
temp table?
what is the negative impact if I use derived table instead temp table?
When I have big stored proc more than 1500 lines and have at least 5
temp tables, what is the best solution to make the heavy stored proc
run faster?
Thanks so much,
Silaphet,> When I have big stored proc more than 1500 lines and have at least 5
> temp tables, what is the best solution to make the heavy stored proc
> run faster?
With that much code running just to generate a report, you might consider
having a background process (or a data warehouse) pre-aggregate the data.
This is my signature. It is a general reminder.
Please post DDL, sample data and desired results.
See http://www.aspfaq.com/5006 for info.|||In addition to Aaron's response , create a temp tables at the beginning of
the stored procedure (reduce recompile).
Also, it may help to improve performance if you add NC/CI indexes depending
on your requrements.
Look at an execution plan of the stored procedure to ensure that optimizer
is available to use indexes.
"Silaphet" <kmounkhaty@.cox.net> wrote in message
news:1112583294.022932.260500@.z14g2000cwz.googlegroups.com...
> Hi All,
> My application makes heavy use of temp tables because we have a few
> thousands of stored proc running every day to renerate reports.
> What are the positive and negative impact to add clustered index on the
> temp table?
> what is the negative impact if I use derived table instead temp table?
> When I have big stored proc more than 1500 lines and have at least 5
> temp tables, what is the best solution to make the heavy stored proc
> run faster?
> Thanks so much,
> Silaphet,
>|||Another possibility would be to consider an INDEXED VIEW.
Typically, you want to keep your OLTP and DSS/OLAP operations and data
seperated. It might be wise to reconsider the cohosting of this data in the
same database and reconsider creating something new elsewhere that better
fits your architecture.
Sincerely,
Anthony Thomas
"Silaphet" <kmounkhaty@.cox.net> wrote in message
news:1112583294.022932.260500@.z14g2000cwz.googlegroups.com...
Hi All,
My application makes heavy use of temp tables because we have a few
thousands of stored proc running every day to renerate reports.
What are the positive and negative impact to add clustered index on the
temp table?
what is the negative impact if I use derived table instead temp table?
When I have big stored proc more than 1500 lines and have at least 5
temp tables, what is the best solution to make the heavy stored proc
run faster?
Thanks so much,
Silaphet,
Optimize Temp tables
My application makes heavy use of temp tables because we have a few
thousands of stored proc running every day to renerate reports.
What are the positive and negative impact to add clustered index on the
temp table?
what is the negative impact if I use derived table instead temp table?
When I have big stored proc more than 1500 lines and have at least 5
temp tables, what is the best solution to make the heavy stored proc
run faster?
Thanks so much,
Silaphet,> When I have big stored proc more than 1500 lines and have at least 5
> temp tables, what is the best solution to make the heavy stored proc
> run faster?
With that much code running just to generate a report, you might consider
having a background process (or a data warehouse) pre-aggregate the data.
--
This is my signature. It is a general reminder.
Please post DDL, sample data and desired results.
See http://www.aspfaq.com/5006 for info.|||In addition to Aaron's response , create a temp tables at the beginning of
the stored procedure (reduce recompile).
Also, it may help to improve performance if you add NC/CI indexes depending
on your requrements.
Look at an execution plan of the stored procedure to ensure that optimizer
is available to use indexes.
"Silaphet" <kmounkhaty@.cox.net> wrote in message
news:1112583294.022932.260500@.z14g2000cwz.googlegroups.com...
> Hi All,
> My application makes heavy use of temp tables because we have a few
> thousands of stored proc running every day to renerate reports.
> What are the positive and negative impact to add clustered index on the
> temp table?
> what is the negative impact if I use derived table instead temp table?
> When I have big stored proc more than 1500 lines and have at least 5
> temp tables, what is the best solution to make the heavy stored proc
> run faster?
> Thanks so much,
> Silaphet,
>|||Another possibility would be to consider an INDEXED VIEW.
Typically, you want to keep your OLTP and DSS/OLAP operations and data
seperated. It might be wise to reconsider the cohosting of this data in the
same database and reconsider creating something new elsewhere that better
fits your architecture.
Sincerely,
Anthony Thomas
"Silaphet" <kmounkhaty@.cox.net> wrote in message
news:1112583294.022932.260500@.z14g2000cwz.googlegroups.com...
Hi All,
My application makes heavy use of temp tables because we have a few
thousands of stored proc running every day to renerate reports.
What are the positive and negative impact to add clustered index on the
temp table?
what is the negative impact if I use derived table instead temp table?
When I have big stored proc more than 1500 lines and have at least 5
temp tables, what is the best solution to make the heavy stored proc
run faster?
Thanks so much,
Silaphet,
Monday, March 19, 2012
Optimising a query / stored procedure
we need to rewrite it. (Need to reduce from 10sec to 1sec)
Using Query Analyser we Executed the Stored Procedure and it takes 10 Sec.
However re-Executing with same parameters it only takes 1 sec, it's like the
Query Optimiser has cached or remembered the Execution Plan or something.
(If I use a new set or parameters then it takes 10 secs again)
Unless we can get a consistent result of how long the current SP takes to
run, there's no way of determining whether any changes are beneficial - is
there some way of clearing the "cache" or "un-remembering" the Execution
Plan so that we can get a consistent result to compare with ?
(This is SQL Server 2000 standard edition)Look at
DBCC DROPCLEANBUFFER
DBCC FREEPROCCACHE
--
Allan Mitchell (Microsoft SQL Server MVP)
MCSE,MCDBA
www.SQLDTS.com
I support PASS - the definitive, global community
for SQL Server professionals - http://www.sqlpass.org
"Richard" <richa@.heidmar.co.uk> wrote in message
news:EHNkb.10$to6.7@.newsr2.u-net.net...
> I have a Stored Procedure based on a View which is running very slowly, so
> we need to rewrite it. (Need to reduce from 10sec to 1sec)
> Using Query Analyser we Executed the Stored Procedure and it takes 10 Sec.
> However re-Executing with same parameters it only takes 1 sec, it's like
the
> Query Optimiser has cached or remembered the Execution Plan or something.
> (If I use a new set or parameters then it takes 10 secs again)
> Unless we can get a consistent result of how long the current SP takes to
> run, there's no way of determining whether any changes are beneficial - is
> there some way of clearing the "cache" or "un-remembering" the Execution
> Plan so that we can get a consistent result to compare with ?
> (This is SQL Server 2000 standard edition)
>|||... and CHECKPOINT in the beginning to get rid of dirty pages.
--
Tibor Karaszi, SQL Server MVP
Archive at: http://groups.google.com/groups?oi=djq&as ugroup=microsoft.public.sqlserver
"Allan Mitchell" <allan@.no-spam.sqldts.com> wrote in message
news:ei6GoxulDHA.3316@.tk2msftngp13.phx.gbl...
> Look at
> DBCC DROPCLEANBUFFER
> DBCC FREEPROCCACHE
>
> --
> --
> Allan Mitchell (Microsoft SQL Server MVP)
> MCSE,MCDBA
> www.SQLDTS.com
> I support PASS - the definitive, global community
> for SQL Server professionals - http://www.sqlpass.org
>
> "Richard" <richa@.heidmar.co.uk> wrote in message
> news:EHNkb.10$to6.7@.newsr2.u-net.net...
> > I have a Stored Procedure based on a View which is running very slowly, so
> > we need to rewrite it. (Need to reduce from 10sec to 1sec)
> > Using Query Analyser we Executed the Stored Procedure and it takes 10 Sec.
> > However re-Executing with same parameters it only takes 1 sec, it's like
> the
> > Query Optimiser has cached or remembered the Execution Plan or something.
> > (If I use a new set or parameters then it takes 10 secs again)
> > Unless we can get a consistent result of how long the current SP takes to
> > run, there's no way of determining whether any changes are beneficial - is
> > there some way of clearing the "cache" or "un-remembering" the Execution
> > Plan so that we can get a consistent result to compare with ?
> >
> > (This is SQL Server 2000 standard edition)
> >
> >
>
Optimiser issues between SQL Server 7 and SQL Server 2000 Enterprise
Can anybody throw some light on this problem...
In SQL Server version 7, running on NT 4, the following query (generated by
Business Objects, not me!) can be successfully executed in query analyser
and returns correct results without heavily utilising tembdb. A seek is
performed, based on date, on a clustered index only returning the rows from
Sales_Details_Extract (fact table) that are required:
SELECT
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value),
(CASE WHEN
(
Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
BusObj_Control) )
THEN (
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
)
ELSE 0 END),
(CASE WHEN
(
Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
BusObj_Control) )
THEN (
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quantity)
)
ELSE 0 END),
Sales_Enquiry.dbo.Time_v.Year,
Sales_Enquiry.dbo.Time_v.Month,
dbo.SCFlattened.SCSubT2Desc
FROM
Sales_Enquiry.dbo.Sales_Details_Extract,
dbo.Title_Hierarchy,
dbo.Customers,
dbo.SCFlattened,
Sales_Enquiry.dbo.Time_v
WHERE
( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.Title_Hierarchy.ISBN and
Sales_Enquiry.dbo.Sales_Details_Extract.Source_System=dbo.Title_Hierarchy.So
urce_System )
AND (
Sales_Enquiry.dbo.Sales_Details_Extract.Customer=dbo.Customers.Customer )
AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
Sales_Enquiry.dbo.Time_v.Period_Start and
Sales_Enquiry.dbo.Time_v.Period_End )
AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT6 )
AND (
( (
Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
BusObj_Control) ) )
AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
'U')
AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Category NOT IN ('V',
'R')
AND
dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
AND
dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
)
GROUP BY
Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
dbo.SCFlattened.SCSubT2Desc
The table sizes are as follows:
Sales_Details_Extract (view containing ~ 70 million rows)
Title_Hierarchy (Table containing ~180k rows)
Customers, (Table containing ~60k rows)
SCFlattened (Table containing ~30k rows)
Time_v (view containing ~ 80 rows)
Total DB Size ~ 33GB
On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
Ed. The same query causes tempdb to fill (56GB) before we run out of disk
resource. The database is a direct copy i.e. db detached and attached. The
indexes have recently been rebuilt and the 'auto update stats' option is
selected. The query plan seems to indicate a scan of the entire
Sales_Details (70 million rows) is taking place, and suspect a Cartesian
product is the cause of tempdb growing so large.
We've tried unsuccessfully placing the database including tempdb into 'SQL 7
Compatibility Mode'
Any Suggestions would be gratefully received. I have copies of the
execution plans if interested.
Regards,
Ian (ichinds@.hotmail.com)Have you tried using query hints to influence the plan? "query hints" in
books online has some description.
If you want to try the compatibility mode, then it should be set on your
database, not tempdb.
--
Wei Xiao [MSFT]
SQL Server Storage Engine Development
http://weblogs.asp.net/weix
This posting is provided "AS IS" with no warranties, and confers no rights.
"Ian" <ian.hinds@.pearsontc.co.uk> wrote in message
news:O%23r3sSX0EHA.3908@.TK2MSFTNGP12.phx.gbl...
> Hello All,
>
> Can anybody throw some light on this problem...
>
> In SQL Server version 7, running on NT 4, the following query (generated
> by
> Business Objects, not me!) can be successfully executed in query analyser
> and returns correct results without heavily utilising tembdb. A seek is
> performed, based on date, on a clustered index only returning the rows
> from
> Sales_Details_Extract (fact table) that are required:
>
> SELECT
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
> )
> ELSE 0 END),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quantity)
> )
> ELSE 0 END),
> Sales_Enquiry.dbo.Time_v.Year,
> Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> FROM
> Sales_Enquiry.dbo.Sales_Details_Extract,
> dbo.Title_Hierarchy,
> dbo.Customers,
> dbo.SCFlattened,
> Sales_Enquiry.dbo.Time_v
> WHERE
> ( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.Title_Hierarchy.ISBN
> and
> Sales_Enquiry.dbo.Sales_Details_Extract.Source_System=dbo.Title_Hierarchy.So
> urce_System )
> AND (
> Sales_Enquiry.dbo.Sales_Details_Extract.Customer=dbo.Customers.Customer )
> AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
> Sales_Enquiry.dbo.Time_v.Period_Start and
> Sales_Enquiry.dbo.Time_v.Period_End )
> AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT6 )
> AND (
> ( (
> Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
> BusObj_Control) ) )
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
> 'U')
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Category NOT IN ('V',
> 'R')
> AND
> dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
> AND
> dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
> )
> GROUP BY
> Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
>
>
> The table sizes are as follows:
>
> Sales_Details_Extract (view containing ~ 70 million rows)
> Title_Hierarchy (Table containing ~180k rows)
> Customers, (Table containing ~60k rows)
> SCFlattened (Table containing ~30k rows)
> Time_v (view containing ~ 80 rows)
>
> Total DB Size ~ 33GB
>
>
> On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
> Ed. The same query causes tempdb to fill (56GB) before we run out of disk
> resource. The database is a direct copy i.e. db detached and attached.
> The
> indexes have recently been rebuilt and the 'auto update stats' option is
> selected. The query plan seems to indicate a scan of the entire
> Sales_Details (70 million rows) is taking place, and suspect a Cartesian
> product is the cause of tempdb growing so large.
>
> We've tried unsuccessfully placing the database including tempdb into 'SQL
> 7
> Compatibility Mode'
>
> Any Suggestions would be gratefully received. I have copies of the
> execution plans if interested.
>
> Regards,
>
> Ian (ichinds@.hotmail.com)
>
>
>
>|||Ian,
I don't know how much influence you have over the query or the schema. I
assume you cannot change the query.
First of all, make sure you statistics are up to date. Run UPDATE
STATISTICS when in doubt.
Next, you could check if all join keys are of the same data type. For
example, column Time_v.Year should have the exact same data type (for
optimal performance) as BusObj_Control.Current_Year.
Also check if the columns in the WHERE clause have the same data type as
the literals of the expressions. For example, column Customers.Customer
should be char or varchar in order to match the data type of the literal
'61384 '.
Hope this helps,
Gert-Jan
Ian wrote:
> Hello All,
> Can anybody throw some light on this problem...
> In SQL Server version 7, running on NT 4, the following query (generated by
> Business Objects, not me!) can be successfully executed in query analyser
> and returns correct results without heavily utilising tembdb. A seek is
> performed, based on date, on a clustered index only returning the rows from
> Sales_Details_Extract (fact table) that are required:
> SELECT
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
> )
> ELSE 0 END),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quantity)
> )
> ELSE 0 END),
> Sales_Enquiry.dbo.Time_v.Year,
> Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> FROM
> Sales_Enquiry.dbo.Sales_Details_Extract,
> dbo.Title_Hierarchy,
> dbo.Customers,
> dbo.SCFlattened,
> Sales_Enquiry.dbo.Time_v
> WHERE
> ( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.Title_Hierarchy.ISBN and
> Sales_Enquiry.dbo.Sales_Details_Extract.Source_System=dbo.Title_Hierarchy.So
> urce_System )
> AND (
> Sales_Enquiry.dbo.Sales_Details_Extract.Customer=dbo.Customers.Customer )
> AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
> Sales_Enquiry.dbo.Time_v.Period_Start and
> Sales_Enquiry.dbo.Time_v.Period_End )
> AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT6 )
> AND (
> ( (
> Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
> BusObj_Control) ) )
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
> 'U')
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Category NOT IN ('V',
> 'R')
> AND
> dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
> AND
> dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
> )
> GROUP BY
> Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> The table sizes are as follows:
> Sales_Details_Extract (view containing ~ 70 million rows)
> Title_Hierarchy (Table containing ~180k rows)
> Customers, (Table containing ~60k rows)
> SCFlattened (Table containing ~30k rows)
> Time_v (view containing ~ 80 rows)
> Total DB Size ~ 33GB
> On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
> Ed. The same query causes tempdb to fill (56GB) before we run out of disk
> resource. The database is a direct copy i.e. db detached and attached. The
> indexes have recently been rebuilt and the 'auto update stats' option is
> selected. The query plan seems to indicate a scan of the entire
> Sales_Details (70 million rows) is taking place, and suspect a Cartesian
> product is the cause of tempdb growing so large.
> We've tried unsuccessfully placing the database including tempdb into 'SQL 7
> Compatibility Mode'
> Any Suggestions would be gratefully received. I have copies of the
> execution plans if interested.
> Regards,
> Ian (ichinds@.hotmail.com)
Optimiser issues between SQL Server 7 and SQL Server 2000 Enterprise
Can anybody throw some light on this problem...
In SQL Server version 7, running on NT 4, the following query (generated by
Business Objects, not me!) can be successfully executed in query analyser
and returns correct results without heavily utilising tembdb. A seek is
performed, based on date, on a clustered index only returning the rows from
Sales_Details_Extract (fact table) that are required:
SELECT
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value) ,
(CASE WHEN
(
Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
BusObj_Control) )
THEN (
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
)
ELSE 0 END),
(CASE WHEN
(
Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
BusObj_Control) )
THEN (
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quanti ty)
)
ELSE 0 END),
Sales_Enquiry.dbo.Time_v.Year,
Sales_Enquiry.dbo.Time_v.Month,
dbo.SCFlattened.SCSubT2Desc
FROM
Sales_Enquiry.dbo.Sales_Details_Extract,
dbo.Title_Hierarchy,
dbo.Customers,
dbo.SCFlattened,
Sales_Enquiry.dbo.Time_v
WHERE
( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.T itle_Hierarchy.ISBN and
Sales_Enquiry.dbo.Sales_Details_Extract.Source_Sys tem=dbo.Title_Hierarchy.So
urce_System )
AND (
Sales_Enquiry.dbo.Sales_Details_Extract.Customer=d bo.Customers.Customer )
AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
Sales_Enquiry.dbo.Time_v.Period_Start and
Sales_Enquiry.dbo.Time_v.Period_End )
AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT 6 )
AND (
( (
Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
BusObj_Control) ) )
AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
'U')
AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Categ ory NOT IN ('V',
'R')
AND
dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
AND
dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
)
GROUP BY
Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
dbo.SCFlattened.SCSubT2Desc
The table sizes are as follows:
Sales_Details_Extract (view containing ~ 70 million rows)
Title_Hierarchy (Table containing ~180k rows)
Customers, (Table containing ~60k rows)
SCFlattened (Table containing ~30k rows)
Time_v (view containing ~ 80 rows)
Total DB Size ~ 33GB
On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
Ed. The same query causes tempdb to fill (56GB) before we run out of disk
resource. The database is a direct copy i.e. db detached and attached. The
indexes have recently been rebuilt and the 'auto update stats' option is
selected. The query plan seems to indicate a scan of the entire
Sales_Details (70 million rows) is taking place, and suspect a Cartesian
product is the cause of tempdb growing so large.
We've tried unsuccessfully placing the database including tempdb into 'SQL 7
Compatibility Mode'
Any Suggestions would be gratefully received. I have copies of the
execution plans if interested.
Regards,
Ian (ichinds@.hotmail.com)
Have you tried using query hints to influence the plan? "query hints" in
books online has some description.
If you want to try the compatibility mode, then it should be set on your
database, not tempdb.
Wei Xiao [MSFT]
SQL Server Storage Engine Development
http://weblogs.asp.net/weix
This posting is provided "AS IS" with no warranties, and confers no rights.
"Ian" <ian.hinds@.pearsontc.co.uk> wrote in message
news:O%23r3sSX0EHA.3908@.TK2MSFTNGP12.phx.gbl...
> Hello All,
>
> Can anybody throw some light on this problem...
>
> In SQL Server version 7, running on NT 4, the following query (generated
> by
> Business Objects, not me!) can be successfully executed in query analyser
> and returns correct results without heavily utilising tembdb. A seek is
> performed, based on date, on a clustered index only returning the rows
> from
> Sales_Details_Extract (fact table) that are required:
>
> SELECT
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value) ,
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
> )
> ELSE 0 END),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quanti ty)
> )
> ELSE 0 END),
> Sales_Enquiry.dbo.Time_v.Year,
> Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> FROM
> Sales_Enquiry.dbo.Sales_Details_Extract,
> dbo.Title_Hierarchy,
> dbo.Customers,
> dbo.SCFlattened,
> Sales_Enquiry.dbo.Time_v
> WHERE
> ( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.T itle_Hierarchy.ISBN
> and
> Sales_Enquiry.dbo.Sales_Details_Extract.Source_Sys tem=dbo.Title_Hierarchy.So
> urce_System )
> AND (
> Sales_Enquiry.dbo.Sales_Details_Extract.Customer=d bo.Customers.Customer )
> AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
> Sales_Enquiry.dbo.Time_v.Period_Start and
> Sales_Enquiry.dbo.Time_v.Period_End )
> AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT 6 )
> AND (
> ( (
> Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
> BusObj_Control) ) )
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
> 'U')
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Categ ory NOT IN ('V',
> 'R')
> AND
> dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
> AND
> dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
> )
> GROUP BY
> Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
>
>
> The table sizes are as follows:
>
> Sales_Details_Extract (view containing ~ 70 million rows)
> Title_Hierarchy (Table containing ~180k rows)
> Customers, (Table containing ~60k rows)
> SCFlattened (Table containing ~30k rows)
> Time_v (view containing ~ 80 rows)
>
> Total DB Size ~ 33GB
>
>
> On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
> Ed. The same query causes tempdb to fill (56GB) before we run out of disk
> resource. The database is a direct copy i.e. db detached and attached.
> The
> indexes have recently been rebuilt and the 'auto update stats' option is
> selected. The query plan seems to indicate a scan of the entire
> Sales_Details (70 million rows) is taking place, and suspect a Cartesian
> product is the cause of tempdb growing so large.
>
> We've tried unsuccessfully placing the database including tempdb into 'SQL
> 7
> Compatibility Mode'
>
> Any Suggestions would be gratefully received. I have copies of the
> execution plans if interested.
>
> Regards,
>
> Ian (ichinds@.hotmail.com)
>
>
>
>
|||Ian,
I don't know how much influence you have over the query or the schema. I
assume you cannot change the query.
First of all, make sure you statistics are up to date. Run UPDATE
STATISTICS when in doubt.
Next, you could check if all join keys are of the same data type. For
example, column Time_v.Year should have the exact same data type (for
optimal performance) as BusObj_Control.Current_Year.
Also check if the columns in the WHERE clause have the same data type as
the literals of the expressions. For example, column Customers.Customer
should be char or varchar in order to match the data type of the literal
'61384 '.
Hope this helps,
Gert-Jan
Ian wrote:
> Hello All,
> Can anybody throw some light on this problem...
> In SQL Server version 7, running on NT 4, the following query (generated by
> Business Objects, not me!) can be successfully executed in query analyser
> and returns correct results without heavily utilising tembdb. A seek is
> performed, based on date, on a clustered index only returning the rows from
> Sales_Details_Extract (fact table) that are required:
> SELECT
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value) ,
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
> )
> ELSE 0 END),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quanti ty)
> )
> ELSE 0 END),
> Sales_Enquiry.dbo.Time_v.Year,
> Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> FROM
> Sales_Enquiry.dbo.Sales_Details_Extract,
> dbo.Title_Hierarchy,
> dbo.Customers,
> dbo.SCFlattened,
> Sales_Enquiry.dbo.Time_v
> WHERE
> ( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.T itle_Hierarchy.ISBN and
> Sales_Enquiry.dbo.Sales_Details_Extract.Source_Sys tem=dbo.Title_Hierarchy.So
> urce_System )
> AND (
> Sales_Enquiry.dbo.Sales_Details_Extract.Customer=d bo.Customers.Customer )
> AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
> Sales_Enquiry.dbo.Time_v.Period_Start and
> Sales_Enquiry.dbo.Time_v.Period_End )
> AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT 6 )
> AND (
> ( (
> Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
> BusObj_Control) ) )
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
> 'U')
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Categ ory NOT IN ('V',
> 'R')
> AND
> dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
> AND
> dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
> )
> GROUP BY
> Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> The table sizes are as follows:
> Sales_Details_Extract (view containing ~ 70 million rows)
> Title_Hierarchy (Table containing ~180k rows)
> Customers, (Table containing ~60k rows)
> SCFlattened (Table containing ~30k rows)
> Time_v (view containing ~ 80 rows)
> Total DB Size ~ 33GB
> On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
> Ed. The same query causes tempdb to fill (56GB) before we run out of disk
> resource. The database is a direct copy i.e. db detached and attached. The
> indexes have recently been rebuilt and the 'auto update stats' option is
> selected. The query plan seems to indicate a scan of the entire
> Sales_Details (70 million rows) is taking place, and suspect a Cartesian
> product is the cause of tempdb growing so large.
> We've tried unsuccessfully placing the database including tempdb into 'SQL 7
> Compatibility Mode'
> Any Suggestions would be gratefully received. I have copies of the
> execution plans if interested.
> Regards,
> Ian (ichinds@.hotmail.com)
Optimiser issues between SQL Server 7 and SQL Server 2000 Enterprise
Can anybody throw some light on this problem...
In SQL Server version 7, running on NT 4, the following query (generated by
Business Objects, not me!) can be successfully executed in query analyser
and returns correct results without heavily utilising tembdb. A seek is
performed, based on date, on a clustered index only returning the rows from
Sales_Details_Extract (fact table) that are required:
SELECT
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value),
(CASE WHEN
(
Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
BusObj_Control) )
THEN (
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
)
ELSE 0 END),
(CASE WHEN
(
Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
BusObj_Control) )
THEN (
sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quantity)
)
ELSE 0 END),
Sales_Enquiry.dbo.Time_v.Year,
Sales_Enquiry.dbo.Time_v.Month,
dbo.SCFlattened.SCSubT2Desc
FROM
Sales_Enquiry.dbo.Sales_Details_Extract,
dbo.Title_Hierarchy,
dbo.Customers,
dbo.SCFlattened,
Sales_Enquiry.dbo.Time_v
WHERE
( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.Title_Hierarchy.ISBN and
Sales_Enquiry.dbo.Sales_Details_Extract.Source_System=dbo.Title_Hierarchy.So
urce_System )
AND (
Sales_Enquiry.dbo.Sales_Details_Extract.Customer=dbo.Customers.Customer )
AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
Sales_Enquiry.dbo.Time_v.Period_Start and
Sales_Enquiry.dbo.Time_v.Period_End )
AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT6 )
AND (
( (
Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
BusObj_Control) ) )
AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
'U')
AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Category NOT IN ('V',
'R')
AND
dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
AND
dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
)
GROUP BY
Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
dbo.SCFlattened.SCSubT2Desc
The table sizes are as follows:
Sales_Details_Extract (view containing ~ 70 million rows)
Title_Hierarchy (Table containing ~180k rows)
Customers, (Table containing ~60k rows)
SCFlattened (Table containing ~30k rows)
Time_v (view containing ~ 80 rows)
Total DB Size ~ 33GB
On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
Ed. The same query causes tempdb to fill (56GB) before we run out of disk
resource. The database is a direct copy i.e. db detached and attached. The
indexes have recently been rebuilt and the 'auto update stats' option is
selected. The query plan seems to indicate a scan of the entire
Sales_Details (70 million rows) is taking place, and suspect a Cartesian
product is the cause of tempdb growing so large.
We've tried unsuccessfully placing the database including tempdb into 'SQL 7
Compatibility Mode'
Any Suggestions would be gratefully received. I have copies of the
execution plans if interested.
Regards,
Ian (ichinds@.hotmail.com)Have you tried using query hints to influence the plan? "query hints" in
books online has some description.
If you want to try the compatibility mode, then it should be set on your
database, not tempdb.
Wei Xiao [MSFT]
SQL Server Storage Engine Development
http://weblogs.asp.net/weix
This posting is provided "AS IS" with no warranties, and confers no rights.
"Ian" <ian.hinds@.pearsontc.co.uk> wrote in message
news:O%23r3sSX0EHA.3908@.TK2MSFTNGP12.phx.gbl...
> Hello All,
>
> Can anybody throw some light on this problem...
>
> In SQL Server version 7, running on NT 4, the following query (generated
> by
> Business Objects, not me!) can be successfully executed in query analyser
> and returns correct results without heavily utilising tembdb. A seek is
> performed, based on date, on a clustered index only returning the rows
> from
> Sales_Details_Extract (fact table) that are required:
>
> SELECT
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
> )
> ELSE 0 END),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quantity)
> )
> ELSE 0 END),
> Sales_Enquiry.dbo.Time_v.Year,
> Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> FROM
> Sales_Enquiry.dbo.Sales_Details_Extract,
> dbo.Title_Hierarchy,
> dbo.Customers,
> dbo.SCFlattened,
> Sales_Enquiry.dbo.Time_v
> WHERE
> ( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.Title_Hierarchy.ISBN
> and
> Sales_Enquiry.dbo.Sales_Details_Extract.Source_System=dbo.Title_Hierarchy.
So
> urce_System )
> AND (
> Sales_Enquiry.dbo.Sales_Details_Extract.Customer=dbo.Customers.Customer )
> AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
> Sales_Enquiry.dbo.Time_v.Period_Start and
> Sales_Enquiry.dbo.Time_v.Period_End )
> AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT6 )
> AND (
> ( (
> Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
> BusObj_Control) ) )
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
> 'U')
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Category NOT IN ('V',
> 'R')
> AND
> dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
> AND
> dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
> )
> GROUP BY
> Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
>
>
> The table sizes are as follows:
>
> Sales_Details_Extract (view containing ~ 70 million rows)
> Title_Hierarchy (Table containing ~180k rows)
> Customers, (Table containing ~60k rows)
> SCFlattened (Table containing ~30k rows)
> Time_v (view containing ~ 80 rows)
>
> Total DB Size ~ 33GB
>
>
> On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
> Ed. The same query causes tempdb to fill (56GB) before we run out of disk
> resource. The database is a direct copy i.e. db detached and attached.
> The
> indexes have recently been rebuilt and the 'auto update stats' option is
> selected. The query plan seems to indicate a scan of the entire
> Sales_Details (70 million rows) is taking place, and suspect a Cartesian
> product is the cause of tempdb growing so large.
>
> We've tried unsuccessfully placing the database including tempdb into 'SQL
> 7
> Compatibility Mode'
>
> Any Suggestions would be gratefully received. I have copies of the
> execution plans if interested.
>
> Regards,
>
> Ian (ichinds@.hotmail.com)
>
>
>
>|||Ian,
I don't know how much influence you have over the query or the schema. I
assume you cannot change the query.
First of all, make sure you statistics are up to date. Run UPDATE
STATISTICS when in doubt.
Next, you could check if all join keys are of the same data type. For
example, column Time_v.Year should have the exact same data type (for
optimal performance) as BusObj_Control.Current_Year.
Also check if the columns in the WHERE clause have the same data type as
the literals of the expressions. For example, column Customers.Customer
should be char or varchar in order to match the data type of the literal
'61384 '.
Hope this helps,
Gert-Jan
Ian wrote:
> Hello All,
> Can anybody throw some light on this problem...
> In SQL Server version 7, running on NT 4, the following query (generated b
y
> Business Objects, not me!) can be successfully executed in query analyser
> and returns correct results without heavily utilising tembdb. A seek is
> performed, based on date, on a clustered index only returning the rows fro
m
> Sales_Details_Extract (fact table) that are required:
> SELECT
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Value)
> )
> ELSE 0 END),
> (CASE WHEN
> (
> Sales_Enquiry.dbo.Time_v.Month )=( (SELECT Current_Month FROM
> BusObj_Control) )
> THEN (
> sum(Sales_Enquiry.dbo.Sales_Details_Extract.Quantity)
> )
> ELSE 0 END),
> Sales_Enquiry.dbo.Time_v.Year,
> Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> FROM
> Sales_Enquiry.dbo.Sales_Details_Extract,
> dbo.Title_Hierarchy,
> dbo.Customers,
> dbo.SCFlattened,
> Sales_Enquiry.dbo.Time_v
> WHERE
> ( Sales_Enquiry.dbo.Sales_Details_Extract.ISBN=dbo.Title_Hierarchy.ISBN an
d
> Sales_Enquiry.dbo.Sales_Details_Extract.Source_System=dbo.Title_Hierarchy.
So
> urce_System )
> AND (
> Sales_Enquiry.dbo.Sales_Details_Extract.Customer=dbo.Customers.Customer )
> AND ( Sales_Enquiry.dbo.Sales_Details_Extract.Date between
> Sales_Enquiry.dbo.Time_v.Period_Start and
> Sales_Enquiry.dbo.Time_v.Period_End )
> AND ( dbo.Customers.Sales_Channel=dbo.SCFlattened.SCSubT6 )
> AND (
> ( (
> Sales_Enquiry.dbo.Time_v.Year ) = ( (select Current_Year from
> BusObj_Control) ) )
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Type IN (' ', 'S',
> 'U')
> AND Sales_Enquiry.dbo.Sales_Details_Extract.Sale_Category NOT IN ('V'
,
> 'R')
> AND
> dbo.Customers.Customer NOT IN ('61384 ', '68812 ', '61383 ')
> AND
> dbo.Title_Hierarchy.ISBN NOT IN ('0141014075')
> )
> GROUP BY
> Sales_Enquiry.dbo.Time_v.Year, Sales_Enquiry.dbo.Time_v.Month,
> dbo.SCFlattened.SCSubT2Desc
> The table sizes are as follows:
> Sales_Details_Extract (view containing ~ 70 million rows)
> Title_Hierarchy (Table containing ~180k rows)
> Customers, (Table containing ~60k rows)
> SCFlattened (Table containing ~30k rows)
> Time_v (view containing ~ 80 rows)
> Total DB Size ~ 33GB
> On new hardware running Windows 2003 Server and SQL Server 2000 Enterprise
> Ed. The same query causes tempdb to fill (56GB) before we run out of disk
> resource. The database is a direct copy i.e. db detached and attached. T
he
> indexes have recently been rebuilt and the 'auto update stats' option is
> selected. The query plan seems to indicate a scan of the entire
> Sales_Details (70 million rows) is taking place, and suspect a Cartesian
> product is the cause of tempdb growing so large.
> We've tried unsuccessfully placing the database including tempdb into 'SQL
7
> Compatibility Mode'
> Any Suggestions would be gratefully received. I have copies of the
> execution plans if interested.
> Regards,
> Ian (ichinds@.hotmail.com)