Hi,
I am used to writing Sub-Correlated queries within my main queries. Although they work fine but i have read alot that they have performance hits. Also, as with time our data has increased, a simple SELECT statement with a few Sub-Queries tends to run slower which may be between 10-15 seconds. Following will be a simple example of what i mostly do:
SELECT DISTINCT C.CusID, C.Name, C.Age,
(
SELECT SUM (Price)
FROM CusotmerOrder
WHERE CusID_fk = CO.CusID_fk
) Total_Order_Price,
(
SELECT SUM (Concession)
FROM CusotmerOrder
WHERE CusID_fk = CO.CusID_fk
) Total_Order_Concession,
(
SELECT SUM (Price) - SUM (Concession)
FROM CusotmerOrder
WHERE CusID_fk = CO.CusID_fk
) Total_Difference
FROM Customer C
INNER JOIN CustomerOrder CO
ON C.CusID = CO.CusID_fk
.....
WHERE (conditions...)
My question is what would be a better way to handle the above query? How can i write a better yet simple query with optimized performance. I would also mention that in some of my asp.net applications, i use inline queries assigned to SqlCommand Object. The reason i mention it that since these queries are written in some class files, how would we still accomplish what i have mentioned above.Kindly could anyQuery Guru guide me writing better queries. I shall be obliged...
SELECT CustomerID, SUM(Price), SUM(Concession), SUM(Price-Concession)
FROM CusttomerOrder
GROUP BY CustomerID
|||You can also use the above as a derived table and join on it, like:
SELECT Customers.CustomerID, Customers.Name, Customers.Age, OrdersTotals.TotalPrice, OrdersTotals.TotalConcession, OrdersTotals.TotalDifference
FROM Customers
INNER JOIN (
SELECT CustomerID, SUM(Price) AS TotalPrice, SUM(Concession) AS TotalConcession, SUM(Price-Concession) AS TotalDifference
FROM Orders
) AS OrdersTotals
ON Customers.CustomerID = OrdersTotals.CustomerID
|||Hi Valenumr,
Thanks alot man, wow, you made things look so easy. The queries i have been working with were really optimized and the time went down from 10 sec to 1 sec. Greate help man and I learnt a great new technique.
Valenumr, now I need one more improvement on the above query. What if I want to get the same query working but on a day by day basis? Actually I use this query to generate a weekly report but it takes a long time. Also I tried to use the same for a monthly report, but the time exceeded 5 minutes. How can I improve the following query (im using the date add function to add days to given date):
Declare @.DateStart DateTime
Set @.DateStart = '2006-05-01'
SELECT DISTINCT C.CusID, C.Name, C.Age,
(
SELECT SUM (Price)
FROM CusotmerOrder
WHERE CusID_fk = CO.CusID_fk
AND Shop_Date = dateadd(dd,0, @.DateStart)
) Customer_Sales_Day1,
(
SELECT SUM (Price)
FROM CusotmerOrder
WHERE CusID_fk = CO.CusID_fk
AND Shop_Date = dateadd(dd,1, @.DateStart)
) Customer_Sales_Day2,
.................. continued for seven days ..................
(
SELECT SUM (Price)
FROM CusotmerOrder
WHERE CusID_fk = CO.CusID_fk
AND Shop_Date = dateadd(dd,6, @.DateStart)
) Customer_Sales_Day7,
FROM Customer C ...
WHERE (condition)
I look forward to your great help, thanks a lot...
|||I think you want something like this:
SELECT Customer.CustomerID, Customer.Name, Customer.Age, OrderSummary.Date, OrderSummary.TotalPriceFROM CustomersINNER JOIN ( SELECT CustomerID, SUM(Price) AS TotalPrice, Shop_Date AS Date FROM Orders GROUP BY CustomerID, Shop_Date HAVING Shop_Date BETWEEN Start_Date AND End_Date) AS OrderSummaryON Customers.CustomerID = OrdersSummary.CustomerID|||
Hi Valenumr,
Thanks for this query once again. It really worked like a charm and taught me a new technique once again. If you dont mind helping a bit more, kindly can i ask for one more help? This query has worked absolutely fine without any problem for me. What if a record does not exists for any given date between the given dates. In that case, i would like to have a 0 for the sum. e.g. For a customer, if records exist for dates 15, 16, 17, 19, 20 and 21 May, 2006 and a record for 18th May does not exists, then i would want to inset a 0 for 18th May, 2006. i want the records to look like the following:
EmpID Name D1 D2 D3 D4 D5 D6 D7
1001 ABC 200 100 90 0 9 74 10
Once again thanks alot for all the help you have given. You help has been very valuable to me. Cheers...
|||one more thing. i have observed that whenever i use the Group By clause, it asks me to include all the SELECT ColumnsNames in the Group By clause. the message is"Column 'ColumnName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause". is there a way that i can include only the required columns in the Group By clause.
|||What you are trying to do is a pivot (or cross-tab) query. I am not very familiar with doing that in SQL, but there is a feature in SQL 2005. It would look something like this.
SELECT *FROM (SELECT EmpID,Name, Price,CONVERT(nvarchar(32), OrderDate, 106)AS OrderDateFROM OrdersWHERE OrderDateBETWEEN N'01/01/2007'AND N'01/07/2007')AS OrdersPIVOT (SUM(Price)FOR OrderDateIN ( [01 Jan 2007], [02 Jan 2007], [03 Jan 2007], [04 Jan 2007], [05 Jan 2007], [06 Jan 2007], [07 Jan 2007] ))AS PivotTable
Anyhow, for the date, you most likely want to do some type of conversion, otherwise the column names will have the long date format. I just used 106 as an example format, which is dd MMM yyyy, which you should note is how the pivot columns are named in the FOR ... IN (column_list).
|||Hi Valenumr,
Yes your absolutely right that this is a PIVOT Technique. Some1 else also pointed out the same technique in a different post. Infact i do have the PIVOT Technique (not the built in PIVOT function) working. Anyways, can you answer one final question (hope its the last :)). Why do we have to include all the columns from the SELECT clause in our Group By clause otherwise it gives the message "Column 'ColumnName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause" or im mistaken in understanding this point.
Valenumn, thanx alot for your time and patience. If i could vote you to be a MVP, i would have :)
|||Say you do something like "SELECT A, B, SUM(C)..."
It seems sort of silly from the point of view that A, B must be specified in the group by clause, since they are in the select list and you are using an aggregate function (SUM) in the select list. You would almost think that by default saying GROUP BY A, B is redundant.
Well, basically, the SUM(C) has to know what perform the aggregate sum over, and the GROUP BY A, B is like the distict set of A / B combinations in the record source (which you should see corresponds to your SELECT DISTINCT statements).
However, you could GROUP BY A, B, D, E, etc. event though D and E are not in the SELECT list. I guess it is probably easier from the SQL specification / implementation perspective to require the GROUP BY to have a complete specification, and that any columns in the select list that aren't aggregated exist in that specification.
It would be awkward to say "SELECT A, B, SUM(C)... GROUP BY D" and have that *imply" A and B in the group by list, because it isn't clear. Also, what if you say "SELECT SUM(C)...", so in this case, you must have *some* group by, or should the engine assume that if you don't, "GROUP BY C" is implied... well, then going back to the original example, should C be included in the GROUP BY list if none is specified? The results of GROUP BY A, B, C will be very different (usually) than GROUP BY A, B.
So, I guess there are just way too many cases to consider that would lack direct clarity if the spec was any different. The GROUP BY clause must be explicit.
Finally, Imagine this... "SELECT A, B, SUM(C)... GROUP BY D"... well, what does that mean exactly? how can you GROUP BY D when for every D you have more than one combination of A / B, so that doesn't really make sense (going back to the SELECT DISTINCT analogy).
No comments:
Post a Comment