Monday, March 19, 2012

Optimiser problem? Takes ages to retrieve next key in 4-segment

Nikki,
You are absolutely correct in your view of this, but unfortunately,
I don't think there is a way in T-SQL to deal with this. It might
become easier with the analytic functions of SQL Server 2005, but
that may not be true and doesn't help you much now... It may also
be that there is a clever way of doing this with cursors that I'm not
seeing, but it would be a shame to have to do that.
If it weren't for the datetime column, you could create (and perhaps
get away with indexing) a computed table column or view column that
concatenated fixed-length string versions of the index columns. Datetime
conversions are considered non-deterministic or imprecise and can't be
indexed, so you would have to use a decimal column or separate serial
date and time columns stored as integers or strings.
Not fun, but if this is a serious concern for you, it might be worth
the trouble of considering. The concatenation is also the closest you can
come to a "clean" way of expressing this, but it won't be as efficient as
it should be:
select top 1 *
from [Requirement Detail]
WHERE
ORDER_11
+ CONVERT(PRTNUM_11,15)
+ CONVERT(CHAR(30),CURDUE_11,121)
+ TYPE_11
>
@.ORDER_11
+ CONVERT(@.PRTNUM_11,15)
+ CONVERT(CHAR(30),@.CURDUE_11,121)
+ @.TYPE_11
ORDER BY ORDER_11, PRTNUM_11, CURDUE_11, TYPE_11
You might get lucky with this, depending on the
distribution of data you have. Adding this or other
conditions you know can use indexes and are true can
help:
ORDER_11 >= @.ORDER_11
At the risk of increasing your frustration more than highlighting your sense
, I'll
point out that if SQL Server implemented row constructors according
to the ANSI SQL standard, it would be very easy:
-- WARNING: DON'T TRY THIS IN T-SQL. :(
-- USING T-SQL TOP also:
SELECT TOP 1 *
FROM [Requirement Detail]
WHERE (ORDER_11, PRTNUM_11, CURDUE_11, TYPE_11)
> (@.ORDER_11, @.PRTNUM_11, @.CURDUE_11, @.TYPE_11)
ORDER BY ORDER_11, PRTNUM_11, CURDUE_11, TYPE_11
ANSI SQL doesn't have TOP, and I don't have my copy handy
to be sure MAX can be used with row constructors, but the
ANSI version might be one of these:
SELECT *
FROM [Requirement Detail]
WHERE (ORDER_11, PRTNUM_11, CURDUE_11, TYPE_11) = (
SELECT MIN((ORDER_11, PRTNUM_11, CURDUE_11, TYPE_11))
FROM [Requirement Detail]
WHERE (ORDER_11, PRTNUM_11, CURDUE_11, TYPE_11)
> (@.ORDER_11, @.PRTNUM_11, @.CURDUE_11, @.TYPE_11)
)
or,
SELECT *
FROM [Requirement Detail] AS RD1
WHERE NOT EXISTS (
SELECT * FROM [Requirement Detail] AS RD2
WHERE (RD2.ORDER_11, RD2.PRTNUM_11, RD2.CURDUE_11, RD2.TYPE_11)
> (@.ORDER_11, @.PRTNUM_11, @.CURDUE_11, @.TYPE_11)
AND (RD2.ORDER_11, RD2.PRTNUM_11, RD2.CURDUE_11, RD2.TYPE_11)
< (RD1.ORDER_11, RD1.PRTNUM_11, RD1.CURDUE_11, RD1.TYPE_11)
)
SK
Nikki Locke wrote:

> Thanks for your reply.
> Unfortunately clustered indexes are out. If we did have a clustered index,
> it would be on the most frequently used key (which is not this one,
> unfortunately).
> Is there a better way or wording the query? All I want it to do is to make
> a key out of the 4 fields provided, look the key up in the index, and
> return either the found record, or the next record in the index if the
> specified one doesn't exist.
> When stated like that, it is obviously a very cheap operation. If I could
> only find a way of telling SQL server that was what I wanted, all would be
> fine.
>SQL Server does support ALL, but Hugo may have been (and I certainly
was) lamenting the fact that SQL Server doesn't support multicolumn
comparisons of the sort (a1,b1,c1) < (a2,b2,c2). Queries that use
ALL can fairly easily be rewritten without ALL, but queries that use
multicolumn comparisons cannot, and the rewrite, in this case
(
(a1 < a2)
OR
((a1 = a2) and (b1 < b2))
OR
((a1 = a2) and (b1 = b2) and (c1 < c2))
)
is not (that I've seen) optimized to take advantage of an index
on (a,b,c), which ought to help out here.
SK
oj wrote:

> Hugo,
> I haven't been following the entire thread. But sqlserver doe support ALL
> (to some extend).
> http://msdn.microsoft.com/library/e..._qd_11_1sz0.asp
>|||Sounds good! Thanks for the followup.
SK
Nikki Locke wrote:

>Thankyou very much for your detailed and useful reply.
>I now have a much better understanding of the problem, and can start to
>approach it from a different angle.
>I already have a query which executes in 3 msecs (as opposed to 3 secs for
>the original), which is as follows...
>declare @.ORDER_11 nchar(10)
>declare @.PRTNUM_11 nvarchar(15)
>declare @.CURDUE_11 smalldatetime
>declare @.TYPE_11 nchar(2)
>declare @.KEY nchar(35)
>set @.ORDER_11 = '5480184500'
>set @.PRTNUM_11 = '548000000000000'
>set @.CURDUE_11 = '2005-04-20 12:00:00'
>set @.TYPE_11 = 'RQ'
>set @.KEY = @.ORDER_11 + @.PRTNUM_11 + CONVERT(nchar(8), @.CURDUE_11, 112) +
>@.TYPE_11
> SELECT TOP 1
> UNQKEY_11,
> Convert(Money, TIMESTAMP_11),
> ORDER_11,
> PRTNUM_11,
> CURDUE_11,
> TYPE_11
> FROM
> [dbo].[Requirement Detail]
> WHERE
> ((ORDER_11>=@.ORDER_11)) -- the major part of the key
> AND
> ORDER_11 + PRTNUM_11 + CONVERT(nchar(8), CURDUE_11, 112) + TYPE_11
>
> ORDER BY
> ORDER_11,
> PRTNUM_11,
> CURDUE_11,
> TYPE_11
>GO
>As far as I can see, provided there aren't thousands of rows where
>ORDER_11>=@.ORDER_11 but the rest of the condition is not satisfied, this is
>pretty optimal. But I'm still going to fiddle to see if I can improve the
>original query to cut down on the amount of rewriting I have to do on the
>stored procedures (there are hundreds of them!).
>[Aside] I used style 113 and length 8 for the date query because I happen
>to know the time part of the date is not significant in the real data.
>
>

No comments:

Post a Comment