OpenXML Identity Insert: fields

Showing posts with label fields. Show all posts

Friday, March 30, 2012

OPTIMIZING QUERY

I query two fields in my string.. Those are myHour and myCounty. There are
about 5 million records.
myHour , myCountry and cpm fields are indexed. It returns too late..
SELECT SUM(cpm) from dbilgi where myHour >= '2006-02-05 00:00:00' and
myHour <= '2006-02-05 23:59:59' and myCountry = 'TR'
when i add myCountry = 'TR' it is getting slower.It takes 45 seconds to
return datas after i run my query.
How can optimize my query...you mean you have three separate indexes - one on myHour, one on myCountry,
and one on cpm column? in that case try with covered index on all three
columns, this should speed things up.
dean
"Savas Ates" <in da club> wrote in message
news:%236XzUo2KGHA.1088@.tk2msftngp13.phx.gbl...
>I query two fields in my string.. Those are myHour and myCounty. There are
> about 5 million records.
> myHour , myCountry and cpm fields are indexed. It returns too late..
> SELECT SUM(cpm) from dbilgi where myHour >= '2006-02-05 00:00:00' and
> myHour <= '2006-02-05 23:59:59' and myCountry = 'TR'
> when i add myCountry = 'TR' it is getting slower.It takes 45 seconds
> to
> return datas after i run my query.
>
> How can optimize my query...
>
>|||Without knowing too much about the table structure, and if there are any
clustered indexes, here is what is (probably) happening.
SELECT SUM(cpm) from dbilgi where myHour >= '2006-02-05 00:00:00' and
myHour <= '2006-02-05 23:59:59'
would (probably) result in an index s

on the nonclustered index for
myHour, with a bookmark lookup to either the clustered index or table. Resul
t
is returned fairly quick.
Adding the condition for myCountry
SELECT SUM(cpm) from dbilgi where myHour >= '2006-02-05 00:00:00' and
myHour <= '2006-02-05 23:59:59' and myCountry = 'TR'
The optimizer sees that either
-the majority of rows in dbilgi have myCoutry = 'TR' , or
-the majority of rows in dbilgi, where myHour is between '2006-02-05
00:00:00' and
'2006-02-05 23:59:59', have a myCountry = 'TR'
and chooses to perform a table scan (or clustered index scan) instead of
using the indexes. Here the optimizer decides that the cost of the bookmark
lookups will be more expensive than just scanning the whole table (or
clustered index).
As Dean mentioned in an earlier reply, you could create a composite index on
myCountry, myHour, and cpm to make the above query faster, but building that
index may take quite a long time on a table with 5 million+ rows.
You could also try using the WITH (INDEX(index_name)) table hint to force
the use of your nonclustered indexes on myCountry and myHour.
"Savas Ates" wrote:

> I query two fields in my string.. Those are myHour and myCounty. There ar
e
> about 5 million records.
> myHour , myCountry and cpm fields are indexed. It returns too late..
> SELECT SUM(cpm) from dbilgi where myHour >= '2006-02-05 00:00:00' and
> myHour <= '2006-02-05 23:59:59' and myCountry = 'TR'
> when i add myCountry = 'TR' it is getting slower.It takes 45 seconds
to
> return datas after i run my query.
>
> How can optimize my query...
>
>|||Savas Ates (in da club) writes:
> I query two fields in my string.. Those are myHour and myCounty. There
> are about 5 million records.
> myHour , myCountry and cpm fields are indexed. It returns too late..
> SELECT SUM(cpm) from dbilgi where myHour >= '2006-02-05 00:00:00' and
> myHour <= '2006-02-05 23:59:59' and myCountry = 'TR'
> when i add myCountry = 'TR' it is getting slower.It takes 45
> seconds to return datas after i run my query.
Judging from this query alone, a clustered index on myHour could be a
good bet. Or a non-clustered index on (myHour, myCountry, cpm) or
even (myCountry, myHour, cpm). But the latter index would not be
use for the query without myContry.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||In addition to what everyone else says. Check out the plan and see what is
happening and what is most costly in each. Do this again after adding
indexes. Figuring out what goes on internally will make this kind of stuff
easier.
One other little point. This value: '2006-02-05 23:59:59' for a date has
two problems.
For smalldatetime:
declare @.date smalldatetime
set @.date = '2006-02-05 23:59:59'
select @.date
Returns:
2006-02-06 00:00:00
declare @.date datetime
set @.date = '2006-02-05 23:59:59.003'
select @.date
select case when @.date <= '2006-02-05 23:59:59' then 1 else 0 end
0
Because of this, your where clause leaves a one second gap. Use '2006-02-05
23:59:59.997' instead (it is only 1 second, but it can make a difference)
----
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP
"Arguments are to be avoided: they are always vulgar and often convincing."
(Oscar Wilde)
"Savas Ates" <in da club> wrote in message
news:%236XzUo2KGHA.1088@.tk2msftngp13.phx.gbl...
>I query two fields in my string.. Those are myHour and myCounty. There are
> about 5 million records.
> myHour , myCountry and cpm fields are indexed. It returns too late..
> SELECT SUM(cpm) from dbilgi where myHour >= '2006-02-05 00:00:00' and
> myHour <= '2006-02-05 23:59:59' and myCountry = 'TR'
> when i add myCountry = 'TR' it is getting slower.It takes 45 seconds
> to
> return datas after i run my query.
>
> How can optimize my query...
>
>

Optimizing queries

friends,
I have the following table with atleast 4,00,000+ records & i want to
update some of its fields with the below logic.
STOCKEXCDOWNLOAD
I've ran the below query in query analyzer with no indexes on this
table. the time taken is about 40 secs.
UPDATE StockExcDownload SET CalCheqAmt = Quantity * (CASE WHEN Price =
9999.00
THEN (SELECT MktPrice FROM Issue) ELSE Price END), BSENSEFlag = 'B',
EditDate = getdate()
But the index tuning wizard has suggested the following index to be
created.
CREATE
INDEX [StockExcDownload1] ON [dbo].[StockExcDownload] ([Price],
[Quantity])
WITH
DROP_EXISTING
ON [PRIMARY]
But after implementing, the above index it the query has taken me
about 7 mins to execute.
what is the reason for this?rameshsaive@.gmail.com wrote:
> friends,
> I have the following table with atleast 4,00,000+ records & i want to
> update some of its fields with the below logic.
>
> STOCKEXCDOWNLOAD
> I've ran the below query in query analyzer with no indexes on this
> table. the time taken is about 40 secs.
> UPDATE StockExcDownload SET CalCheqAmt = Quantity * (CASE WHEN Price =
> 9999.00
> THEN (SELECT MktPrice FROM Issue) ELSE Price END), BSENSEFlag = 'B',
> EditDate = getdate()
> But the index tuning wizard has suggested the following index to be
> created.
> CREATE
> INDEX [StockExcDownload1] ON [dbo].[StockExcDownload] ([Price],
> [Quantity])
> WITH
> DROP_EXISTING
> ON [PRIMARY]
>
> But after implementing, the above index it the query has taken me
> about 7 mins to execute.
> what is the reason for this?
The subquery:
(SELECT MktPrice FROM Issue)
is invalid unless Issue contains no more than ONE row. I suspect this
may be part of the problem but without more info I can only guess what
the solution is. Read my signature.
Why do you want to calculate an amount on the table if it can already
be derived from other tables in the database? Don't store calculated
results if you can avoid it. Put the calcs in a view or query.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||Thanks a lot david.
At any time the issue table contains only one record.
i've modified the query to
UPDATE StockExcDownload SET CalCheqAmt = Quantity * (CASE WHEN Price =
9999.00 THEN 34 ELSE Price END), BSENSEFlag = 'B',
EditDate = getdate()
It has boosted a little bit (its coming down to 4.45 Mins). But i still
want to reduce the time as it is too costly for me to use this query.
And also it is one of the queries in the stored procedure. below is
the list of queries preceding above query. I've tried the above query
for indexing.
UPDATE StockExcDownload SET CatgNm = (SELECT DISTINCT TOP 1 CAT.CatgNm
FROM Category CAT, InvType INV
WHERE CAT.CatgNm = 'EMP' AND CAT.CatgCD = INV.CatgCD AND INV.ITCD =
StockExcDownload.InvstNm AND StockExcDownload.InvstNm = 'EMP'
ORDER BY CAT.CatgNm) WHERE StockExcDownload.CatgNm IS NULL AND
StockExcDownload.BSENSEFlag = 'B'
UPDATE StockExcDownload SET CatgNm = 'RETAIL' WHERE CalCheqAmt <=
100000 AND StockExcDownload.BSENSEFlag = 'B'
AND StockExcDownload.CatgNm IS NULL
UPDATE StockExcDownload SET CatgNm = (SELECT DISTINCT TOP 1 CAT.CatgNm
FROM Category CAT, InvType INV
WHERE StockExcDownload.CalCheqAmt BETWEEN CAT.MinAmt AND CAT.MaxAmt AND
CAT.CatgCD = INV.CatgCD
AND INV.ITCD = StockExcDownload.InvstNm ORDER BY CAT.CatgNm)
WHERE StockExcDownload.CatgNm IS NULL AND StockExcDownload.BSENSEFlag =
'B'
UPDATE StockExcDownload SET Catgnm = (SELECT DISTINCT TOP 1 CAT.CatgNm
FROM Category CAT
WHERE StockExcDownload.CalCheqAmt BETWEEN CAT.MinAmt AND CAT.MaxAmt
ORDER BY CAT.CatgNm)
WHERE StockExcDownload.CatgNm IS NULL AND StockExcDownload.BSENSEFlag =
'B'
UPDATE StockExcDownload SET CatgNm = 'RETAIL' WHERE CatgNm IS NULL AND
BSENSEFlag = 'B'
UPDATE StockExcDownload SET SyndNm = T.SyndNm, Name = T.BrokerNm,
CityNm = T.CityNm
FROM Terminal T WHERE T.BrokerID = StockExcDownload.UserCD AND
T.TerminalCD = StockExcDownload.BranchCD
AND StockExcDownload.BSENSEFlag = 'B'
UPDATE StockExcDownload SET BrokerFlag = LEFT(CompNm,1) FROM Company
A, Broker B
WHERE B.BrokerID = StockExcDownload.UserCD AND B.DSPML = 1 AND
StockExcDownload.BSENSEFlag = 'B'
UPDATE StockExcDownload SET CleanDirty = 'C' WHERE OrderNo IN(SELECT
TOP 1 OrderNo FROM StockExcDownload STI
WHERE STI.ApplNo1 = StockExcDownload.ApplNo1
ORDER BY STI.ApplNo1 ASC,STI.CalCheqAmt DESC,STI.Quantity DESC)
UPDATE StockExcDownload SET CleanDirty = 'D' WHERE CleanDirty IS NULL
UPDATE StockExcDownload SET SyndNm = (Case WHEN BSENSEFlag = 'B' THEN
UserCD ELSE BrokerCD END) WHERE SyndNm IS NULL
--UPDATE StockExcDownload SET SyndNm = BrokerCD WHERE SyndNm IS NULL
AND BSENSEFlag = 'N'
--UPDATE StockExcDownload SET SyndNm = UserCD WHERE SyndNm IS NULL AND
BSENSEFlag = 'B'
UPDATE StockExcDownload SET Name = BranchCD WHERE Name IS NULL
EXEC ActualCleanBidProcess|||rameshsaive@.gmail.com wrote:
> Thanks a lot david.
> At any time the issue table contains only one record.
> i've modified the query to
> UPDATE StockExcDownload SET CalCheqAmt = Quantity * (CASE WHEN Price =
> 9999.00 THEN 34 ELSE Price END), BSENSEFlag = 'B',
> EditDate = getdate()
> It has boosted a little bit (its coming down to 4.45 Mins). But i still
> want to reduce the time as it is too costly for me to use this query.
> And also it is one of the queries in the stored procedure. below is
> the list of queries preceding above query. I've tried the above query
> for indexing.
> UPDATE StockExcDownload SET CatgNm = (SELECT DISTINCT TOP 1 CAT.CatgNm
> FROM Category CAT, InvType INV
> WHERE CAT.CatgNm = 'EMP' AND CAT.CatgCD = INV.CatgCD AND INV.ITCD =
> StockExcDownload.InvstNm AND StockExcDownload.InvstNm = 'EMP'
> ORDER BY CAT.CatgNm) WHERE StockExcDownload.CatgNm IS NULL AND
> StockExcDownload.BSENSEFlag = 'B'
> UPDATE StockExcDownload SET CatgNm = 'RETAIL' WHERE CalCheqAmt <=
> 100000 AND StockExcDownload.BSENSEFlag = 'B'
> AND StockExcDownload.CatgNm IS NULL
> UPDATE StockExcDownload SET CatgNm = (SELECT DISTINCT TOP 1 CAT.CatgNm
> FROM Category CAT, InvType INV
> WHERE StockExcDownload.CalCheqAmt BETWEEN CAT.MinAmt AND CAT.MaxAmt AND
> CAT.CatgCD = INV.CatgCD
> AND INV.ITCD = StockExcDownload.InvstNm ORDER BY CAT.CatgNm)
> WHERE StockExcDownload.CatgNm IS NULL AND StockExcDownload.BSENSEFlag =
> 'B'
> UPDATE StockExcDownload SET Catgnm = (SELECT DISTINCT TOP 1 CAT.CatgNm
> FROM Category CAT
> WHERE StockExcDownload.CalCheqAmt BETWEEN CAT.MinAmt AND CAT.MaxAmt
> ORDER BY CAT.CatgNm)
> WHERE StockExcDownload.CatgNm IS NULL AND StockExcDownload.BSENSEFlag =
> 'B'
> UPDATE StockExcDownload SET CatgNm = 'RETAIL' WHERE CatgNm IS NULL AND
> BSENSEFlag = 'B'
> UPDATE StockExcDownload SET SyndNm = T.SyndNm, Name = T.BrokerNm,
> CityNm = T.CityNm
> FROM Terminal T WHERE T.BrokerID = StockExcDownload.UserCD AND
> T.TerminalCD = StockExcDownload.BranchCD
> AND StockExcDownload.BSENSEFlag = 'B'
> UPDATE StockExcDownload SET BrokerFlag = LEFT(CompNm,1) FROM Company
> A, Broker B
> WHERE B.BrokerID = StockExcDownload.UserCD AND B.DSPML = 1 AND
> StockExcDownload.BSENSEFlag = 'B'
> UPDATE StockExcDownload SET CleanDirty = 'C' WHERE OrderNo IN(SELECT
> TOP 1 OrderNo FROM StockExcDownload STI
> WHERE STI.ApplNo1 = StockExcDownload.ApplNo1
> ORDER BY STI.ApplNo1 ASC,STI.CalCheqAmt DESC,STI.Quantity DESC)
> UPDATE StockExcDownload SET CleanDirty = 'D' WHERE CleanDirty IS NULL
> UPDATE StockExcDownload SET SyndNm = (Case WHEN BSENSEFlag = 'B' THEN
> UserCD ELSE BrokerCD END) WHERE SyndNm IS NULL
> --UPDATE StockExcDownload SET SyndNm = BrokerCD WHERE SyndNm IS NULL
> AND BSENSEFlag = 'N'
> --UPDATE StockExcDownload SET SyndNm = UserCD WHERE SyndNm IS NULL AND
> BSENSEFlag = 'B'
> UPDATE StockExcDownload SET Name = BranchCD WHERE Name IS NULL
> EXEC ActualCleanBidProcess
Indexes on StockExcDownload aren't going to help because all these
updates will perform a scan of the entire table/clustered index anyway.
Make sure you have indexes in the other tables on the columns used in
your joins.
Combine as many of the UPDATEs as you can. Using joins rather than
subqueries in the UPDATE statements may help. IMPORTANT: Make sure you
only join on keys that are unique in the table you are making the
update from.
Of those UPDATEs, seven of them don't join to any other table. Going by
the table name it looks like this is a "staging" table for
pre-processing before you move the data elsewhere. If that's the case
then you could eliminate those 7 UPDATEs altogether. Do the work in the
INSERT statement when you load to the production tables. Even if that
isn't possible, you should be able to combine those 7 UPDATEs into one,
which will be a very significant improvement.
Where you have a very large update against data that is otherwise
static, it will often help to batch the UPDATE into smaller
transactions. For example:
SET ROWCOUNT 100000
WHILE 1=1
BEGIN
UPDATE StockExcDownload
SET y = T.y
FROM tbl AS T
WHERE StockExcDownload.x = T.x
AND StockExcDownload.y IS NULL ;
IF @.@.ROWCOUNT=0
BREAK
END
SET ROWCOUNT 0
Experiment with the SET ROWCOUNT option to see what size of batch works
best for you.
Hope this helps.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||Based on the UPDATE statement in the original post, the advice to add an
index does not make sense, because there is no WHERE clause in this
query, so no index will speed up the data retrieval.
Looking at the other UPDATE queries, a clustered index on (BSENSEFlag)
might increase performance.
For futher comments/suggestions, see inline.
"rameshsaive@.gmail.com" wrote:
> Thanks a lot david.
> At any time the issue table contains only one record.
> i've modified the query to
> UPDATE StockExcDownload SET CalCheqAmt = Quantity * (CASE WHEN Price =
> 9999.00 THEN 34 ELSE Price END), BSENSEFlag = 'B',
> EditDate = getdate()
If you have chosen to create a clustered index on (BSENSEFlag), then it
might be worth your while to split the above query in the following two:
UPDATE StockExcDownload
SET CalCheqAmt = Quantity * (CASE WHEN Price = 9999.00 THEN 34 ELSE
Price END)
, BSENSEFlag = 'B'
, EditDate = getdate()
WHERE BSENSEFlag <> 'B'
UPDATE StockExcDownload
SET CalCheqAmt = Quantity * (CASE WHEN Price = 9999.00 THEN 34 ELSE
Price END)
, EditDate = getdate()
WHERE BSENSEFlag = 'B'

> It has boosted a little bit (its coming down to 4.45 Mins). But i still
> want to reduce the time as it is too costly for me to use this query.
> And also it is one of the queries in the stored procedure. below is
> the list of queries preceding above query. I've tried the above query
> for indexing.
> UPDATE StockExcDownload SET CatgNm = (SELECT DISTINCT TOP 1 CAT.CatgNm
> FROM Category CAT, InvType INV
> WHERE CAT.CatgNm = 'EMP' AND CAT.CatgCD = INV.CatgCD AND INV.ITCD =
> StockExcDownload.InvstNm AND StockExcDownload.InvstNm = 'EMP'
> ORDER BY CAT.CatgNm) WHERE StockExcDownload.CatgNm IS NULL AND
> StockExcDownload.BSENSEFlag = 'B'
Is there some kind of generator that created this query? Why not write
something like this:
UPDATE StockExcDownload
SET CatgNm = 'EMP'
WHERE StockExcDownload.CatgNm IS NULL
AND StockExcDownload.BSENSEFlag = 'B'
AND StockExcDownload.InvstNm = 'EMP'
AND EXISTS (
SELECT 1
FROM Category CAT
INNER JOIN InvType INV
ON INV.CatgCD = CAT.CatgCD
WHERE CAT.CatgNm = 'EMP'
AND INV.ITCD = StockExcDownload.InvstNm
)
There is one slight difference with your original: it will not set
CatgNm to NULL when there is no match. But since you are only updating
rows where this column is already NULL, this is actually a good thing.

> UPDATE StockExcDownload SET CatgNm = 'RETAIL' WHERE CalCheqAmt <=
> 100000 AND StockExcDownload.BSENSEFlag = 'B'
> AND StockExcDownload.CatgNm IS NULL
> UPDATE StockExcDownload SET CatgNm = (SELECT DISTINCT TOP 1 CAT.CatgNm
> FROM Category CAT, InvType INV
> WHERE StockExcDownload.CalCheqAmt BETWEEN CAT.MinAmt AND CAT.MaxAmt AND
> CAT.CatgCD = INV.CatgCD
> AND INV.ITCD = StockExcDownload.InvstNm ORDER BY CAT.CatgNm)
> WHERE StockExcDownload.CatgNm IS NULL AND StockExcDownload.BSENSEFlag =
> 'B'
The ORDER BY clause in the subquery matches the Selection List. So you
don't need to use TOP, and when you use TOP 1, there is definitely no
use for the DISTINCT keyword. You could write SELECT MIN(..) instead of
SELECT DISTINCT TOP 1 .. ORDER BY ..

> UPDATE StockExcDownload SET Catgnm = (SELECT DISTINCT TOP 1 CAT.CatgNm
> FROM Category CAT
> WHERE StockExcDownload.CalCheqAmt BETWEEN CAT.MinAmt AND CAT.MaxAmt
> ORDER BY CAT.CatgNm)
> WHERE StockExcDownload.CatgNm IS NULL AND StockExcDownload.BSENSEFlag =
> 'B'
> UPDATE StockExcDownload SET CatgNm = 'RETAIL' WHERE CatgNm IS NULL AND
> BSENSEFlag = 'B'
This query can be combined with one of the later UPDATEs. It just needs
an UPDATE that will cover the entire table, and a CASE expression to
retain non-NULL CatgNm values.

> UPDATE StockExcDownload SET SyndNm = T.SyndNm, Name = T.BrokerNm,
> CityNm = T.CityNm
> FROM Terminal T WHERE T.BrokerID = StockExcDownload.UserCD AND
> T.TerminalCD = StockExcDownload.BranchCD
> AND StockExcDownload.BSENSEFlag = 'B'
> UPDATE StockExcDownload SET BrokerFlag = LEFT(CompNm,1) FROM Company
> A, Broker B
> WHERE B.BrokerID = StockExcDownload.UserCD AND B.DSPML = 1 AND
> StockExcDownload.BSENSEFlag = 'B'
Ouch! This is no good. The table Company is not joined to Broker or
StockExcDownload, making it a cross join, which could be very expensive!
Also, if the column CompNm originates from table Company, then the
BrokerFlag could be set to any random Company first letter.

> UPDATE StockExcDownload SET CleanDirty = 'C' WHERE OrderNo IN(SELECT
> TOP 1 OrderNo FROM StockExcDownload STI
> WHERE STI.ApplNo1 = StockExcDownload.ApplNo1
> ORDER BY STI.ApplNo1 ASC,STI.CalCheqAmt DESC,STI.Quantity DESC)
I would change "IN" to "=".
Please note, that if there are multiple OrderNo for a particular
combination of (ApplNol, CalCheqAmt, Quantity), then the query engine
will select one of these OrderNo 'randomly'. If you want consistent
results, you might want to add OrderNo to the end of the ORDER BY
clause.

> UPDATE StockExcDownload SET CleanDirty = 'D' WHERE CleanDirty IS NULL
If CleanDirty is NULL for all rows, prior to the previous query, then
you can merge these two queries to something like this:
UPDATE StockExcDownload
SET CleanDirty = CASE WHEN OrderNo = (
SELECT ... ) THEN 'C' ELSE 'D' END
You could also last "CatgNm = 'RETAIL'" query to this one, which would
make it something like
UPDATE StockExcDownload
SET CleanDirty = CASE WHEN OrderNo = (
SELECT ... ) THEN 'C' ELSE 'D' END
, CatgNm = CASE WHEN (CatgNm IS NULL AND BSENSEFlag = 'B') THEN
'RETAIL' ELSE CatgNm END

> UPDATE StockExcDownload SET SyndNm = (Case WHEN BSENSEFlag = 'B' THEN
> UserCD ELSE BrokerCD END) WHERE SyndNm IS NULL
You can add this one to the previous query as well.

> --UPDATE StockExcDownload SET SyndNm = BrokerCD WHERE SyndNm IS NULL
> AND BSENSEFlag = 'N'
> --UPDATE StockExcDownload SET SyndNm = UserCD WHERE SyndNm IS NULL AND
> BSENSEFlag = 'B'
> UPDATE StockExcDownload SET Name = BranchCD WHERE Name IS NULL
And this one as well.
Hope this helps,
Gert-Jan

> EXEC ActualCleanBidProcess

Wednesday, March 28, 2012

Optimizing Job fails on tables with computed fields

Hi,
we have some tables in SQL Server 2000 with computed fields. When we try to
run an optimization job, it return the error
"Rebuilding indexes for table 'FI_Period'
[Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 1934: [Microsoft][ODBC SQL
Server Driver][SQL Server]DBCC failed because the following SET options have
incorrect settings: 'QUOTED_IDENTIFIER'"
After removing the field the job runs without errors. I cannot see any link
between a computed field and the Quoted Identifier setting.
Any input is appreciated.
TIA, Norbert Meiss
See MSKB 301292 <http://support.microsoft.com/kb/q301292/>.
Hope this helps.
Dan Guzman
SQL Server MVP
"Norbert Meiss" <NorbertMeiss@.discussions.microsoft.com> wrote in message
news:436B7DE5-21CC-413B-A0D6-854FD3721894@.microsoft.com...
> Hi,
> we have some tables in SQL Server 2000 with computed fields. When we try
> to
> run an optimization job, it return the error
> "Rebuilding indexes for table 'FI_Period'
> [Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 1934: [Microsoft][ODBC
> SQL
> Server Driver][SQL Server]DBCC failed because the following SET options
> have
> incorrect settings: 'QUOTED_IDENTIFIER'"
> After removing the field the job runs without errors. I cannot see any
> link
> between a computed field and the Quoted Identifier setting.
> Any input is appreciated.
> TIA, Norbert Meiss

Optimizing Job fails on tables with computed fields

Hi,
we have some tables in SQL Server 2000 with computed fields. When we try to
run an optimization job, it return the error
"Rebuilding indexes for table 'FI_Period'
[Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 1934: [Microsoft][ODBC SQL
Server Driver][SQL Server]DBCC failed because the following SET options have
incorrect settings: 'QUOTED_IDENTIFIER'"
After removing the field the job runs without errors. I cannot see any link
between a computed field and the Quoted Identifier setting.
Any input is appreciated.
TIA, Norbert MeissSee MSKB 301292 <http://support.microsoft.com/kb/q301292/>.
Hope this helps.
Dan Guzman
SQL Server MVP
"Norbert Meiss" <NorbertMeiss@.discussions.microsoft.com> wrote in message
news:436B7DE5-21CC-413B-A0D6-854FD3721894@.microsoft.com...
> Hi,
> we have some tables in SQL Server 2000 with computed fields. When we try
> to
> run an optimization job, it return the error
> "Rebuilding indexes for table 'FI_Period'
> [Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 1934: [Microsoft][ODBC
> SQL
> Server Driver][SQL Server]DBCC failed because the following SET options
> have
> incorrect settings: 'QUOTED_IDENTIFIER'"
> After removing the field the job runs without errors. I cannot see any
> link
> between a computed field and the Quoted Identifier setting.
> Any input is appreciated.
> TIA, Norbert Meiss

Optimizing Job fails on tables with computed fields

Hi,
we have some tables in SQL Server 2000 with computed fields. When we try to
run an optimization job, it return the error
"Rebuilding indexes for table 'FI_Period'
[Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 1934: [Microsoft]&#
91;ODBC SQL
Server Driver][SQL Server]DBCC failed because the following SET options
have
incorrect settings: 'QUOTED_IDENTIFIER'"
After removing the field the job runs without errors. I cannot see any link
between a computed field and the Quoted Identifier setting.
Any input is appreciated.
TIA, Norbert MeissSee MSKB 301292 <http://support.microsoft.com/kb/q301292/>.
Hope this helps.
Dan Guzman
SQL Server MVP
"Norbert Meiss" <NorbertMeiss@.discussions.microsoft.com> wrote in message
news:436B7DE5-21CC-413B-A0D6-854FD3721894@.microsoft.com...
> Hi,
> we have some tables in SQL Server 2000 with computed fields. When we try
> to
> run an optimization job, it return the error
> "Rebuilding indexes for table 'FI_Period'
> [Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 1934: [Microsoft]
[ODBC
> SQL
> Server Driver][SQL Server]DBCC failed because the following SET option
s
> have
> incorrect settings: 'QUOTED_IDENTIFIER'"
> After removing the field the job runs without errors. I cannot see any
> link
> between a computed field and the Quoted Identifier setting.
> Any input is appreciated.
> TIA, Norbert Meiss

Wednesday, March 21, 2012

Optimization gurus: Help with varchar vs. text fields decision

Hi, I'm trying to improve performance for a db that stores messages. The message is currently stored in an ntext field, but when I look at all the records from the past 3 months, I found that 88% are less than 1000 characters, and 97% are less than 3000 characters.

I don't want to actually limit the message size, but it seems like I might get much better performance using a varchar(3000) field to hold most of the messages, and a separate text field just used for those 3% that really are long. Is this a good idea? If so, is it better to put the Message and LongMessage fields in the same table; or, have a separate table to hold the long messages? If it is in a separate table, it would need to be left joined with the message table each time messages are retrieved.

Also -- I am getting about 700 new messages daily, and right now have over 150,000 messages stored. The vast majority of activity involves new messages. Is this a good situation to look at using horizontal partitioning?

Thanks for any help, I don't really have anyone to discuss this with and it is really helpful to get some other views!!

Are you able to upgrade to SQL2005? VARCHAR(max) would be a simple solution to your problem.

|||

Yay, I'm already on SQL server 2005, so I could use varchar(max) -- now that I've heard of it! Are there performance issues to be aware of with max? Any drawback to a design where the row size will vary wildly from row to row??

|||

Celestine:
when I look at all the records from the past 3 months, I found that 88% are less than 1000 characters, and 97% are less than 3000 characters.

I calculated wrong -- it is an ntext field, so each char is two bytes, not one byte. Meaning 97% of the messages are actually less than 1500 characters, not 3000. All the messages are in English, so I'm not going to continue using ntext or nvarchar.

|||

Varchar(max) allows rows to span physical blocks, hence no row length restriction. Why not look it up on Books-On_line BOL?

|||

I understand that there is no length restriction, I looked it up right away; thanks for making me aware of the max option. What I am asking about is whether there are performance implications to consider with using varchar(max), when you are hoping to get multiple records to fit on a data page.

|||

With varchar(max) most of the records will be fetched with a single read whereas for ntext, two reads will be required for every record irrespective of its size.

Monday, March 19, 2012

optimising a table with lots of boolean fields

I have an application that reads a monitoring devices that produces 200 digital outputs every second and I would like to store them in a table. This table would get quite big fairly quickly as ultimately I would like to monitor over a hundred of these devices.

I would like to construct queries against each of the individual digital channels or combinations of them.

M first thought is to set up a table with 200 separate columns (plus others for date stamp, device ID etc) however, I am concerned that a table with 200 boolean (1-bit) fields would be an enormous waste of space if each field takes maybe one to four bytes on the hard disk to store a single bit. However, this would have the advantage of make the SQL queries more natural.

The other alternative is to create a single 200 bit field and use lots of ANDing and ORing to isolate bits to do my queries. This would make my SQL code less readable and may also cause nore hassle in the future if the inputs changed, but it would make the file size smaller.

In essence I am asking (hoping) the following : If I create a table with 200 boolean fields, does SQL server express automatically optimise the storage to make it more compact? This means that the server can mess around at the bit level and leave my higher level SQL code looking cleaner and more logical.

hi,

SQL Server can pad up to 8 bit columns together to save space, but this is another concern in your problem..

I'd actually go for a more normalized model.. you can break the 200 bit columns in a separated entity referincing the same transaction... this not only is more "elegant" and correct, but solves problem where you have to add/modify a device..

something like

SET NOCOUNT ON;

USE tempdb;

CREATE TABLE dbo.Devices (

Id int NOT NULL PRIMARY KEY,

Description varchar(10) NOT NULL

);

CREATE TABLE dbo.DeviceTran (

Id int NOT NULL IDENTITY PRIMARY KEY , -- for sake of simplicity

otherData varchar(10) NULL,

TimeRecorded datetime DEFAULT GETDATE()

);

CREATE TABLE dbo.DeviceTranOutput (

Id int NOT NULL IDENTITY PRIMARY KEY , -- for sake of simplicity

IdTran int NOT NULL

CONSTRAINT fk_DeviceTran_DeviceTranOutput

FOREIGN KEY

REFERENCES dbo.DeviceTran (Id),

IdDevice int NOT NULL

CONSTRAINT fk_Devices_DeviceTranOutput

FOREIGN KEY

REFERENCES dbo.Devices (Id),

TValue bit NOT NULL DEFAULT 0

);

PRINT 'available devices';

INSERT INTO dbo.Devices VALUES ( 1 , 'PDA' );

INSERT INTO dbo.Devices VALUES ( 2 , 'PBAX' );

INSERT INTO dbo.Devices VALUES ( 3 , 'PC' );

INSERT INTO dbo.Devices VALUES ( 4 , 'xxx' );

SELECT * FROM dbo.Devices;

PRINT '--';

DECLARE @.i int, @.y int, @.id int;

SET @.i = 1

WHILE @.i < 10 BEGIN

INSERT INTO dbo.DeviceTran VALUES ( 'Data ' + CONVERT(varchar, @.i), DEFAULT);

SELECT @.id = SCOPE_IDENTITY();

SET @.y = 1;

WHILE @.y < 5 BEGIN

INSERT INTO dbo.DeviceTranOutput VALUES ( @.id , @.y , (@.i + @.y) % 2);

SET @.y = @.y +1;

END

SET @.i = @.i +1

END;

PRINT 'Transaction Report';

SELECT t.Id, t.otherData, t.TimeRecorded, d.Description, o.TValue

FROM dbo.DeviceTranOutput o

JOIN dbo.Devices d

ON d.Id = o.IdDevice

JOIN dbo.DeviceTran t

ON t.Id = o.IdTran;

DROP TABLE dbo.DeviceTranOutput, dbo.DeviceTran, dbo.Devices;

resulting in something likeTransaction Report
Id otherData TimeRecorded Description TValue
-- - -- --
1 Data 1 2006-07-17 18:23:01.717 PDA 0
1 Data 1 2006-07-17 18:23:01.717 PBAX 1
1 Data 1 2006-07-17 18:23:01.717 PC 0
1 Data 1 2006-07-17 18:23:01.717 xxx 1
2 Data 2 2006-07-17 18:23:01.717 PDA 1
2 Data 2 2006-07-17 18:23:01.717 PBAX 0
2 Data 2 2006-07-17 18:23:01.717 PC 1
2 Data 2 2006-07-17 18:23:01.717 xxx 0
3 Data 3 2006-07-17 18:23:01.717 PDA 0
3 Data 3 2006-07-17 18:23:01.717 PBAX 1
-- result abrdiged..

storage will be better used, you have some penality in both reporting and inserting, as JOIN operations are involved, but you get far better design you can expand/modify with no worries about modifications in the device pattern...

regards

|||Thanks for that.

I'll probably adopt your suggestion.

In order to save space, I am considering only storing the ocasions when the digital values change (rather than every time step).
Using your method, I don't need to store every channel every time so that is another reason in favour of it.

It does lead to another question though..

Taking your example: say a DeviceTran record is generate every timestep for the benefit of some analogue channel 'X' that is continuously changing.
Take a digital channel called 'Y' that is only recorded every time it changes.
Then, if I do an outer join to gather all the data at every time step, I might get something like this,,,

Id TimeRecorded X Y
-- -- --
1 2006-07-17 18:23:01.000 1.0 false
2 2006-07-17 18:23:02.000 1.1 null
3 2006-07-17 18:23:03.000 1.2 null
4 2006-07-17 18:23:04.000 1.1 true
5 2006-07-17 18:23:05.000 1.4 null
6 2006-07-17 18:23:06.000 1.5 null
7 2006-07-17 18:23:07.000 1.2 false
8 2006-07-17 18:23:08.000 0.9 null
9 2006-07-17 18:23:09.000 0.8 nullFor a particular value of DATETIME I would like to get the most recent
record for Y.
A straight outer join would just show null values for those not present.
Is there a clever way of joining but using the 'last' value based on
datetime or id.

Regards
|||

hi,

I'm not sure I fully understand your requirements..

I think you mean you just avoid to insert repeted values as following..

SET NOCOUNT ON;

USE tempdb;

CREATE TABLE dbo.Devices (

Id int NOT NULL PRIMARY KEY,

Description varchar(10) NOT NULL

);

CREATE TABLE dbo.DeviceTran (

Id int NOT NULL IDENTITY PRIMARY KEY ,

otherData varchar(10) NULL,

TimeRecorded datetime NOT NULL DEFAULT GETDATE()

);

CREATE TABLE dbo.DeviceTranOutput (

Id int NOT NULL IDENTITY PRIMARY KEY ,

IdTran int NOT NULL

CONSTRAINT fk_DeviceTran_DeviceTranOutput

FOREIGN KEY

REFERENCES dbo.DeviceTran (Id),

IdDevice int NOT NULL

CONSTRAINT fk_Devices_DeviceTranOutput

FOREIGN KEY

REFERENCES dbo.Devices (Id),

TValue bit NULL

);

DECLARE @.t datetime, @.id int, @.dev int;

SELECT @.dev = 1, @.t = GETDATE();

INSERT INTO dbo.Devices VALUES ( @.dev , 'PBAX' );

INSERT INTO dbo.DeviceTran VALUES ( 'Data Pbax' , @.t);

SELECT @.id = SCOPE_IDENTITY();

INSERT INTO dbo.DeviceTranOutput VALUES ( @.id , @.dev , 0);

WAITFOR DELAY '00:00:01'

SELECT @.t = GETDATE();

INSERT INTO dbo.DeviceTran VALUES ( 'Data Pbax' , @.t);

SELECT @.id = SCOPE_IDENTITY();

--INSERT INTO dbo.DeviceTranOutput VALUES ( @.id , @.dev , 0);

WAITFOR DELAY '00:00:01'

SELECT @.t = GETDATE();

INSERT INTO dbo.DeviceTran VALUES ( 'Data Pbax' , @.t);

SELECT @.id = SCOPE_IDENTITY();

--INSERT INTO dbo.DeviceTranOutput VALUES ( @.id , @.dev , 0);

WAITFOR DELAY '00:00:01'

SELECT @.t = GETDATE();

INSERT INTO dbo.DeviceTran VALUES ( 'Data Pbax' , @.t);

SELECT @.id = SCOPE_IDENTITY();

INSERT INTO dbo.DeviceTranOutput VALUES ( @.id , @.dev , 1);

WAITFOR DELAY '00:00:01'

SELECT @.t = GETDATE();

INSERT INTO dbo.DeviceTran VALUES ( 'Data Pbax' , @.t);

SELECT @.id = SCOPE_IDENTITY();

--INSERT INTO dbo.DeviceTranOutput VALUES ( @.id , @.dev , 1);

PRINT 'Transaction Report';

PRINT '';

PRINT 'IdDevice is NULL, you can''t directly reference';

PRINT 'the devices, you have to resort on ''specific'' queries';

SELECT t.Id, t.otherData, t.TimeRecorded, d.Description,

(SELECT TOP 1 o2.TValue

FROM dbo.DeviceTranOutput o2

WHERE o2.IdDevice = d.Id

AND o2.IdTran <= t.Id

ORDER BY o2.Id DESC) AS TValue

FROM dbo.DeviceTranOutput o

JOIN dbo.Devices d ON d.Id = o.IdDevice

RIGHT JOIN dbo.DeviceTran t ON t.Id = o.IdTran

ORDER BY t.TimeRecorded;

PRINT 'as passing a [@.DeviceId] as a parameter';

DECLARE @.DeviceId int;

DECLARE @.DeviceDescription varchar(10);

SELECT @.DeviceId = 1;

SELECT @.DeviceDescription = Description

FROM dbo.Devices

WHERE Id = @.DeviceId;

SELECT t.Id, t.otherData, t.TimeRecorded,

@.DeviceDescription AS [Description],

(SELECT TOP 1 o2.TValue

FROM dbo.DeviceTranOutput o2

WHERE o2.IdDevice = @.DeviceId

AND o2.IdTran <= t.Id

ORDER BY o2.Id DESC) AS TValue

FROM dbo.DeviceTranOutput o

JOIN dbo.Devices d ON d.Id = o.IdDevice

RIGHT JOIN dbo.DeviceTran t ON t.Id = o.IdTran;

DROP TABLE dbo.DeviceTranOutput, dbo.DeviceTran, dbo.Devices;

--<-

Transaction Report
IdDevice is NULL, you can't directly reference

the devices, you have to resort on 'specific' queries

Id otherData TimeRecorded Description TValue

-- - -- --

1 Data Pbax 2006-07-18 22:56:51.810 PBAX 0

2 Data Pbax 2006-07-18 22:56:52.827 NULL NULL

3 Data Pbax 2006-07-18 22:56:53.827 NULL NULL

4 Data Pbax 2006-07-18 22:56:54.827 PBAX 1

5 Data Pbax 2006-07-18 22:56:55.827 NULL NULL

as passing a [@.DeviceId] as a parameter

Id otherData TimeRecorded Description TValue

-- - -- --

1 Data Pbax 2006-07-18 22:56:51.810 PBAX 0

2 Data Pbax 2006-07-18 22:56:52.827 PBAX 0

3 Data Pbax 2006-07-18 22:56:53.827 PBAX 0

4 Data Pbax 2006-07-18 22:56:54.827 PBAX 1

5 Data Pbax 2006-07-18 22:56:55.827 PBAX 1

but my idea is you'll go into troubles both at insert time, as you have to check if the current value is the same as the last one, and later at query time, as you miss some references..
you can scan for an older value in dbo.DeviceTranOutput of a previous transaction, but you miss the IdDevice value... if you query for a specified device then it's allright, as you pass the IdDevice as a parameter, which becames a constant, but a general purpose query to list all transactions (orderd by TimeRecorded and IdDevice) becames heavy, for every row ...
at insert time this is a heavy load as well as instead of just inserting you have to check, and this can cost a lot in real time apps..
considering you're collecting data in quiet real time, I'd go for the quicker (is it english?) way to pump data in, without trigger to filter out repeated values...

more, transactionally, it breaks a rule of atomicity of a row, as it depends on the values of previous rows... it makes all the design trickier, and of corse coding as well.. my $0.02..

regards

|||Thanks for that.
You have understood my requirements exactly.

I was hoping to save disk space by not storing all the values for a timestep that haven't changed. But you are right about breaking the rule of atomicity, because the value of a particular column now depends on its 'last' non-null value. I hadn't thought of it like that.

Whatever the gains I make in disk space I have to seriously consider the penalty in SQL complexity. Trying to extract the 'current' value for just one or maybe a handful of channels can be done (as you have shown above) but a generic query to return values for all channels might be very complex/slow.

I'll go away and think some more on it.

Thanks once again.

Regards

Monday, March 12, 2012

Optimise multitable update

Hi
I've got the following scenario:
TableA (4 million rows)
TableB (20 000 rows)
I have two fields on TableA that are the unique fields on TableB,
which I use to set the foreign key from A to B:
UPDATE TableA
SET TableA.B_FK = TableB.B_PK
FROM TableA, TableB
WHERE
TableA.Code = TableB.Code
AND TableA.Name = TableB.Name
Code = varchar(5)
Name = varchar(50)
What would a suitable indexes be to optimise this query as it takes 4
hours to run?
I already have an index on TableA on "Code, Name" and TableB on "B_PK"
- takes 4 hours with these!
Any help? Should I have a covering index on TableB, i.e. "Code, Name,
B_PK" ?
Thanks
Sean
On 25 May 2004 08:14:42 -0700, Sean wrote:

>Hi
>I've got the following scenario:
>TableA (4 million rows)
>TableB (20 000 rows)
>
>I have two fields on TableA that are the unique fields on TableB,
>which I use to set the foreign key from A to B:
>UPDATE TableA
>SET TableA.B_FK = TableB.B_PK
>FROM TableA, TableB
>WHERE
>TableA.Code = TableB.Code
>AND TableA.Name = TableB.Name
>
>Code = varchar(5)
>Name = varchar(50)
>What would a suitable indexes be to optimise this query as it takes 4
>hours to run?
>I already have an index on TableA on "Code, Name" and TableB on "B_PK"
>- takes 4 hours with these!
>Any help? Should I have a covering index on TableB, i.e. "Code, Name,
>B_PK" ?
>Thanks
>Sean
Hi Sean,
Is the current index on TableA(Code, Name) a clustered index? Is it
defined as a unique index?
Do all 20000 rows in TableB match a row in TableA? If so, adding an index
on TableB won't do you any good. If all rows in a table have to be
processed anyway, a table scan is always the best way. If only a few of
the 20000 rows will match, an index on TableB(Code, Name) *might* help,
but I'm not sure. Test it. The covering index you suggest *might* help as
well, but you'll have to test that as well. But, as I said - only if the
majority of rows in TableB will not match against TableA.
Is there an index on TableA(B_FK)? If it is, see if you can remove it;
that saves the time to update this index as the update is carried out.
Check that there are no triggers on TableA. (If you have them, can't
disable them and they're the cause of the long execution, forget about the
query and start optimising the triggers first!)
And the most important thing: Check the execution plan!! From your
description, I would expect a table scan of TableB and an index seek on
the index on TableA(CodaA, Name).
Best, Hugo
(Remove _NO_ and _SPAM_ to get my e-mail address)

Optimise multitable update

Hi
I've got the following scenario:
TableA (4 million rows)
TableB (20 000 rows)
I have two fields on TableA that are the unique fields on TableB,
which I use to set the foreign key from A to B:
UPDATE TableA
SET TableA.B_FK = TableB.B_PK
FROM TableA, TableB
WHERE
TableA.Code = TableB.Code
AND TableA.Name = TableB.Name
Code = varchar(5)
Name = varchar(50)
What would a suitable indexes be to optimise this query as it takes 4
hours to run?
I already have an index on TableA on "Code, Name" and TableB on "B_PK"
- takes 4 hours with these!
Any help? Should I have a covering index on TableB, i.e. "Code, Name,
B_PK" ?
Thanks
SeanOn 25 May 2004 08:14:42 -0700, Sean wrote:
>Hi
>I've got the following scenario:
>TableA (4 million rows)
>TableB (20 000 rows)
>
>I have two fields on TableA that are the unique fields on TableB,
>which I use to set the foreign key from A to B:
>UPDATE TableA
>SET TableA.B_FK = TableB.B_PK
>FROM TableA, TableB
>WHERE
> TableA.Code = TableB.Code
> AND TableA.Name = TableB.Name
>
>Code = varchar(5)
>Name = varchar(50)
>What would a suitable indexes be to optimise this query as it takes 4
>hours to run?
>I already have an index on TableA on "Code, Name" and TableB on "B_PK"
>- takes 4 hours with these!
>Any help? Should I have a covering index on TableB, i.e. "Code, Name,
>B_PK" ?
>Thanks
>Sean
Hi Sean,
Is the current index on TableA(Code, Name) a clustered index? Is it
defined as a unique index?
Do all 20000 rows in TableB match a row in TableA? If so, adding an index
on TableB won't do you any good. If all rows in a table have to be
processed anyway, a table scan is always the best way. If only a few of
the 20000 rows will match, an index on TableB(Code, Name) *might* help,
but I'm not sure. Test it. The covering index you suggest *might* help as
well, but you'll have to test that as well. But, as I said - only if the
majority of rows in TableB will not match against TableA.
Is there an index on TableA(B_FK)? If it is, see if you can remove it;
that saves the time to update this index as the update is carried out.
Check that there are no triggers on TableA. (If you have them, can't
disable them and they're the cause of the long execution, forget about the
query and start optimising the triggers first!)
And the most important thing: Check the execution plan!! From your
description, I would expect a table scan of TableB and an index seek on
the index on TableA(CodaA, Name).
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)

Friday, March 9, 2012

Opposite of union - Stupid question for a common problem

Hi there!.
I'm having a simple problem. I have a table with like 20 fields , this table
is always growing since it's a price table acumulator for a set of products
(with their features) , this is loaded from a text file.
When a new text file arrives , I upload it to a temp table prior to import
i to the real table.
What I need to do is to only insert in the main table only the NEW fields,
so basically any record that differs in any of the 20 fields from the one
inside the database.
It's a stupid thing when you think of it, but I cannot seem to find the solu
tion.
Any help will be appreciated!!!
VictorSorry.
Please provide DDL and sample data.
http://www.aspfaq.com/etiquette.asp?id=5006
AMB
"Victor Daicich" wrote:

> Hi there!.
> I'm having a simple problem. I have a table with like 20 fields , this tab
le
> is always growing since it's a price table acumulator for a set of product
s
> (with their features) , this is loaded from a text file.
> When a new text file arrives , I upload it to a temp table prior to import
> i to the real table.
> What I need to do is to only insert in the main table only the NEW fields,
> so basically any record that differs in any of the 20 fields from the one
> inside the database.
> It's a stupid thing when you think of it, but I cannot seem to find the so
lution.
> Any help will be appreciated!!!
> Victor
>
>|||As Alejandro said, the shortest path to a solution is to provide DDL, sample
data and expected results. Based on what you have given though, it sounds
like an INSERT INTO with a SELECT statement with a NOT EXISTS and one
*HUMONGOUS* WHERE clause. It's just a big WHERE clause with about 20 AND's
in it. Here's a sample that selects only the unique rows from table #b
(that don't currently exist in #a). I cut it down to just two columns, but
you can expand the SELECT subquery in the NOT EXISTS predicate to include as
many columns as you like:
CREATE TABLE #a (color1 VARCHAR(16),
color2 VARCHAR(16))
CREATE TABLE #b (color1 VARCHAR(16),
color2 VARCHAR(16))
INSERT INTO #a (color1, color2)
SELECT 'blue', 'red'
UNION SELECT 'red', 'green'
UNION SELECT 'black', 'yellow'
UNION SELECT 'black', 'blue'
INSERT INTO #b (color1, color2)
SELECT 'blue', 'green'
UNION SELECT 'black', 'yellow'
UNION SELECT 'yellow', 'purple'
UNION SELECT 'black', 'blue'
SELECT b.*
FROM #b b
WHERE NOT EXISTS
(
SELECT 1
FROM #a a
WHERE a.color1 = b.color1
AND a.color2 = b.color2
)
DROP TABLE #a
DROP TABLE #b
"Victor Daicich" <victordaicich@.hotmail.com> wrote in message
news:12613abe6ea18c85d1b76245e10@.msnews.microsoft.com...
> Hi there!.
> I'm having a simple problem. I have a table with like 20 fields , this
> table is always growing since it's a price table acumulator for a set of
> products (with their features) , this is loaded from a text file.
> When a new text file arrives , I upload it to a temp table prior to import
> i to the real table.
> What I need to do is to only insert in the main table only the NEW fields,
> so basically any record that differs in any of the 20 fields from the one
> inside the database.
> It's a stupid thing when you think of it, but I cannot seem to find the
> solution.
> Any help will be appreciated!!!
> Victor
>|||Hello Mike C#,
That's right Mike. It's like you said, a big where clause is what I need
then.
Thanks guys you've been very helpful with this issue. I can go on now and
finish it.
Thanks again,
Victor
> As Alejandro said, the shortest path to a solution is to provide DDL,
> sample data and expected results. Based on what you have given
> though, it sounds like an INSERT INTO with a SELECT statement with a
> NOT EXISTS and one *HUMONGOUS* WHERE clause. It's just a big WHERE
> clause with about 20 AND's in it. Here's a sample that selects only
> the unique rows from table #b (that don't currently exist in #a). I
> cut it down to just two columns, but you can expand the SELECT
> subquery in the NOT EXISTS predicate to include as many columns as you
> like:
> CREATE TABLE #a (color1 VARCHAR(16),
> color2 VARCHAR(16))
> CREATE TABLE #b (color1 VARCHAR(16),
> color2 VARCHAR(16))
> INSERT INTO #a (color1, color2)
> SELECT 'blue', 'red'
> UNION SELECT 'red', 'green'
> UNION SELECT 'black', 'yellow'
> UNION SELECT 'black', 'blue'
> INSERT INTO #b (color1, color2)
> SELECT 'blue', 'green'
> UNION SELECT 'black', 'yellow'
> UNION SELECT 'yellow', 'purple'
> UNION SELECT 'black', 'blue'
> SELECT b.*
> FROM #b b
> WHERE NOT EXISTS
> (
> SELECT 1
> FROM #a a
> WHERE a.color1 = b.color1
> AND a.color2 = b.color2
> )
> DROP TABLE #a
> DROP TABLE #b
> "Victor Daicich" <victordaicich@.hotmail.com> wrote in message
> news:12613abe6ea18c85d1b76245e10@.msnews.microsoft.com...
>|||If you are using 2005, then you have the "except" operator.
select... from A
except
Select ... from B
-Omnibuzz (The SQL GC)
http://omnibuzz-sql.blogspot.com/
"Victor Daicich" wrote:

> Hello Mike C#,
> That's right Mike. It's like you said, a big where clause is what I need
> then.
> Thanks guys you've been very helpful with this issue. I can go on now and
> finish it.
> Thanks again,
> Victor
>
>
>|||Good point, I assumed SQL 2000 (I always do when the OP doesn't mention the
platform) :)
"Omnibuzz" <Omnibuzz@.discussions.microsoft.com> wrote in message
news:EFA7D756-F8CA-4F4B-A614-81A78981ADEA@.microsoft.com...
> If you are using 2005, then you have the "except" operator.
> select... from A
> except
> Select ... from B
>
> --
> -Omnibuzz (The SQL GC)
> http://omnibuzz-sql.blogspot.com/
>
> "Victor Daicich" wrote:
>|||Oracle provides INTERSECT and MINUS operators (9i, maybe earlier).
SQL Server 2005 provides INTERSECT and EXCEPT.
Is there an ANSII SQL equivilant that anyone is aware of?
"Mike C#" <xyz@.xyz.com> wrote in message
news:OeBlwT$jGHA.4660@.TK2MSFTNGP05.phx.gbl...
> Good point, I assumed SQL 2000 (I always do when the OP doesn't mention
the
> platform) :)
> "Omnibuzz" <Omnibuzz@.discussions.microsoft.com> wrote in message
> news:EFA7D756-F8CA-4F4B-A614-81A78981ADEA@.microsoft.com...
need
and
you
>|||"Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message
news:eUyihMJkGHA.3512@.TK2MSFTNGP03.phx.gbl...
> Oracle provides INTERSECT and MINUS operators (9i, maybe earlier).
> SQL Server 2005 provides INTERSECT and EXCEPT.
> Is there an ANSII SQL equivilant that anyone is aware of?
ANSI SQL:1999 defines INTERSECT and EXCEPT. Unfortunately SQL 2000 is only
compliant up to ANSI SQL:1992.|||INTERSECT, UNION and EXCEPT are in ANSI SQL. MINUS is not.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
"Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message
news:eUyihMJkGHA.3512@.TK2MSFTNGP03.phx.gbl...
> Oracle provides INTERSECT and MINUS operators (9i, maybe earlier).
> SQL Server 2005 provides INTERSECT and EXCEPT.
> Is there an ANSII SQL equivilant that anyone is aware of?
>|||Good to know. Thanks.
"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:exwUSUJkGHA.3844@.TK2MSFTNGP02.phx.gbl...
> INTERSECT, UNION and EXCEPT are in ANSI SQL. MINUS is not.
> --
> Tibor Karaszi, SQL Server MVP
> http://www.karaszi.com/sqlserver/default.asp
> http://www.solidqualitylearning.com/
>
> "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message
> news:eUyihMJkGHA.3512@.TK2MSFTNGP03.phx.gbl...

Wednesday, March 7, 2012

Operator is not valid for type 'Date' and type 'Date'

I was wondering if someone could assist me in writing an experession that
would subtract 2 DateTime's. I have 2 DateTime fields and want to display
both in a report and in a third column the difference between the 2 values.
However, I get an error:
"Operator is not valid for type 'Date' and type 'Date'"
Many thanks,
SimonYou can get the difference from SQL by trying something like this in your
store proc or query
SELECT date1, date2, DATEDIFF(d, date1, date2) AS date3
hope it helps.
- David
"Simon Dingley" wrote:
> I was wondering if someone could assist me in writing an experession that
> would subtract 2 DateTime's. I have 2 DateTime fields and want to display
> both in a report and in a third column the difference between the 2 values.
> However, I get an error:
> "Operator is not valid for type 'Date' and type 'Date'"
> Many thanks,
> Simon
>
>|||Assuming you have two DateTime fields and you want to determine the
difference in total seconds in the report rather than the query, you can use
an expression like this:
=Datediff("s", Fields!End_Time.Value, Fields!Start_Time.Value)
--
This posting is provided "AS IS" with no warranties, and confers no rights.
"David Nevarez" <DavidNevarez@.discussions.microsoft.com> wrote in message
news:33B5D244-F9D5-4ECB-99CB-F9B1B7D9E445@.microsoft.com...
> You can get the difference from SQL by trying something like this in your
> store proc or query
> SELECT date1, date2, DATEDIFF(d, date1, date2) AS date3
> hope it helps.
> - David
> "Simon Dingley" wrote:
> > I was wondering if someone could assist me in writing an experession
that
> > would subtract 2 DateTime's. I have 2 DateTime fields and want to
display
> > both in a report and in a third column the difference between the 2
values.
> > However, I get an error:
> >
> > "Operator is not valid for type 'Date' and type 'Date'"
> >
> > Many thanks,
> >
> > Simon
> >
> >
> >|||Thank You for the replies. I opted for the in-report solution as opposed to
the SQL solution but thanks for both. What I want to return is the
difference in days and hours but can seem to do it I changed the format
string to "d" instead of "s" to return the number of days but when I tried
to use "d h" i get "#error".
Thanks for the help.
Simon
"Robert Bruckner [MSFT]" <robruc@.online.microsoft.com> wrote in message
news:eVK4OnNsEHA.832@.TK2MSFTNGP10.phx.gbl...
> Assuming you have two DateTime fields and you want to determine the
> difference in total seconds in the report rather than the query, you can
use
> an expression like this:
> =Datediff("s", Fields!End_Time.Value, Fields!Start_Time.Value)
> --
> This posting is provided "AS IS" with no warranties, and confers no
rights.

Saturday, February 25, 2012

openxml/'for xml' image data problem

I'm having problems with image fields when trying to insert XML into a table
using 'for xml' and openxml.
The code below demonstrates my problem. I have a table, tblAttachment that
contains a field named 'file' that contains image data.
prsXML is a simple proc that takes an xml document, extracts rows using
openxml and updates the row.
--prsXML
create proc dbo.prsXML2 @.XMLDoc text as begin
declare @.iDoc int
EXEC sp_xml_preparedocument @.idoc OUTPUT, @.XMLDoc
select * into #tblAttachment
from openxml (@.idoc, '/NewDataSet/tblAttachment', 2) with tblAttachment
update tblAttachment set [ID] = t.[ID], [FileName] = t.[FileName],
[File] = t.[File]
from #tblAttachment t inner join tblAttachment f on f.ID = t.ID
drop table #tblAttachment
exec sp_xml_removedocument @.idoc
end
If I call prsXML passing it a simple xml document as shown below it updates
the row in the database.
prsXML '<?xml version="1.0" ?>
<NewDataSet>
<tblAttachment>
<ID>49AA5490-DD38-4A7D-87E7-0525E07930AF</ID>
<FileName>test.jpg</FileName>
<File>QUJDREVGRw==</File>
</tblAttachment>
</NewDataSet>'
However, when I then select this updated row using:
select [file] from tblAttachment where ID =
'49AA5490-DD38-4A7D-87E7-0525E07930AF'
for xml auto, elements, binary base64
it returns:
<tblAttachment>
<file>UQBVAEoARABSAEUAVgBHAFIAdwA9AD0A</file>
</tblAttachment>
The element <File>QUJDREVGRw==</File>
does not match the element: <file>UQBVAEoARABSAEUAVgBHAFIAdwA9AD0A</file>
returned by the last query.
It appears that encoding/decoding is messing up the image data. What is
going on and how can I fix it?
Thanks for your help.
Jay
The issue is that FOR XML binary base64 does encode the content of your
image column using base64 encoding, but the OpenXML with clause does not
perform a decoding (let's say it is a design quirk :-)).
So you have the following workarounds:
1. Extract the encoded image and run a base64-decoder on it before doing the
insertion. The algorithm is well-described in the literature and can be
implemented using T-SQL or an extended stored proc.
2. You are waiting for SQL Server 2005, where the nodes() method will
provide you automatic decoding in the same scenario.
Sorry and best regards
Michael
"jamccormick" <jamccormick@.discussions.microsoft.com> wrote in message
news:17A4CDE2-90C8-46B0-A17E-493E2A6AF23E@.microsoft.com...
> I'm having problems with image fields when trying to insert XML into a
> table
> using 'for xml' and openxml.
> The code below demonstrates my problem. I have a table, tblAttachment
> that
> contains a field named 'file' that contains image data.
> prsXML is a simple proc that takes an xml document, extracts rows using
> openxml and updates the row.
> --prsXML
> create proc dbo.prsXML2 @.XMLDoc text as begin
> declare @.iDoc int
> EXEC sp_xml_preparedocument @.idoc OUTPUT, @.XMLDoc
> select * into #tblAttachment
> from openxml (@.idoc, '/NewDataSet/tblAttachment', 2) with tblAttachment
> update tblAttachment set [ID] = t.[ID], [FileName] = t.[FileName],
> [File] = t.[File]
> from #tblAttachment t inner join tblAttachment f on f.ID = t.ID
> drop table #tblAttachment
> exec sp_xml_removedocument @.idoc
> end
> If I call prsXML passing it a simple xml document as shown below it
> updates
> the row in the database.
> prsXML '<?xml version="1.0" ?>
> <NewDataSet>
> <tblAttachment>
> <ID>49AA5490-DD38-4A7D-87E7-0525E07930AF</ID>
> <FileName>test.jpg</FileName>
> <File>QUJDREVGRw==</File>
> </tblAttachment>
> </NewDataSet>'
> However, when I then select this updated row using:
> select [file] from tblAttachment where ID =
> '49AA5490-DD38-4A7D-87E7-0525E07930AF'
> for xml auto, elements, binary base64
> it returns:
> <tblAttachment>
> <file>UQBVAEoARABSAEUAVgBHAFIAdwA9AD0A</file>
> </tblAttachment>
> The element <File>QUJDREVGRw==</File>
> does not match the element: <file>UQBVAEoARABSAEUAVgBHAFIAdwA9AD0A</file>
> returned by the last query.
> It appears that encoding/decoding is messing up the image data. What is
> going on and how can I fix it?
> Thanks for your help.
> --
> Jay

openxml to build a report

Trying to build a report for XML data stored in a table... Employee.EmployeeUDF has six fields...I need to be able to report on all the data for the table....any way to do this?

So far, I have found a plethora of information that will easily allow me to return one row of the table...like below...but can't get more than that...

DECLARE @.idoc int
DECLARE @.doc varchar(1000)

--The line remarked line below will select the udf xml values from the Employee table
SELECT @.doc = EmployeeUDF FROM Employee

--The following is an example of the getting the udf_date_tamex info from the xml
--SELECT @.doc = ' <udf>
<udf_text_tam>595297022</udf_text_tam>
<udf_date_tamex>2009-12-20</udf_date_tamex>
<udf_text_sher>2547793</udf_text_sher>
<udf_date_sherex>2010-02-15</udf_date_sherex>
<udf_text_helth> 772469 </udf_text_helth>
<udf_date_expriration>2008-04-28</udf_date_expriration>
</udf>'

-- Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @.idoc OUTPUT, @.doc
-- Execute a SELECT statement using OPENXML rowset provider.
SELECT *
FROM OPENXML (@.idoc, '/udf',2)
--specify the fields you wish to return by specifying the tag and datatype
WITH (udf_date_tamex datetime)

EXEC sp_xml_removedocument @.idoc

Thanks....
Jason

If you are using SQL Server 2000, you will have to wrap the above into a cursor that loops over every row.

If you are using SQL Server 2005, you should use the nodes() method in the following way:

select n.value('f_text_tam[1]', 'bigint'), n.value('udf_date_tamex[1]','datetime'), ....
from Employee cross apply EmployeeUDF.nodes('/udf') as E(n)

Best regards
Michael

|||

Thanks Michael. The cursor solution is the one. Should have specified 2000. It also reminds me to request 2005 install tomorrow...

Friday, March 30, 2012

OPTIMIZING QUERY

Optimizing queries

Wednesday, March 28, 2012

Optimizing Job fails on tables with computed fields

Optimizing Job fails on tables with computed fields

Optimizing Job fails on tables with computed fields

Wednesday, March 21, 2012

Optimization gurus: Help with varchar vs. text fields decision

Monday, March 19, 2012

optimising a table with lots of boolean fields

Monday, March 12, 2012

Optimise multitable update

Optimise multitable update

Friday, March 9, 2012

Opposite of union - Stupid question for a common problem

Wednesday, March 7, 2012

Operator is not valid for type 'Date' and type 'Date'

Saturday, February 25, 2012

openxml/'for xml' image data problem

openxml to build a report

OpenXML Identity Insert

Blog Archive

About Me