Hi, I'm trying to improve performance for a db that stores messages. The message is currently stored in an ntext field, but when I look at all the records from the past 3 months, I found that 88% are less than 1000 characters, and 97% are less than 3000 characters.
I don't want to actually limit the message size, but it seems like I might get much better performance using a varchar(3000) field to hold most of the messages, and a separate text field just used for those 3% that really are long. Is this a good idea? If so, is it better to put the Message and LongMessage fields in the same table; or, have a separate table to hold the long messages? If it is in a separate table, it would need to be left joined with the message table each time messages are retrieved.
Also -- I am getting about 700 new messages daily, and right now have over 150,000 messages stored. The vast majority of activity involves new messages. Is this a good situation to look at using horizontal partitioning?
Thanks for any help, I don't really have anyone to discuss this with and it is really helpful to get some other views!!
Are you able to upgrade to SQL2005? VARCHAR(max) would be a simple solution to your problem.
Yay, I'm already on SQL server 2005, so I could use varchar(max) -- now that I've heard of it! Are there performance issues to be aware of with max? Any drawback to a design where the row size will vary wildly from row to row??
Celestine:
when I look at all the records from the past 3 months, I found that 88% are less than 1000 characters, and 97% are less than 3000 characters.
I calculated wrong -- it is an ntext field, so each char is two bytes, not one byte. Meaning 97% of the messages are actually less than 1500 characters, not 3000. All the messages are in English, so I'm not going to continue using ntext or nvarchar.
Varchar(max) allows rows to span physical blocks, hence no row length restriction. Why not look it up on Books-On_line BOL?
I understand that there is no length restriction, I looked it up right away; thanks for making me aware of the max option. What I am asking about is whether there are performance implications to consider with using varchar(max), when you are hoping to get multiple records to fit on a data page.
With varchar(max) most of the records will be fetched with a single read whereas for ntext, two reads will be required for every record irrespective of its size.
No comments:
Post a Comment