I started the series here: http://www.sqlskills.com/BLOGS/KIMBERLY/post/Spring-cleaning-your-indexes-Part-I.aspx and I want to continue with Part II today by clarifying some great questions/comments that have come up on the series. In Part III, I'll give you a few more ways to get rid of (or consolidate) indexes. And, I think there's still a bit more that Paul and I will investigate further (wrt to operational stats) but, I want to address a few comments and a few interesting things that both Paul and I have found.

In the Part I post, I talked about using sys.dm_db_index_usage_stats to see if there are any indexes that just aren't being used at all... A few comments asked why I didn't use operational_stats instead. To address that first, there are a few key differences:

  • dm_db_index_operational_stats is persisted only as long as an object is in cache (however, it's not cleared when objects are forced out of cache with DBCC DROPCLEANBUFFERS). If you want to clear ALL DMVs for a specific database, then a relatively easy way to do this (IN TESTING) is to take the database offline and then immediately bring it online again.
    • ALTER DATABASE <dbname> SET OFFLINE
    • ALTER DATABASE <dbname> SET ONLINE
      • NOTE: If there are any suspect files, you will NOT be able to bring your database back online without FIRST taking all suspect files OFFLINE. And, if you take a FILE offline then it's even more important to know that THERE IS NO WAY TO BRING A FILE ONLINE without restoring it from backups. So, it's VERY important to understand that OFFLINE/ONLINE for a database is really easy IF AND ONLY IF there are no other problems with the DB. You really need to resolve those problems first (or at least know that you're going to need to resolve those problems later through backup/restore) before you take a database offline.
  • dm_db_index_operational_stats is (from BOL) neither persistent nor transactionally consistent. This means you cannot use these counters to determine whether an index has been used or not, or when the index was last used. For information about this, see sys.dm_db_index_usage_stats.

Having said that though, none of these are really any guarantee of perfect information. And, they're not meant to be. I look at these DMVs as being a quick and easy way to get some relatively descent insight into what is or is not happening in my environement. However, even though the sys.dm_db_index_operational_stats might give you insight that you have a problem it still doesn't give you good insight into exactly what that problem might be. For example, it *might* be splits that causes some of your wait times to increase (column: page_io_latch_wait_in_ms) but, it could be something else too (some other system issue).

The main point, you can use these to get insight into which tables have the biggest problems (i.e. the biggest waits) and where they might have a lot of splits (column:nonleaf_allocation_count) but, in all honesty, that's not a guarantee. In fact, the reason I said "might" is that pages that are allocated at the end of the leaf level STILL allocate a page and require an entry to be made in the next level up in the index. So, a lot of nonleaf_allocations COULD be for a perfectly unfragmented index. So, it still doesn't tell you how fragmented the objects are or what the REAL problem is (or even if it is a problem yet).

Basically, these just tell you where you have the most activity and give you a starting point for problem solving. But, none of these (usage or operational) really tell you how to solve the problem. However, sometimes even knowing where to start IS the problem in and of itself. So, I'm not against these DMVs and I really do think you can get some good insight from them. Just use them as a tool to help focus your investigations. Use better tools like sys.dm_db_index_physical_stats to really see if you have fragmentation and where it's the worst.

hth,
kt

This morning, I woke to a wonderful email and I thought I would share it:

Good morning Kimberly, I just wanted to send you an email to let you know how awesome you are! I have applied some of your indexing strategies to our website’s homegrown CMS and we are experiencing a significant performance increase. WOW....amazing tips delivered in an exciting way! 

Thanks again,
Jim

OK, I seriously can't complain about the awesome part ;-)... but, what I think is so cool is that he sent the email. No, I don't expect everyone to send me mail after they implement a tip/trick BUT, it's really nice when I do get an email (especially like this :). There are so many ways to improve performance and some can lead to significant gains (or minimized downtime/data loss, etc.) and this is exactly why we (speakers/RDs/MVPs, etc.) do this stuff!

The long story short is that it does cost you money to go to a conference (or, at least, time to read blogs, etc.) but what we try to do at events like Connections is distill down the key points into our sessions so that you can [hopefully] apply these techniques immediately. It makes for fast-paced sessions and an information packed week but our conference just ended yesterday and I already have email from it. That's just really cool!

So, I know the economy is in a tough state and I know that quantifying the gain of a conference is hard (especially when the cost of the conference is so easy to quantify) but the end result of some of this time away might be more than worthwhile!

THANKS Jim!
kt

Categories:
Conferences | Personal

First and foremost, happy spring! I truly hope we're on the path to summer (although who would know it here - we're in Florida for SQL Connections and the weather is a bit chilly and it's been raining off/on today - I hope this is short term (no, I don't want to look at the forecast as I don't want to jinx it :)). But, wherever you are - I hope you're on your way to nicer weather and minimal cold weather (ok, I guess I only have warm wishes for the northern hem... for you southies - I hope your fall is lovely!!).

But, for everyone - now's a good time to start thinking about cleaning out the [non-literal index] closet... and getting rid of some of those dusty indexes?

Why/when should you get rid of some of your indexes?

  1. It's possible that some of them aren't being used at all...
  2. Especially when they're not being used but even when they're "redundant" (or minimally useful) they're still costing you in many ways:
    1. Wasting space on disk
    2. Wasting space in memory (well, if they are being used then they're cluttering up your cache)
    3. Wasting space/time in your maintenance routines (so, here they're cluttering up your cache for sure!)
    4. Wasting space in your backups
  3. You might be able to reduce your overall indexes with index consolidation...

So, for this post, I'm going to target #1 - are there any indexes that just aren't being used at all...

First, how do you know if your indexes are being used?
In SQL Server 2005 and higher, there's a DMV (dynamic management view) called sys.dm_db_index_usage_stats and it's there to track index usage patterns. However, it's not persisted since the beginning of time and as a result, if you look at this and believe that it's telling you ALL of the indexes that have been used in your database - then you might be mistaken. The index usage stats DMV is cleared when SQL Server is restarted as well as when you detach/attach the database or when you backup/restore the database.

Therefore, you don't want to just run the following query and drop all of the indexes that aren't being used. A better way to "trust" this information is to periodically persist the data from the DMV in your own table and then query it after you've completed a business cycle's worth of activity - logging all of the usage stats. Then you can trust this much more. Again, here are a couple of negatives:
1) it's not persisted
2) it only keeps the database_id and the index_id (which could change over time). You're right in thinking it probably shouldn't change but, a nonclustered index's ID is not permanent so, it's better to track the index name in addition to the index_id. And, when you run your queries to determine what to delete, you can easily verify indexes against the current indexes because your comparison is within the same db (more on this below).

If you want to persist this, then you have two ways to do this:

Store the index usage patterns in a table within the specific database you're tracking:

Pro: it goes with the database when you back it up, etc. and, it's easier to reverse engineer which actual indexes you're referring to (grabbing the names and not just the IDs).

Con: it's a bit more complex of a query to run and you'll need to run it for all of your databases (ok, it's really not all that bad - but, using something like sp_msForEachDB will really help)

Store the index usage patterns from all databases in a table within master or your own "performance database":

Pro: you only need one job to handle all the index usage info AND object_name *does* support TWO parameters (object_id and database_id) so, as long as you trap the name at the time of insert then you'll be good.

Con: it doesn't go with the database (e.g. backup/restore - and if you're restoring to a test system and you want to see what the usage patterns were then you'll need to get this information as well...)

Here's a simple query that you can run that shows all the indexes used right now - and adds the databasename/objectname into the results - in a persisted table you'll also want to add the runtime:

SELECT getdate() AS RunTime
, DB_NAME(i.database_id) as DatabaseName
, OBJECT_NAME(i.object_id, i.database_id) as ObjectName
, *
FROM sys.dm_db_index_usage_stats AS i
WHERE object_id > 100

And, if you want to get a few more insghts into how to persist this on Paul's blog here: http://www.sqlskills.com/BLOGS/PAUL/post/Indexes-From-Every-Angle-How-can-you-tell-if-an-index-is-being-used.aspx.

OK, so, you have a few options to think about and I have a few more parts to post!
kt

PS - The Tipping Point is coming soon too. I'm still adding a few things to that one!!! ;-)

OK, I've been pretty quiet this week... I had high hopes that I'd blog at least once or twice this week but the week turned out to be crazy busy as I presented T-W-Th at the SQL MCM (Microsoft Certified Masters) and then Fr at the Sharepoint MCM. The groups were really fun and wanted to go deeper and deeper into the technologies so on most days, I didn't even end lecture (which started at 8am) until 6:30+. So, it was a looooonnnggg week.

If you're interested, here's a link to the "very intense but fun" MCM programs: Microsoft Certified Master Program

Tomorrow, Paul and I fly out at 6am (to Florida) for SQLConnections. It's a packed week at Connections (a precon, a postcon and 5 sessions between us) and we're really looking forward to seeing many of you there! I'll be back to blogging soon as I know I still owe you the results/answers for the Tipping Point queries.

In the interim, check out Sean's Midnight DBA videos... he seriously needs to get some sleep! ;-)

Cheers to you all!
kt

Categories:
Resources

OK, I was tagged by Ward Pond here. Ward was tagged by Jimmy May here. Jimmy was tagged by Kevin Kline here. Kevin was tagged by Chris Shaw here. Chris was tagged by Michelle Uford here. And, Michelle was tagged by Mike Walsh here.

Ah... at least I know who to blame. Ah, thanks Mike! ;-)

So, as for some things I know now that I wish I knew then and/or what I value the most. First, I took a peek at a lot of the other posts. There are some really good tips and tricks in all of them and so it's hard to come up with something really clever. But, I had to admit the following:

You don't know what you don't know..

Yeah, I know. Then you don't know. So, does it matter?! Actually, yes. There have been times where I wished I had listened better before I jumped to what turned out to be the wrong solution. Sometimes taking more time at design - can REALLY help. This is something that I've been working really hard to prove in a lot of my recent posts. We've all thrown together a schema... got things working... and then found out it wasn't ideal... But, how much can you really know if you don't know the workload? Well, this is where the listening part comes in. Talk to your users (as painful as it may be ;). Talk to the stakeholders. Even if you can't figure out the specific queries that are going to run, give them sheets of paper on which they can draw their own dialogs and tell you what they'd LIKE to see (tell them to visualize what they want to see on the screen - on the paper). This is an ideal way of getting to the meat of what they want/need - without getting lost in the "glitz" of the UI. Even if you only have some insight into what they'll be doing - some is WAY better than none and the more you listen - the better you'll design. The better you design; the better it will scale/perform.

You can't know it all!!!

Give up now. Don't try to know everything - you'll never sleep. I work really hard knowing a lot about a lot of different things - but, I still specialize AND, I'm HAPPY to say "I don't know" (and say it often). In fact, what I pride myself on the most is that I'm never afraid to prototype and/or try something out. One of my favorite sayings for SQL Server is: "Jack of all trades, master of SOME." Know enough to know you need to learn more (because you plan to use that feature)... And, know enough to know when a feature is NOT appropriate. But, don't get lost in all of the minutia. We just don't have time for it. Choose your battles.

And, since I'm starving and I know when to quit. ;-) I'm going to tag a few folks including a couple of NON-SQL folks and hear what they say!! And, Paul was also tagged - check out his post here: http://www.sqlskills.com/BLOGS/PAUL/post/Things-you-know-now.aspx.

And, I'd love to hear what you know now!

Thanks,
kt

Categories:
Opinions | Tips

This is a tough topic. It's a big topic and more than any other - I think there are a lot of misunderstandings about what the log is for, why it's so critical and ESPECIALLY when/why it gets extrememly large. Simply put, it gets large when it's not managed correctly. OK, there are times when it can become large - even if it is well managed. But, more often than not, when a transaction log is wildly out of control (orders of magnitude larger than the data itself) it indicates a management/maintenance problem.

There are a lot of places where you can go to find out the technical details behind the transaction log but I'm going to target this blog post to the relatively straightforward easy (no, really easy!!) facts about transaction log maintenance.

What kind of transaction log management is right for YOUR database?

First and foremost, you MUST decide whether or not you need to do log backups. SQL Server *requires* you to make some form of decision. Well, I take that back. They don't tell you anywhere that you need to make this decision but the transaction log can get wildly out of control if you don't (see the next section for more details on this one :)).

Why? Transaction log backups will allow you better recovery options in the event of a disaster. If you create a good backup strategy, you should be able to recover from a disaster very close (possibly even up-to-the-minute) to the time of the disaster. Howevre, you are not required to do log backups. Instead you can do only database-level backups and recover with those. That's fine. There's really nothing wrong with that strategy. However, it does mean that you have a greater potential for data loss. Basically, if you decide that you're doing to do weekly full backups - then you need to be OK with losing everything that's happened since your last full backup. If that's OK, then performing full database backups (and never worrying about the log) is absolutely fine.

However, if you want more granular control and more recovery options (again, possibly even up-to-the-minute recovery - which is transactional recovery up to the time of the disaster), then you MUST add transaction log backups into your disaster recovery strategy.

So, make this decision FIRST:

  1. Am I OK with some data loss? (then you're probably OK with just database-level backups... but, you will need to do something else! be sure to keep reading!!!)
  2. Do I want to minimize data loss to the smallest amount possible? (then you're going to want to AUTOMATE transaction log backups)

But I didn't do anything - why is the log WAY out of control (in terms of size)?

OK, even if you consciously make the decision to ONLY do database-level backups, you are NOT DONE!!! In fact, this is actually what led me to do this post. I found these two (relatively dated but interesting nonetheless) MSDN forum discussions for TFS (Team Foundation Server) databases:

    MSDN Forum discussion "Recommended SQL Maintenance Plan": http://social.msdn.microsoft.com/forums/en-US/tfsadmin/thread/b23f7018-3eaa-4596-96e4-728b02cf6211/ 
    MSDN Forum discussion "Huge log files":
http://social.msdn.microsoft.com/forums/en-US/tfsadmin/thread/605d51f7-23fd-470c-945e-53fa7ed5aa87/

And, I know EXACTLY what happened in ALL of these cases (and MANY more... Paul and I see this ALL the time, in fact). In the "Huge log files" thread, there's a database mentioned (TfsWareHouse) with a 124MB mdf and a transaction log of 61.8GB. It didn't mention whether or not there were other data files but my guess is that there weren't. My guess is that they were completely shocked by why the data portion had grown to a size that's 510 TIMES the size of the database... The reason is actually somewhat simple (no pun intended). If you're not going to do transaction log maintenance (meaning transaction log backups), then you need to tell SQL Server that. (This is the part that's completely unexpected.)

When a database is created, SQL Server runs that database in a "pseudo simple recovery model". (Yes, I know - that didn't help.) What that means is that SQL Server automatically clears inactive records from the transaction log once it knows that it no longer needs them. It no longer needs them to be stored in the log because no one is using the log (i.e. you're not doing ANY backups). However, once you do start to do backups (and, people generally start by doing a full database backup), then SQL Server looks to your recovery model to determine what to do with log records. If the recovery model is set to full (and, yes, this is the default), then SQL Server gives you the "full feature set" with regard to backup/restore. SQL Server is expecting YOU to manage the transaction log by backing it up. Once it's backed up, SQL Server can remove the inactive records from the transaction log (and when you do a transaction log backup, it automatically clears the inactive records by default).

So, there are really two choices - and ONLY two choices here:

  1. Perform transaction log backups as part of your maintenance plan
  2. Change the recovery model to the SIMPLE recovery model so that SQL Server clears inactive transactions from the log automatically

Is there anything else to do for the transaction log? 

Yes! If you decide that you want to do transaction log backups then I would recommend a few things. I'd first recommend reading 8 Steps to Better Transaction Log Throughput and when you decide how large your transaction log needs to be, then also read Transaction Log VLFs - too many or too few?. These two posts will help you to create a more appropriately sized log as well as one that won't be prone to performance problems (such as internal VLF fragmentation).

If you want to learn more about the transaction log, I'd suggest a few of Paul's resources (it's probably because he has such a fantastic tech editor... oh, I'm asking for trouble with this comment!! ;-):

  1. Read Paul’s blog post to his TechNet article on Logging & Recovery. It’s a great article that covers a lot of different aspects of logging. He also did a great short video on why the transaction log grows wildly out of control. Here’s a link to the blog post that pulls all of the TechNet resources together: http://www.sqlskills.com/BLOGS/PAUL/post/TechNet-Magazine-feature-article-on-understanding-logging-and-recovery.aspx.
  2. Read Paul’s blog post to his TechNet article on Database Maintenance. It’s a great overview of all of these maintenance tasks and will give you a good overview of what each one does. Here’s a link to the blog post that pulls all of the TechNet resources together: http://www.sqlskills.com/BLOGS/PAUL/post/TechNet-Magazine-Effective-Database-Maintenance-article-and-August-SQL-QA-column.aspx

OK, so, I think that sums up part III. I think that's the last one in the series for now. I'll go through and explain "The Tipping Point" next. However, I was hoping for more results to my brain teasers (in those two posts)!!

Cheers,
kt

When you decide to rebuild or reorganzie an index, you have an option to set something called FILLFACTOR. This is probably the MOST IMPORTANT thing to understand about index maintenance and reducing fragmentation (especially in databases that are prone to it). Unfortunately, we need to dive into some internals to really understand why this is so helpful...

What is FILLFACTOR?

FILLFACTOR defines how full the leaf level pages of the index are filled when rebuilding or reorganizing an index. The leaf level of an index is the largest structure of an index and it's the most prone to fragmentation. A good analogy would be to use the phone book (ah, loosely for this one) and imagine that new people come to the area. The *entire* phone book needs to get reprinted. Obviously this is terribly expensive and therefore it's not done often (well, we all use the internet anyway...). But, imagine if EACH page of the phone book were only filled to 90%. When new people come to the area we could just add their information - on the specific page where it should reside - and the phone book stays relatively intact and we don't have to kill a bunch of trees (OK, you could argue that you would have had to kill more trees to make a phone book that has 10% whitespace...but, let's not go there right now. :). OK, I know this analogy is weak at best but think back to "a page in the phone book" and think of it representing a "page" within SQL Server (quite literally here - SQL Server actually stores data on 8KB pages and indexes are doubly-linked lists of these 8KB pages).

So, if the pages are full and there's nowhere to put a particular row (let's say someone with a lastname of 'Tripp'), how does SQL Server "insert" this new row? They can't possibly shift all of the data down and rewrite all of the pages AFTER 'Tripp' could they? I suppose they could but it would get REALLY expensive - especially if someone with the last name of 'Anderson' moves to town...

How does SQL Server make room for new rows? They do something called a page split. Now, this can be A LOT more technical than I'm going to explain here but simply put, they split the page (on which the row is to be located) into half. The first 50% of the rows STAYS on the current page and the 50% MOVES to a new page (and this new page is probably nowhere near the page being split). However, they keep this list LOGICALLY in order by maintaining a doubly-linked list. So, SQL Server knows where to go when it's reading the data - even if the pages are not physically/sequentially in order. If this happens once or twice, it's not a huge problem. But as it continues to happen, it makes scans of your data more and more expensive and the process of the split can be expensive in and of itself. The biggest problems are:

  1. The data is not physically in order (and can create lot random IOs when scanning)
  2. The pages are no longer very dense (they've gone from 100% to 50%). Splitting does has a positive side in that it's going to take awhile before the page splits again BUT, splitting is meant as a temporary fix and ideally, it shouldn't be happening to every page. But, this also negatively impacts cache - you're only using 50% of the page that's split so now you're wasting half of that page in memory (and, on disk). So, while making a table 10% wider from the start seems bad; it reduces the need for splitting and therefore your table will stay more intact over time. 

What indexes have fragmentation?

Some indexes are prone to more fragmentation than others. For example, indexes with a high-order element (the first column of the index) that is NOT in ever-increasing order. This does NOT mean that ALL indexes should be created such that their first column is ever-increasing (that's NOT a good way to create indexes!). But, it does mean that you should pay special attention to some of your larger tables and therefore larger indexes (meaning ESPECIALLY the clustered index since the clustered index is over ALL of the columns). 

In terms of index management, here are the main things to think about for each index type: 

Rebuilding the clustered index is the most expensive index to rebuild. It's also the one that you'd rather avoid if you can. It has the most limitations... A clustered index CANNOT be rebuilt as an online operation IF the table has ANY LOB columns in it at all. What this means is that the clustered index - if fragmented - must be rebuilt using an offline operation. This translates into downtime for that table - while it's being maintained. So, if you can create a good clustering key EARLY, then you might be able to minimize clustered index maintenance.

Rebuilding a nonclustered index is much easier and likely to be an ONLINE operation. Because nonclustered indexes contain a much smaller number of columns - and rarely ever include a LOB column - they can be managed/maintained through ONLINE index operations. So, even if these are prone to more fragmentation (and, in general, they are) then it's not as big of a deal.

How do I solve this problem?

So, the BEST thing to do is to visually picture the ORDER of the data in the index (just think of it like a phone book - a LIST of records on PAGES - ordered by the KEY of the index). If you can visualize that structure then you can visualize what happens on an INSERT, UPDATE and a DELETE. Once you have a good visual you should be able to define a FILLFACTOR based on these two things:

  1. How much fragmentation is happening (is it becoming massively fragmented quickly) - then set a lower fillfactor (like 70 or 80 - depending on how big the table is)
  2. How frequently are you going to be running maintenance (if you can do this daily or weekly - for some objects) then you might not need much free space. If you can run maintenance against the table frequently then you won't need as much free space.

And, of course, you can directly check the level of fragmentation your table has by using the DMV: dm_db_index_physical_stats (which I referenced in Part I) and ideally, you'll create a procedure that runs through your objects determining how much fragmentation they have and the appropriate FILLFACTOR. Sometimes the best way to do this is to create your own "management table" which lists all of your tables and then have definitions for when to rebuild, when to reorg and what fillfactor to use. There are a few folks that have done things like this. Check out Ola's site here. He's put a lot of effort into a more robust and flexible combination of options for database maintenance. It might be more than you need but it's a GREAT EXAMPLE of all of the things that you can evaluate and consider wrt to maintenance.

Solving this using Database Maintenance Plans

Finally, if you CHOOSE to use a maintenance plan you ONLY have two options:

Reorganize pages with the default amount of free space
Does NOT sound like what it actually does (IMO). It sounds like it uses some form of SQL Server default value. However, it actually uses the "OrigFillFactor" value that's set for the index. This is actually a good thing IF you've set the FILLFACTOR on the CREATE INDEX statement OR on a rebuild. Reorganizing does NOT allow you to set this ONLY a rebuild does. So, if you have set this then this choice is probably best. If you run a query against sysindexes (yes, the backward compatibility view NOT the new sys.indexes catalog view), then you can see the OrigFillFactor setting for ALL of your indexes. (And, it's probably 0 - which sounds strange... 0 actually means 100... I could explain this but just trust me on this one.)

Change free space per page percentage to _____ %
Well, this is certainly an option you can consider. If you've never set the FILLFACTOR (or, even if you have...), this would overwrite it AND change the OrigFillFactor value to the percent you use here. If you decide to take this "sledgehammer" approach then I'd suggest something quite high - like 90-95% so that you don't end up wasting a lot of space. Some indexes (ones that have a high-order element that IS ever-increasing) don't really need free space and this might be completely wasted. So, I think it's better to set this indivually using stored procedures and/or a more custom maintenance procedure.

NOTE/TIP FROM COMMENTS POSTED: While I explain this concept (above) using the index option/setting (called FILLFACTOR) the UI is wanting the FREE SPACE percentage. So, if you want to use 90-95% FULL then be sure to set this value to 5-10%. Using the syntax, you will set page fullness.

OK, I still have at least 2 more in this series and EVEN THOUGH I've been completely sucked into facebook (as Paul blogged here), I'm still trying to manage to work and save SQLskills. Paul, on the other hand, just uploaded more photos. ;-)

Cheers,
kt

OK, it seems as though there's A LOT of confusion about what steps are required for proper database maintenance. And, it seems as though some recommendations are being given as "quick fixes" without any real recommendation for root cause analysis. I'm not saying that the generalizations are horribly wrong but in many cases they're just too broad and/or unspecific to actually be useful (and, well, in all honesty, some of them are just really bad recommendations because they’re so ambiguous). And, in my random internet trolling for the day, I found 4 different references that I want to go through (which is why this is only Part I). For this post, I’ll focus just on Sharepoint.

First, what did I see that’s motivating this post?
I found the following KB article – which was referenced by numerous sites as recommended reading. And, without knowing a lot about SQL (and, that’s NOT meant as a dig at all – it’s just that most apps that sit on SQL don’t ever even recommend that you need to know SQL and I can argue certain aspects of that point as well BUT, wrt to maintenance, it can really become a problem if you don't know a few things about these tasks), it does seem like good reading: Information about the Maintenance Plan Wizard in SQL Server 2005 and about tasks that administrators can perform against SharePoint databases

Here is the part that over-simplifies picking what maintenance tasks to run vs. what not to run:

DIRECTLY TAKEN FROM THIS KB ARTICLE IS THE FOLLOWING:

We have tested these tasks and the effects that these tasks have on database schema and performance. The following table summarizes the results of these tests.

Task                                        Safe to perform this task?
Check database Yes
Reduce a database Yes
Reorganize an index Yes
Clean up the history Yes
Update statistics Yes. However, this task is unnecessary because the SharePoint Timer service performs this task automatically.
Rebuild an index No. The task does not restore existing index options before the rebuild operation. However, you can use scripts that restore index options.
Note This problem was corrected in SQL Server 2005 Service Pack 2.

We used the following criteria to determine whether a task was safe to perform:

  • Whether the task modified the database schema from its default state
  • Whether the task decreased performanceResults may vary depending on the environment.

However, if you use the Maintenance Plan Wizard to perform the tasks that are listed in the table as "safe to perform," you are likely to experience increased performance in SQL Server 2005.

The big problem is: this is just too little information about too many VERY important tasks!

Let me break this down task by task and give you a few other places to go for more information.

Check database

The check database task refers to DBCC CHECKDB. This is definitely an important part of any maintenance plan. And, it really is a safe task to run as it’s NOT corrective by default. However, there is nothing mentioned about how this command may completely flush your buffer pool as it reads all of the pages of all of the objects it’s checking. So, this might impact performance but, of all of the tasks, this is the safest to run and it’s definitely a recommended task.

If you want to learn more about DBCC CHECKDB, check out these things: 

  1. Read Paul’s blog post to his TechNet article on Database Maintenance. It’s a great overview of all of these maintenance tasks and will give you a good overview of what each one does. Here’s a link to the blog post that pulls all of the TechNet resources together: http://www.sqlskills.com/BLOGS/PAUL/post/TechNet-Magazine-Effective-Database-Maintenance-article-and-August-SQL-QA-column.aspx  
  2. Read Paul’s blog post on Myths around causing corruption – so that you can get better insight into where/why the actual corruptions are occurring. 
  3. Finally, if you’re really interested in the internals of CHECKDB and how it works – Paul’s written a ton about it in his CHECKDB from Every Angle category. FYI, 3 of his 9 years on the SQL Server Development Team were spent writing CHECKDB and repair for SQL Server 2005 (so, he definitely knows how it works J). Here’s the link to the category: http://www.sqlskills.com/BLOGS/PAUL/category/CHECKDB-From-Every-Angle.aspx

Reduce a database 

OK, I’m sure I’ll get a lot of responses to this one but IMO, a database maintenance plan SHOULD NEVER INCLUDE A SHRINK.

Let me explain… J
To be honest, I'm not even a fan of manually running database-level shrinks (DBCC SHRINKDATABASE) either. Don't get me wrong - there are ACCEPTABLE times to shrink parts of a database but, in general, I'd recommend only using DBCC SHRINKFILE for individual file-level shrinks. I wouldn't schedule shrinks nor would I EVER turn on [the database option] autoshrink. I don't think shrinks should EVER be automated - either through the database option OR through maintenance plans.

If you need to do regular shrinks - then it's likely that you have some other problem. And, without DIRECTLY addressing this problem, you *might* be making things worse.

This is a bigger discussion and I’ve found a few other references that I want to pull together. I’ll post another post about this within the next day or so – and link to it from here BUT, for right now…Know this – free space is generally GOOD. Excessive free space has happened FOR A REASON. Maybe there’s a pattern to it but often shrinking is worse than just leaving the free space for the next data explosion (a bunch of data comes in, the database grows, the data is archived, the free space remains for the next set of data that comes in).

If you shrink the database you might make things worse by fragmenting everything. Paul’s video that goes with the TechNet article on Database Maintenance shows you the [shocking if you didn’t know this] effect of shrinking a database on indexes.

Reorganize an index, Update statistics and Rebuild an index

These need to be grouped together to start because this KB article does NOT address the impact of running these together. In fact, the problem, if you run these together – is that you MIGHT make things worse. First, let me give you an overview of each:

Reoganizing an index removes fragmentation in the largest part of an index (it’s called the leaf level of the index) and removing fragmentation in this level has the greatest (and positive) effect on range query scans and cache. So, this is really the most important type of fragmentation to remove. However, this is NOT the only way to do it… 

Rebuilding an index completely and totally removes ALL forms of fragmentation in all levels of an index; however, this is the most expensive (yet most effective) way to do it. As a result of rebuilding an index, SQL Server also updates the statistics for the indexes that were rebuilt. Therefore you do not need to update statistics OR reorganize an index if it gets rebuilt.

Updating statistics is important for query processing and optimization. The query processor uses statistics on your data to help determine how many rows will be processed by your query/statement. If SQL Server can accurately estimate the rows, then it can choose a more effective plan. However, if it doesn’t have good statistics, then it may not do as good of a job at accurately estimating rows and therefore it might not come up with as optimal of a plan. So, this is an integral part to good database health. However, some of this might be done via the database option: auto update statistics which is ON by default in SQL Server (and, YES, you should leave this on). Check out this post on: Auto update statistics and auto create statistics - should you leave them on and/or turn them on??

However, if you use a maintenance plan then I really see two problems: 

  1. You’ll end up doing maintenance on things that may not need it. The default behavior for these tasks is just to run them on the selected objects. And, since many people will choose all objects (possibly even of all databases) then you’ll probably select objects that won’t really need this as frequently as you run this maintenance plan. 
  2. You might end up running a combination of things that either – wastes cycles/CPU and a MASSIVE amount of log space (which can translate into all sorts of concerns for DR technologies like database mirroring which will need to send all log rows to the secondary server). For example, if you run ALL three of these things then they’ll have to be run in a certain order (you can change this in a maintenance plan). However, the default order is: Reorganize Index(es), Rebuid Index(es), Update Statistics. This means that the work that’s done by reorganizing is effectively wasted as the rebuild would have taken care of it AND the work that’s done for updating statistics could mean that they update statistics TWICE (during the rebuild AND after) and the end result is WORSE because the update statistics command might use a sampling mechanism to generate statistics (which can lead to LESS EFFECTIVE statistics information). However, this is ONLY if you change the wizard’s default. The default is for the updating statistics command to do a “full scan”. So, even if the statistics will end up being the same – it’s still problematic because it means that for all indexes you’ve just rebuilt – you’ve now updated their statistics TWICE.

SUMMARY

A database maintenance plan is CRITICAL for best performance (especially for databases that are prone to some of the problems corrected by these maintenance tasks (yes, you can read SHAREPOINT into that statement). Sharepoint uses GUIDs as PRIMARY KEYs (read this post to hear more about the side effects of this choice) and as a result, as clustering keys. This means that many Sharepoint tables are prone to [potentially a MASSIVE amount of] fragmentation.

You absolutely need to have a maintenance plan. But, what should it be?

My absolute preference is to NOT use the Database Maintenance Plan Wizard UNLESS you really know what you’re doing. It just doesn’t give enough prescriptive advice. And, if you just select the defaults, you will end up with an inoptimal maintenance plan.

A better approach would be to create your own maintenance plan. If you write the code yourself (or leverage one of the custom ones that are already out on the web) then you can strategically target ONLY the objects that have the warning signs and/or are out of date and you can set when to rebuild vs. when to reorganize (generally people rebuild if a table has more than 30% fragentation and they reorganize when it's less than 30%). Fragmentation is something that can be detected programmatically using the DMV: dm_db_index_physical_stats (in SQL 2005/2008) or by using DBCC SHOWCONTIG (in SQL 7.0/2000). Here are a few places to go to see the more flexible and programmatic way of rebuilding/reorganizing indexes:

Smart Indexing Part II - Conditional Rebuilding a blog post (with conditional index rebuild code) from SQLMCA Bob Duffy (a good friend who is located in Dublin, Ireland and whose wife (Carmel) just had a baby last week – congrats again Bob!! You guys are seriously outnumbered now!!!) here: http://blogs.msdn.com/boduff/archive/2007/06/08/smart-indexing-part-ii-conditional-rebuilding.aspx

Custom Index Defrag / Rebuild Procedures - a blog post with some posted code as well. http://www.sqlstuff.info/post/2008/03/Custom-Index-Defrag--Rebuild-Procedures.aspx

Rebuild and Reorganize Indexes in SQL 2005 – an article (with conditional index rebuild code) from SQL Server Central here: http://www.sqlservercentral.com/scripts/31857/  (NOTE: You will need to become a subscriber to get to this article.)

Rebuild Only the Indexes that Need Help - an article by Andrew Kelly (SQL MVP) on SQL Server Magazine here: http://www.sqlmag.com/articles/index.cfm?articleid=99019&pg=1 (NOTE: You will need to become a suscriber to get to the full text of the article.)

Or, build your own! Check out the BOL topic for the sys.dm_db_index_physical_stats for SQL 2005 here: http://msdn.microsoft.com/en-us/library/ms188917(SQL.90).aspx, Example D has sample code to help you get started! For SQL 2008 it’s here: http://msdn.microsoft.com/en-us/library/ms188917.aspx. It’s still Example D for the sample code to leverage. J

The most important thing I can tell you is that a SMALL amount of time getting familiar with what’s really happening in SQL as well as WHY it’s happening to you WILL BE A LOT MORE PRODUCTIVE then just slamming in a maintenance plan that solves some problems but probably creates others.

Hope this helps!
kt

OK, so this is interesting. I've got a few answers to my last survey (Tipping Point Query #1) and well, there's a good mix of answers (and, yes, some are correct! ;)). Be sure to go back and review that last post so that you can evaluate it and these two tipping point questions completely. So, now I want to see if people really know the basis of "the tipping point".

Try these two:

Tipping Point Query #2

Table1 (t1) has 1 million rows at 100 rows per page. The table has 10,000 pages. A nonclustered index exists (on name) but it does not cover the query. At what percentage (of the table) is this nonclustered index no longer selective enough to use:

Tipping Point Query #3

Table2 (t2) has 1 million rows at 2 rows per page. The table has 500,000 pages. A nonclustered index exists (on name) but it does not cover the query. At what percentage (of the table) is this nonclustered index no longer selective enough to use:

OK, so I'd really love to see quite a few responses to these *3* "tipping point" questions. I PROMISE to do a nice long (and detailed) post for what is the actual tipping point AND the answers to all three of these questions. I'll explain the math as well as how you can generalize "what is selective enough" so that you can better create your nonclustered indexes!!!

Thanks for reading - and responding to these brain teasers!!

Cheers,
kt

PS - It's snowing here (ah...again)... maybe I'll spend the day creating brain teasers??! Do you guys like this kind of a post? (well, I suppose you won't really know until I post the answer part of it... but, just in general??). I think it's pretty cool. But, don't worry, I won't (nor will Paul) make all of my posts surveys. But, I think this is a really good one. I'm anxious to see if the asnwers come in correctly for these two as well! Have at it!

Along the same lines of improving database design and getting better performance on SQL Server (which [IMO] DOES take an experienced SQL Server database developer - but, we'll talk more about "whose job this really is" in many more posts and probably even a RunAs - which Richard and I just setup to record on Thursday (Mar 12)), I started thinking about how I could convince people of why they NEED a database developer. So, I thought I'd ask this VERY important question...

What percentage of data IS selective enough to use a nonclustered index which doesn't cover the query... in other words (just in case you're not entirely sure of what I mean :)), think of indexes in the back of a book... if you need to go to the back of the book to reference a bunch of data (this is called a [bookmark] lookup in SQL Server), there's a point where the randomness of the lookups (especially if you think in terms of many rows on a page) becomes too expensive. For example, imagine that the index is customer name and the data (the book) is customer orders - and, each page (of this rather weird book ;)), has 20 orders on it. Doing a query to lookup customer number 12's orders might be really easy (if they only have only a few orders) BUT, what if the query is "show me all of the orders for people that have an 'e' in their name". First, the number of people have have an 'e' in their name is probably better than 50% (that's TOTALLY a guess) and, if there's 20 orders per page then a lookup from the index into the book would require SQL Server to touch every page roughly 10 times. If the table has 50,000 pages (therefore 1 million rows - at 20 rows per page), then to find the 500,000 rows (remember, I'm estimating half), SQL Server would have to do 500,000 bookmark lookups. For a table with only 50,000 pages that's terribly expensive.

So, here's the question - what's the tipping point? When is a nonclustered index on customer name NOT going to be used to lookup rows of sales orders? I'm going to use a survey to see what you think and then within a week, I'll give the specific SQL Server math AND a query you can run within your own DBs to see EVERY one of your table's "tipping points". It's really interesting and I think will really help you to understand why SQL Server might not be using those nonclustered indexes.............

Cheers,
kt

Expanding on the topic of "are you kidding me"... one of the MOST PREVALENT problems I see today is the dreaded "GUIDs as PKs" problem. However, just to be clear, it's not [as much of a] problem that your PRIMARY KEY is a GUID as much as it is a problem that the PRIMARY KEY is probably your clustering key. They really are two things BUT the default behavior in SQL Server is that a PRIMARY KEY uses a UNIQUE CLUSTERED INDEX to enforce entity integrity. So, I thought I'd take this post to really dive into why this is a problem and how you can hope to minimize it.

Relational Concepts - What is a PRIMARY KEY? (quick and basic reminder for what is what and why)

Starting at the very beginning... a primary key is used to enforce entity integrity. Entity integrity is the very basic concept that every row is uniquely identifiable. This is especially important in a normalized database because you usually end up with many tables and a need to reference rows across those tables (i.e. relationships). Relational theory says that every table MUST have a primary key. SQL Server does not have this requirement. However, many features - like replication - often have a requirement on a primary key so that they can guarantee which row to modify on a related database/server (like the subscriber in a replication environment). So, most people think to create one. However, not always...

What happens when a column(s) is defined as a PRIMARY KEY - in SQL Server?

The first thing that SQL Server checks is that ALL of the columns that make up the PRIMARY KEY constraint do not all NULLs. This is a requirement of a PRIMARY KEY but not a requirement of a UNIQUE KEY. They also check to make sure (if the table has data) that the existing data meets the uniqueness requirement. If there are any duplicate rows, the addition of the constraint will fail. And, to check this as well as to enforce this for [future] new rows - SQL Server builds a UNIQUE index. More specifically, if you don't specify index type when adding the constraint, SQL Server makes the index a UNIQUE CLUSTERED index. So, why is that interesting...

What is a clustered index?

In SQL Server 7.0 and higher the internal dependencies on the clustering key CHANGED. (Yes, it's important to know that things CHANGED in 7.0... why? Because there are still some folks out there that don't realize how RADICAL of a change occurred in the internals (wrt to the clustering key) in SQL Server 7.0). It's always (in all releases of SQL Server) been true that the clustered index defines the order of the data in the table itself (yes, the data of the table becomes the leaf level of the clustered index) and, it's always been a [potential] source of fragmentation. That's really not new. Although it does seem like it's more of a hot topic in recent releases but that may solely because there are more and more databases out there in general AND they've gotten bigger and bigger... and you feel the effects of fragmentation more when databases get really large.

What changed is that the clustering key gets used as the "lookup" value from the nonclustered indexes. Prior to SQL Server 7.0, SQL Server used a volatile RID structure. This was problematic because as records moved, ALL of the nonclustered indexes would need to get updated. Imagine a page that "splits" where half of the records are relocated to a new page. If that page has 20 rows then 10 rows have new RIDs - that means that 10 rows in EACH (and ALL) of your nonclustered indexes would need to get updated. The more nonclustered indexes you had, the worse it got (this is also where the idea that nonclustered indexes are TERRIBLY expensive comes from). In 7.0, the negative affects of record relocation were addressed in BOTH clustered tables and heaps. In heaps they chose to use forwarding pointers. The idea is that the row's FIXED RID is defined at insert and even if the data for the row has to relocate because the row no longer fits on the original page - the rows RID does not change. Instead, SQL Server just uses a forwarding pointer to make one extra hop (never more) to get to the data. In a clustered table, SQL Server uses the clustering key to lookup the data. As a result, this puts some strain on the clustering key that was never there before. It should be narrow (otherwise it can make the nonclustered indexes UNNECESSARILY wide). The clustering key should be UNIQUE (otherwise the nonclustered indexes wouldn't know "which" row to lookup - and, if the clustering key is not defined as unique then SQL Server will internally add a 4-byte uniquifier to each duplicate key value... this wastes time and space - both in the base table AND the nonclustered indexes). And, the clustering key should be STATIC (otherwise it will be costly to update because the clustering key is duplicated in ALL nonclustered indexes).

In summary, the clustering key really has all of these purposes:

  1. It defines the lookup value used by the nonclustered indexes (should be unique, narrow and static)
  2. It defines the table's order (physically at creation and logically maintained through a linked list after that) - so we need to be careful of fragmentation
  3. It can be used to answer a query (either as a table scan - or, if the query wants a subset of data (a range query) and the clustering key supports that range, then yes, the clustering key can be used to reduce the cost of the scan (it can seek with a partial scan)

However, the first two are the two that I think about the most when I choose a clustering key. The third is just one that I *might* be able to leverage if my clustering key also happens to be good for that. So, some examples of GOOD clustering keys are:

  • An identity column
  • A composite key of date and identity - in that order (date, identity) 
  • A pseudo sequential GUID (using the NEWSEQUENTIALID() function in SQL Server OR a "homegrown" function that builds sequential GUIDs - like Gert's "built originally to use in SQL 2000" xp_GUID here: http://sqldev.net/xp/xpguid.htm

But, a GUID that is not sequential - like one that has it's values generated in the client (using .NET) OR generated by the newid() function (in SQL Server) can be a horribly bad choice - primarily because of the fragmentation that it creates in the base table but also because of its size. It's unnecessarily wide (it's 4 times wider than an int-based identity - which can give you 2 billion (really, 4 billion) unique rows). And, if you need more than 2 billion you can always go with a bigint (8-byte int) and get 263-1 rows. And, if you don't really think that 12 bytes wider (or 8 bytes wider) is a big deal - estimate how much this costs on a bigger table and one with a few indexes...

  • Base Table with 1,000,000 rows (3.8MB vs. 15.26MB)
  • 6 nonclustered indexes (22.89MB vs. 91.55MB)

So, we're looking at 25MB vs 106MB - and, just to be clear, this is JUST for 1 million rows and this is really JUST overhead. If you create an even wider clustering key (something horrible like LastName, FirstName, MiddlieInitial - which let's say is 64bytes then you're looking at 427.25MB *just* in overhead..... And, then think about how bad that gets with 10 million rows and 6 nonclustered indexes - yes, you'd be wasting over 4GB with a key like that.

And, fragmentation costs you even more in wasted space and time because of splitting. Paul's covered A LOT about fragmentation on his blog so I'll skip that discussion for now BUT if your clustering key is prone to fragmentation then you NEED a solid maintenance plan - and this has it's own costs (and potential for downtime).

So............... choosing a GOOD clustering key EARLY is very important!

Otherwise, the problems can start piling up!

kt

Well, today has definitely been enlightening. I've been around and around with a few folks (mostly Tim Huckaby and Richard Campbell - who are BOTH great friends and people whose opinions I greatly respect!!!) and I've come a conclusion... Paul started down this line of commentary here: http://www.sqlskills.com/BLOGS/PAUL/post/Mandatory-SQL-breathalyzer-test.aspx and this was in reference to something I was saying in our long thread...

But, where I'm at is somewhere that I'm not sure entirely understand AND I need your help. I need to understand where and with whom do you think each role should be.... Here's what I think:

The DBA has to be able to trust the code that a developer submits – plain and simple (yes, this is absolutely true). But, (IMO) it’s still the database developer that writes the code and they’re the ones that need to know what goes on IN the db and the best practices of database design/development. The database developer should know how to create scalable schemas, they should know best practices, they should have a fundamental understanding about their specific platform as well (whether it's SQL Server or DB2 or even that O one :)). At a minimum, they should have a basic understanding of things to avoid... But, I'll give you this - it's not easy and it's not like the best practices really jump out at you. I'll even argue that some of the defaults INSIDE of SQL Server itself aren't ideal (ah, let me give a list: database size and autogrowth settings have HORRIBLE defaults, the PRIMARY KEY of a table is - by default - enforced with a UNIQUE CLUSTERED INDEX (and, well, there are many cases when that's not the best CLUSTERING KEY even though it might be a completely legitimate PRIMARY KEY), databases default to the FULL recovery model but run in a pseudo simple mode where log truncation on checkpoint occurs automatically (ah, until you do your first backup...)). Ugh, many of these things can catch you out and can completely cause you grief... and some are horribly complex to change after the fact (like what is your PRIMARY KEY and/or what is the index type of a key). So, it's definitely NOT easy. BUT, back to my original point - someone needs to know this and I need to find out who YOU think should? I think it's the database developer and NOT the DBA. I think there should be a somewhat blurred line between the database developer and the DBA. IMO, the DBA needs to know operational stuff (backup/restore, availability, security, maintenance… and, this is a HUGE list of things to really understand in and of itself - especially when you're managing this for many servers), it’s not their job (IMO) to know TSQL coding, error handling, stored procedure recompilations (although I can debate both sides of this last one), I would even argue in many cases it's not the DBA that should make the primary decisions on base indexing - I *REALLY* feel that it is the database developer - they should know the database schema, know the general access patterns AND know how the platform uses those objects so that they can make better design decisions EARLY. My favorite line is "the sooner you begin to code, the longer it's going to take" and it's really true.

However, with all of the discussions I've had today (and, I'm debating on posted an edited version of each of our rants... it's colorful in places :)), it feels to me like (and what I’m seeing more and more and more) is something that got mentioned early on in the thread – the database is “just plumbing” and the real application is in the client. Is this true... are there really only two players:

The Application Developer (who doesn't touch (because they're not allowed) OR possibly because they don't want to) the database
The DBA (who doesn't allow the AppDev to touch the database BUT doesn't touch it for them either because they're too busy and/or too protective over changes??)

Seriously, IS there a database developer in this formula above? And, is it a distinct person? I don't think so (personally)... I think the Application Developer SHOULD be the database developer. I think the AppDev should do EVERYTHING in terms of schema, keys, indexes, procedures, etc. AND once in production the two should work together to handle efficient and effective change control policies and procedures. There is a way to make this work. But, when good design doesn't exist - the database performance/scalability suffers, bad practices fester and then everyone blames SQL Server and no one really makes a change. In some cases, people just accept the poor performance as "how it works".

Thoughts??
kt

**** UPDATE March 17, 2009 ****
Be sure to read the comments.
This doesn't appear to be as bad as it seemed...still glad I did the rant though; I learned more about what it's actually doing! :) 
And, it's still good to have good database design - NO MATTER WHAT!
****

OK... today's been a bit of a weird day. I've been doing research for some upcoming events and in general, I'm learning a few things. However, as one does with the web - you can get side-tracked. And, in watching the video (recommended by my good friend Timmie) highlighed in this blog post about DBAs and Developers: http://www.benhblog.com/2009/03/linq-dba-vs-developer.html, I followed a few other links as well and found this direct quote:

"I was surprised to learn that EF decided that since there was not a primary key it would just use all the non-nullable columns as a concatenated primary key.  This might not be what you want."

This quote is also from Ben's blog (and I agree) and specifically this post: http://www.benhblog.com/2008/11/entity-framework-and-tables-with-no.html. And, David Yack (a fellow RD) makes a GREAT comment that you can deselect this but I honestly cannot believe that this is the default???!

My response is....... Seriously, are you FREAKING kidding me?

OK, I don't want to go on a total rant here but this is so bad that I almost cannot believe it. And, well, it's somewhat par for the course with the day that I'm having (with other things that I've learned and been "working on"). I'll come back to more topics that fall under the category of "are you kidding me" in a second. Ah ha... there's a new category for me!!!

Anyway, why is that so bad?

  1. There's no guarantee that all non-null columns are even going to be unique (but, this is the LEAST of the problems)
  2. Making a WIDE composite PRIMARY KEY means that the underlying structure which is a UNIQUE CLUSTERED INDEX won't be as efficient. SQL Server creates this index to enforce the relational rule for a PRIMARY KEY-that it is UNIQUE. So, SQL Server will make this WIDE COMPOSITE SET OF COLUMNS the table's one and only clustered index (ugh - I hope at least it changes the index type to nonclustered.... someone please let me know if this is the case)?? However, if it's not.... then, the clustered index will now force SQL Server to order this table by this combination of columns (which probably don't have any form of pattern) so, this clustered index is probably going to be very prone to fragmentation (which in turn can negatively affect performance). And, even worse, ALL of the non-clustered indexes will ALSO be rebuilt to include ALL of the columns of the clustered index which are not already in the non-clustered index. In fact, this step alone - if a clustered index IS created - will cause ALL of the nonclustered indexes to be rebuilt in addition to the table being rebuilt.

So, please tell me this isn't the case? Sorry if this is a rant but I honestly cannot believe this one.

This is another reason where people just weren't even thinking about what was going on in the backend. Another scenario where a little bit of background knowledge of the "data store" would have REALLY helped.

Oh my?
kt

Just published! A colleague/friend of ours - Mark Pohto - has been working on server consolidation issues and the environmental costs of IT (it's an eye opening article!!!). And, the article "Green IT in Practice: SQL Server Consolidation in Microsoft IT" came out recently:

Here's the link to the The Architecture Journal you can see all of their content (it's a quarterly publication from Microsoft - which I didn't even know)
Here's a link to the article: Green IT in Practice: SQL Server Consolidation in Microsoft IT here.

And, you can even request a free print version of it online... but, that's not very green?! :)

Cheers,
kt

It's an exciting year for us for DevConnections! SQL Server 2008 has now been out for a few months and an SP is coming up soon. This is the sign that some customers wait for to migrate over to the new release feeling that an SP indicates a higher level of stability. But, this is also a time when some companies are shying away from upgrades because of the immediate and very quantifiable costs. And so Paul and I really struggled with what to focus on when we put together our Connections line-up.

What we decided to do is focus on your getting the most from the system that you have now - with best practices that apply to SQL Server 2000, SQL Server 2005 and SQL Server 2008. Personally, Paul and I are going to demo and focus on 2005/2008 but the concepts work on all 3 versions (and even most of the syntax as well - but, for index fragmentation analysis and maintenance the commands changed between 2000 and 2005/2008 so that's one minor difference). For the conference itself, we're focusing on upgrade, new features in 2008 and things to be aware of architecturally in all areas of Administration/Ops, Development and Business Intelligence. And, given that this isn't a "new product year" for SQL, other big conferences are likely to have fewer SQL sessions than usual (and most do...seriously).

So, with SQLConnections you get 46 *SQL* sessions and 3 full-day *SQL* workshops (and workshops on other technologies as well - all of which are spread over 2 pre-con days and 1 post-con day - with none of the SQL ones running concurrently so you could attend all 3). Top it all off with a more intimate event than many others means more interaction to get your tough questions answered! In fact, to help make sure we see as many people as possible, Paul and I usually schedule our sessions before and after lunch so that we can spend the entire lunch gap inside our session room answering even more questions! We even have a session called "Follow the Rabbit" where YOU drive the session with your questions. It's great fun and we've been doing this for the past few years with a lot of success!

If you want to see a bit of the personality and flair offered at Connections - check out MyConnections - it's our conference magazine (note: it's 9.80 MB to download but, it's 84 pages). It's something you get automatically after attending and it's filled with technical articles and all sorts of additional information that comes from Connections. And, here's a link to a fun and fast-paced video with highlights of the conference itself (nothing technical - just fun shots of the event). For example, did you know that EVERY year a Harley is given away at Connections? Here's the quick video: mms://bcast.sswug.org/sswugtv/DevConnectionsFall08.wmv.

Finally, did you know that EVERY attendee gets a FREE SQL Server 2008 Standard Edition license with one CAL. That can cover your attendance right there and get you started on development and learning with SQL Server 2008.

We really have a great time at Connections and we hope to see you there!! (And, Florida in March is a nice destination from the winter weather for many of us as well. :)

We hope to see you there!!
Kimberly and Paul

Theme design by Nukeation based on Jelle Druyts