At PASS we hung out with Sean McCown and Jen McCown (of MidnightDBA) and chatted... and, despite our better judgement, it was recorded. ;-) These aren't overly technical but they are fun! Enjoy!!

And, our favorite "travel tip" from the video is to know your plane and check your seat on http://www.seatguru.com/. In fact, I have a few other travel tips and I'm overdue on a meme that Kevin Kline tagged me in so I'll post that shortly as well.

Enjoy!
kt

Categories:
Interviews | Resources

Well... it's finally been released and last week was the book launch. This book came together first as a goal by Paul Nielsen who was motivated by Steve Ballmer's request that MVPs "give back" even more and even outside of the direct technical community. Paul had the idea of a book - written for charity - and contributed to by MVPs. Well, here it is!

The book contains content from 53 MVPs with Paul Nielsen, Kalen Delaney, Greg Low, Adam Machanic, Paul S. Randal and I as editors. It was great fun going through all of the content and we're really happy that it's now available!

To purchase, click on the book (to the left) and purchase directly from Manning. When purchased through Manning a larger amount of the proceeds go to WarChild.

Finally, many of us were able to get together at PASS and "launch" the book. We gathered around a long table and people (if interested) had their copies signed. It was great fun! When it was all done - we took the group shot below.

Thanks for a great week (to all that were at PASS) and for great fun working on this title!

UPDATE November 16, 2009: After a few comments, I realized that a few people were tagged incorrectly in my facebook post of this photo and as a result, I had a few names switched around too. So, I just completely started over again and double-checked all the names... Also, I've added links to their websites so you can find out more information about the authors and editors of this title. Finally, I added that purchases should be made through the Manning website instead of other sites (so that more money goes to WarChild). And, finally, I added the entire chapter list with as many links as possible to everyone's websites. I'll keep this up-to-date as folks send me corrected lnks, etc. ENJOY!!

 

Present for the photo

Here's the full chapter list from the book - along with links to the authors/editors websites:

PART I

DATABASE DESIGN AND ARCHITECTURE
Paul Nielsen, Editor

 1. Louis and Paul's 10 relational database design rules by
Paul Nielsen and Louis Davidson 
 2. SQL Server tools for maintaining data integrity by Louis Davidson
3. Finding functional dependencies by Hugo Kornelis


PART II

DATABASE DEVELOPMENT
Adam Machanic, Editor
4. Set-based iteration: the third alternative by Hugo Kornelis 
5. Gaps and islands by Itzik Ben-Gan
6. Error handling in SQL Server and applications by Bill Graziano
7. Pulling apart the FROM clause by Rob Farley
8. What makes a bulk insert a minimally logged operation? by Denis Gobo
9. Avoiding three common query mistakes by Kathi Kellenberger
10. Introduction to XQuery on SQL Server by Michael Coles
11. SQL Server XML frequently asked questions by Michael Coles
12. Using XML to transport relational data by Matija Lah
13. Full text searching by Robert C. Cain
14. Simil: an algorithm to look for similar strings by Tom van Stiphout
15. LINQ to SQL and the ADO.NET Entity Framework by Bob Beauchemin
16. Table-valued parameters by Don Kiely
17. Build your own index by Erland Sommarskog
18. Getting and staying connected—or not by William Vaughn
19. Extending your productivity in SSMS and Query Analyzer by Pawel Potasinski

20. Why every SQL developer needs to have a tools database by Denis Gobo
21. Deprecation feature by Cristian Lefter
22. Placing SQL Server in your pocket by Christopher Fairbairn
23. Mobile data strategies by John Baird



PART III

DATABASE ADMINISTRATION
Kimberly L. Tripp and Paul S. Randal, Editors

24. What does it mean to be a DBA? by Brad McGehee
25. Working with maintenance plans by Tibor Karaszi
26. PowerShell in SQL Server by Richard Siddaway
27. Automating SQL Server management using Server Management Objects by Allen White
28. Practical auditing in SQL Server 2008 by Jasper Smith
29. My favorite DMVs and why by Aaron Bertrand
30. Reusing space in a table by Joe Webb
31. Some practical issues in table partitioning by Ron Talmage
32. Partitioning for manageability (and maybe performance) by Dan Guzman
33. Faster, smaller, and more energy-efficient backups without indexes by Greg Linwood
34. Using database mirroring to become a superhero! by Glenn Berry
35. The poor man's SQL Server log shipping by Edwin Sarmiento
36. Some understated changes found in SQL Server 2005 replication by Paul Ibison
37. High performance Transactional Replication by Hilary Cotter
38. Successfully implementing Kerberos delegation by Scott Stauffer
39. Running SQL Server on Hyper-V by John Paul Cook

PART IV

PERFORMANCE TUNING AND OPTIMIZATION
Kalen Delaney, Editor
40. When is an unused index not an unused index? by Rob Farley
41. Speeding up queries with index covering by Alex Kuznetsov
42. Tracing the deadlock by Gail Shaw
43. How to optimize tempdb performance by Brad McGehee
44. Does the order of columns in an index matter? by Joe Webb
45. Correlating SQL Profiler with Windows Performance Monitor by Kevin Kline
46. Using correlation to improve query performance by Gert-Jan Strik
47. Dr. DMV: how to use Dynamic Management Views by Glenn Berry
48. Query performance and disk I/O counters by Linchi Shea
49. XEVENT: the next event infrastructure for SQL Server by Cristian Lefter



PART V

BI DEVELOPMENT
Greg Low, Editor
50. BI for the relational guy by Erin Welker
51. Unlocking the secrets of SQL Server 2008 Reporting Services by William Vaughn
52. Reporting Services tips and tricks by Bruce Loehle-Conger
53. SQL Server Audit: change tracking or change data capture? by Aaron Bertrand 
54. Introduction to SSAS data mining by Dejan Sarka
55. To aggregate or not to aggregate: is there really a question? by Erin Welker
56. Incorporating data profiling in the ETL process by John Welch
57. Expressions in SQL Server Integration Services by Matthew Roche
58. SSIS performance tips by Phil Brammer
59. Incremental loads using T-SQL and SSIS by Andy Leonard

Categories:
Conferences | Resources

Recently, the SQL Server 2008 Internals title was released (and only in the past few days have people actually received their copies)! In fact, I still haven't seen the book in person... soon though!

Anyway, Kalen, Paul, Conor, Adam and I worked to create a comprehensive resource on SQL Server 2008 internals and to supplement the written content, many of us created demo scripts and examples. I've now gone back and created a sample script based on ALL of the code in the entire chapter (and in many cases I extended the code in the samples). And, while this updated content is going to be located on the companion content site, I thought I’d also release it here with some notes.

The zip contains a solution with 3 projects, each with a few scripts:

I know the names seem a bit strange but everything is ordered EXACTLY as it is shown in the book. And, in the book, I referenced "a" script called EmployeeCaseStudy-AnalyzeStructures.sql but that script was so big that I broke it down into 7 parts (hence the naming convention of 06...01, 06...02, etc.). Regardless, each script contains a brief header and a few details about the script. To get the most from the script, do not just open the script and execute it. If you really work your way through the script, you should see all of the comments and they will help you to make instance specific changes so that everything runs without error. Just take your time and really try to step back and think about each command (and what you're expecting the output to be) to test yourself while your working through the results. Taking your time and really grapsing all of these internals is what makes it fun!

Quite a bit of this content can stand alone but it's really best as companion content to the title as there's a lot more "text" and detailed information in the book. But, the scripts are really a great way to dive deeper, learn documented/undocumented commands and really get to know what the heck is really going on internally.

Finally, I only worked on Chapter 6 so here it is: 20090810-IndexInternals-Chapter6-Resources.zip (6.13 mb). As for the other companion content, you'll need to get the links from the book. Ah, or from Kalen :). 

And, certainly, if you find a typo or anything that you think needs more clarification, let me know! I'm more than happy to post updates to this content. 

Have fun,
kt

UPDATES/ERRORLOG:
2009-Apr-13 (8am): Updated the zip after remembering in my sleep (yes, sad, but true!) that one of my comments that referenced some line numbers didn't get updated in the final version. So, no errors and if you don't get this update, it's not going to break anything. But, the script that's been tweaked is script: 05_EmployeeCaseStudy-TableDefinition.sql. Enjoy! kt
2009-Apr-13 (4:30pm): Ha... guess what arrived today. Yes - our copies of the book. Wow, it's great to see it in person. Again, enjoy!
2009-Aug-10: Added a :CONNECT option inside the IndexInternals restore script AND, cleaned up the zip as it had an extra copy of the IndexInternals database in it.

A couple of weeks ago I wrote a blog post titled Whose job is it anyway? It's an interesting debate and something I've been hearing more and more - that SQL Server is a "set it and forget it" technology - a black box where you just don't need to know how it works to do well with it. In fact, I've even had a few folks comment that they think it would be better to "roll their own" database rather than have to learn how to work in a "general purpose" database. And, while there are certainly lots of different angle to this debate - one fact remains... if you don't know anything about the database on which you're developing (whether it's SQL Server, mySQL, Oracle, whatever), I *PROMISE* you won't have a truly scalable, optimal solution. Why do you think there are so many knobs? It's because there are so many different ways to work with data. There is more than one way to query, more than one way to design. This is also why every answer to a "how should I do this" question starts with "It depends". And, while that seems like a scary response it's actually a good one. It means that you have lots of options - options that can offer many different pros/cons. And, as a result of knowing these pros/cons, you can make better decisions - decisions that will ultimately determine how well you can scale.

So..... while I don't think this debate will EVER be finished (as to WHOSE job it is to know these things), I do think a lot of folks are seeing the effects of not knowing more about their store (and, again, this is NOT limited to SQL Server in any way, shape or form).

At a minimum, hear the discussion on RunAsRadio with Richard, Greg and I and let us know what you think!

Kim Tripp on the Roles of Developers and DBAs with the Database!

Cheers,
kt

OK, I've been pretty quiet this week... I had high hopes that I'd blog at least once or twice this week but the week turned out to be crazy busy as I presented T-W-Th at the SQL MCM (Microsoft Certified Masters) and then Fr at the Sharepoint MCM. The groups were really fun and wanted to go deeper and deeper into the technologies so on most days, I didn't even end lecture (which started at 8am) until 6:30+. So, it was a looooonnnggg week.

If you're interested, here's a link to the "very intense but fun" MCM programs: Microsoft Certified Master Program

Tomorrow, Paul and I fly out at 6am (to Florida) for SQLConnections. It's a packed week at Connections (a precon, a postcon and 5 sessions between us) and we're really looking forward to seeing many of you there! I'll be back to blogging soon as I know I still owe you the results/answers for the Tipping Point queries.

In the interim, check out Sean's Midnight DBA videos... he seriously needs to get some sleep! ;-)

Cheers to you all!
kt

Categories:
Resources

Just published! A colleague/friend of ours - Mark Pohto - has been working on server consolidation issues and the environmental costs of IT (it's an eye opening article!!!). And, the article "Green IT in Practice: SQL Server Consolidation in Microsoft IT" came out recently:

Here's the link to the The Architecture Journal you can see all of their content (it's a quarterly publication from Microsoft - which I didn't even know)
Here's a link to the article: Green IT in Practice: SQL Server Consolidation in Microsoft IT here.

And, you can even request a free print version of it online... but, that's not very green?! :)

Cheers,
kt

Given the general state of the economy...many companies are looking to cut back. Going back over what we've done and "optimizing" things -> budgets, expenses, etc. is the norm right now. And, scaling back is not always a bad thing - unless the wrong things are cut. Unless the wrong things are used to motivate you. Prioritizing and/or really assessing what gives you the biggest gains for your dollars is hard. In fact, one of the things that always seems to be first on the cutting block is training. Training is hard to quantify. And, the results of good training are also hard to quantify. Instead of fixing a problem (which you can often see the exact improvement) you might instead avoid a problem. Avoid downtime. Avoid data loss. Process more rows - with the same hardware. But, how do you know the cost of what could have happened. Ugh. To be honest, if I could do that - Paul and I would be on a beach. ;)

But, I do have a reason for this post... what should you be thinking? Where should you focus your attention? What can you cut - safely, temporarily, permanently and what might you help to prioritize?

Should you upgrade software?

  • Is there a feature that makes something easier? Some new features are really powerful "big" features. For example, Policy-Based Management (PBM) might help you to better centralize certain rules (in PBM-speak "policies") and then enforce them on many servers - even 2005 servers... so, you might be able to upgrade a smaller number of servers and still get some of the benefits. Many of the tools work against multiple versions so you might be able to minimize (and/or prioritize) which servers you upgrade and slowly migrate others. Potentially following an every-other-version upgrade strategy... upgrading some servers from 2000 to 2008 and leaving some of your 2005 servers to wait to upgrade until SQL11 (the next version after SQL10 - which is SQL Sever 2008).
  • Are you starting a new project - architecting a new database? Wouldn't it be easier to start on the newer version and get better longevity (maybe?!)? For example, sparse columns might make a major difference in your base table's architecture...and be easier than if you were to architect (and write all of the code) for 2005 but then later need to do a major architectural change to move to 2008 (well, to *really* benefit from things like sparse columns). There are some really good features in 2008 and some *might* warrant upgrading... upgrading now. But, if you don't have a direct need then I'd argue that you could probably stay with 2005 (or even 2000) and then push this out a bit until you absolutely need to move forward.

Should you upgrade hardware?

  • Again, are there features that will directly impact: performance, availability, manageability?
  • Can you wait? I can't really answer this and - for everyone - the answer is going to be "it depends". There might be something that significantly reduces costs and/or minimizes downtime and as a result, you'll just have to do cost-benefit analysis. This is a tough one... but, maybe you can do rolling upgrades and let some of the lesser servers take the hand-me-downs. :)
  • Can you do rolling upgrades moving the most critical to a new server and then a less critical server to the one freed up by the last upgrade...

Is there anything you can do to get more out of what you already have??

In my opinion, this is probably even more important than the two above. Upgrading hardware and software is something you will ALWAYS need to consider but if you could get better performance, scalability and availability out of the hardware/software you have now, then you'll benefit *now* without additional funds spent (actual outgoing funds) and you still be able to leverage what you do today when you do upgrade. So, what this really translates to (IMO) is tweaking and tweaking a bit more - what you already have? How? What can you look for? What can you do to help??

  • Upgrade to the latest service packs/hotfixes (at least upgrade to the free stuff - you might see some gains and in some cases (like SQL Server 2005 SP2+) you might get some new features. (important note: test this on a non-production server FIRST!!)
  • Update your hardware's firmware? You might have missed an update that improves performance (important note: test this on a non-production server FIRST!!)
  • Bottleneck Analysis - Some good resources for this are: Performance Tuning Using Waits and Queues and the SQLCAT team.
  • Workload Analysis - Some good resources for this are: Troubleshooting Performance Problems in SQL Server 2005, Working with Tempdb in SQL Server 2005, Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005...well, there are lots of good whitepapers that are specific to certain types of workloads and/or perf problems...check out our whitepapers page here: http://www.sqlskills.com/whitepapers.asp and the CAT team's whitepapers pages here: http://sqlcat.com/whitepapers/default.aspx and the general SQL Server on microsoft.com pages here: http://www.microsoft.com/sqlserver/2008/en/us/white-papers.aspx and for 2005 here: http://www.microsoft.com/sqlserver/2005/en/us/white-papers.aspx
  • Maintenance - often overlooked and incredibly important. A database that has solid maintenance practices (fragmentation analysis and cleanup, VLF analysis and cleanup, transaction log management, finding corruption in its early stages through automated CHECKDB executions...) performs better, is easier to recover, might naturally stay smaller (more compact) and therefore require less hardware. In fact, analyzing indexes - to get rid of unused indexes and to consolidate redundant indexes can end up saving disk space, backup space, cache, maintenance costs, etc. Both Paul and I have blogged quite a bit about many of these!
  • Other tips and tricks
    • Blogs... which is why you're here and there are so many out there! Here's a link I recently found that lists a bunch of SQL-related blogs: http://technet.microsoft.com/en-us/sqlserver/bb671052.aspx and, of course, Paul's post on "So many blogs" and the PASS list of blogs here: http://www.sqlpass.org/Community/BlogDirectory.aspx.
    • Webcasts... there are lots out there and we now have a page which has most of ours listed on it (thanks to Paul for creating this!!) here: http://www.sqlskills.com/webcasts.asp and there are LOTS more on TechNet, MSDN, etc.
    • Conferences... OK, maybe a shameless plug for conferences like SQLConnections *but* in having put together the agenda (with Paul) where we specifically focused on best practices topics and performance tuning - I can tell you that some of the tips and tricks that we recommend can significantly improve performance, may minimize needed disk space (by creating more optimal and often fewer indexes), may improve availability with better design practices and/or maintenance and much more than that! And, in getting away from the office for a few days and focusing just on learning you might do two things. First, you might learn some tips and tricks that you never would have (or it would have taken *a lot* more time and/or been harder to really understand?). Second, you might come back with a whole new and renewed enthusiam for doing things - and with an ordered/prioritized list of things to try. And, this might even help to motivate you because it also shows that your company really is committed to you/your job (having spent money specifically on your learning) - and you to them.

So, I do think that there are SMARTER ways to save. A well trained employee is worth a lot more than a cheaper one. And, there are smarter things to cut. I hope this might help you think of things to do and/or places to look to get better performance with what you have! I think blanket "no training" or "no upgrades" statements are never good for anything - even the budget (the longer term effects can be much worse - but also much harder to quantify).

Really, the answer is always different. It depends............

kt

I know where I go but where the heck does the time go (it seems to move much faster than I)? First off, I hope everyone had a truly lovely holiday season. I know they've long since past for everyone but maybe this reminder will bring back a nice memory from the holidays as it does for me. Paul and I escaped the world of SQL and focused on finding frogfish. Unfortunately, it just wasn't the right time of year to find them though. Oh well - we did see some very coolstuff including a batfish (the odd-shaped-bottom-dweller kind) and, we also read a ton of books (a completely strage combination of books - list below) and we relaxed.  It was really quite lovely.

Once the holidays were over, we were back into the fire... Paul worked from home (mostly writing) and did a lot of work from there - including blogging (as always) and I was back in the classroom. We were only home for 10 days and then we were off again. Right now, we're both on the road - in Hyderabad, India this week (and last) and Bangkok, Thailand next week - teaching some classes on performance tuning, maintenance and high availability. I thought I'd do a quick post to remind you that I do exist (yes, I know - I really do *want* to blog more but it's not easy for me... first of all, before I publish a blog post I probably read it 50 times and that's after I even get the courage to write one... blogging is just different for me than it is for Paul... for that, I'm sorry. I'm *really* going to try... no, really. I will try to get a really good post out there as often as possible but for right now, I'm not going to give a number or a goal - just know that it drives me nuts that I don't blog all that often.)

In the interim, interviews/podcasts - for some reason - are a lot easier for me. Paul and I did a few interviews with TechNet in December and these are being made available for download as they're edited for publishing. The first one is available now and it's titled: New Features in Partitioning and Indexing for SQL Server 2008. Here are the download links:

OK, that's it for me for today. But, I do have a couple of other posts brewing. I will work to get these out as soon as possible as I have a small break from teaching, writing, and conferences (yes, I've been writing too!). Oh, speaking of which, we've *all* finished the SQL Server 2008 Internals work (Kalen doing the large majority) and it's getting really close! You can pre-order it on Amazon here: http://www.amazon.com/Microsoft%C2%AE-SQL-Server%C2%AE-2008-Internals/dp/0735626243/ref=sr_1_1?ie=UTF8&s=books&qid=1233824791&sr=1-1.

Thanks for reading - I'll be back soon!
Kimberly

As for books, in no particular order:  

  • Robin Maxwell's Mademoiselle Boleyn
  • Josh Grogan's Marley & Me
  • Sheridan Hay's The Secret of Lost Things
  • Tucker Max's I hope They Serve Beer in Hell
  • J.K. Rowling's Harry Potter books 2, 3, 4, 5 (these were re-reads - always a safe bet to bring with me and a fun re-read as Kiera (7) goes through the series now too) and where is the 6th movie? Ugh. ;)
  • Tom Philbin's The Killer Book of Serial Killers: Incredible Stories, Facts and Trivia from the World of Serial Killers
  • Stephenie Meyer's Twilight (last, but not least!)

OK, I told you it was a very wide-ranging book list; a few were purchased at the airport on the way. I've since read the entire Twilight series by Stephenie Meyer: New Moon, Eclipse, and Breaking Dawn (and I have to admit that I loved the series (especially Breaking Dawn))... I really hope she completes Midnight Sun; I absolutely loved Twilight from Edward's perspective (maybe even more than from Bella's). And, if she continues writing in this world - I really hope she writes as Renesmee. The Twilight movie was not spectacular - defnitely read the book before going to the movie - but, I loved the casting of Bella and Edward. And, I'm hopeful that the next movie will be *a lot* better. Next on my reading list, I'm planning to read many Robin Maxwell books starting with The Secret Diary of Anne Boleyn. In the end, the holidays were really relaxing.

Oh, and here's a great shot from India... At least he has a helmet??

Well... I think I had had too much tea that morning ;-). But, as always, chatting with Richard and Greg was great. Here's the specific show link: http://www.runasradio.com/default.aspx?showNum=76.

Oh, and just for the record, I didn't come up with that title. But, I do hope that all your [high-priority and important] queries are indexed!

Enjoy!
kt

OK, I know this post doesn't really apply to most folks but I did want to have the setup instructions on my blog so that I could refer people to it. For the August release we updated all 17 labs and all of the demos for SQL Server 2008 RTM; however, we did use a "security enabled" VPC. And, well, security is bugging people right about now because the password policy in this VPC was set for 42 days. I know all of you run into this on production servers but on a test VPC, it's pretty much nothing more than an annoyance. So... I thought I'd mention how to disable it for the VPC.

(1)    Remove the policy for 42 password expiration

                Local Security Policy Editor (Admin Tools, Local Security Policy)

                Security Settings

                                Account Policies

                                                Password Policy

                                                                Set the Maximum password age to 0 days (which is no password expiration)

 (2)    Change the SQLService account to not require a password on next login

Computer Management

                                Users and Groups

                                                SQLService

                                                                Deselect the option "User must change password at next login"

         you might not actually need to set this but it's good to check!

 (3)    Windows Update

While you're at it then, you should also consider running Windows Update...

 (4)    Shutdown and save changes...

And, here's the full instructions that we usually give out at conferences/workshops: SQL2008Aug HOLs DVD SETUP Instructions.pdf (25.93 KB)

As for accessing this content on the web, it's part of Jumpstart and it will be part of the SQL Server 2008 Resource Kit. Also, I just refreshed the image for the resource kit so that it does NOT have the password policy enabled so you won't need to do that step. Also, the Jumpstart folks will probably refresh the image there so that it also uses this version. Regardless, you might still want to run Windows Update. And, you'll also find a lot of these resources at upcoming events and conferences as well as through these resources.

Enjoy!
kt

OK, in my quest to blog more often, I realized I have the perfect thing to blog about today... stuff I did that I haven't yet told you about. :) :)

At TechEd US, Paul and I were on a panel created to debunk some of the more common questions/concerns when setting up High Availability. It's an interesting panel and I think there are some particularly interesting moments based on the strange turns the discussions take... but, overall, it's expecially good for people seeking a 100-200 level overview of HA (Clustering, Database Mirroring, Replication, Log Shipping) with a few other related topics (Database Snapshots and Backup/Restore). Here's the direct link to the recording: http://mfile.akamai.com/14853/wmv/microsofttec.download.akamai.com/14853/TechEdOnline/Videos/08_NA_ITP_TEOPanel_57_low.asx. And, here's a general link to everything TechNet so that you can check out TechNet Radio, TechTalks, etc. there's some really great stuff out there: http://technet.microsoft.com/en-us/default.aspx. Have fun!

Also at TechEd US, I hung out for a bit in the bloggers lounge...when it was just about to end, it turned out there was a "Women in Tech" panel starting - in which I was then asked to join in. I have to admit that I'm sometimes reluctant to get involved in these because my general (and this is *very* general) opinion is that it doesn't matter if you're a woman or a man (wait - wait - let me explain!!). What I mean by this is that the key to ANY job is that you have to enjoy it. You need to find something in it that's exciting. If you can do that - then you'll be good at your job... plain and simple. If you don't like it (regardless of whether you're a man or a woman) then it's never going to be satisfying and/or you're never going to get ahead. Having said all of this, I do have to admit that I find it especially exciting when I do find other women that can geek out and really enjoy doing this whole IT thing (like MLB, Kate Gregory, Stacia Misner, Julia Lerman, Kalen Delaney, Eileen Brown). So, I don't really intend on actively trying to get more women into technology but the fun part about this panel was the fact that a lot of us talked about how children these days are already way more into IT - without realizing it (our girls asked what film was recently because we were scanning a lot of negatives... and, can you imagine any of our lives without cell phones??). And, my hope, is that if people find ways of getting more into IT (and computers/digital stuff - in general) at an earlier and earlier age... then maybe more kids (girls as well) will find a passion in IT. Here's a link to the panel from TechEd: http://mfile.akamai.com/14853/wmv/microsofttec.download.akamai.com/14853/TechEdOnline/Videos/08_NA_ITP_TEOPanel_67_low.asx.

And, here are a few resources regarding Women In Technology:

And, I think that's it for now...

Thanks for reading,
kt

Categories:
Opinions | Personal | Resources | WIT

I first posted an update to sp_helpindex here. My version of sp_helpindex was solely to expand what sp_helpindex showed and adds 1 or 2 things based on version: for SQL2005+ it adds included columns and for SQL2008 it also adds the filter predicate. So, there were two versions of sp_helpindex2 depending on which verison you're using. A lot of folks like the changes to this sp but, alas, it had a bug (or two :) and in fact, I found a few others when I went back over this as well. So, thanks to Josh (who commented here) and to a private email (thanks Vasco!), I have an updated version of sp_helpindex2:

For SQL Server 2005, here's your new sp_helpindex2 script: sp_helpindex2_2005.zip (2.89 KB)

And, here's a simple test script for 2005:

DROP TABLE tbl1
GO

CREATE TABLE tbl1( c1 int, c2 int, c3 int, c4 int)
GO
CREATE INDEX ix_1 ON tbl1(c1) INCLUDE (c2)
CREATE INDEX ix_2 ON tbl1(c1)
CREATE INDEX ix_3 ON tbl1(c1) INCLUDE (c2, c3)
CREATE INDEX ix_4 ON tbl1(c1, c3) INCLUDE (c2)
CREATE INDEX ix_5 ON tbl1(c3) INCLUDE (c1, c2, c4)
CREATE INDEX ix_6 ON tbl1(c1, c2) INCLUDE (c3, c4)
go

sp_helpindex2 tbl1
go

index_name

index_description

index_keys

included_columns

ix_1

nonclustered located on fg1

c1

c2

ix_2

nonclustered located on fg1

c1

NULL

ix_3

nonclustered located on fg1

c1

c2, c3

ix_4

nonclustered located on fg1

c1, c3

c2

ix_5

nonclustered located on fg1

c3

c1, c2, c4

ix_6

nonclustered located on fg1

c1, c2

c3, c4

For SQL Server 2008, here's your new sp_helpindex2 script: sp_helpindex2_2008.zip (2.84 KB)

And, here's a simple test script for 2008:

DROP TABLE tbl1
GO

CREATE TABLE tbl1( c1 int, c2 int, c3 int, c4 int)
CREATE INDEX ix_1 ON tbl1(c1) INCLUDE (c2)
CREATE INDEX ix_2 ON tbl1(c1)
CREATE INDEX ix_3 ON tbl1(c1) INCLUDE (c2, c3)
CREATE INDEX ix_4 ON tbl1(c1, c3) INCLUDE (c2)
CREATE INDEX ix_5 ON tbl1(c3) INCLUDE (c1, c2, c4)
CREATE INDEX ix_6 ON tbl1(c1, c2) INCLUDE (c3, c4)

CREATE INDEX ix_1f ON tbl1(c1) INCLUDE (c2)
WHERE c3 IS NOT NULL

CREATE
INDEX ix_2f ON tbl1(c1)
WHERE c4 > 2

CREATE INDEX ix_3f ON tbl1(c1) INCLUDE (c2, c3)
WHERE c4 > 2 AND c1 < 50 AND c2 = 12

CREATE INDEX ix_4f ON tbl1(c1, c3) INCLUDE (c2)
WHERE c4 IS NOT NULL AND c1 = 12

CREATE INDEX ix_5f ON tbl1(c3) INCLUDE (c1, c2, c4)
WHERE c1 > 5

CREATE INDEX ix_6f ON tbl1(c1, c2) INCLUDE (c3, c4)
WHERE c4 < 20
go

sp_helpindex2 tbl1
go

index_name

index_description

index_keys

included_columns

filter_definition

ix_1

nonclustered located on PRIMARY

c1

c2

NULL

ix_1f

nonclustered located on PRIMARY

c1

c2

([c3] IS NOT NULL)

ix_2

nonclustered located on PRIMARY

c1

c2

NULL

ix_2f

nonclustered located on PRIMARY

c1

c2

([c4]>(2))

ix_3

nonclustered located on PRIMARY

c1

c2, c3

NULL

ix_3f

nonclustered located on PRIMARY

c1

c2, c3

([c4]>(2) AND [c1]<(50) AND [c2]=(12))

ix_4

nonclustered located on PRIMARY

c1, c3

c2

NULL

ix_4f

nonclustered located on PRIMARY

c1, c3

c2

([c4] IS NOT NULL AND [c1]=(12))

ix_5

nonclustered located on PRIMARY

c3

c1, c2, c4

NULL

ix_5f

nonclustered located on PRIMARY

c3

c1, c2, c4

([c1]>(5))

ix_6

nonclustered located on PRIMARY

c1, c2

c3, c4

NULL

ix_6f

nonclustered located on PRIMARY

c1, c2

c3, c4

([c4]<(20))

Have fun!
kt

I started this post while Paul and I were in the TechEd Bloggers Lounge... which, from the number of folks "stopping by" must be more of an online thing :). We did have lots of folks visit with us in the DAT "Green" area but in the bloggers lounge, I thought I'd blog (maybe that's the point?!). But, as luck would have it, I got side tracked at the end and ended up on the Women In Technology panel, then chatting with a customer, and then another... and, well, now it's Friday and time to pack up and head home. This year's TechEd was a combination of crazy busy highs with a very "where is everyone" lows. I asked about YOUR opinion of the split for TechEd this year and I think folks are waiting for the event to complete before they form opinions but for me, I felt like the event was smaller...without as much buzz as in TEs past. I've always really viewed TE as "the biggest US event of the year" where all of the buzz was created. It's where I often learn about new (usually outside of SQL Server :) technologies and where I hear some of the latest buzz. This year, I just didn't feel as though there was as much buzz (or as many people). And, I guess I've complained about how crowded it's felt in years past so I guess both have their ups/downs - but, overall, TE just didn't (in my opinion) have the same energy of TEs past... I'd definitely like to hear more opinions on this. Overall, it was nice to catch up with a few of the usual suspects.

As for a trip (and resources/tips/tricks) report... here I go :)

Monday was our "Overview of SQL Server 2008" precon and that went really well... we were completely exhausted by the end of the day but we had a lot of great feedback. The general feedback was that it was the perfect way to start the week as it allowed people to get a really good more-than-just overview of the new features and help to determine which topics/sessions might be the most relevant for folks to attend. Also, I learned about a Performance Dashboard bug here that I thought I'd pass on as well - it's not likely to happen to everyone and it's related to a potential timeout problem. I didn't find too many references to this problem though so if someone does have a more detailed link - let me know. The end result (a gentleman named Leif sent this script to me) is that you need to tweak the C:\Program Files\Microsoft SQL Server\90\Tools\PerformanceDashboard\setup.sql file so that the stored procedures deal with the timeout correctly. The change is to line 276:

From:
sum(convert(bigint, datediff(ms, login_time, getdate()))) - sum(convert(bigint, s.total_elapsed_time)) as idle_connection_time,

To:
sum(convert(bigint, CAST ( DATEDIFF ( minute, login_time, getdate()) AS BIGINT)*60000 + DATEDIFF ( millisecond, DATEADD ( minute, DATEDIFF ( minute, login_time, getdate() ), login_time ),getdate() ))) - sum(convert(bigint, s.total_elapsed_time)) as idle_connection_time,

Tuesday, I delivered my Indexing session "Are your indexing strategies working?" and the group really enjoyed some of the things I focused on - from DMVs to DTA to SSMS to Performance Data Collection in SQL Server 2008... there are many aspects to indexing and depending on your environment some are more relevant/important than others. There are definitely a lot of things to talk more about here... INCLUDE columns, statistics... In the interim, I've posted all of my demo scripts to our past conferences page here

Wednesday, Paul delivered his Corruption session and it was excellent. He showed a variety of different situations that all have different workarounds - as well as discussed a few that don't. He's posted a few things about these sessions and he's going to slowly detail each of his demos in blog posts.

Thursday, we delivered a Database Maintenance session which was a "Top 10", per se. This was a great session for us but I wish we had done it as our first session instead of our last. I felt like this session was a wider session (in terms of topics) and our other two were deeper sessions (in terms of technical content)... But, it was still great fun to do. The thing that made it the most fun for me is that I decided at about 5pm the day prior that I wanted to tweak my demos a bit... and, 14 hours later, I was done. I decided that I would create two copies of a database - one for each of two laptops and then I would work really hard to create a "best practices database" and a "worst practices database". What was most stunning to me was that the cumulative effect of all of the worst practices was actually a lot worse than I had thought it would be. In the first test run I did on stage my best practices database (before the index maintenance) ran at about 3500 measured statements per second. My worst practices ran at about 71. I absolutely did not expect that large of a difference... and, one of the things I'm truly looking forward to doing is breaking each of the worst practices down and comparing the breakdown of each to find what has the greatest impact.

The week was a big success overall and it was great to see a few folks that we seem to only see at these types of events....... see you next year!

Cheers,
kt

Memorial Day weekend we were in Chicago to celebrate my Father's life. We did a "Celebration of Life" memorial and we had a few drinks (celebratory Meyers, Tonic and lime - which was my Father's favorite drink), we (7 of us) gave a few heartfelt speeches, and a few friends wrote a song (and passed out the words - to which we all sang along) and we grieved... but, in a refreshingly-not-overly-depressing way. I have to admit - it was exactly what I'd want as well. It was a wonderful day filled with memories and friends. After that, we visited with my Mom as well as my Grandmother. Paul blogged a couple of pics (yes, that chair is VERY big!). Then, we were back in Seattle for only one week...back to work...and preparing for TechEd 2008 ITPro week.

As for TechEd being spread over 2 weeks, well... I think it offers some excellent logistical options (smaller size means more possible venues AND/OR it means that they could possibly grow the size for each event). And, for some topics, I think there is a very strong separation between developer and ITPro (admin/ops) but for SQL - I think it's hard to get it perfectly right. I think there's a lot of developers who need to know more about admin/ops just so that they can develop more optimal (and even manageable solutions) and I think that DBAs should have a really good architectural overview of a lot of features to better administer them. So, for SQL, I'd *love* to hear your comments on what you think............

For Paul and I, the decision is relatively simple, we came for this second week for ITPro/Ops. But, we've also spoken at the developer events (and sometimes we even write/present sessions specifically targeted at developers at our SQL Connections shows and/or at User Groups (we just did a local .NET user group in Redmond and the discussion around Indexes became so popular that we're going back in August (for Indexes) and again (tbd) for Disaster Recovery techniques). Basically, developers tend to say...oh, that's why I should x or y or z....... so, maybe next year we'll hit both? Regardless, I'd still like to know what you think? Were you at the Developers event? Do you wish you could be at both? Are you at both?

As for what we're doing - Paul's already blogged it here: http://www.sqlskills.com/blogs/paul/2008/06/05/OffToTechEdUSITProTomorrow.aspx.

But, I thought I'd do a quick recap so that you can get some insight into our week as well as where to find us to come and chat. We'd love to meet you and/or hear your success (or disaster) stories!

Monday

  • Full day pre-con seminar: SQL Server 2008 Overview for DBAs

This is ACTION packed (and a very full day!) and will include giving out the updated SQL Server 2008 HOLs DVD. We weren't really sure we were going to be able to do it... we didn't really burn too many of the CTP6 version of the DVDs, nor were we sure that CTP6 would still be the most relevant. But, it's still excellent to learn on and this time our DVD includes 17 labs:

HOL Lab Filename

Length

Lab Name

Lab Abstract/Description
Using Policy-based Management.doc 75 minutes SQL Server 2008 Policy-based Management Security, best-practices, proper configuration settings - how do you control these things on one or more server? These hands-on labs show you how to implement and leverage the new policy-based management framework to define and control your business rules and your server's compliance for one or more instances of SQL Server 2008. 
Data Recovery Preventative Techniques.doc 75 minutes SQL Server 2008 Data Recovery and Preventative Techniques Hands-on Lab Can you recover from a dropped table? Can you reconcile tables that have become out of sync due to human error? These hands-on exercises show you how to bring a database back online quickly after a table is dropped as well as how to reconcile the differences between a production environment and a recently restored version of your database - so that you can manually merge the recovered data back into your production database. Once all of the recovery techniques are shown, the last exercise shows how DDL triggers can prevent some of these human errors altogether.
Table and Index Partitioning.doc 75 minutes SQL Server 2008 Table Index & Partitioning Hands-On Lab Table and Index Partitioning allows large tables to be managed more granularly. These hands-on labs show you how to implement and leverage these key features: a partition function, a partition scheme and the sliding window scenario.
Database Mirroring Part I.doc 75 minutes SQL Server 2008 Database Mirroring, Part I Database Mirroring allows you to create a secondary (mirror) database to handle requests either automatically or manually, in the event of a diaster at the principal database. These hands-on labs show you how to implement and leverage as well as when and how to use Database Mirroring. You will setup database mirroring in a High Availability configuration (synchronous mirroring with a witness), see the affects of failover, and see how automatic page repair restores damaged pages in the principal or the mirror.
Peer to Peer Replication.doc 75 minutes SQL Server 2008 Peer-to-Peer Replication Hands-On Lab Replication gives you a scale-out configuration where multiple servers participate in bi-directional transaction replication. Setting up and configuring this topology has a few requirements - many of which are minimized by using the Replication Wizards - but all need to be understood to configure a peer topology correctly. These hands-on exercises show you how to implement a peer topology correctly.
Using Performance Data Collection.doc 75 minutes Performance Data Collection in SQL Server 2008 Performance Data Collection brings together many key tuning features into one cohesive toolset. These hands-on labs show you how to create a Management Data Warehouse, setup and control the collection intervals and analyze the results of system data collection sets.
Instant Initialization.doc 45 minutes SQL Server 2008 Instant Initialization Instant Initialization allows data files of any size to be created instantly - eleminating zero-initialization. These hands-on exercises show you how to configure your server's permissions to leverage instant as well as the security vulnerability created by enabling this feature.
Online Operations.doc 75 minutes SQL Server 2008 Online Operations Hands-On Lab Online Operations are critical to the success of any server that needs to be highly available. These hands-on labs show you how to implement and leverage these key features: online index operations, partial database availability and online piecemeal restore.
Database Development Clients Lab.doc 120 minutes SQL Server 2008: Database Development Hands-On Labs The goal of these hands-on lab materials is to get an understanding of when to use one or more of the advanced features of SQL Server 2008 Database Development. After completing these self-paced labs, you will be able to:
* Set up a Database Project using Visual Studio 2008 Team System Database Edition
* Make changes to the database schema and deploy those changes
* Create and edit a project that uses the LINQ to SQL Object Relational mapper
* Use LINQ to SQL to query and maintain a SQL Server database using the managed classes
* Use LINQ to SQL with stored procedures
* Create and edit a project that uses the ADO.NET Entity Data Model mapper.
* Use the ADO.NET Entity Data Model to model a many-to-many relationship in the database
* Use the ADO.NET Entity Framework classes and LINQ to Entities to query and update a database
* Use Visual Studio 2008 to quickly get an ADO.NET Sync Services application up and running.
* Set up SQL Server 2008 Change Tracking
* Use ADO.NET Sync Services with SQL Server 2008 Change Tracking
Snapshot Isolation.doc 75 minutes SQL Server 2008 Snapshot Isolation Hands-On Lab The goal of these hands-on lab materials is to get an understanding of the appropriate uses of transaction isolation levels as well as how snapshot isolation affects conflicting readers and writers.
Database Mirroring Part II.doc 120 minutes SQL Server 2008 Database Mirroring, Part II Part II of the Database Mirroring HOLs allows you to go through setup, implementation and numerous failover scenarios - step-by-step. While Part I offers quicker setup through SQLCMD scripts, Part II works through the setup process more slowly allowing you to see how things work together. This lab is longer but also goes through changing the mirroring configuration as well as forcing failover. Part I should be completed first and Part II should be completed only if time permits.
Service Oriented Database Architecture.doc 120 minutes SQL Server 2008 Service Oriented Database Architecture Hands-On Lab Manual The goal of these hands-on lab materials is to get an understanding of how and when to use Service Broker in deploying a service-oriented database application.
Database Snapshots.doc 75 minutes SQL Server 2008 Database Snapshots Hands-on Lab The goal of these hands-on lab materials is to get an understanding of how to use the Database Snapshot feature of SQL Server 2008. After completing these self-paced exercises, you will be able to:
* Understand how to create a database snapshot
* Understand how to investigate file sizes and sparse file configuration (using both T-SQL queries and Windows Explorer)
* Understand the benefits and challenges with creating multiple snapshots
* Understand how a database snapshot is created when transactions are in flight as well as when they're not
* Understand how to use database snapshots for testing and reverting databases
* Understand the requirements to drop database snapshots and drop databases that have database snapshots
* Understand how to create a database snapshot on a mirror database
Dynamic Management Views.doc 75 minutes Understanding and Using DMVs Hands-on Lab The goal of these hands-on lab materials is to get an understanding of the more advanced new features of SQL Server 2008 that give access to server information that can be used for performance tuning, server health monitoring, and problem diagnosis. After completing these self-paced labs, you will be able to:
* Determine what DMVs exist, what their input parameters are, how and where their data is stored, and be able to persist DMV data to your own tables.
* Access information from the query plan cache, including determining frequently executed queries and their query plans.
* Access physical statistics information about indexes (e.g. fragmentation).
* Access information about tempdb space utilization.
Resource Governor in Action.doc 45 minutes SQL Server 2008 Resource Governor Hands-on Lab The goal of these hands-on lab materials is to get an understanding of when to use one of the more advanced features of SQL Server 2008: Resource Governor.  After completing these self-paced labs, you will be able to:
* Understand appropriate uses for Resource Governor
* Create Resource Pools
* Create Workload Groups
* Monitor Resource Usage
Understanding Spatial Data.doc 120 minutes SQL Server 2008: Understanding Spatial Data Hands-on Lab The goal of these hands-on lab materials is to get an understanding of one of the more new features of SQL Server 2008:  Spatial Data Support. After completing these self-paced labs, you will be able to:
* Understand what spatial data is
* Understand the different types of spatial data
* Create instances of spatial data
* Investigate the properties of spatial data
* Query the relationships between different instances of spatial data
* Integrate spatial data into a managed code application
* Move spatial data between managed code and the database
* Create spatial data graphically using WPF 
Using SQLCMD.doc 75 minutes Understanding Command-line Management with SQLCMD in SQL Server Hands-on Lab The goal of these hands-on lab materials is to get an understanding of one of the more new features of SQL Server 2008:  Spatial Data Support. After completing these self-paced labs, you will be able to:
* Use SQLCMD with an initialization file, system environment variables and parameters to create customized “master” scripts for automation and administration
* Use SQLCMD and SQL Server Management Studio with the Dedicated Admin Connection for troubleshooting
* Use SQL Server Management Studio to modify and execute SQLCMD mode scripts
* Upgrade databases from SQL Server 2000 using a parameterized script running with SQLCMD

This is a GREAT resource for playing with a lot of these new technologies and it's exciting that we have enough copies to give away to our pre-con attendees! A few of these are featured as HOLs at this year's TechEd as well and some of these can also be found with our prior Jumpstart resources.

Tuesday

  • 13.15 - 14.30 (Room N230) DAT354 Are Your Indexing Strategies Working? (featuring me as speaker/presenter)
  • 15.00 - 16.00 (TechEd Online Stage) Panel: Leveraging SQL Server Technologies to Build a Solid High-Availability Strategy
  • 16.00 - 18.00 DAT track booth (green)

Wednesday

  • 10.15 - 11.30 (Room N220D) DAT375 Corruption Survival Techniques: From Detection to Recovery (featuring Paul as speaker/presenter)
  • 11.30 - 14.45 DAT track booth
  • 15.00 - 16.00 Blogger's Lounge

Thursday

  • 10.15 - 11.30 (Room S230E) DAT363 Essential Database Maintenance (we're co-presenting this one)
  • 11.45 - 13.00 Speader Idol judging (I'll do my best to play Paula and I have hopes that Paul won't play Simon but he does have that British thing going for him)
  • 14.30 - 18.00 DAT track booth

Friday, we fly home... then, we're going to test all of our dive gear and take a little dive vacation at the end of the month. Hopefully, I'll be able to post a couple of nice underwater shots!

Oh, and I've finally tweaked my Indexing post (the one that survived the drive corruption - oh, but as an update to that post... Even though I got that drive repaired, almost all of the jpgs, some of the pdfs and even a few of the Office files were still corrupt. Office opened a few of them and "repaired" them on open (which was really cool) but I did lose the photos I had taken that weekend (well, all of the ones that I removed from my camera's SD card). Anyway, I plan to (well, hope to) post the Indexing post (part 2 to this one) tomorrow!

Cheers,
kt

PS - It's hot as hell here... and the humidity is NASTY!!! But, it beats the SOLID rain that we've been having in Seattle...

Have you ever written something and then lost it... for whatever reason: your own stupidity (come on we've all accidentally done something at some point where we lost data or a spreadsheet or a document or something...), the software eats it (this might be self-inflicted but I've been in apps that just hang and that's it... there's nothing you can do except power off), or....whatever.

Well, during those times... have you ever thought - I'd do almost anything to get that data (and time) back?

Before I go any further - yes, backups are good. No, let me be clearer... Backups are an absolute requirement of ANY environment!!! 

Even personal/home environments should have something to protect the data. Something to consider is even offsite copies. Burn a DVD or two of your precious family photos and then swap DVDs with a friend... there's your simple/cheap/offsite data recovery. We all focus on critical data when a business depends on it - what about the personal stuff... Paul and I always talk about database backups and this post is not really targeting enterprise systems or even business critical systems... Really, there's NOTHING THAT SHOULD EVER REPLACE A GOOD DISASTER RECOVERY STRATEGY...

Having said that though, I have an interesting story (with a data-saved/happy ending)!

I was on a plane flying back from SQL Connections (I was actually flying from Tampa after having visited family in St. Pete) and well, disaster struck again (yes, this has NOT been a good year for hard drives for me as this was my 3rd and MOST catastrophic disaster so far...).

Anyway, Paul and I were "discussing" what I STILL think is a bug in a SQL function (ok, I'll get to that in a later post) and I had been feverishly completing a multi-page blog post AND some index examples/metadata queries, etc. when my laptop slipped off of my crappy airplane table (I was in an exit row so I had a table that came out of an arm chair and folded - it was very wobbly) and crashed to the floor (and, the irony of being in an exit row... had I been in a regular seat there wouldn't have been room for the laptop to have fallen to the ground :).

To my surprise, I picked it up and still saw the screen I was working on. I typed in another line... and then got a little dialog that said "windows hard error" or something... I don't even remember now. The only thing I could do is power it off. My laptop was dead. Very dead. I [expect] that I had had a major head crash when my laptop hit the floor because it's likely the disks were still spinning. However, I still did not know this at the time.

I rebooted and received "Error 2100 - Hard drive initialization error"... so, when we landed in Denver, I was off to one of the small stores to buy an eyeglasses kit (I needed a screwdriver :) and then I went to find a spot to do some laptop surgery. The good news is that I've had so many disasters over the years that almost none really freak me out anymore (this is probably the 10th drive I've had personally fail) and, I always carry at least one spare laptop... But after the terrible time I had March in India, I actually had 2 spare laptops on this trip (yes, airline security hates me even more now - oh, and Paul travels with 2... so, we go through security with 5 laptops... we get some interesting looks!).

Anyway, I took the take the drive out, loaded it into my secondary drive bay (if you travel a lot - having this second bay that can hold a primary/secondary drive is INVALUABLE as these secondary drive bay usually use the same setup (Serial ATA in my case) as the boot drive). So, if your boot drive doesn't boot, you *might* be able to still read and/or save data by using the secondary drive. So, on a second machine, I gave it a try to see if it would spin. No dice. I even tried my third machine (my primary was XP, my secondary is XP and my backup/backup is Vista... I thought... well, maybe?) Ah, I thought wrong. My drive would never spin again. So, on my flight from Denver to Seattle, I was not overly pleased (this is an understatement to be honest - just ask Paul) because I was at least a couple of days from having done a backup AND I was even more frustrated about having lost the detailed indexing post I was working on - and even losing the code that was on screen just THAT day.

We finally got home (which seemed like a much-longer-than-normal trip ;) and I got online thinking/hoping - is there ANYTHING I can do... And, I stumbled on a reference to a possibly out of date BIOS version and the needed update which also happens to generate this SAME error. And, being hopeful (and opportunistically forgetting the drop/crash/thump which led me to this problem), I *attempted* the BIOS update and well, it didn't recognize that a drive was attached. OK, that was my last hope. Hours lost. Let's move over to my backup laptop and shift everything I do have backed up over........ which I did and I was up and running that morning. Yes, I had lost a few things and yes I was pretty frustrated but, I wasn't totally down. It wasn't as bad as it could have been without any backups...... but, I was still annoyed.

Then, I thought... is there any other option(s)? It's been at least 10 years since I sent a drive into a drive recovery place. So, I thought this needed some research. I wanted to see what it might take to (and/or IF I could) recover data. About 10 years ago, I had a single drive of a RAID 0 array fail and the disk recovery place couldn't recover any of it (well, I think they could get 128kb out it and it was going to be 800 bucks). But, that was 10 years agao. Have things changed??! Hmmm... what could they do?

Anyway, I got in touch with Drive Solutions, Inc. and they gave me the rundown of what it would take to get data back. I wasn't sure if it was really going to be worth it (especially for the costs) but I still wanted to go through the process (for multiple reasons - some of it was for the data but some of it was for this post - and to remind people of what's possible (myself included)) given that we talk about the importance of backups and the UNLIKELY potential for data recovery off of damaged disks...

The long story short (ah, too late :) is that they can do amazing things these days (NO GUARANTEES THOUGH!!!) and they can completely rebuild the drives in a clean room - replacing drive heads, etc. Once complete, IF they get anything back, they'll give you a directory structure of what they've recovered and different options for getting it back to you (DVDs of just critical directories (there's a cost for each DVD after the first one or two) OR you can purchase a new comparable drive and they'll copy it over). The whole process took about a week (and this was for expedited service - which was also an extra charge). However, and amazingly, they recovered everything (well, I've only done a bit of spelunking but so far, so good)...

And, here's the coolest part, I was working on some SQL files at the exact time of the disaster and well, I went to the \Documents and Settings\username\My Documents\SQL Server Management Studio\Backup Files\ directory and found a directory of Solution1. In it were 3 files with similarly ugly names (~AutoRecover.~vsC.sql, ~AutoRecover.~vsC.sql~RF93d469.TMP, etc...) and the AutoRecover file was 2KB (the others were 0KB). Anyway, I opened up the file and viola! I actually recovered the .sql script I was working on at the time I dropped my laptop (well, I still blame this on Paul cause he made me lean over to talk to him and this ultimately pushed my laptop off the crappy airline table ;) ;) ;) :).

Needless to say, I am pretty amazed at what they can do now... but, I'm certainly not going to rely on that AND, it wasn't cheap!!!

So... what did I learn??!

1) First and foremost, data recovery is NEVER GUARANTEED. (yes, ok, we all knew that. However, I think we just need to say it out loud a few times :).
2) Even if they can recover some of the data, data recovery is not lightning fast. Even with the more expensive expedited service it takes time to ship (overnight), get the drive into a clean room, rebuild it from parts of an exactly matching (including BIOS/firmware) drive, test it, copy it to another drive, ship it back (overnight). Maybe you can find a place that's local, that would help but, it's still time...
3) It's expensive... expect about $10-15 per GB. And, I guess that some will think that's a crazy amount... Again, I had multiple motivating factors - one of which was also related to some pictures I had taken over the weekend with family in FL, some was for data, some was even better to understand this overall process... I expect that got about 10 hours back plus some photos and, I'm pretty impressed with the overall process (in general).

More than anything, I'm going to get even better at daily full system backups when I'm on the road (scheduled/overnight to an external drive) and I might even copy critical stuff to something like Windows Live Skydrive (or something like that). And, while on a plane, I might keep a small/simple USB stick handy if I do something that I really don't want to lose while on a long plane ride. I think new technologies like "mesh" and "cloud" are really interesting and definitely the direction to better performance AND *possibly* minimizing data loss but you're always at risk if there's only one copy. All of this might seem crazy but well...... I've been called worse ;-)).

So, just to wrap things up, I'll be doing a bit of final tweaking on my indexing blog post + my indexing demo scripts from Connections + my metadata script that I was working on at the time of the laptop disaster (which is also why it's been a while since my last post) and I'll be leveraging some of my favorite tools (Beyond Compare) to determine all of the differences between my recovered data/directories and the system I rebuilt from a backup (especially now that it has also changed over the course of this week since I moved over to my backup laptop). And, I've now ordered a new harddrive for my primary laptop. Sadly, I'm getting good at laptop rebuilding.

Thanks for reading,
kt

PS - When did you last backup your home/personal/less-critical system... is it really less-critical?

Categories:
Opinions | Personal | Resources

OK, we were in Iceland and then Florida for our Accidental DBA workshops and both went really well. People agree that there are quite a few involuntary/accidental DBAs out there and overall, we helped quite a few to see a lot of options for better performance, availability, recovery, and/or just manageability.

So, this is our "resources post". We waited until after the SQL Connections delivery to post these as we figured we might add a few more to the list (as is typical when you deliver content more than once - it's really never the same twice!).

Also, I used a few "interactive" (or build) slides in my presentation - specifically on transaction log backups and the concepts of "clearing the log" which really only clears the inactive portion of the log. To help you visualize this, I've added these slides here: TrippRandal_ClearingTheLog-BuildSlides.zip (647.2 KB).

Finally, we've taken all of the scripts that we demo'ed and placed them on SQLskills on our Past Events page here: http://www.sqlskills.com/pastConferences.asp.

And, if you were there and you think we missed something, feel free to ping me (or Paul!) with an email and we'll make sure to update this resources post (and/or [at least] help you find it what you're looking for!!).

Next stop - Microsoft TechEd ITPro in June (we're back in Orlando again)!
kt

OK - so this has been frustrating me for many months... when you create indexes with included columns (which was a new feature of SQL Server 2005), they're not shown by sp_helpindex or by DBCC SHOW_STATISTICS. I understand this not showing for statistics because included columns are not factored into the histogram (that's only the high order element which is the first column in the index) OR the density vector (which only shows the densities (or averages) for the left-based subsets of the key). So, why doesn't sp_helpindex show it? Well... I guess it just didn't get updated for SQL 2005. So, in SQL 2008, I was hoping I'd not only see included columns but also filtered indexes... well, neither is there and sp_helpindex is still the same old proc. Don't get me wrong, you can use SSMS to see all of the index properties for a single index (pane, by pane for each property) OR you can run queries to find the included columns for a given index:

SELECT
(CASE ic.key_ordinal WHENTHEN CAST(AS tinyint) ELSE ic.key_ordinal END) AS [ID],
clmns.name AS [Name],
CAST(COLUMNPROPERTY(ic.object_id, clmns.name, N'IsComputed') AS bit) AS [IsComputed],
ic.is_descending_key AS [Descending],
ic.is_included_column AS [IsIncluded]
FROM sys.tables AS tbl
   
INNER JOIN sys.indexes AS
      
ON (i.index_id >AND i.is_hypothetical = 0) AND (i.object_id = tbl.object_id)
   INNER
JOIN sys.index_columns AS ic 
      
ON (ic.column_id >AND (ic.key_ordinal >OR ic.partition_ordinal =OR ic.is_included_column != 0)) 
         
AND (ic.index_id = CAST(i.index_id AS int) AND ic.object_id = i.object_id)
   INNER
JOIN sys.columns AS clmns 
   
ON clmns.object_id = ic.object_id AND clmns.column_id = ic.column_id
WHERE (i.name = N'[MyIndex]') AND ((tbl.name = N'[MyTable]' AND SCHEMA_NAME(tbl.schema_id) = N'[MySchema]'))
ORDER BY IsIncluded, [ID] ASC

but, there isn't a nice clean way to show all of the included columns for all indexes for a particular table... until now :)

A couple of weeks ago I sat down and rewrote sp_helpindex. I was actually on a plane from Hyderabad to Frankfurt or from Frankfurt to San Fran or from San Fran to Seattle (it was a long day :) and I was using (and well, forcing myself to learn how to use :) my new Vista laptop. OK, that's a HUGE story in and of itself and it definitely warrants its own post but I'll sum up the story with the fact that I had to purchase a new laptop while in Hyderabad because BOTH my primary laptop (T61p) AND my backup laptop (T60p) BOTH (yes, BOTH!!!) suffered catastrophic disk failures on their boot drives within 24 hours of each other. In the end, I really cannot believe the "coincidence" of two laptops crashing within 24 hours of each other. Yes, I thought MTBF too (at first) but the laptops were two Lenovos - one Lenovo (the T60p) was purchased in Feb 2007 and the second, a Lenovo T61p was purchase in Oct 2007. And, it was the T61p that went first. The only thing I can even begin to speculate about and/or think to attribute it to (as I was in India for 17 days from Mar 3 through Mar 20 and this all started on Mar 17) was an overactive metal detector at the hotel at which I was staying (or something related to St. Patrick but I've since ruled that out - and no, I wasn't drinking green beer either...). OK, I really need to do another post to give you all of the details about this trip BUT, I did get a new laptop... and, having just bought it only shortly before I flew back I felt like I really needed to get my money's worth so I just *had* to work on the flights home (ah, security with *3* laptops was NOT fun and I'm *VERY* glad that none of them asked me to "boot" my laptops to prove they were working... that could have been a VERY bad situation... lol).

OK - so back to the story... I was working on the flights and I was preparing to deliver some content on the Friday after I returned (yes, I taught a full day in India on Wednesday then flew back leaving India at 2:15am Thursday morning so that I could arrive back in Redmond at roughly 7pm Thursday night - about 30 hours later - and then teach Friday morning for an 8:30 start time... ah, I was *really* tired on Friday night :). Anyway, in preparing, I decided that I finally needed to re-write sp_helpindex. When I was first writing it, I was only thinking of SQL Server 2005. So, here's the 2005 version that I wrote: sp_helpindex2_2005.zip (2.71 KB).

So, I had wanted to blog that when I got back to Redmond but in preparing for the trip we're on now AND rebuilding my primary and backup laptops, well, it got tabled. So now, today, Paul and I are in Iceland (working with our great friends at Miracle Iceland) and we're teaching "the Accidental DBA" (this past Monday) and SQL Server 2008 New Features in Database Infrastructure and Scalability (Tue through Thursday)... I was giving a lecture on Filtered Indexes in SQL Server 2008 and I, once again, found myself needing a better sp_helpindex. So, when Paul got up to talk about Compression (which is no short lecture for him :), I had time to rewrite sp_helpindex... again. And, here's what I ended up with...

exec sp_helpindex2 'member'

index_name index_description index_keys included_columns filter_definition
member_corporation_link nonclustered located on PRIMARY corp_no NULL NULL
member_ident clustered, unique, primary key located on PRIMARY member_no NULL NULL
member_region_link nonclustered located on PRIMARY region_no NULL NULL
NCIndexCoveringLnFnMiIncludePhone nonclustered located on PRIMARY lastname, firstname, middleinitial phone_no NULL
NCIndexCoversAll4Cols nonclustered located on PRIMARY lastname, firstname, middleinitial, phone_no NULL NULL
NCIndexLNinKeyInclude3OtherCols nonclustered located on PRIMARY lastname firstname, middleinitial, phone_no NULL
NCIndexLNOnly nonclustered located on PRIMARY lastname NULL NULL
QuickFilterTest nonclustered located on PRIMARY lastname phone ([lastname]>'S' AND [lastname]<'T')

So, in the end, I can quickly see whether or not my index has included_columns and/or a filter_definition. Don't get me wrong, these indexes above are NOT necessarily a good combination of indexes (or recommendation of ANY kind) to have - these were just created to make sure that my code works. And, as my good friend Gunnar would say - "it's not my best code but it's not my worst code either" <G>. And, so, here it is: sp_helpindex2_2008.zip (2.75 KB).

Pretty darn useful for sure! Oh, and I used the undoc'ed sp_MS_marksystemobject so that I could still create the sp_ in master but then execute it in all other databases. It's frustrating that this behavior (with sp_ named objects) no longers works in 2005/2008 but at least the sp_MS_marksystemobject still sets the behavior so that we can create this one proc in master but use it in all other databases.

Have fun!
kt

A couple of weeks ago, Paul and I recorded two interviews with TechNet Radio... both are ready for download and in multiple formats! 

Our specific interviews can be downloaded from the following links/formats:
  SQL 2008 Part 1 of 2: Security and Availability WMA | MP3 High | MP3 Low
  SQL 2008 Part 2 of 2: Management, Troubleshooting and Throttling  WMA | MP3 High | MP3 Low
  More TechNet Radio interviews (and *lots* of other shows), can be found on Channel 9.

Enjoy! 
kt

Categories:
Events | Resources | SQL Server 2008

I know that Paul and I recommended that you subscribe to Conor's blog... but have you? He's posted some great details on Partitioning (Part 1 and Part 2) as well as statistics and it always reminds me of how much I can learn from other people's perspectives!

And, just to dove-tail on some of his statistics comments... I, too, have found that as tables get significantly larger AND have non-standard distributions of more than 200 distinct values (and un-even distribution between those values as well), that the optimizer just cannot possibly do a perfect job. The only way an optimizer can be good is when it can "find a good plan fast" (which I first heard from Nigel Ellis (former Development Manager of the Query Processor team) - back when he delivered a Pacific Northwest SQL Server User Group meeting many moons ago). The most important thing to realize is that it's just not possible to waste time to find the absolutely best plan... mathematically analyzing all permutations would be prohibitive - you'd have to take a vacation between query executions (wait, that's not a bad idea... I digress :).

The point:

  1. Make sure that statistics are up-to-date (either through the database option: auto update stats OR by manually updating statistics)
  2. Consider re-evaluating statistics over large tables (and, when poor performance occurs - look at the estimated rows v. the actual rows - if the estimate/actual are off by a fact of 10, then it could be the statistics). I'd try updating stats first and then if that doesn't work, updating with a fullscan. If neither of those work, I'd also re-evaluate other possible indexes (there are some distributions between tables being joined that just can't show a correct correlation between the values when in multiple indexes... sometimes the best index is a multi-column (ie. composite index)). 
  3. Consider breaking very large tables down into smaller chunks (not just table index partitioning but possibly Partitioned Tables AND Partitioned Views) as this can give the optimizer additional details about partiticularly interesting data sets. Even in SQL Server 2008, statistics are still table-level (filtered indexes can provide some, but not complete, relief... I'll give more details in a later post) but I'd often argue that some of the best table designs are not just for a single table. Consider the statistical, locking, and indexing implications for mixed workloads against a single table (and the tremendous amount of blocking that could occur in addition to varying access patterns). And, even while 2008 will offer Partition-level lock escalation, well-designed tables may not need it! I know I've mentioned this before but different perspectives on statistics, optimizers and the fact that a good optimizer has to be efficient in-and-of itself, remind me of some of the most basic things that are also the most common problems contributing to poor query performance.

Returning to the basics and optimizing a system from the ground up always leads to better scalability!

Enjoy!
kt

 

I had started to write this blog post when we (Paul and I) were on our way back from Zurich on November 21. We had been in Zurich presenting a TechNet DeepDive on Database Maintenance Best Practices...after presenting at ITForum in Barcelona...after presenting at SQLConnections in Las Vegas (well, we did spend 30 hours at home in between those last two conferences :)). Once we returned to the US on Wednesday night, it was just before Thanksgiving and well, a few personal things prevented me from getting as much work done as planned in December... and then the holidays hit... and then I realized I was horribly behind and so I've been playing catchup ever since. Ah, Happy New Year. 

Now, I'm really back in the swing of things and I wanted to let you know what we've done!

First things first - we've posted almost all of our resources from our conferences in November (we still have a few more to tweak/post). We'll catch up with the remaining scripts this week! Check them out on our Past Events page here: http://www.sqlskills.com/pastConferences.asp.

And, now, here are my thoughts as I wrote them on November 21 - with a few updates along the way in this font:

**********

It's been a great 2.5 weeks with 4 full-day workshops, 17 sessions, 3 interviews and 8 flights to conferences/events in 3 countries. OK, Paul has been with me too and that hasn't stopped him from blogging (I know you're all thinking this :) but, he blogs shorter blogs posts and I go for quality rather than quantity (TOTALLY kidding...lol...I am SO in trouble for that one (update: I'm glad I didn't post this until after the holidays!)). And, well I've also been pushing through a bit of a cold (and, Paul doesn't sleep ;=).

So, here's a long post to catch you up on all the travels and even some of the great questions we've been asked while running around from conference to conference. First, it was a great week in Vegas for SQLConnections. We ended that week with a full-day workshop that was all hands-on on SQL Server 2005. Our 80+ attendees downloaded a VPC image and used that for the base environment for labs on Database Mirroring, Database Snapshots, Partitioning, Partial Database Availability and Online Piecemeal Restore... (Update: We received our evaluations from this session (only 37 evals were submitted) but they really seemed to enjoy it! Our reviews were fantastic (literally 4 of 4) and so, we're planning to do this again for the Spring SQL Connections. More details coming.)

And, speaking of the VPC/DVD... We've had a lot of requests for these resources (the DVD, the lab manual, the utilities, etc.) and we're already working on an updated version of this for SQL Server 2008 CTP5 which just came out on Friday (November 16) (update 1/13/2008: we've finished the November CTP update and it will be available on the SQL Server 2008 Readiness Kit). And, that's just a start! (update 1/13/2008: we've also released the DDM, check out Paul's post here). With 9 labs on the current SQL Server 2005 version, a few new exercises planned for the first 2008 version (the labs include exercises on automatic page repair for Database Mirroring AND updates for Peer to Peer including the Peer Topology Viewer, etc.) AND a second DVD on Manageability (already working on labs for SQL Server Policy-based Managment and Performance Studio (update 1/13/2008: these two are done too and they will also be available on the SQL Server 2008 Readiness Kit). I think we'll have a TON of resources to help you get started with SQL Server 2008 by the time it releases. We're also looking (for the first time!) to put together a way for you to access these resources more directly. I *promise* we'll keep you posted on that as soon as we have the final outcome! (update 1/13/2008: and, that's the Readiness Kit! Now, we just need to find out all of the details on how/when you can access it. We'll let you know as soon as it's available! It's likely that you will receive it as an attendee at a launch event - February 27, 2008. Check out the launch portal here and you can see when/where a launch event is coming to you: http://www.microsoft.com/heroeshappenhere/register/default.mspx. I can't promise that they'll all give out the Readiness Kit but, that's the most likely place (of which I'm aware at this point) where you'll receive one. If I find out any additional information, I'll post it on my blog.)

As for some of the favorites from the labs - people seem to love the Database Mirroring SQLCMD master script that sets up the High Availability Configuration for Database Mirroring. So, I thought that this might be an interesting script to post here: GenericDatabaseMirroringSetup.sql (16.73 KB). And, to make it even more flexible, I have modified this script quite a bit and made almost everything parameterized (PrincipalServer, PrincipalDNS, PrincipalPort, MirrorServer, MirrorDNS, MirrorPort, WitnessServer, WitnessDNS, WitnessPort, Database2Mirror, RestoreWithMove, BackupLocation). Also, you can decide to "move" the database being restored on the mirror (to the mirror instance's default data root) OR keep the database backup directories exactly the same (which is generally the recommended configuration). In our lab, we move the database to the instance's default directory when it's restored on the mirror because all three instances are on the same virtual machine. However, in a real-world database mirroring environment, you want to try and avoid moving the database to a different drive location on the mirror because future changes might cause the mirroring partnership to become broken. So, if you want to play with this script, be sure to modify the parameters at the beginning of the script and then make sure you test this in your environment (and, let me know if you run into any snags... I know this works in our VPC but I just whipped up the modified script relatively quickly so it's a *learning* script more than anything!). And, here are a few tips for successfully implementing a database mirroring partnership:

  1. Make sure that the principal and mirror server are either identical to each other (in every way) from disk to memory to CPU, etc. When the principal and mirror are identical, you are more likely to minimize performance problems during a failover AND you're less likely to have additional slowdowns on the principal (in a synchronous mirroring configuration) by slower hardware on the mirror. If you don't have identical hardware then you want to either choose an asynchronous database mirroring configuration OR be sure to thoroughly test your before and after failover configurations to ensure that performance is "good enough" in a failover sitation. Some shops find it acceptable to run slower during a diaster (v. having downtime) and therefore they choose a secondary server (for the mirror) that's not quite as powerful as the principal... However, beware that if you're in a synchronous mirroring configuration then your mirror's performance might affect the performance of the principal.
  2. Test BOTH your OLTP activity load as well as your batch load AND be sure to test over an entire business cycle. You might be surprised to find that month end processing might surprise you AND/OR a batch process that runs maintenance (for example an index rebuild OR an index defrag over a heavily fragmented table). We've seen a few customers that had network problems when under a significant load and if not tested, this could compromise your mirroring partnership by not allowing the mirror to stay in sync OR it might cause "throttling" at the principal. If the database mirroring partnership has over 1MB of of unsent log waiting, then performance on the principal will be slowed to try and help the mirror catch up.
  3. Be sure not to make any assumptions about the way that database mirroring works - even if you have multiple databases being mirrored on the same server. Database mirroring is always between only ONE "principal" database and it's mirror copy. If you want to mirror multiple databases on the same server - to the same secondary server - that's possible. However, this presents additional problems in that a failover is between ONLY a single database and it's mirror. If you have an isolated database failure that causes one of (let's say four) related databases to failover, then three-part naming will suddently start failing as you have a combination of mirrors and principals on the same server and local (three part naming) won't work against a mirror database as the mirror database(s) cannot be accessed directly. As a workaround, you could create an alert on the WMI event: database_mirroring_state_change which then forces a failover for the remaining databases... Effectively all four would then failover. While this will work - and, the alert will fire relatively quickly after the first database fails over, it's important to realize that alerts (and this WMI event) are all *after* the actual failover. The secondary failovers would be asynchronous, response-based events. As a result, there could be a few transactions that fail between the first database failing over and the remaining databases failing over. In a diaster case however, this might be acceptable and if your application is well designed, you might be able to make this relatively seemless...

Also, within the SQL Server 2005 Always On VPC, we found a bug (yes, I know... the shame of it!! ;-) but (and this is our saving grace...) it's fairly minor and only requires a couple of quick changes to database mail configuration options. We decided late in the game (which is always a bad mistake) that we needed wanted to change the Windows Server name (and we all know what a pain that is!). In fact, there have always been problems associated with a server name change (at least problems for SQL Server picking up this change):

  • The server's @@servername setting will not be correct. There are actually a few problems that can occur as a result of this but it's also very easy to fix. The steps are quick and Tibor (an MVP in Sweden) already blogged them here: http://www.karaszi.com/SQLServer/info_change_server_name.asp so I won't repeat.
  • Jobs will have problems when edited. The problem with jobs is that they have a different "originating_server" than the new (when changed) servername and so when you open a job and try to modify it, SQL Server thinks that the server is a target server in a master/target environment. And, when you're just a target server receiving jobs from a master, those jobs cannot be modified, they can only be executed. As a result, the job cannot be changed. If you're going to change your servername, you need to make sure that your jobs change with it. Tibor also blogged the update to msdb.dbo.sysjobs that's necessary.
  • This one is NEW for SQL Server 2005 - The Windows groups that SQL Server creates on non-failover cluster servers (if you're on a failover cluster, you must create the security groups manually - check out this whitepaper for more details on failover cluster setup: http://www.microsoft.com/downloads/details.aspx?FamilyID=818234dc-a17b-4f09-b282-c6830fead499&DisplayLang=en) will no longer correspond to the server name (after the change). So far, I have not seen nor heard of a case where this creates a problem - but I won't be surprised if one of you responds with an issue or two! SQL Server 2005 uses security GROUPs to manage security and service/component-level permissions so that service account changes don't require permissions to be removed from that account name (in 2000 they did this and the side-effect was that permissions that had already been granted for some other reason - and just happened to be duplicated with the required SQL Server permissions - were removed when you changed the service account. As part of the service account name change and cleanup, they removed all of the permissions needed to run an instance of SQL Server. In SQL Server 2005, they changed this model solely to put you into the correct server group (like SQLServer2005MSSQLUser$servername$instancename) and take the former service account out.

And, of course, there are potentially a lot of [additional/other] external dependencies when you make a servername change... and, well, that's where we missed one. Part of it is because we also setup a POP3/SMTP mail server inside the VPC and when we changed the servername, we also changed the mailserver name (and then forgot to change the Database Mail settings, Outlook Express Account and SQL Server Agent Operator settings). So, you need to change the references to it. The servername was SQLHASP1 and in the June edition of the Always On DVD, we upgraded to SP2 and changed the servername to SQLHAVPC (notice the more generic name... duh!). For completeness, I'll put a bit more detail here:

Database Mail - Manage account, View existing account, check correct domain and server name
SQL Agent - Ensure operators have correct domain and NET SEND address
Outlook Express - Ensure correct server name + domain name

And, now, you should be able to sucessfully use this POP3 mail server in the lab exercises on WMI Events with Database Mirroring. For more details and references (even if you don't have the lab content) check out the links to all of the Database Mirroring whitepapers here: http://www.sqlskills.com/whitepapers.asp. (update: Paul blogged about a bunch of new whitepapers here and some of these are very new!)

**********

And, that's all I had in my November 21 version of the blog post (see, this is what happens, I write a huge post and then start researching things and put the post on the side to post later... and, later is WAY later in this case.... Sadly, I have a bunch of half-finished posts like this. More posts than time :)

So, a lot more to post over the coming days/weeks as the SQLskills team is heads down in SQL Server 2008 working to create some great content (for example, the labs mentioned above).

And, HAPPY NEW YEAR! (at least I said this before January was even over :)

Cheers,
kt

Over the past couple of years, we've had *tons* of requests for the DDM (the Dual Database Monitor - which is a little utility for testing and checking out a mirroring partnership...) and finally, we've (ha, pretty much only Paul :)) made it more flexible (it was hard-coded with specific insert procs). He's created an ini to allow implicit or explicit client redirection (to see the benefits of Transparent Client Redirect), written a help file, and he'll keep you posted with updates/bug fixes...

If you're interested, Paul's blogged about it here: http://www.sqlskills.com/blogs/paul/2008/01/13/DualDatabaseMonitorNowAvailableToOrder.aspx

Have fun and HAPPY NEW YEAR!
kt 

Categories:
Resources

Every time we're at a conference, we get asked whether we're going to write any books on the kinds of things we talk about. Well, at present the answer is no - we just don't have time unfortunately. And, honestly, we like doing smaller chunks of more timely content rather than a complete book which takes a lot longer to "get to press". However, we have a plan that we'd like to gauge interest in to see whether or not it's worthwhile (Paul already blogged about this).

Basically, there are two full-day workshops that we presented this year, Database Maintenance: From Planning to Practice to Post-Mortem and Disaster Recovery: From Planning to Practice to Post-Mortem. Each of these has 100+ slides in them and we add *a lot* of content while we're presenting...just looking at the slides really isn't enough. However, our plan is to take each of these slide decks and then add comprehensive notes for each slide. Once complete, we'll create a full printout of the entire deck - with notes - for US $99.99. In addition to the printout, we will create a section on our website that includes relevant whitepaper links, KB article links, and blog post links - corresponding with specific slides/topics from the deck. Additionally, as we'd be shipping the book to you anyway, we'll throw in a free copy of the AlwaysOn DVD and the Manageability DVD, with around 20 hours of self-paced labs on them. Shipping and handling would be included for domestic US orders, and US $10 extra for non-US orders.

So, if there's enough interest in this, then we'll go forward and try to make these available for ordering by January/February 2008. So - if you're interested in this (no obligation) please drop Paul a mail.

Thanks,

kt (& pr)

Categories:
Resources

OK, well, the first day is over and we're starting to relax... just had a nice meal in our room and we're off to each do a blog post (or what might turn into a couple :) regarding the things we each discussed in our full day workshop today.

So, it was a great day and a great way to start the Connections event. We had roughly 170 people attend our Sunday workshop that started at 9am which is especially impressive in Vegas!! And, it was a great way to start because it felt like everyone was ready with questions and a few folks (especially those with jet-lag) said that they didn't think they'd make it through the day but then were surprised at how fast it went and that we managed to keep them awake :)...

As for the questions, we answered a lot of them during the session but as it always happens, we each remembered additional references, sites, and/or details that we always want to post after the fact - this post is the result. I'll try to put headers on the sections but I think this will be a long one!

Here's a link to Paul's Q&A blog post from today's session:
http://www.sqlskills.com/blogs/paul/2007/11/05/ConferenceQuestionsPotPourri1IndexesStatsCorruptionAndEnterpriseonlyFeatures.aspx

So, have fun and we both look forward to seeing you tomorrow in the pre-conference workshop on Disaster Recovery - from Planning to Practice to Post-Mortem. The rest of this post contains random but helpful stuff (and in NO particular order at all). And, there's at least one more post coming after this one as I still have more to add!

Resource Reference List

  • KB#909369: SQL Server 2000 Bug where checkpoint is not appropriately clearing the inactive portion of the log
  • KB#329526: File allocation extension in SQL Server 2000 (64-bit) and SQL Server 2005 (this allows 256KB block allocations)
  • KB#328551: PRB: Concurrency enhancements for the tempdb database (this refers to trace flag -T1118 for SQL Server 2000 AND how to create tempdb on multiple files when on multiproc machines). A good and very complementary read is the "Working with tempdb" whitepaper. AND, the primary author of this whitepaper (Sunil) will be here at Connections this week, so there's another session to visit.
  • KB#224071: How to move SQL Server databases to a new location by using Detach and Attach functions in SQL Server
  • Performance Dashboard Reports (only in SQL Server Management Studio for SQL Server 2005 SP2)
  • sysinternals Zoomit tool: this is what I was using to magnify various parts of the screen, etc.
  • SQL Server Customer Advisory Team Blog and specifically the entry on DMVStats. Also, here's an excellent whitepaper called Troubleshooting Performance Problems in SQL Server 2005 as this has numerous DMV queries and helpful details on troubleshooting.
  • SQLskills Whitepapers list

Progress Report: Online Index Operation
As for the details on this - you want to capture these events: EventClassTextDataEventSubClassObjectIDObjectName, and BigIntData1

And, for the EventSubClass there are multiple values that will be returned. What I think is the most interesting is that many of these events return only one of the values for either ObjectId or ObjectName...so, beware of filtering!

  1-Start (both ObjectID/ObjectName)
  2-Stage1 start (null for both ID/Name)
  6-Insert row count... (ObjectID only)
  3-Stage1 end (null for both ID/Name)
  7-Complete (both ObjectID/ObjectName)

EventSubClass 6 is definitely the most helpful. In the BigIntData1 column, they will show the current row number being processed (essentially) and so, you can guage roughly how far you have gone as well as how long you have to go. Here's a screen shot to help you get some insight into what you will see: OnlineProgressReport.pdf (690.78 KB).

Some operations (e.g. DBCC CHECKDB (and related), SHRINKFILE, ALTER INDEX...REORGANIZE) produce a value for the percent_complete column in sys.dm_exec_requests. Both of these are helpful for assessing where you're at...

Index Creation/Rebuild Order
Paul's written about this AND he's linking to some relevant posts in his blog post from today's session but here's a VERY quick highlight of a few things we chatted about:

Index CREATION order DOES matter (you should always create the clustered index first and then create the non-clustered (as non-clustered indexes depend on the clustering key)
Index REBUILD order does NOT matter as the physical location of the data isn't interesting to the non-clustered indexes as they reference the data by the row's clustering key

VLFs - Virtual Log Files... too many (and typically very small) OR even possibly too large of VLFs....
We had some great discussions around VLFs and I know I've posted a bit on this in the past but I'm not sure I've posted this much detail. So, here are a few relevant points.

First, the size and number of VLFs is determined by the size of the "fragment" added to the transaction log. The "fragment" is added at the time the log is created or anytime the log grows (either manually or automatically). The number of VLFs can be predicted from the following quick guide:

  • If the fragment is less than 64MB, then up to 4 VLFs will be added (the reason I say "up to" 4 VLFs are added is because really small fragments - fragments under 1MB - will actually add even fewer VLFs and I can't remember (nor do I care :) what the exact cut-offs are for < 1MB.
  • If the fragment is greater than or equal to 64MB and less than 1GB, then 8 VLFs will be added.
  • If the fragment is greater than or equal to 1GB, then 16 VLFs will be added.

So, if you create a database with a 100MB log (just as a simple example), you will get 8 - 12.5MB logs... if you then increase the log to 150MB then you will add 4 more VLFs. This really isn't all that bad and in general, I recommend that you have less than 100 VLFs. Often I'll find folks that have thousands and this results in VLF fragmentation. I blogged about this originally here: http://www.sqlskills.com/blogs/kimberly/2005/06/25/8StepsToBetterTransactionLogThroughput.aspx and the steps to help minimize it are there as well. Also, as an update to that post - use DBCC SHRINKFILE with the NOTRUNCATE as it appears to have special meaning on the transaction log... but, I still need to investigate this one a bit more.

Finally, if you pre-create too large of a transaction log then you might have new/different problems... the most likely is that the log doesn't clear as often (because you haven't spilled into a new VLF OR you have a transaction that happens to span two VERY large VLFs) and the effect can be that performance is slowed by the less frequent clearing that has to happen on a very large log... As a way to minimize this, the best way to create larger logs (logs in the 10GB+ range), is to create them in multiple chunks (for example 4-8GB chunks so that you end up with VLFs that are 256MB-512MB instead of 4GB because you created a 48GB log to start).

Installation Instructions for the post-conference workshop AND the HOLs DVD
And, for those of you who want to play with the DVD we handed out today... here are the "generic" setup instructions:
Generic HOLs DVD SETUP Instructions.pdf (20.31 KB)

OK, that's it for today and we look forward to another infomation/question packed session tomorrow!

Cheers,
kt

OK, if you've been reading at all... you know that Paul's leaving Microsoft (ok, he's left... yippie!) and joining SQLskills (ok, he's joined). And, well, that's the reason for this blog entry. He's going to work on some updates to his prior posts (in the very near term) on the SQL Server Storage Engine blog as well as many new ones. And, he's threatening me to blog more as well.

So, be sure to subscribe to Paul S. Randal "In Recovery" (very well named :): http://www.sqlskills.com/blogs/Paul/ and here's his rss: http://www.sqlskills.com/blogs/paul/SyndicationService.asmx/GetRss.

We have a TON of exciting things planned, including some upcoming events that we're delivering together:

And, that's it for now...........but stay tuned - there's a lot more coming from us as SQLskills expands!

THANKS for reading,
kt

Categories:
Events | Resources | SQLskills News

Well, I'm not sure where May went but I do know that part of it went to my 2 weeks in Hyderabad, India at Microsoft IDC (the India Development Center) and what was left was spent getting ready for upcoming events such as TechEd (and some personal events too :). And, the first big hurdle of TechEd is over...yesterday, Bob and I delivered a full day preconference seminar on Leveraging SQL Server Always On Technologies to Achieve High Availability and Scalability and the day could have been a week long (ok, do any of you ever know of a one day event where we couldn't expand into more time! :). But, the precon event went really well and was great fun. We had a lot of questions and ended the day close to on time but then we stayed late (Bob and I - obviously, and Paul Randal, too) for questions. In fact, Paul blogged about something he learned: the black box trace. Check out his blog entry and tips for using the trace.

Today was all about exciting future releases and the one at the top of my list: Katmai - now officially announced as SQL Server 2008. There are certainly many more details coming soon on Katmai but for now, here are a few interesting items to read/watch:

Website: SQL Server Katmai website
Press release: Microsoft SQL Server “Katmai” Builds on Proven Success of SQL Server 2005, Empowering Customers to Manage the Data Explosion
Whitepaper: SQL Server 2008 Product Overview
Download CTP3: SQL Server 2008 CTP3

And, there are quite a few other resources available! So, start checking out the upcoming version so that you are ready when it's released! I'll have a lot more to say...shortly (relatively speaking :).

More to come from TechEd!

Cheers,
kt

Categories:
Events | Resources | SQL Server 2008

OK...SP2, the SP2 refresh and then the parallel/subsequent GDRs has seemingly (and rightly so) confused some of us... However, thanks to the PSS Engineers blog (and specifically Bob Ward - Senior Escalation Engineer, Microsoft PSS), this blog entry clears up a lot of that confusion. The end result is that you should be at 9.00.3054 or 9.00.3159. 3054 is the correct one if you haven't had any special hotfix/GDRs directly from Microsoft PSS and 3159 is for those of you that have. For me, I think the best part was the reiteration of the fact that "Microsoft Update will notify you of this" and the comments made that "Microsoft Update is smart enough to recognize you need this specific version of the GDR2 fix...". The most interesting part of all of this is the reminder that SQL Server IS included in Microsoft Update. What's the most interesting is that most people are still using Windows Update and Microsoft Update is DIFFERENT. You need to (essentially) replace Windows Update with Microsoft Update (although it's not that simple - of course...). Basically, you need to install Microsoft Update and then remove Windows Update. So.... if you haven't done this - you should. At least on your main desktop/laptop machine (at first) and then on other machines from there. I can't remember when this originally came into place but a few folks asked me about the difference, etc. and how SQL Server fits in and well... it's all about Microsoft Update now not Windows Update (however/fyi, Microsoft Update looks and feels exactly like Windows Update but it includes Windows, Office, SQL and Exchange). If you want to find out more, check out the Microsoft Update FAQ here.

And, along the lines of maintenance... Paul Randal (of the SQL Server Storage Engine blog) would like to know if you have time to fill in a survey on YOUR VLDB maintenance practices. This is pretty important for them to know. He explains what they'll use it for and why it's useful to them. Be sure to check out his blog entry here.

Finally........... lots of final session writing/planning going on for TechEd. Bob Beauchemin and I are delivering a pre-conference workshop titled: Leveraging SQL Server Always-On Technologies to Achieve High Availability and Scalability. It's on the Sunday prior to TechEd and it's a new session for us. Here's the abstract:

PRCN06 Leveraging SQL Server Always-On Technologies to Achieve High Availability and Scalability 
System down time and lack of scalability for mission critical applications can result in loss of revenue and business creditability. Planned downtime is typically caused by hardware upgrade, application or OS upgrade, applying a service pack, or performing routine maintenance task. Examples of unplanned downtime are hardware or software failure, natural disasters, and human error. In fact, human error has been identified as the number one cause of downtime. SQL Server 2005 Always-On Technologies provides a full range of options for achieving and maintaining appropriate levels of availability. Because the product offers so many choices, it is difficult to choose features that provide the best availability solution for a given application. In this session, we provide an in-depth description of these technologies and delve into scenarios and best practices in deployment of the availability technologies. The high availability technologies covered include Database Mirroring, Database Snapshots, Peer-to-Peer Replication, Clustering, Online Indexing, Online Restore, Piecemeal Backup & Restore, Partial Database Availability, Table and Index Partitioning, Snapshot Isolation, DDL Triggers, and others. The second part of this session focuses on scalability and building systems that scale-out to multiple servers. Building a scale-out application with SQL Server 2005 may entail using techniques and features that are unfamiliar, or are new. This session provides in-depth information about the internal implementation of scale-out features such as Service Broker, Query Notifications, Distributed Partitioned Views, Scalable Shared Databases, and Peer-to-Peer Replication. The session also includes troubleshooting techniques using Profiler and the new dynamic management views.

As for content, we'll have our lecture content available to all attendees, we're going to giveaway AlwaysOn DVDs (more info coming up) AND Bob, Paul and I are going to hang out after the workshop to answer even more questions... So, if you're looking to burn budget for FY'07 AND you want to attend an information packed (and fun ;) pre-con workshop AND a great conference for breadth/futures (a bunch of Katmai sessions at the event too), then you should sign up for TechEd before it sells out......again. Also, there are a bunch of sessions at the conference that might interest you - Paul and I are doing a Chalk/Talk Q&A on VLDB Maintenance, I'm doing a demo fest on AlwaysOn, Paul's doing a session on Corruption Detection and Recovery, Bob's doing a session Windows PowerShell and SMO Together (oh, and he's listed as Robert Beachemin...not sure why???) ...and that's just to name a few!

Oh, and the AlwaysOn DVDs are cool because:

  1. they have a setup.exe that runs to create vhd/vmc files that allow you to access a predefined VPC image.
  2. Virtual PC is free and Virtual Server is free... you can use EITHER for the Virtual Environment.
  3. the VPC is a Windows 2003 Server setup with SQL Server 2000 and SQL Server 2005 (multiple instances) and allows you to access an environment that's excellent for learning and testing and...self-paced labs
  4. the DVD includes 9 lab manuals for roughly 16 hours of self-paced lab time AND they're really good labs with multiple parts, excellent links and even useful undoc'ed commands too (if I might say so myself as I wrote most of them :)
    1. Database Snapshots - 4 Exercises, 75-90 minutes
      • Exercise 1: Repartition the SalesDB Database
      • Exercise 2: Create and Examine a Database Snapshot
      • Exercise 3: Working with Multiple Snapshots
      • Exercise 4: Creating a Database Snapshot on a Mirror Database
    2. Data Recovery & Preventative Techniques - 4, exercises, 75-90 minutes
      • Exercise 1: Examining Foreign Key Relationships between Tables
      • Exercise 2: Point-In-Time Recovery
      • Exercise 3: Using the tablediff.exe Command-Line Utility to Compare ALL Data Modifications
      • Exercise 4: Using DDL Triggers to Prevent Tables Being Dropped
    3. Instant Initialization - 2 exercises, 30-45 minutes
      • Exercise 1: Enabling Instant Initialization
      • Exercise 2: Security Vulnerabilities Created by Instant Initialization
    4. Peer to Peer Replication - 5 exercises, 75-90 minutes
      • Exercise 1: Implementing a Replication-Ready Schema
      • Exercise 2: Configuring and Implementing Peer-to-Peer Replication Configuration Using the Replication Wizards in SQL Server Management Studio
      • Exercise 3: Using the Dual Database Monitor
      • Exercise 4: Adding a new Peer Server
      • Exercise 5: Monitoring Peer-to-Peer Data Flow after a Fault
    5. Table and Index Partitioning - 4 exercises, 75-90 minutes
      • Exercise 1: Range Partition Function
      • Exercise 2: Partition Scheme
      • Exercise 3: Partitioned Table
      • Exercise 4: The Sliding Window Scenario
    6. Snapshot Isolation - 5 exercises, 75-90 minutes
      • Exercise 1: Pessimistic Locking
      • Exercise 2: Activating Snapshot Isolation & Read Committed with Snapshot Isolation 
      • Exercise 3: Using Snapshot Isolation (SI)
      • Exercise 4: Using Read Committed with Snapshot Isolation (RCSI)
      • Exercise 5: Monitoring Snapshot Isolation & Read Committed with Snapshot Isolation 
    7. Online Operations - 2 Parts, 75-90 minutes
      • Part 1: Online Index Operations
        • Exercise 1: ONLINE Index Move (for better isolation)
        • Exercise 2: Partition an Active Table ONLINE
      • Part 2: Partial Database Availability and Online Piecemeal Restore
    8. Database Mirroring - 2 large sesions with TONS of exercises, 4+ hours
      • Part I: Database Mirroring in Action
        • Exercise 1: Configuring and Implementing the High Availability Database Mirroring Configuration – using Transact-SQL through a SQLCMD master script
        • Exercise 2: Using the Dual Database Monitor and Transparent Client Redirect
        • Exercise 3: Initiating Failover in the High Availability Configuration
      • Part 2: Understanding and Implementing Database Mirroring
        • Exercise 1: Configuring and Implementing Database Mirroring using the SQL Server Management Studio
        • Exercise 2: Configuring the Database Mirroring Monitor, Mirroring Threshold Alerts and WMI Event Alerts
        • Exercise 3: Converting to the High Protection Configuration and Comparing Performance between Synchronous and Asynchronous forms of Database Mirroring 
        • Exercise 4: Configuring and Implementing the High Availability Database Mirroring Configuration – using Transact-SQL through a SQLCMD master script 
        • Exercise 5: Initiating Failover 
          1. Part I: Manual and Automatic Failover in the Synchronous forms of Database Mirroring Configuration 
          2. Part II: Preventing “split brain” in the High Availability configuration 
        • Exercise 6: Converting to the High Performance Configuration and Forcing Failover with Potential Data Loss 
    9. Service Oriented Database Architecture - 5 exercises, 3+ hours
      • Exercise 1: Setting up simple Service Broker messaging
      • Exercise 2: Setting up Inter-instance Services
      • Exercise 3: Setting up dialog security and encryption
      • Exercise 4: Setting up application-specific functions
      • Exercise 5: Using Query Notifications

And........ if that doesn't motivate you - we might also giveaway a Manageability DVD that's packed with Tools demos/labs and some SP2 specific stuff such as customized reports (which we'll talk about in the last part of our pre-conference workshop). OK, so I hope to see you at TechEd.......... the pre-conference alone is worth it!

THANKS,
kt

Categories:
Events | Resources | SQL Server 2005

In the quest for more (and more and more ;) information, I've been told about a new link - from the SQL Server Books Online team... it's call the "SQL Server 2005 Books Online Scoped Search" and it allows you a "live" search format for accessing content in the SQL Server books online. And - because they're online - they are the most up-to-date. I'm not sure how frequently they update these BOL (vs. the downloads that we get) but my guess is that they update the online version frequently and then do a BOL refresh every now and then which includes that which is online.

Anyway, I always like to have the latest version on my laptop BUT it's nice to have a quick/easy way to find content online. Oh - and please make sure you give these guys feedback as this is a new site, still with the ability to make tweaks where necessary!

Have fun: http://search.live.com/macros/sql_server_user_education/booksonline

Cheers,
kt

PS- Check out Buck Woody's Blog here: http://blogs.msdn.com/buckwoody/ and subscribe here. He's a Technical Content Developer for the SQL Server Documentation Team and he's blogged items about the BOL, how to search, updates, etc.

Categories:
Resources | SQL Server 2005

OK, I've been complaining about finding resources - for a long time... AND, I've been complaining about how I can never tell if a whitepaper is on MSDN or on TechNet or on Microsoft.com or on x, y, or z. Well........ finally, I've done something about it. I've *started* to put together (and verify) a list of what I think are the top whitepapers out there. This is by no means a complete list AND I haven't read every one of the papers I've referenced. However, I've only linked to whitepapers written/published by reputable sources and I've checked every link to make sure it works. Primarily, this is the list of whitepapers that I reference the most - in classes, seminars, workshops, etc. And, it's not a blog entry - it's an actual webpage so it should [hopefully] be easy to find. I plan/hope to do this for blogs and other useful stuff BUT, it takes quite a bit of time verifying each of the links (and of course, searching/finding the darn thing when someone has broken the link :).

So.... after the first 5+ hours of work - here it is: http://www.sqlskills.com/whitepapers.asp.

Thanks for reading (and wow - you have a lot more reading to do now :) :) :),
kt

Categories:
Resources | SQL Server 2005

Another great DotNetRocks interview has been completed. It's Paul Randal's session on Disaster Recovery, DBCC, Index fragmentation (and defrag) and [unfortunately for me] a lot more. All I can say is that I was ambushed...

thanks Richard
   thanks Carl...

Enjoy: http://www.dotnetrocks.com/default.aspx?showNum=217
kt

Categories:
Events | Resources | SQL Server 2005

I had a discussion earlier today (with Paul Randal) about many misconceptions that exist about upgrading databases and more importantly, about "downgrading" databases. Really, the issue is that I've heard people get frustrated when they find that things like backup/restore works FROM SQL 7.0/2000 TO SQL Server 2005 but not the other way around - even if the database is in SQL Server 2000 (80) compatibility mode. First and foremost, compatibility mode only affects parsing, query processing, and general data manipulation; it does not affect physical storage (well, there's more to it than that but that's a general overview). When you upgrade a database to SQL Server 2005, you WILL benefit immediately from changes in the storage engine, etc. regardless of compatibility mode. Compatibility modes are there to give you time in updating/upgrading your code - if/when necessary. Most code will work when upgrading but some code may not be supported because of changes to keywords, syntax changes, etc... The best thing to do is check your application compatibility with the Upgrade Advisor. I did a a couple of webcasts on Installation/Upgrade as part of my 11-part series on TechNet. See the blog entry for the entire series here. Part 3 and part 4 are focused on Installation and Upgrade and their associated blog entries have a lot of additional links (including links to the Upgrade Advisor as well as a series of things you might want to do before you upgrade). Also, be sure to checkout the upgrade site off of the main Microsoft SQL Server site.

How to move USER databases around - a quick list of what's supported between versions

Backup/Restore from 7.0 to SQL Server 2000
Detach from 7.0, copy the files, then attach to SQL Server 2000
Backup/Restore from 7.0 to SQL Server 2005
Detach from 7.0, copy the files, then attach to SQL Server 2005
Backup/Restore from 2000 to SQL Server 2005
Detach from 2000, copy the files, then attach to SQL Server 2005

Why use Backup/Restore?

PROs

  1. Because you have a backup! This will allow you to go back to the version from which you came. However, without any changes made on the uplevel version.
  2. Because it doesn't require the database to be taken "offline" when the backup is performed (note: that this is both good and bad - bad because you don't really know the exact point in time to which the database reconciles...which may not matter if you're just testing).
  3. Because the backup will be the size of data only and will not include database free space. Free space is not backed up (e.g. a database with a 100GB data file with only 20GB of data should yield a file that's roughly 20GB in size). I say "roughly 20 GB" because the internals of a backup require that the transaction log records for the activity that occured during the backup process are also backed up with the full database (or differential) backup. This is actually the basis for why transaction log backups are not supported during a full/differential backup in SQL Server 2000 (they are in SQL Server 2005). However, this is the reason why the transaction log cannot be cleared while a full or differential is ALSO running in SQL Server 2005.

CONs

  1. You don't know the exact point in time to which the database reconciles (it will be the time that the backup completed) AND logs CAN be restored uplevel as well. NOTE: If you're interested in creating an exact point in time version of the database - consider putting the database into "restricted user" mode or "single user" mode (so that user transactions are not allowed during the backup). Again, this may not be a concern.
  2. It takes time to complete the backup (there are four phases of a restore: create/initialization, copy, redo, undo). Make the create/initialization *much* faster by enabling Instant Initialization. See my Instant Initialization blog post for more details.

Why use detach/attach?

PROs

  1. It's simple, it's fast... but once detached then the database is OFFLINE.
  2. You know the exact point in time to which it reconciles because no transactions are allowed into the database once it is offline. Again, this may not be a concern.

CONs

  1. You must copy the entire file - including the free space to the other location and the network copy might be the most expensive (meaning time consuming) part of the entire process. However, once copied, the files do NOT need to be created on the destination because on attach, these files will be used.
  2. The database is offline once detached and during copy.
  3. If you don't COPY the files and instead you attach the detached files, you will have ABSOLUTELY NO WAY of getting back to the version from which you detached. (ah, this is probably the single most important reason for why I prefer backup/restore!)

Summary for "How to move USER databases around"
Between these versions "upgrades" are supported ONLY to the uplevel version. There is NO single (or simple) feature that can be used to get back to the version from which you started (without exporting/importing all of the data). There is also no undocumented back-door to do this either (no trace flags, no DBCC commands, NA DA!!! as per Paul).

What about System Databases?
This is a whole other can of worms to open and the easiest thing I can say here is that you generally should not move/upgrade system databases across machines. These are upgraded through "in-place" upgrades of SQL Server (on the same machine) or through manual migrations (to different machines) of the users/objects (SQL Agent Jobs, user-defined system procedures in master, logins in master, etc.). This is not an easy process (manual migration) but may prove to be a better choice over an upgrade in place if something were to go horribly wrong (which is unlikely but I'm a "what's the worst case scenario" person when it comes to availability :). The other benefit of NOT upgrading in place - and instead MOVING databases from one version to another on an upgrade - is that you get to complete some basic "spring cleaning". New hardware, freshly formatted, freshly installed/configured OS, clean disks, etc. This can often alleviate some of the strangest, hard-to-determine-problems, that have plagued you for weeks/months. Like I said, this is a whole other can of worms to open!

But - if you're interested in moving system databases around on the SAME machine, here's a great KB that covers the required options, syntax, rules and restrictions: How to move SQL Server databases to a new location by using Detach and Attach functions in SQL Server

And - if you're interested in transferring logins and passwords between instances (for upgrade or for sync'ing two servers used to create a standby partnership - with Database Mirroring and/or Log Shipping), here's a great KB article that includes links to other articles even uplevel transfering of logins (like 2000 to 2005): How to transfer logins and passwords between instances of SQL Server

And - that's it for this week (probably)... two in a row is not likely to become three in a row (just setting expectations :) :) :),
kt

Categories:
Resources | SQL Server 2005 | Tips

Instant Initialization is a new feature of SQL Server 2005 that is based on an NTFS feature that was added to Windows XP (and therefore is also available in Windows 2003 Server). It's a feature that's seemingly simple; it allows file allocation requests to skip zero initialization on creation. As a result, file allocation requests can occur instantly - no matter what the file size. You might wonder why this is interesting or why this make a difference? Most file allocation requests are small requests, with small incremental changes (like .doc files, .xls files, etc.) but database files can be rather large. In fact, they should be rather large as pre-allocation of a reasonable file size is a best practice to reduce file fragmentation. Additionally, autogrowth causes performance delays (more so in 2000 than 2005) but it's generally something that you want to avoid when possible. As as result, database creation times can take minutes to hours to days, depending on file allocation request. But - it's not just for database creation. ALL file requests can leverage this feature: file creation for a new database, adding a file to an existing database, manually or automatically growing a file and (IMO - the best) restoring a database where the file (or files) being restored does not already exist. The reason I think the last feature is the best is that it can reduce downtime if a database is damaged and allow you to get back up and running more quickly. This is especially important for databases that cannot leverage partial database availability, which is an Enterprise Engine feature. So, to give you some motivation, here is a test that I performed just to have some interesting and comparable numbers.

Performance Test with Zero Initialization
Hardware: Dell Precision 670 Dual Proc (x64) with Dual Core, 4 GB Memory, RAID 1+0 array w/4-142 GB, 15000rpm disks
   CREATE DATABASE with 20 GB Data file = 14:02 minutes
   ALTER DATABASE BY 10 GB = 7:01 minutes
   RESTORE 30 GB DATABASE (EMPTY Backup) = 21:07 minutes
   RESTORE 30 GB DATABASE (11GB Backup) = 38:28 minutes

Performance Test with Instant Initialization
Hardware: Dell Precision 670 Dual Proc (x64) with Dual Core, 4 GB Memory, RAID 1+0 array w/4-142 GB, 15000rpm disks
   CREATE DATABASE with 20 GB Data file = 1.3 seconds
   ALTER DATABASE BY 10 GB = .4 seconds
   RESTORE 30 GB DATABASE (EMPTY Backup) = 5 seconds
   RESTORE 30 GB DATABASE (11GB Backup) = 19:42 minutes

SQL Server can leverage this feature for DATA file requests ONLY; the transaction log must be zero initialized because of its circular nature... which brings me to why this is not "on by default" or more specifically - HOW do you get this feature. First, there's absolutely no syntax change required - SQL Server will use it if it has access to it (so what does that mean?). The SQL Server service must have been granted the Windows permission - "Perform Volume Maintenance Tasks". By default, Windows Administrators have this permission but as yet-another-best-practice, we recommend that your SQL Server run under an account that is a "lower privileged" account (i.e. NOT an administrator). Other ideal options include running as "network service" or running as a dedicated user/domain account that has very few permissions except to SQL Server and it's required resources. A lot of folks recommend using network service for its simplicity (it doesn't have a password and it has limited network/local rights) and I agree with this as long as it's truly dedicated to SQL Server. If network service is used by other services on the same machine then you could compromise security of your SQL Server (or the other services) with the elevated permissions that SQL Server grants or visa versa by the permissions that other applications may have granted to network service. Again, I'm not against network service BUT I would check your local permissions to see if there's anything that jumps out at you. If you've installed other applications/services/etc. you may have already compromised the security of the network service account. I would love to know if anyone has a quick/easy way to check windows permissions to see what may have been granted in addition to the default permissions OR even a link to where the defaults are listed online... I've had trouble doing exactly this when searching, etc... feel free to post links/comments in your comments!. Anyway, with a dedicated user account, you can make sure that it's not compromised because you only use it for SQL Server. But, even these have negative issues - like required passwords that networks invalidate after n days and that you must change on a server/service basis. From a management perspective, this can be difficult.

SIDE NOTE: Managing the service account password is a lot easier in SQL Server 2005 with the SQL Server Configuration Manager (SQL-CM). The SQL-CM allows you to change the password to a service without an active connection (meaning even if the service isn't started) and it invalidates the login token so that password changes don't require a restart of the service. SQL-CM also has a command-line interface and is scriptable with WMI. The WMI Provider allows server settings, client and server network protocols, and aliases to be scripted through the WMI Provider by means of simple VBScript code (or by using the command-line tool). What you could end up doing is creating a script that changes the password of your services on all of your servers (for example, when a password policy is enforced that requires that the passwords of service accounts be changed). I've recently completed a whitepaper that highlights the Management tools (it's really just an overview but even then it turned out to be quite large as we looked at the tools in many different ways). I'll certainly let you know when the whitepaper is published (which should be within the next couple of weeks).

Granting the permission "Perform Volume Maintenance Tasks"
To use instant initialization, your SQL Server service must be running with an account that has the required privilege. If your SQL Server service is running as a local administrator this permission already exists. For a service account which is not a local administrator (again, recommended!), the necessary privilege to grant is Perform Volume Maintenance Tasks. This permission can be granted by an administrator through the Local Security Policy tool (Start, All Programs, Administrative Tools) and once granted, SQL Server automatically uses instant initialization. IMPORTANT NOTE: If this permission is given while SQL Server is running, then SQL Server must be stopped and restarted. However, once the server is restarted, no other syntax or permissions are needed. All data file creation/extension options will use instant initialization for data files created on NTFS volumes when SQL Server 2005 is running on Windows XP or Windows 2003 Server.

Why isn't this on by default?
OK, so after all of this... the gains that you see and the lack of changes to syntax, etc. You're probably wondering why this isn't on by default? It's a security issue. The biggest vulnerability is with SQL Server Administrators who are NOT also Windows Administrators. Windows Administrators have access to local files and can easily see all files stored on the local server. For files that are not encrypted (and are not already open to another process), an Administrator can open and/or modify these files using an appropriate editor. For files that are encrypted, an Administrator can at least view the encrypted information using a hex editor. By granting “Perform Volume Maintenance Tasks” to a SQL Server instance, you are giving administrators of the instance the ability to read the encrypted contents of a recently deleted file (ONLY IF the file system decides to use this newly freed space on the creation of a new database - created with instant initialization) with the undocumented DBCC PAGE command.

SIDE NOTE: The format for DBCC PAGE is undocumented in the Books Online but you will find tips and tricks on many “official” Microsoft blogs. The SQL Server Storage Engine blog (http://blogs.msdn.com/sqlserverstorageengine/ has some very good blog posts on internals and often describes undocumented commands. Specifically, check out the blog entry titled: How to use DBCC PAGE. The first three components are fairly straightforward: database id (or name), file id, and page id. The fourth component is the tricky one: printopt (or print option). The print options for DBCC PAGE are as follows (taken almost verbatim from the SQL Server Storage Engine blog - and Paul said I could :):

0 - print just the page header
1 - page header plus per-row hex dumps and a dump of the page slot array (unless it’s a page that doesn't have one, like allocation bitmaps)
2 - page header plus whole page hex dump
3 - page header plus detailed per-row interpretation (in this case, this option is not available - even if the data you are trying to read is from a previously deleted database because the metadata is not accessible only the raw page data)

Bear in mind, even if you can access this [encrypted] information, making sense out of this data will be challenging if not incredibly difficult.

In production environments, database files should NOT be located on file server drives - especially those where restricted and/or sensitive files are stored. As a result of prudent security measures, the true impact of using instant initialization is low. However, because this vulnerability exists, this feature is off by default.

Want to try this?
I've written a lab on Instant Initialization AND it has an interesting sequence of exercises (using multiple instances):

  • You create a database on instance: SQLDev01
  • Populate a large chunk of pages with very contrived (and easy to find) "junk" data

USE TestWithZeroInitialization
go

 

CREATE TABLE JunkData
(
   
JunkDataID int identity,
   
JunkDataValue char(8000) 
      DEFAULT REPLICATE('Junk', 2000)
)
go

 

SET NOCOUNT ON
go

 

DECLARE @Counter int
SELECT
@Counter = 0
WHILE @Counter < 20000
BEGIN
   
INSERT JunkData DEFAULT VALUES
   
SELECT @Counter = @Counter + 1
END
go

Drop the database from instance: SQLDev01

  • Create a new database on a different instance: SQLDev02 (which has been configured to use Instant Initialization) and hope that it uses the freed space by having dropped the first database
  • Start walking various pages (using DBCC PAGE) to see if you can view the "junk" data from the dropped database.

DBCC PAGE ('TestSecurityExposure', 1, 200, 2)
DBCC PAGE ('TestSecurityExposure', 1, 400, 2)
DBCC PAGE ('TestSecurityExposure', 1, 600, 2)
DBCC PAGE ('TestSecurityExposure', 1, 800, 2)
DBCC PAGE ('TestSecurityExposure', 1, 1000, 2)
DBCC PAGE ('TestSecurityExposure', 1, 1500, 2)
DBCC PAGE ('TestSecurityExposure', 1, 2000, 2)
DBCC PAGE ('TestSecurityExposure', 1, 2500, 2)
DBCC PAGE ('TestSecurityExposure', 1, 3000, 2)
DBCC PAGE ('TestSecurityExposure', 1, 3500, 2)
DBCC PAGE ('TestSecurityExposure', 1, 4000, 2)
DBCC PAGE ('TestSecurityExposure', 1, 4500, 2)

 

In most cases the very first output (for page 200) will return Junk data. If this is not the case, simply drop the TestSecurityExposure database and recreate it again. Sometimes it’s a timing issue and sometimes it could be a background process (like Windows Update) that uses the expected pages. Regardless, if you do get the same pages again - our contrived data should be easy to find.

You can certainly create the environment on your own and see if you can get it to work. OR, you can get a copy of our AlwaysOn DVD that has the appropriate lab environment. I tend to give away the AlwaysOn DVD at events I speak at (on Availability/Disaster Recovery) but I'm also happy to send a few out over snail mail (it needs 2GB of memory and 10GB of disk space for the virtual environment AND you need to have Virtual PC or Virtual Server installed - which are freely downloadable from Microsoft). Paul asked for DBCC CHECKDB information here (to get a free DVD sent to you) and I'm going to ask for instant initialization numbers and how this has helped. Post a comment here and then send me an email with your snail mail address information as well. I'm willing to do this for the first 10 responses... go!

Have fun...and thanks for reading!
kt

PS - For those of you in our UK (London) Event tomorrow, we're giving away the AlwaysOn DVD (and an even cooler SQLskills pen... lol). Now there's motivation if the content doesn't (NOT!).

OK - I feel like I know a fair amount about SQL Server but sometimes I also feel like I don't :) :) I'm continuously amazed at how big a product SQL Server is... today was one of those days when I felt "I don't"!

I've been wanting to know more and more about the new features in the tools and the direction in which the tools are headed so... I setup a meeting with Paul Mestemaker (it helps that I live in Redmond and I'm working on some SP2 resources for the team :); it was a great meeting. Some exciting new features and some great new directions in which the tools are headed. I like the way they're thinking and I especially like the options that are now in place to discover, use and customize the "reports" feature within Management Studio (just to name one!). What I learned (that was the highest on my "I didn't know that" list) is that quite a few gems are released as part of the Feature Pack for SQL Server. The Feature Pack "is a collection of standalone install packages that provide additional value for SQL Server 2005. It has been updated for SP2." From that description, it doesn't leap out at me as exciting AND I often know about many of these tools through other channels - but usually it's just "a tool at a time". The thing that's nice about the FP page is that it seems to be a nice and central, single location for ALL of the "add-ons" for SQL Server. It includes things like the Upgrade Advisor (which I typically point people to individually on it's main page) and will include (once it ships) things like the BPA (SQL Server 2005 Best Practices Analyzer) BUT it also includes things like a standalone download for SQLCMD so that you don't need to install all of the tools if you just want this lightweight client for automation. Additionally, it includes the SQL Server 2000 DTS Designer Components if you want to edit/modify/maintain DTS packages in 2005 before you rewrite/convert them to SSIS.

So - the point is that there's lots of great stuff out there and sometimes it just takes another person, a blog entry, or a few minutes hitting a company site to see what's new. I'd strongly suggest that you and your team pick a morning - maybe once or twice a month (and round robin who brings the coffee/doughnuts :)) to just browse around and see if there's anything new on your hardware, software and other supporting sites - especially those that don't already have a blog/rss feed (or other form for notifications). No one needs to know everything but knowing where to look can really make all the difference when you do need something (or when you have a concern/problem). And browsing a few sites (occasionally) might make the difference in applying a hotfix/patch before something becomes a big problem. Staying current with hotfixes, service packs, bios updates, firmware updates, etc. is difficult so make it a team effort.

Speaking of service packs, here's the primary page for SQL Server 2005 SP2: http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/servicepacks/sp2.mspx
And don't forget the Books Online Update as it is NOT installed when you update an instance to SP2: http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx

Also, be sure to watch the SQL Server Manageability Blog (aka Paul Mestemaker's blog) moving forward as he'll have the first news about BPA and many other tips/tricks with regard to the tools.

Finally, (and this is great timing too), Paul Randal - prolific author at the SQL Server Storage Engine Blog - blogged about all of the active "SQL Server Product Team"-related blogs here.

And........ if that doesn't keep you busy, I'm not sure what will! :)

Instant Initialization technical details are next and then I'll get back to the Clustered Index Debate. Thanks for reading!
kt

Ok - a strange title indeed but it's been a strange couple of months. It all started with a "much needed" vacation and I thought that would get me through the persistent "cold" that I was having all through my travels. Anyway, vacation didn't help and I came back to find that what I had was a sinus infection. Janaury has been filled with antibiotics, sleep and well......still a lingering cough even though the month is over today. The long story short is that I'm starting to come out of it and I promise to start blogging a lot more frequently starting now. In fact, I have 3 or 4 entries in the queue that I'm plotting for upcoming posts.

To get you started - there are some great and NEW resources that were posted just this week by some of my SQLskills colleagues. If you read their blogs then you've probably already seen this but for completeness, I'm going to post them here:

Bob Beauchemin's Blog Entry: http://www.sqlskills.com/blogs/bobb/2007/01/30/TheFirstOfMyScaleoutWhitepapersIsAvailable.aspx
Bob's Whitepaper: Planning, Implementing, and Administering Scaleout Solutions with SQL Server 2005

Liz Vitt's Blog Entry: http://www.sqlskills.com/blogs/liz/2007/01/30/AnalysisServicesPerformanceTuningWhitepaperHasArrived.aspx  
Liz's Whitepaper: Analysis Services 2005 Performance Guide

Ok - there's my first post of 2007 and NOT my last. More to come. Thanks for reading and HAPPY NEW YEAR (at least I got that in January :),
Kimberly

OK, it's been a heck of a long time since I blogged... and for that I apologize. I'm also WAY overdue in my posting my demo scripts from a TON of conferences BUT... now everything has been posted. Check out the past events page on SQLskills and you can find the demo scripts that you're looking for......lots of fun stuff and TONS of scripts to play with and test.

Now - as for the reason(s).... many are business and for that I blame the following (yes, 17 flights [yes, one boarding pass is missing] over ONE 5 week trip with 7 events and 5 continent changes):

The other reason(s) are personal...suffice it to say that the last 6 months have been some life changing times for me and what I'm finding (or trying to find) is that ever important balance between work and life. During this holiday season (and always), I wish you and your loved ones well and I hope that you too can find (and cherish) what's most important to you.

So, you won't see anything else from me for this year but I do hope to be better (and more frequent) with blogging in the New Year and I especially hope to see you again at an upcoming conference. Let me leave you with the most exciting picture I've witnessed this year...it was during my one day of site seeing in Cape Town - where I went cage diving with Great White Sharks (and got horribly sea sick - which is rare for me) but where I was able to witness these amazing and powerful creatures....

Have a happy and safe holiday season!
kt

There are SO many new features in the updated DasBlog that I don't even know where to start. Here's just a quick list of the things I really love:

1) Scrolling through the entries - if you click a specific entry's title, you will get a "last blog entry" | main | "next blog entry" section above the entry's title and you can scroll through the blog entries. It's really a nice way of reading through a series of blog entries.

2) the SUPER cool complete archive by category - here's a link: http://www.sqlskills.com/blogs/kimberly/Archives.aspx

And - those are just a couple of my favorite new features with 1.9 but you can read more about it from the source - Scott Hanselman - here. Definitely worth the upgrade, definitely worth the consideration to move from some other blog engines (depending on how much you blog) OR consider creating your own blog (here's why)........

Enjoy,
kt

Categories:
Opinions | Resources

Hey there everyone - Been a LONG time since I last blogged (sorry!)... key reason (fyi) is that I'm trying to find the ever-challenging work/life balance during the best months of the year (here in Seattle July/Aug are GREAT! months - September is almost always good too). Anyway, it's been a few weeks and I thought I'd catch you up... It all started with some travel (of course) and I was in Chicago for a SQLskills Immersion event and then off to London for another event with SQL Server FAQ (aka Tony Rogerson). I returned during the CRAZY travel restrictions and had to check two laptops (sigh) BUT they both made it back without damage after I purchased kitchen towels, bed sheets, a blanket and a duvet (and two new [cheap] suitcases) in which to pack them. When I got back, I relaxed! I've been up to see the Athabasca Glacier and Johnston Canyon and just this week I took off a day to do the "Lotus Experience" Advanced Driving Course at Pacific Raceways/ProFormance Racing School. I've always LOVED driving so driving 700 miles to Johnston Canyon didn't bother me at all (especially with great company and fantastic tunes - have you heard the latest Snow Patrol?) and the driving course in the Lotus was not only great fun but VERY informational. I've taken the Advanced Driving course before and I think EVERYONE should consider continuing their education in driving - things like Collision Avoidance, Advanced Braking (understanding proper driving under ABS), "high eyes" and so many other things...just make you safer on the road. OK, enough about all that fun/practical stuff.....let's get back to SQL.

Some great new resources are out:

And - it's now time to start enjoying Labor Day weekend, there ain't no labor going on here this weekend. Enjoy - and check out those links next week. ;-)

Cheers everyone,
kt

Hey there everyone - If you're into Analysis Services, Integration Services, Reporting Services and BI in general - you'll want to check out Elizabeth Vitt's new blog. Liz has been specializing in BI since SQL Server started adding BI-centric components. She's got a wealth of information to share and many great insights into performance tuning as she's working on a BI Performance Tuning resource that will probably hit 100 pages (from current guestimates).

And - no surprise from Liz, she's out the gates running with her first entry on Influencing Aggregation Candidates.

Subscribe now!

And a big welcome to blogging for Liz!
kt

Categories:
Resources | SQL Server 2005 | Tips

I've posted all of my demo content from TechEd 2006 and wow - it was a lot of fun! I created many new and fun demos as I tried to keep everyone awake through the sessions (cause it seems like there were way too many evening events - didn't it? ;-)). All of the content is posted here.

Finally, can I just say that Paul (and team) has been on a roll. They're blogging machines. If you're not reading this blog... you're DEFINITELY missing out.

Have a great weekend... I'm off to Switzerland today.
kt

Categories:
Events | Resources | SQL Server 2005

Well, if you're wondering why I've been so quiet this week... it's a myriad of events all coming together and/or being finalized right in time for TechEd. In working really hard (especially crazy was today) for some final TechEd content, I realized that a lot of people don't really know what goes on behind the scenes of some of these really huge events. Brian Marble has been blogging about this and you can learn some interesting things by checking out his blog. And for some fun statistics related to TechEd, here is an idea of the quantity of food and drink that will be consumed:

  • 1,250,000 pieces of "Mikes & Ikes" will be consumed over the course of the Tech Ed 2006 week
  • 18,750 pounds of salad will be prepared and offered at meals
  • 83,700 ice cream novelty/fruit and yogurt bars have been ordered for this function
  • The total amount of fruit ordered will fill 3/4 of full size tractor-trailer
  • 60,000 eggs will be eaten by attendees at breakfast (this is equal to 4,800 dozen cartons of eggs)
  • It will take 4 semis to transport the 150,000 bottles of water consumed
  • 1.6 million ounces of coffee will be poured and consumed (conservative estimate)
  • More than 50,000 pounds of carbohydrates will be consumed at Tech*Ed (Atkins who?)
  • 1,500 table cloths will be used and re-set on a daily basis (7,500 for the week)
  • A minimum of 2,000 antacid tablets are likely to be consumed at this event

As for the technical content, well that's not too shabby either. There are over 900 breakout sessions, chalk talks, ILLs (Instructor-led labs), HOLs (Hands-on labs) and general/keynote sessions. There's a lot of technology that comes together for a show like this and there's even a DVD that's available after the show with all of the breakout sessions on it. The key point is that there's a lot going on and I'd have a hard time believing that you couldn't find something to do during every timeslot (for me there are multiple time slots where I'm torn between delivering my own session and attending another...but, I have a feeling I know where I'll end up :).

One thing that you can do in almost every timeslot is an HOL (Hands-on Lab) and for SQL Server there are more than 10 of them. Each HOL is focused and technical and each covers a specific technology or topic. For TechEd 2006, I've written two of the HOLs: DAT007 and DAT010. Specifically, DAT007 is Database Mirroring in SQL Server 2005 SP1 and DAT010 is Table and Index Partitioning. These HOLs cover everything from design to implemenation to failover to monitoring - for Database Mirroring and for Partitioning the lab goes from design to implementation to performance to the sliding window scenario. They were a lot of fun to write and I hope a lot of fun to go through. If you're interested in hearing more about them, Mark Penaroza did a couple of interviews about them. He blogged about it here and mentioned that the interviews are available on Commnet (the Microsoft TechEd attendee website). I've also posted the interviews here (DAT007 Interview (4MB mp3 file) and DAT010 Interview (2.75MB mp3 file)) so that you can get some insight into the things we're doing to help get you started and ready with these new technologies.

Finally, since TechEd is sold out, I know that not all of you will be there. As a result, there's "Virtual TechEd". Virtual TechEd is a site dedicated to getting some of the content and resources out to folks that just couldn't attend. The Virtual TechEd site is here: http://virtualteched.com/default.aspx

So, I think that's it for now. Still enjoying the comments you're making on the last blog entry about the version you're running and why. Seems like we all have the same problem - time and money ;). Keep those comments coming!

Thanks for reading,
kt

Categories:
Events | Resources | SQL Server 2005

Ha... did that get your attention? Well, what I really hope to do is make everyone aware of what's made the Developer Community rounds this week. On Wednesday, Microsoft announced "Data Dude" (aka the Visual Studio sku for Database Developers). This was an announcement that may be glanced over by many DBAs thinking it's just another tool for developers...what can it offer me? And, well, that's where I think there are some VERY cool things to point out. I've been following Data Dude for a couple of months now (ah... a little birdy told me :) and at first I wasn't sure how much it would impact me. However, after starting to get a better feel for their future directions, I've realized that even though I'm not their initial and/or direct target audience that I'll definitely find some great uses for their product. In fact, in getting ready for their announcement and in chatting with a few press folks, I wrote up a small amount of text. Some of this was quoted in the eWeek article here but there's a few more things that I really think you'll (yes, even DBAs) be interested in. This is the second half of the content that was quoted:

For Administration and Operations, I especially like their direction with regard to unit testing and sample data generation. I work with a myriad of customers who do not let development/operations perform testing/tuning on real production data (even a copy) due to data sensitivity requirements/policies. As a result, performance testing can be horribly flawed. With the ability to generate large sample volumes of statistically "real" data, real-world tuning will be possible without compromising data sensitivity. This is the area that I'm most interested in initially but refactoring and schema comparisons are very interesting as well. One of my favorite sayings is "The sooner you start to code, the longer the program will take." (Roy Carlson) as schema changes can be challenging at best and often things can be missed (data types, columns names, etc.). Often the changes are made on alternative systems and then they need to be integrated in - often through comparing schemas and with hand-created alteration scripts. With the ability to have intelligent refactoring, application and database logic can be fixed through a straightforward and flexible interface. This will help to minimize later errors or even harder to recognize performance problems caused by implicit conversions.

So, in the end, I'm not their primary target but I think I'll probably get really into it and try to consider a variety of ways to leverage it for Operations/Administrations teams for "after the fact" tuning cases. However, I do think teams will be even more productive if they adopt it earlier in their production lifecycle.
Now, if you're even slightly interested. You can get a lot more information about Data Dude already. Here's a beginning list:

And, if you're going to be at Microsoft TechEd 2006 - in Boston in a couple of weeks, there's a LOT more information coming. In fact, my pre-conference workshop co-presenter Brian Randell has authored some Hands-on Labs for Data Dude and those will be available in the HOL area. He's on DotNetRocks this week (to be released on DNR's site next Tuesday) and he's chatted with them in the past about Virtual Server/Virtual PC (for hours...now you know how I feel ;). Honestly though, we use VPC/VS a lot in our HOLs and Brian is REALLY knowledgeable about how to optimize them, compact drives, etc. Anyway, here's the link to his last show on DNR and here's Brian's blog entry on Data Dude. And, just as a small hint... you should consider making a Sched+ for the "Live from TechEd" show from DNR. All I have to say is that it might have some great guests on it (maybe even related to Data Dude, hint, hint)!

Now, the last thing that's the most exciting for me to announce is that there are some new bloggers as a result of the Data Dude annoucement. FINALLY, one of my best friends - Gert Drapers - has started blogging (don't forget his already awesome content site: http://www.SQLDev.Net). If you're at all interested in geeky database development stuff, subscribe now! And - many of his team members are great friends too (Richard and Matt!) and I'm very excited to see them blogging as well (it's just that I've been begging Gert to do it for the past couple of years ;)). Anyway, it will be great fun watching this team grow and watching this product evolve.

Here are the Data Dude team blogs:

Here's the official Visual Studio Team Edition for Database Professionals site.

The times are changing............. for the better!
kt

Categories:
Resources | SQL Server 2005 | Tips

In part 11 of the TechNet webcast series for the ITPro, I spoke briefly about the Oracle Migration Assistant and the recent release of the Community Tech Preview versions of both the Access and Sybase Migration Assistants... A few of you asked for their download location and there were even a few replies that folks had found them... However, after looking around, I figured out that what was found were the OLD and very outdated Access Upsizing Wizards (and that's not this!). The new SQL Server Migration Assistant tools are truly Migration tools - tools that can help convert code, change data types, etc. More than anything they're targeted at being more complete and feature rich than just an "upsizing tool" which is excellent for what it is but still requires a lot of additional work.

Having said all of that, I have the details about the TRUE Migration Assistants.

SSMA for Access download instructions:

Download instructions

  1. Open the download page here.
  2. Select ‘Receive File from Microsoft’.
  3. Enter the Password: w$%dIcKP_TZrf
  4. Download and run ssma-for-access-xxx.msi

SSMA for Sybase download instructions:

Download instructions

  1. Open the ftp download site page here with the following username and password:
         username: SSMA4Syb2
         password: i456$Lk
  2. Download the msi for Sybase or the msi for the Sybase Enterprise Portal (ep).

    IMPORTANT: If you access the files from Internet Explorer, please verify IE Browser Settings using Tools, Internet Options, Advanced, under the Browsing section:

CHECK - Enable folder view for FTP sites
UNCHECK - Use Passive FTP (for firewall and DSL modem compatibility)

So... have fun with those downloads and if you run into any issues/concerns - be sure to post in the SQL Server Migration Assistant newgroup.

Cheers,
kt

The SQL Server team has a few *very* interesting blogs and the Engine Team just started blogging - check it out here: http://blogs.msdn.com/sqlserverstorageengine/ (thanks for the heads up Sunil).

For completeness, here are the bulk of the other SQL team blogs - which I leveraged (aka stole - thanks Euan!) from Euan Garden's EXCELLENT list (his blog roll) of SQL Server Team Blogs.

SQL Server Team Blogs

Excellent CORE/Related SQL Server Team Blogs

Now there's some entertainment for the [holiday] weekend ;). Hope that all of you enjoyed a bit of rest and relaxation this weekend.... now, back to work!

Cheers,
kt

Categories:
Resources | SQL Server 2005

Hey there everyone - The series has completed and I know that many of you struggled to get access to the surveys... Microsoft has asked me to post links to the surveys...so, for completeness, I decided to create this blog entry to have links for every session, every blog link (resources, demo scripts, etc.) and the survey links. I really did have a lot of fun on the series and I hope we can do this again!

TechNet Webcast Series

Session 1: A Fast-Paced Feature Overview and Series Introduction (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 2: Security (Level 200)
   Presenter: Bob Beauchemin, SQLskills.com, 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 3: Understanding Installation Options and Initial Configuration (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 4: Upgrade Considerations and Migration Paths (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 5: Effective Use of the New Management Tools (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 6: New Application Design Patterns for Scalability and Availability and the Operational Implications of Service Broker (Level 200)   
   Presenter: Bob Beauchemin, SQLskills.com, 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 7: Technologies and Features to Improve Availability (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 8: Implementing Database Mirroring, Part 1 of 2 (Level 200)
   Presenter: Mark Wistrom, SQL Server Program Manager - Microsoft Corp., 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 9: Implementing Database Mirroring, Part 2 of 2 (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry is here
   Session's survey is here.

Session 10: Recovering from Isolated Disasters and Human Error (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry here. And a second blog entry here.
   Session's survey is here.

Session 11: Best Practices in Building Robust, Recoverable, and Reliable Systems (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry here.
   Session's survey is here.

And that's about it! I hope you really enjoy the series... and if you like that one, you might want to checkout the entire 10-part series on MSDN. The link to the blog entry that has all the links (like this one) is here.

Have fun,
kt

Well... 11 of 11 has completed. Friday was our last chat - until next time ;). It was a summary event where I took a slightly different spin on things focusing on grouping technologies by the amount of effort that's needed to implement them. Simply put, we looked at the technologies in order of what gives you the biggest bang for the buck. We ended the session with a ton of great questions (as always!) and there was even a question on the origin of foo (make sure to also see fubar).

First, there were a few links that I wanted to provide from the session, I'll start with those:

And, we also talked about Migrations:

Finally, capacity planning:

  1. Calculate the amount of space needed for your tables (calculate this as rows per page and then required pages as MB)
  2. Calculate the amount of space needed for your indexes (you can use sp_spaceused to get a current ratio of index to data and then use that OR you can estimate 1-3times your current data in indexes...yes, if you have 10GB of tables - you should estimate 10-30GB for indexes)
  3. Calculate in your estimate on future growth
  4. Take your single largest table and multiply by 1.5 for free space. (Use 2.5 IF you're going to use ONLINE index operations). So, if the single largest table is 3GB then I'd add 7-8GB for free space)
  5. Add a "just in case" extra 10-20%
  6. And, I didn't mention this BUT you should also include alerts to help you monitor space usage and significant changes to your free space!

And that wraps up the series. Wow - I can't believe how many of you joined in for questions as well as stayed on until the end. It's really great that so many of you are still having fun with SQL Server as well. I look forward to another series with you...at some point! In the interim, here are a few places where I'll be:

SQLskills Immersion Events - in the US... will be announced shortly. The BEST place to be when we announce the dates for these events is a subscriber on SQLskills. Subscribing is FREE and the announcements are going to be later this month. Here's a link to directly subscribe on SQLskills: http://www.sqlskills.com/login.aspx.

Thanks again for attending the series! It was great fun. I'll post a final blog entry with ALL of the links as well as all of the survey links. I know that they're going to send me these so that you can get easier access to them.

See you next time,
kt

In part 9 of our webcast series titled: Implementing Database Mirroring, we covered the steps from setup to failover to monitoring. There were lots of great questions and I think we could easily go back and do a couple more hours on database mirroring, failover combinations - including manual failover and client application questions. Having said that, there were a few interesting scenarios that came up that I thought I'd add a bit more details about here. For simplicity I created sections...

Where to go for more information on Database Mirroring and SQL Server SP1

Database Mirroring between Editions

Database Mirroring is supported in both the Standard Edition (SE) and the Enterprise Engine (EE) Edition(s): Enterprise, Enterprise Eval and Developer. In the EE Editions all configurations (synchronous and asynchronous) are supported: High Availability (sync), High Protection (sync) and High Performance (async). In the SE, only the synchronous forms of Database Mirroring are supported: High Availability and High Protection. One thing that is true however, (and I learned this as well - durin the webcast in Part 9 - thanks to the question submitted and Mark being present...thanks Mark!), is that even while synchronous mirroring is supported in both SE and EE, you can only create a mirroring partnership between servers of the same edition.

Database Mirroring between Platforms

Database Mirroring is supported in both the Standard Edition (SE) and the Enterprise Engine (EE) Edition(s): Enterprise, Enterprise Eval and Developer. In the EE Editions all configurations are supported: High Availability, High Protection and High Performance. In the SE, only the synchronous forms of Database Mirroring are supported: High Availability and High Protection but not the asynchronous High Performance configuration. One thing that is true however, (and I learned this as well in Part 9 - thanks Mark!), is that even while synchronous mirroring is supported in both SE and EE, you can only create a mirroring partnership between servers of the same edition.

Combining Database Mirroring with Other Technologies

The Books Online has a section targeting exactly this discussion. Review this section in the SQL Server 2005 Books Online (April Update): Database Mirroring and Other Features and Components. Additionally, I've provided a few comments for you to review as well as links to some of the specific BOL topics that exist on these combinations.

Database Mirroring with Failover Clustering

These two technologies CAN be combined but there are multiple things with which you should be aware. First, a failover of a cluster is SLOWER than a failover of a Mirror pair... as a result, it is likely that your secondary server will come online as the new principal in the time that it takes your principal (which is on a cluster) to recover. In a lot of cases, this is good because this keeps you online longer and results in less downtime but it may also be undesireable when your primary is now running at your alternate operations site - which is unstaffed. So, in some cases you may want to prevent automatic failover and instead only use the secondary mirror when you absolutely have to (i.e. NOT just when the cluster fails). If this is the case then you might prefer running with the High Protection configuration of Database Mirroring instead of the High Availability configuration.

This will allow you to manually failover when desired.

As another option - you can increase the timeout for Database Mirroring failover to 90 seconds. If the cluster comes back online within 90 seconds then the automatic detection/failover of the High Availabilty configuration will not occur unless the cluster does not come back online (as the principal) within x seconds. You can configure the Database Mirroring Failover timeout by using ALTER DATABASE.

ALTER DATABASE dbname SET PARTNER TIMEOUT x

Please note, this is only one timeout of many. There are many different types of timeouts in the system that can cause a failover. However, a hard error code generally starts the failure procedure sooner.  Mark pointed this out in his failure detection slides in our TechNet webcast series, Part 8.

Review this section in the SQL Server 2005 Books Online (April Update): Database Mirroring and Failover Clustering.

Database Mirroring with Replication

These two technologies CAN be combined together but not all configurations are supported and where supported, there are specific setup requirements. From the BOL: Replication supports mirroring the publication database for merge replication and for transactional replication with read-only Subscribers or queued updating Subscribers. Immediate updating Subscribers, Oracle Publishers, Publishers in a peer-to-peer topology, and republishing are not supported.

 
Review this section in the SQL Server 2005 Books Online (April Update): Replication and Database Mirroring

Database Mirroring with Log Shipping

These two technologies CAN be combined together but it will require a bit of manual configuration to continue log shipping when a mirror becomes the new principal.

Review this section in the SQL Server 2005 Books Online (April Update): Database Mirroring and Log Shipping.

And - there are others in the BOL. Please reference the sections listed above for more details.

And - with that - we're caught up with our resources and references for this series. Part 11 - the LAST one - is this Friday, May 19. I look forward to your being there LIVE. Register here and come ready with your questions, this one is going to be VERY focused on best practices, ideas/architectures and your questions. Those of you that are there LIVE will help to direct the session.

Thanks!
kt

In the last few minutes of the webcast (part 10), I goofed up one line of code and didn't realize it until today. As my very last demo (and there were at least 10 different scenarios/concepts/demos yesterday) in my webcast, I decided to show a Database Snapshot on a Mirror database. It was the second database snapshot that I had created so my first database snapshot demo was just fine. However, when I went to create the database snapshot on the mirror, I inadvertently left off the most important part "AS SNAPSHOT OF AdventureWorks". The irony is that I tried to query some tables and just ended up (because we were right at the end of the webcast ;)) saying that I probably wasn't getting the table names right. Ha - there were no tables... I hadn't created a database snapshot, I had created just another database - so the only tables I was seeing were the catalog views.

Anyway, just for clarity, I corrected the "Demo Scripts" zip that's associated with Part 10 BUT if you've already downloaded it then you'll have the old (and incorrect) version of this script (SnapshotOnMirror.sql). And, for completeness, I'll put the code that I executed during the webcast here:

USE AdventureWorks
go

USE master
go

CREATE DATABASE AdventureWorksSnap
ON
( NAME = N'AdventureWorks_Data',
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\AdventureWorksSnap_Data.mdfss')
-- , SIZE = 167872KB , MAXSIZE = UNLIMITED, FILEGROWTH = 16384KB )
go

and the code that I should have executed here:

USE AdventureWorks
go

USE master
go

CREATE DATABASE AdventureWorksSnap
ON
( NAME = N'AdventureWorks_Data',
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\AdventureWorksSnap_Data.mdfss')
-- , SIZE = 167872KB , MAXSIZE = UNLIMITED, FILEGROWTH = 16384KB )
AS SNAPSHOT OF AdventureWorks  <<<< ----------
go

USE AdventureWorksSnap
go

SELECT * FROM person.contact
go

So, quick demos right at the end of the webcast might not have been my best idea ;). But - I'm surprised none of you called me on it?! I'll blame it on this for now.

Have a great weekend,
kt

OK - today's session was quite fun... lots of demos and quite a few "tie-ins" where I tried to bring together many things that we've touched on in our series. And - that's really the point of the series - creating a reliable, robust, scalable and available environment takes MANY different features. You really need to architect a complete solution in order to handle the many potential problems that may occur. And, unfortunately, it's a never ending process; you're never done and you're never going to get everything (sorry!). You will need to re-evaluate, monitor, and manage your system as long as it runs to keep it reliable, available and fast. Something will come up...someday...that you didn't think about, evaluate and/or prevent. But, then you'll know and then you'll put something into place to keep it from happening again.

So - to tie back into some of the other sessions and resources, here is a list of everything to date in the series as well as a few specific references I made during the session.

Demo Scripts are here: 20060512_TechNetWebcast-Part10.zip (25.46 KB) (updated on Sat, May 13 at 2:55 PDT)
Credit Database zip is here. NOTE: This is a 48MB zip which expands to a 175MB backup and restores to a 700 MB database (with a lot of free space for testing, etc.).

TechNet Webcast Series

Session 1: A Fast-Paced Feature Overview and Series Introduction (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 2: Security (Level 200)
   Presenter: Bob Beauchemin, SQLskills.com, 
   Session's corresponding blog entry, here

Session 3: Understanding Installation Options and Initial Configuration (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 4: Upgrade Considerations and Migration Paths (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 5: Effective Use of the New Management Tools (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 6: New Application Design Patterns for Scalability and Availability and the Operational Implications of Service Broker (Level 200)   
   Presenter: Bob Beauchemin, SQLskills.com, 
   Session's corresponding blog entry, here

Session 7: Technologies and Features to Improve Availability (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 8: Implementing Database Mirroring, Part 1 of 2 (Level 200)
   Presenter: Mark Wistrom, SQL Server Program Manager - Microsoft Corp., 
   Session's corresponding blog entry, here

Session 9: Implementing Database Mirroring, Part 2 of 2 (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here.

Session 10: Recovering from Isolated Disasters and Human Error (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   You're reading it! :-)

Recovery Models and Backup/Restore

  • MSDN Webcast Parts I and II cover Recovery Models and some issues/best practices related to changing recovery models. Check out the blog entry here which has links to the sessions and their associated blog entries.
  • MSPress Title: SQL Server 2000 High Availability, Chapter 9: Database Environment Basics for Recovery is here. The MSPress page for this title is here.
  • SQL Server Magazine Article on Isolated Disasters and Recovery (using RESTORE with STANDBY/STOPAT to investigate when a database became damaged) is here. Check out a consolidated list of all of my SQL Server Magazine Articles here and SQL Server Magazine here.

Table and Index Partitioning

RAID 0+1 and RAID 1+0

There was a question that came up on this and the question basically asked - which is better. Well, this is a hard question to answer because they both have pro's and con's BUT before I get to the pro's/con's there's also another [more important] issue; these two get confused and swapped all the time. In fact, many vendors USED to refer to these interchangeably and even just lumped them together as RAID 10. Today, most people don't do this and most people also try to refer to the underlying technlogy instead of the numbers. Having said all of that, RAID 1+0 is Striped Mirrors and is my general recommendation because it tends to be more reliable than 0+1 and can tolerate more drive failures than 0+1. RAID 0+1 is Mirrored Stripes - which generally outperforms RAID 1+0 but cannot tolerate the loss of more than one drive and because of that it's more vulnerable. In the end, I'd suggest a simple "educational" site here (it's on a commercial site but it has a nice - and short - description of the different types of RAID arrays).

See you next week - for our LAST part in this series - Part 11: Best Practices in Building Robust, Recoverable, and Reliable Systems (Level 200).

Thanks for reading, listening and continuing to ask great questions!
kt

I completely spaced in blogging about a recent interview I did...poolside, in Orlando, FL when I was at SQL Connections back in April. I had the pleasure of meeting Chuck Boyce (a DBA from Philly, PA) who feverishly works in his spare time to spread the word about technology and specifically about all things SQL. His blog is here and he does a great job of summarizing good links and useful resources - almost daily (just so you don't have to!) on his "WHERE Clause" resource blog posts. You should check that out while you have your morning coffee. A great way to quickly find some useful stuff.

Additionally, Chuck has a radio program (What's Happening in SQL Server) that he does for SSWUG (Steven Wynkoop's excellent SQL Server Worldwide User Group). The entire list of SSWUG Broadcasts are here and specifically, the chat that we did poolside is here.

So - sorry that took me so long to remember... I wish I could blame it on too much sun (and/or drinks) poolside but......... sadly, I can't.

See you Friday on our 10th part of our TechNet Webcast series. Wow, we're on the home stretch!

Thanks,
kt

Last week Mark Wistrom (Program Manager in the SQL Server Team at Microsoft), delivered part 8 of our TechNet webcast series. Most of the resources needed to prepare for this session - as well as learn more about Database Mirroring - have already been posted in the blog entry for part 7 (as homework!). However, there were two things that we wanted to post from Mark's session:

(1) The case study that was presented during the session is here.
(2) The Q&A that was created by a few of Mark's team who were answering during the session (and then Mark did a scrub of it as well to clean it up- THANKS Mark) is here (29.1 KB).

Enjoy!
kt

Well, Part 7 has completed and we're on the home stretch... focusing on part of the new Always On technologies of SQL Server 2005. We've made our way through quite a few discussions and my main point for the sequence - as defined - was to make clear that keeping a system available takes a myriad of choices, features, configurations - and more. In fact, even once you think you've done it you still need to monitor, manage and re-evaluate your configuration if unexpected events occur and bring your system offline and/or unavailable in any way. And - well, that's also a big part of my focus... what does "availability" mean to you? Do you believe that only unplanned downtime counts or that *any* impact to the system's availability counts as "downtime"? (btw - I'd really like to know!)

Regardless, that's been our primary focus for the series... I believe that the Enterprise Edition of SQL Server 2005 can keep your system available through a very wide number of system hiccups, damage and even more catastrophic disasters. In the previosu sessions we looked at migration and installation (ensuring a proper configuration - right from the start), we covered creating a secure environment (which also impacts availability), we looked at "finding the right tool for the job" and then we started looking into alternative designs that may help to improve availability by scaling out our design. If you missed any of the sessions you might want to go back and see what's what! Here's the list of sessions at a quick glance:

Session 1: A Fast-Paced Feature Overview and Series Introduction (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 2: Security (Level 200)
   Presenter: Bob Beauchemin, SQLskills.com, 
   Session's corresponding blog entry, here

Session 3: Understanding Installation Options and Initial Configuration (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 4: Upgrade Considerations and Migration Paths (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 5: Effective Use of the New Management Tools (Level 200)
   Presenter: Kimberly L. Tripp, SQLskills.com, 
   Session's corresponding blog entry, here

Session 6: New Application Design Patterns for Scalability and Availability and the Operational Implications of Service Broker (Level 200)   
   Presenter: Bob Beauchemin, SQLskills.com, 
   Session's corresponding blog entry, here

Session 7: Technologies and Features to Improve Availability (Level 200)

Session 7 was a great deal of fun - we covered 11 different technologies (12 if you count partitioning) and discussed the architectural reasons to choose each teachnology - as well as the barriers it provides protection against. We talked about a lot of technologies and a lot of resources:

  • Remote Mirroring - Always consult your hardware vendor and make sure they support block size preservation and write-order preservation. Ideally, RM should be combined with Failover Clustering - when that's the case you have a "Geographically dispersed failover cluster" which removes the single point of failure in Failover Clustering. See the Windows Server Catalog, specifically for the Geographically Dispersed Cluster Solution category.
  • Failover Clustering - A combination of hardware and software to provide protection against server failure. Only solutions from the Windows Server Catalog, specifically for the Cluster Solution category for supported solutions in failover technologies.
  • Database Mirroring - See the homework references below as well as come back for the next two upcoming sessions where we cover DBM for two weeks.
  • Log Shipping - While this is still supported *and* while there are still some excellent uses for Log Shipping, this is not a "favorite" solely for failover. If you're looking for a "warm" failover solution (warm = no automatic detection, no automatic failover) with less potential for data loss - you should consider the "High Performance" configuration of Database Mirroring. If you would like to continue using Log Shipping for a more latent secondary (a log load delay) for managing disasters (either investigation or recovery) of data from an older "version" of the database then LS is an option but Database Snapshots can also help in *some* cases. This technology is well documented as well as written about.
  • Peer to Peer Replication - I demo'ed and discussed this in session 1 as well as referenced a few helpful links for TechNet sessions, etc. See the session and "blog" links as listed above.
  • RAID - Redundant Array of Independant Disk
  • Partial Database Availability, Online Piecemeal Restore and Database Snapshots - come back for Session 10 where I'll cover these and demo these!
  • Raid.edu - a short - but interesting overview of all the different raid types.
  • MSPress title: Microsoft SQL Server 2000 High Availability, Chapter 9: Database Environment Basics for Recovery
  • SQL Server 2000 and SQL Server 2005 support for mounted volumes
  • MSDN "Developer/Design" Webcast Series: Blog entry with all of the links
    • Online Index Operations, Part 5
    • Snapshot Isolation, Part 6
    • Partitioning, Part 8
  • Scalable shared databases are supported by SQL Server 2005
  • Oracle Real Application Clusters and Industry Trends in Cluster Parallelism and Availability

Finally, be ready to watch Mark's session on Friday, April 28. Here's your homework for Session 7:

  1. Review: Release notes and information for SQL Server 2005 Service Pack 1 
  2. Watch: TechNet Webcast: How to Increase Availability Using Database Mirroring in SQL Server 2005 (Level 200) 
  3. Read: Database Mirroring in SQL Server 2005 

And the details for Session 8:

TechNet Webcast: SQL Server 2005 for the IT Professional (Part 8 of 11): Implementing Database Mirroring in SQL Server 2005 (Part 1 of 2) (Level 200)
Presenter: Mark Wistrom, Program Manager, Microsoft Corporation

Database mirroring was released for testing when Microsoft SQL Server 2005 shipped in November. As the first service pack has shipped, it's time to get prepared for database mirroring in production! In this session, understand the barriers of what database mirroring will protect against, what constitutes a "failover", what the performance criteria are and how the monitoring has been brought together for release. Attend this first part of two - as the eighth webcast in the SQL Server 2005 for the IT Professional series to obtain better insight for when database mirroring should be implemented as well as what to expect moving forward in service pack 1 (SP1). Part 9 will cover implementation from start to finish - as an end to end demo.

Start Time:   Friday, April 28, 2006 9:30 AM (GMT-08:00) Pacific Time (US & Canada) 
End Time:   Friday, April 28, 2006 11:00 AM (GMT-08:00) Pacific Time (US & Canada) 

See you in Part 9: TechNet Webcast: SQL Server 2005 for the IT Professional (Part 9 of 11): Implementing Database Mirroring in SQL Server 2005 (Part 2 of 2) (Level 200) on May 5th.
kt

In doing my final preparations for part 7 of my TechNet webcast series on Building Robust, Reliable and Recoverable Systems, I decided to (once again) review my abstract. I do this as a last step to make sure I cover everything I said I would cover. Here's the abstract:

TechNet Webcast: SQL Server 2005 for the IT Professional (Part 7 of 11): Technologies and Features to Improve Availability

Find the right technology for the job in this seventh webcast of the SQL Server 2005 for the IT Professional series. Join us to learn which technologies provide the right solution for a specific problem, as well as the pros and cons of each technology. Designing a system to protect you against the faults most likely to occur is the first and most important strategy, but finding the right combination to minimize both downtime and data loss is critical. This webcast covers many of the “AlwaysOn” technologies at a glance: remote mirroring, failover clustering, database mirroring, log shipping, [peer to peer] replication, RAID, partial database availability, piecemeal online restore, database snapshots, snapshot isolation, and online index operations.

Start Time: Friday, April 21, 2006 9:30 AM (GMT-08:00) Pacific Time (US & Canada) 
End Time: Friday, April 21, 2006 11:00 AM (GMT-08:00) Pacific Time (US & Canada)  

So, in re-reading this it certainly sounds like a lot to cover. But - rest assured, this session is what we're going to use to lead into the rest of the series. Parts 8-11 go into more detail on some of the new and more complex topics covered in that list. For example, parts 8 and 9 cover Database Mirroring and part 10 covers Partial Database Availability, Online Piecemeal Restore and Database Snapshots. Also, for a few topics, I'll point you to some great resources to keep you going in learning these other technologies. In the end, my goal for Friday is make sure you understand the best use case for each of these technologies. Once you know when it's best to use them, you can really begin to architect the *right* solution for your system! Parts 8-11 will focus more on implementation and demos!

If you're wondering what your options are and how to get better direction on the architecture to implement, join us on Friday: http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032290562&EventCategory=4&culture=en-US&CountryCode=US

Oh, and in the actual abstract, there's a typo...not sure if we'll have time to cover log hipping. ;) ;)

Talk to you on Friday!
kt

Well, Friday brought another flood of great questions from everyone as we moved our way through many of the new 2005 tools. The one thing that I really wanted to stress was that *many* SQL Server 2005 tools (SQLCMD, SSMS and SQL Profiler) offer important features that can be leveraged today, even if your primary production servers are still SQL Server 2000. I did move through the tools quickly and showed quite a few new features; there are a lot of excellent resources to help you dive in deeper now that you're interested, ready and know some of the rewards of starting now. Here are a few of those resources:

For deleting old database backup history, there are a couple of stored procedures in msdb that can be used:

  • sp_delete_backup_and_restore_history
  • sp_delete_backuphistory
  • sp_delete_database_backuphistory

For cycling errorlogs, use: sp_cycle_errorlog.

And - lots of other questions that I primarily answered online in the last 40+ minutes. We had a great group and I hope everyone had fun. For the second half+ of the series we're going to focus on architectures and solutions - mostly related to disaster recovery and avoidance. However, the next part of the series is going to branch into a new (and *very* interesting) area of SQL Server 2005 - Service Broker. There are many impacts of Service Broker on the SQL Server system AND you might find a few applications of the technology within your own application as well. Have a great time with Bob for part 6 and I'll be back for part 7 next Friday.

See you soon!
kt

And another one bites the dust! Wow - what a great group today... soooooo many questions! For those of you that weren't there - the lecture was 80 minutes and the additional Q&A went on for another 45 minutes. So - as a result, there were *a lot* of additional resources needed. Let me get started with all of those right away.

To prepare for moving to SQL Server 2005 there are a few EXCELLENT resources with which you should start:

Phase 1 - Prepare to Upgrade/Migrate

Phase 2 - Database-level Testing

  • Copy Database Wizard or
  • Backup/Restore or
  • Detach/Attach

Phase 3 - Server-level Testing

  • Consider upgrade in-place or
  • Make sure that you manually migrate all EXTERNAL objects, logins, jobs, error messages, etc.

Phase 4 - Testing/Updating after the upgrade/migration

  • Update statistics immediately
  • Test application code, database compatibility modes, session settings
  • Check for "broken code" in terms of system table changes
  • MOST of this should have already been done and assessed in Phase 1 but better to be safe!

And - finally - the other things we talked about and the rest of the links are here:

And - that's it for this week. See you next Friday when we chat about the new Management Tools and how to effectively use them!

Thanks for listening/watching and asking GREAT questions,
kt

OK - so Bob Beachemin delivered Part 2 and I was back for Part 3. We had lots of folks on board with this session (more than 400) and as a result, I had a lot of questions. More than anything it seems like a lot of you wanted to know which versions of which came with what and could go with what (in terms of OS)... so, even getting started - and probably installing at home to play around ;). I was expecting tons of questions on the technical tidbits of installation options so you sure kept me on my toes!! Here are probably two very useful MSDN Links to SQL Server BOL Topics:

Hardware and Software Requirements for Installing SQL Server 2005 - In fact, this has a GREAT matrix of all the different platforms and which versions can be installed where!

Features Supported by the Editions of SQL Server 2005 - The BOL topic is very detailed.

And - in addition to those, there were quite a few more topics discussed during the webcast. The rest of the blog entry focuses on those questions! I hope this helps... enjoy!

Resource Links for all On-demand TechNet Sessions in our series for the ITPro

Part 1 - A Fast-Paced Feature Overview and Series Introduction
   On-demand link
   My blog entry for the session

Part 2 - Security
   On-demand link
   Bob's Blog Entry for the session

Part 3 - Understanding Installation Options and Initial Configuration (Level 200)
   On-demand link
   Blog entry link (well, you're already here ;)

Session 3 Resource Links as discussed during the session:

Submitting Product Feedback

  • MSDN Product Feedback Center: http://lab.msdn.microsoft.com/productfeedback/
  • Tips for Submitting Feedback on the Feedback Center (tips were related to Visual Studio but there are some great general tips about how to file useful feedback!): http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=138235&SiteID=1
  • Add the Help Toolbar and connect to Product Feedback from within the SQL Server 2005 Tools. In SQL Server Management Studio, select View, Toolbars, add Help. Once the Help toolbar is visable, select the Send Feeback button which should be in the "Ask a Question" section at the end of the toolbar.

If you're thinking about downloading - check out the newly bundled SP1 downloads:

SQL Server 2005 RTM Enterprise Eval Edition
SQL Server 2005 RTM Express Edition

SQL Server Express Edition with SP1 (SQLEXPR.EXE)
SQL Server 2005 Express Edition with SP1

SQL Server Express Edition with Advanced Services (SQLEXPR_ADV.EXE)
SQL Server 2005 Express Edition with SP1 + Advanced Services includes SQL Server Management Studio Express (SSMSE), support for full-text catalogs, and support for viewing reports via report server.

SQL Server Express Edition Toolkit (SQLEXPR_TOOLKIT.EXE)
SQL Server 2005 Express Edition Toolkit (SQL Server Express Toolkit) provides tools and resources to manage SQL Server Express and SQL Server Express Edition with Advanced Services. It also allows creating reports by using SQL Server 2005 Reporting Services (SSRS).

SQL Server Management Studio Express (SQLServer2005_SSMSEE.msi)
SQL Server Management Studio Express (SSMSE) provides a graphical management tool for managing SQL Server 2005 Express Edition and SQL Server 2005 Express Edition with Advanced Services instances. SSMSE can also manage relational engine instances created by any edition of SQL Server 2005. SSMSE cannot manage Analysis Services, Integration Services, SQL Server 2005 Mobile Edition, Notification Services, Reporting Services, or SQL Server Agent.

And - that's it for this week. See you on Friday, March 31 when we'll chat more about Upgrade and Migration. Here's the link to register for this upcoming session: http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032290477&EventCategory=4&culture=en-US&CountryCode=US

See you soon,
kt

Hey there everyone - Well there was lots of excitement around our first session...so much so that apparently a Live Meeting server went down and caused MANY of you to get booted-out or even blocked-from attending (figures, right!).... Ugh (talk about the irony here - a series on high availability that isn't available because a server crashes...hhmmm, I think I know where to go for my next potential customers ;) ;). Regardless, I'm glad that at least a couple hundred of you did get in. For the more than 1000 others that were registered but unable to get in - I truly want to apologize!

The good news is that we now have the on-demand link available and for all of you who registered, it should have been sent to you via email. Also, as promised, I've attached the resources and demo scripts we talked about today.

Partial Database Availability Demo Scripts: PartialDBAvail-DemoScripts.zip (4.19 KB)
Database Mirroring Demo Scripts: DatabaseMirroring-DemoScripts.zip (3.74 KB)
Replication
Demo Scripts - Since this demo was completed through the UI, here are some useful references on Replication:

Other Resources:

SQLCMD Resources: My blog entry after Michiel Worries' Webcast (includes links to webcast, etc.)
TechNet Resource Center: SQL Server 2005 Mission Critical High Availability
Demo: Windows Server System Reference Architecture Design Considerations for SQL Server 2005 High Availability
Whitepaper: Choosing a Database for High Availability: An Analysis of SQL Server and Oracle

Also, to get you ready for SQL Server 2005 - check out the Upgrade/Migration Resource Center: Upgrading to SQL Server 2005

And... that should keep you busy between now and next week!

Have fun,
kt

Hey there everyone! I know I still owe you a few Q&A entries (for sessions 7, 8 and 9) but I wanted to get this blog entry out there so that you can play a bit with some of the resources. This series was targeted at developers but really helps to "bridge the gap" between development and administration by always remembering the impacts of what you implement (and techniques to help you to see if you do). More specifically, everything you do and/or design, has the potential for a negative impact to something else - there's no free lunch, eh?

So, this series focused more on Scalability but always remembered the impact to availability and/or reliability. This last session brought together the three primary areas to remember while developing a scalable and reliable architecture:

  1. Know your data
    • Design for Performance - Sessions 1, 2, 3, 6, and 8
  2. Know your users
    • Indexing for Performance - Sessions 4, 5 and 9
    • Optimizing Procedural Code - Session 7
    • Controlling Mixed Workloads and Concurrency - Session 6
  3. Users lie
    • Profile - to make sure that you're tuning what's really happening as opposed to what you think was going to happen! - Session 9

This last session had some great questions and as a result, a few new resources were used. Here are a few of the things we talked about:

Event Notifications and DDL Triggers

DMVs

Webcast links for the entire series!

Part 1: Creating a Recoverable Database
For the MSDN Download for Part 1, click here.
For the SQLskills Blog Entry for Part 1, click here.

Part 2: Creating a Reliable and Automated Backup Strategy
For the MSDN Download for Part 2, click here.
For the SQLskills Blog Entry for Part 2, click here.

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures
For the MSDN Download for Part 3, click here.
For the SQLskills Blog Entry for Part 3, click here.

Part 4: SQL Server Indexing Best Practices
For the MSDN Download for Part 4, click here.
For the SQLskills Blog Entries for Part 4
Resource links blog entry, click here.
Q&A blog entry, click
here.

Part 5: SQL Server Index Defrag Best Practices
For the MSDN Download for Part 5, click here.
For the SQLskills Blog entry, click here.

Part 6: Mixed Workloads, Secondary Databases, Locking and Isolation
For the MSDN Download for Part 6, click here.
For the SQLskills Blog Entry for Part 6, click here.

Part 7: Understanding Plan Caching and Optimizing Procedure Performance
For the MSDN Download for Part 7, click here.

Part 8: Data Loading and Aging Strategies - Table and Index Partitioning
For the MSDN Download for Part 8, click here.

Part 9: Profiling for Better Performance
For the MSDN Download for Part 9, click here.

Part 10: Session Summary - Common Roadblocks to Scalability
For the MSDN Download for Part 10, click here.
Transcript can be found here.

So, the series comes to an end (even though I still have more work to do). I have to say that it was a lot of fun and I enjoyed everyone's questions. And /start shameless plug here/ starting in March, SQLskills will begin a 10-12 part series on TechNet. The series will include sessions from my colleague Bob Beauchemin as well as me. This will definitely be more Operations and DBA focused but for all of you developers - it may help you better understand the system, High Availability and a myriad of New Features in SQL Server 2005.

I hope to see you there - or at least your DBA... ;-)

Thanks again everyone,

Kimberly

Effectively Designing a Scalable and Reliable Database

A Primer to Proper SQL Server Development

SQL Server Mixed Workloads, Secondary Databases, Locking and Isolation, Part 6 of 10

Presented by Kimberly L. Tripp, SQLskills.com

Q: Can I view a recording of this webcast?Part 6 can be replayed by clicking here.

Q: Where can we get the demo scripts AND the sample database: Credit? The demo scripts are in this zip (20050916MSDNDemoScripts.zip (6.11 KB)); here in this blog entry. However, at the series completion, I will also create an entry under Past Event Resources for the entire webcast series.  To download the ZIP of the Credit Database Backup click here. Once unzipped, restore this backup to SQL Server 2000 or SQL Server 2005. The backup is a SQL Server 2000 backup and can be restored to either version! If restoring to SQL Server 2005, you might want to change the destination for the data and log file as the path will probably be different.

Q: Where are the links to all prior Webcast Q&As from this series?

Part 1: Creating a Recoverable Database
For the MSDN Download for Part 1, click here.
For the SQLskills Blog Entry for Part 1, click here.

Part 2: Creating a Reliable and Automated Backup Strategy
For the MSDN Download for Part 2, click here.
For the SQLskills Blog Entry for Part 2, click here.

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures
For the MSDN Download for Part 3, click here.
For the SQLskills Blog Entry for Part 3, click here.

Part 4: SQL Server Indexing Best Practices
For the MSDN Download for Part 4, click here.
For the SQLskills Blog Entries for Part 4
Resource links blog entry, click here.
Q&A blog entry, click here.

Part 5: SQL Server Index Defrag Best Practices
For the MSDN Download for Part 5, click here.
For the SQLskills Blog entry, click here.

Q: How can I replay previous sessions? I thought we were going to get emails for replaying -- but I haven't received any replay emails. You will receive replay emails ONLY when you register for these sessions through MSDN. We’ve come to find out that there are other ways to register but it’s only through MSDN that we know for sure you will receive the replay information.

Technical Questions

Q: I know you have covered indexes and backups in other webcasts, but here is my question I use heavily temporary tables. My TempDB grows up to 5 GIG. Should I backup or truncate the transaction log in order to bring it back to its normal size? No, there is no need to specifically maintain the transaction log of the TempDB database. If the transaction log (and subsequently, the database) grows large – there could be multiple reasons for that and instead of thinking in terms of trying to manage the log, I’d look at long running transactions and/or large transaction. You can use Profiler to help you see long running and/or large transactions.

Q: Can I perform a database snapshot to another server? No, database snapshots must be created on the same server as the database on which the snapshot is being based.

Q: Can I snapshot by filegroup? No, however if what you want to do is create a snapshot which does NOT include certain files – you can take those filegroups offline and then create the snapshot. In the snapshot the only file/filegroups available will be those which were online when the snapshot was created…even if those files/filegroups are brought online after the snapshot was created.

Q: Could a reader be blocked on the snapshot DB while SQL updates the changed page? No. The copy on write mechanism is really a copy before write mechanism and the pages will be copied before the write and essentially before the locks, etc. The only possible “blocking” could be caused by the excess I/Os that need to be performed. However, the I/Os are performed only on the FIRST change to the page after the snapshot is created – so it’s minimal!!

Q: Are DMVs in SQL Server 2005 only?Yes, DMVs = Dynamic Management Views and these are a feature of SQL Server 2005.

Q: Is read uncommitted the lowest/least in terms of data consistency? Yes, read uncommitted is also known as “dirty read.” A dirty read is a read against an “in-flight” transaction; this transaction could be rolled back. As a result, the query that read that data would be inaccurate.

Q: What is the effect of versioning on fragmentation and performance (I'm assuming I create a split of the page is full)? Actually, I’m not sure I’m following this one… But – I think I can answer it by just giving you some insight into how things work. Versioning – in terms of the data overhead added to the data row – does add a 14 byte value to help store the offset. This overhead is added ONLY once, to each row, after one of the snapshot isolation options is turned on (either or both – the READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION). When this 14 byte value is added to each of the rows, the additional 14-bytes might cause the page to split. Again, this is only a one-time addition. The trick to optimizing this structural change is to change the database option and then rebuild your indexes. This will make the data contiguous and versioning will have no additional affects on the data row. Now, if what you were thinking is that the versions were stored in the data row – then this is NOT the case. The version store comes from the TempDB and as a result, there is no additional overhead (over the 14-bytes) needed within the data row.

Q: What if we're not using transactions? Will repeatable read still lock the table during the read (particularly if the select is long)? Repeatable reads locks – and holds – the resources as they are read. So, YES, in the case of a select statement, you will acquire and hold the read locks for the life of the transaction.

Q: How much additional overhead does versioning require from the SQL Server engine? Most of the overhead comes from TempDB but there’s also a bit of overhead in looking up the version. There are no direct numbers associated with the overhead but in a lot of cases you should think in terms of roughly 10% additional costs for your transaction… So, I guess the best point is that you will have slower overall performance when implementing row version; however, you might solve a lot of your blocking problems. Slower but not blocked is better than not running at all – even when it does run at all quickly. J  In all seriousness though, if blocking is NOT your primary problem, you will add overhead without a possible benefit.

Q: So, is it the new transaction data or the old transaction data held in the snapshot store (seems like it might be different for statement vs. transaction level snapshotting)? It’s always the BEFORE image. The general process of the write is called “copy on write” but I think of it better as copy before write.

Q: Can we optimize the snapshot store (different physical device, file group(s), etc.)? No. However, you should look at optimizing TempDB. There are multiple things that you might want to consider. I discuss those thing in this blog entry here.

Q: If I don’t need locking why shouldn’t I use read-uncommitted? Hmm, you can… you just need to be aware of the fact that the data is “dirty” and is not guaranteed to persist.

Q: Where does SQL store all the row versions (with snapshot isolation turned on)? The version store is in TempDB.

Q: How do I view all of the Report options from the summary page? I am looking at Adventureworks (compatibility level = 90), but all that I see is the General report. Ah, ha! The new summary windows were added to after the beta II April CTP. So, what this tells me is that you’re running a build lower than 9.00.1187.07. At this point, I’d go for the September CTP which is build 9.001314.06.

Q: This question is from previous webcast... Is there anything new with SQL 2005 that does datetime support data types? Time datatype or Date datatype only? No, SQL Server 2005 only includes the datetime datatype for date/time data. However, by using “custom types” you can create your own types which are date only or time only (just for one example). There were separate SQLCLR types of date only and time only in SQL Server 2005; however these were non-native types and subsequently removed. Instead, they will be shipped as examples in a resource kit which ships after RTM.

Q: Kimberly, the downloadable zip file from your blog for at least the first session will only unzip to a "C:" drive (which my system doesn't have ;^) Could you please re-zip it to allow election of the drive to which it should unzip? This one still perplexes me. I didn’t set any options that would restrict this…  

For the next session, we’re going to cover how SQL Server keeps plans, where you can look to see what's in cache AND how you can know better if the stored procedure's plan should be kept...or not? If you’re interested in hearing more - here’s the registration link:

MSDN Webcast: A Primer to Proper SQL Server Development (Part 7 of 10): Understanding Plan Caching and Optimizing Procedure Performance

 See you on Friday!

kt

Brian A. Randell's Blog: http://www.mcwtech.com/CS/blogs/brianr/default.aspx
MCWTechnologies Website: http://www.mcwtech.com/

Kimberly L. Tripp's Blog: http://www.SQLskills.com/blogs/Kimberly
SQLskills Website: http://www.SQLskills.com

Presentation Resources
Presentation in PDF form
Kimberly's Demo Scripts
Brian's Demo Scripts and Code

Running SQL Server 2000 tools and SQL Server 2005 tools side-by-side
We talked about re-registering all of your COM components and I didn't have a slide for this. So, if SQL Server 2000 Enterprise Manager crashes when you try to access database properties OR SQL Server 2000 Query Analyzer doesn't seem to do the color coding correctly, then you need to re-register your COM componenets in BOTH of the following directories:
   c:\program files\microsoft sql server\80\tools\binn 
   c:\program files\microsoft sql server\mssql\binn

To re-register the components, execute: FOR %i IN (*.dll) DO regsvr32 /s %i

Resources and Presentations on Indexing Best Practices
First, start by reviewing the blog entries listed in the Indexes category here.
As for the Webcasts - there are 6 from which to choose! Each webcast has an associated Q&A posted to my blog - make sure to look for the Q&As. Usually they are posted within 1 week (give or take :) from the actual webcast.
   MSDN Webcast: Indexing for Performance - Finding the Right Balance (SQL Server 2000), recorded 11 June 2004
   MSDN Webcast: Indexing for Performance - Index Maintenance Best Practices (SQL Server 2000), recorded 19 July 2004
   TechNet It’s Sh0wtime Webcast: Index Creation Best Practices with SQL Server 2005, recorded at Tech Ed Amsterdam, July 2005
   TechNet It’s Sh0wtime Webcast: Index Defragmentation Best Practices with SQL Server 2005, recorded at Tech Ed Amsterdam, July 2005
   MSDN Webcast Series: Part 4 of 10, Best Practices in Indexing, recorded 26 August 2005
   MSDN Webcast Series: Part 5 of 10, New Features in Indexing and Index Maintenance Best Practices, recorded 2 September 2005

MSDN Webcast Series: Building Highly Reliable and Available Systems with SQL Server 2005
Watch one on-demand and/or sign up to attend one of the remaining!

Part 1: Creating a Recoverable Database

      For the MSDN Download for Part 1, click here.

      For the SQLskills Blog Entry for Part 1, click here

 

Part 2: Creating a Reliable and Automated Backup Strategy

      For the MSDN Download for Part 2, click here.

      For the SQLskills Blog Entry for Part 2, click here.

 

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures

      For the MSDN Download for Part 3, click here.

      For the SQLskills Blog Entry for Part 3, click here.

 

Part 4: SQL Server Indexing Best Practices

      For the MSDN Download for Part 4, click here.

      For the SQLskills Blog Entries for Part 4

         Resource links blog entry, click here.

         Q&A blog entry, click here.

         Part 5: SQL Server Index Defrag Best Practices
         
      For the MSDN Download for Part 5, click here.
               
For the SQLskills Blog entry, click here.

         Part 6: SQL Server Mixed Workloads, Secondary Databases, Locking and Isolation
         
      For the MSDN Download for Part 6, click here.

         Part 7: Understanding Plan Caching and Optimizing Procedure Performance 
         
      To register to attend, click here.

         Part 8: Data Loading and Aging Strategies 
         
      For the MSDN Download for Part 6, click here.

         Part 9: Profiling for Better Performance 
         
      For the MSDN Download for Part 6, click here.

         Part 10: Most Common Roadblocks to Scalability and Reliability 
         
      For the MSDN Download for Part 6, click here.

Profiling SQL Server and Creating a Server-side Trace
INF: How to Create a SQL Server 2000 Trace (283790)
HOW TO: Programmatically Load Trace Files into Tables (270599)
How To: Stop a Server-Side Trace in SQL Server 2000 (822853)
INF: How to Monitor SQL Server 2000 Traces (283786)
INF: Stored Procedure to Create a SQL Server 2000 Blackbox Trace (281671)
BUG: BOL Incorrectly States That Users Do Not Need to Be Sysadmin to Use Profiler or SQL Profiler SPs (310175) 
   NOTE: This is ONLY a SQL Server 2000 limitation.
INF: Job to Monitor SQL Server 2000 Performance and Activity (283696)
Support WebCast: SQL Server 2000 Profiler: What's New and How to Effectively Use It

Great KB to Start with for Troubleshooting
HOW TO: Troubleshoot Application Performance with SQL Server

What about Whitepapers - we referenced quite a few!
Get a Lean, Mean Dev Machine with the Express Editions of Visual Basic and SQL Server 2005 by Brian A. Randell 
SQL Server 2005 Beta 2 Transact-SQL Enhancements by Itzik Ben-Gan 
SQL Server 2005 Partitioned Tables and Indexes by Kimberly L. Tripp
SQL Server 2005 Snapshot Isolation by Kimberly L. Tripp
SQL Server 2005: the Database Administrator’s Guide to the SQL Server Database Engine .NET Common Language Runtime Environment by Kimberly L. Tripp

Other Whitepapers, Websites, and Webcasts
Blog Entry: 8 Steps to Better Transaction Log Throughput
MSDN Whitepaper: An Overview of SQL Server 2005 for the Database Developer
MSDN Whitepaper: Processing XML Showplans Using SQLCLR in SQL Server 2005
MSDN Whitepaper: Using CLR Integration in SQL Server 2005
MSDN Whitepaper: XML Support in Microsoft SQL Server 2005
MSDN Whitepaper: XML Options in Microsoft SQL Server 2005
MSDN Whitepaper: What's New in FOR XML in Microsoft SQL Server 2005
MSDN Whitepaper: XML Best Practices for Microsoft SQL Server 2005
MSDN Whitepaper: Usage Scenarios for SQL Server 2005 Native Web Services
MSDN Whitepaper: Managed Data Access Inside SQL Server with ADO.NET and SQLCLR
MSDN On-demand Webcasts 
MSDN Live Webcasts 
SQL Server 2005 Hands-On Labs
         
SQLCLR Hands-On Lab Manual
Microsoft SQL Server TechCenter on TechNet
Sample Book Chapters for SQL Server 2005 is a list of chapters posted from a variety of authors for books related to SQL Server 2005.
Hosting the .NET Runtime in Microsoft SQL Server on the Association for Computing Machinery (www.ACM.org). To access this article you need membership in SIGMOD, the ACM, or you can purchase just this article for download.
Service Oriented Database Architecture by David Campbell, also on the Association for Computing Machinery (www.ACM.org). To access this article you need membership in SIGMOD, the ACM, or you can purchase just this article for download.

Gert E.R. Drapers' website
Microsoft SQL Server Development Customer Advisory Team
PDC Information Site

Well, if that doesn't keep you busy, I don't know what will!

Enjoy!
Kimberly

Categories:
Events | Resources | SQL Server 2005

This is a much needed and much overdue blog entry... In 8 Steps to Better Transaction Log throughput, I mentioned a customer that was helped by TWO typical optimization problems I see. In that blog entry, I said I would write two blog entries - that one on transaction log optimization and another on common tempdb optimizations. Well, I forgot...until I was reminded with an email this morning (thanks Marcus!).

First - a bit of understanding of TempDB - what goes there?

  • Internal temporary objects needed by SQL Server in the midst of other complex operations. For example, worktables created by a hash aggregate will be stored in TempDB or interim tables uses in hash joins (almost anything that shows as "hash" something in your query plan output is likely to go to tempdb).
  • User objects created with either # (for local temporary objects), ## (globabl temporary objects) or @ (table variables)
    • # = Local temporary object
      Local temp objects are objects accessible ONLY in the session that created it. These objects are also removed automatically when the session that created it ends (unless manually dropped).
    • ## = Globabl temporary object
      Global temporary objects are objects that are accessible to ANYONE who can login to your SQL Server. They will only persist as long as the user that created it lasts (unless manually dropped) but anyone who logs in during that time can directly query, modify or drop these temporary objects. These objects are also removed automatically when the session that created it ends (unless manually dropped) OR if being used by another session when the session that created it ends, when the session using it finishes using it (and it's only as long as any locks are held). If other sessions need more permanent use of a temporary object you should consider creating a permanent objects and dropping it manually.
    • @ = User-defined Table Variable
      User-defined Table Variables were introduced in SQL Server 2000 (or, wow - was it 7.0?) and provide an alternative to temporary tables by allowing you to create a variable defined as type TABLE and then you can populate and use it in a variety of ways. There has been A LOT of debate over whether or not you should always use table variables or always use temp tables. My response is that I ALWAYS avoid the word always! My point is that table variables are NOT always better nor are temp tables always better. There are key uses to each. I tend to like temp tables in scenarios where the object is used over a longer period of time - I can create non-key indexes on it and it's more flexible to create to begin with (SELECT INTO can be used to create the temp table). I also have the ability to use the temporary table in nested subprocedures because it's not local to the procedure in which it was created. However, if you don't need any of those things then a table variable might be better. When it is likely to be better - when you have smaller objects that don't need to be accessed outside of the procedure in which it was created and when you only need KEY indexes (a table variable ONLY supports the indexes created by a create table statement - meaning PRIMARY KEY and UNIQUE KEY).
  • Objects created by client applications - this is possibly a large part of your problem... Profiling can help you to determine if there's a lot of TempDB usage from your client applications.

OK, so now that you know what goes there - how do you make it optimal?

First and foremost, TempDB is in cache just as any other database is in cache. TempDB does not spill to disk unless you are low on cache and/or if you have a lot of inflight transactions in TempDB. Although TempDB is not persisted from shutdown to restart - it still needs to do some logging and therefore you should consider its optimization a lot like other databases.

Things you should do for TempDB (that are a lot like what you should do for every database):

  1. Isolate the data and log portion of TempDB.
  2. Place them on clean, defragmented disks.
  3. Pre-allocate them so they don't need to do a lot of autogrowth.
  4. Make sure you have sufficient memory to support active objects (check for disk activity to the disks that contain TempDB files).
  5. Make sure that transactions are written efficiently so that there are no unusually long running transactions that are unnecessarily holding resources (and therefore locks and therefore log activity).

And - if you need to move TempDB, you should review this KB Article: Moving SQL Server databases to a new location with Detach/Attach

Things you should do SPECIFICALLY for TempDB (especially if you're running on a multiproc machine):

Before I say what... let me tell you why? TempDB has a large number of objects being created all the time. For an object to be created, space must be allocated to it. Space allocation is determined by looking at some of the internal system pages (the GAM, and SGAM). In the end, it is these pages that start to have significant contention (with just one file) in a VERY active TempDB. To minimize that contention you can create multiple files.

  1. Consider creating multiple files for TempDB (even if on the same physical disks) so that there is less of a bottleneck when objects are being allocated. Make sure to read associated KB.
  2. Consider setting a trace flag to have object allocation grab extents rather than pages. Make sure to read associated KB.

BOTH of these last two are described in detail by a KB article: FIX: Concurrency enhancements for the tempdb database.

OK - so that should really help! Moving forward (meaning SQL Server 2005), having multiple files can still help for TempDB.

Effectively Designing a Scalable and Reliable Database

A Primer to Proper SQL Server Development

New Features in Indexing and Index Maintenance Best Practices, Part 5 of 10

Presented by Kimberly L. Tripp, SQLskills.com

Q: Can I view a recording of this webcast? Part 5 can be replayed by clicking here.

Q: Where can we get the demo scripts AND the sample database: Credit? The demo scripts are in this zip (20050902MSDNDemoScripts.zip (8.52 KB)); here in this blog entry. However, at the series completion, I will also create an entry under Past Event Resources for the entire webcast series.  To download the ZIP of the Credit Database Backup click here. Once unzipped, restore this backup to SQL Server 2000 or SQL Server 2005. The backup is a SQL Server 2000 backup and can be restored to either version! If restoring to SQL Server 2005, you might want to change the destination for the data and log file as the path will probably be different.

Q: Where are the links to all prior Webcast Q&As from this series?

Part 1: Creating a Recoverable Database
For the MSDN Download for Part 1, click here.
For the SQLskills Blog Entry for Part 1, click here.

Part 2: Creating a Reliable and Automated Backup Strategy
For the MSDN Download for Part 2, click here.
For the SQLskills Blog Entry for Part 2, click here.

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures
For the MSDN Download for Part 3, click here.
For the SQLskills Blog Entry for Part 3, click
here.

Part 4: SQL Server Indexing Best Practices
For the MSDN Download for Part 4, click here.
For the SQLskills Blog Entries for Part 4
Resource links blog entry, click here.
Q&A blog entry, click here.

Q: How can I replay previous sessions? I thought we were going to get emails for replaying -- but I haven't received any replay emails. You will receive replay emails ONLY when you register for these sessions through MSDN. We’ve come to find out that there are other ways to register but it’s only through MSDN that we know for sure you will receive the replay information. Regardless, you can always find the “on-demand” version of the sessions here.

Related Resources

MSDN Webcast: Indexing for Performance – Proper Index Maintenance MSDN Whitepaper: Microsoft SQL Server 2000 Index Defragmentation Best Practices TechNet It’s ShOwtime Webcast: Index Defragmentation with SQL Server 2005 

Technical Questions

Q: In your script, what is "HA Requirements"? HA = High Availability. This is the requirement that your table stay online and available. Some companies are trying to achieve 99.999% uptime, this is especially challenging when even maintenance operations take a table offline.

Q: If you create extra indexes is there a easy to configure utility that you can run across an application after it has run for a few months to list keys that were never or infrequently used? Use one of the new DMVs: sys.dm_db_index_usage_stats. To see the complete list of DMV objects, use the following query:SELECT * FROM sys.system_objects WHERE [name] LIKE 'dm[_]%'

Q: What is DMV again? Dynamic Management View. These are new objects which give information about in-memory objects and state information.

Q: What are the parameters and their usage – for sys.dm_db_index_physical_stats? (DatabaseID, ObjectID, IndexID, PartitionNumber, Mode)

DatabaseID = [ NULL | 'DatabaseID' ] NULL: returns information for ALL databases, if NULL is used no other options can be supplied. This returns ALL indexes for all objects in all databases. Easy but possibly slow.
DatabaseID: smallint type. Refers to the ID for a specific database. DB_ID() or DB_ID('DatabaseID') can be used. The latter allows you to run this from ANY database as long as you have access. However, 3-part naming must be used.
ObjectID = [ DEFAULT | NULL | 'ObjectID' ]DEFAULT/NULL: return ALL base data: CL, Heap, LOB for the specified database.
ObjectID: int type. Refers to the ID for a specific object. OBJECT_ID('TableName') can be used. When using OBJECT_ID, you can use 1/2/3-part naming. Be sure to use 3-part when executing outside of database.
IndexID = [ DEFAULT | NULL | 'IndexID' ]DEFAULT/NULL: All indexes
IndexID: tinyint type. Refers to the ID of a specific index.
PartitionNumber = [ DEFAULT | NULL | # ]DEFAULT/NULL/0: return ALL partitions
#: returns only the details about specific partition. When a PartitionNumber is specified then an IndexID must also be specified.
Mode = [ DEFAULT | NULL | 'SpecificMode' ]DEFAULT/NULL/LIMITED: return FAST scan and use only an IS (Intent Shared) Table-level lock. This lock blocks ONLY eXclusive TABLE-level locks and schema changes. Excellent, relatively unobtrusive way to get fragmentation details.
LIMITED: IS Lock. Same as SQL 2000 WITH FAST, only page counts and EXTERNAL fragmentation displayed. Does not detail INTERNAL fragmentation and page density.SAMPLED:  IS Lock. For tables less than 10,000 pages (~80MB), all details are produced. For tables of more than 80MB, two samples are done (1% and 2%) at every nth page. The samples are compared and if close, 2% sampling output returned. If not close, then up to 10% will be sampled.DETAILED: S Lock. Entire table analyzed for both internal and external fragmentation. Returns one row for each level of the index from the leaf level (level 0) all the way up to the root level. This can help you determine fragmentation in the non-leaf levels but at the expense of holding a shared table level lock.
Q: How often should you run DEFRAG on your SQL server box? Should this be a part a regular schedule? Taking down SQL is their any other consideration? First, the only thing that’s not available is the table being REBUILT. Defragging an index does not take that table/index offline. So, more than anything, it depends on what you’re trying to achieve. If you want achieve better availability on SQL Server 2000 then you might choose to defrag rather than rebuild – to keep your tables available.

Q: How often do you get such perfect tables in practice? A table is always completely clean and contiguous after a rebuild. To periodically fix a table, you should use consistent and automated rebuild strategies.

Q: Do you have suggestions for developers using MSDE when customer’s demands can vary? Vary from few transactions to a large customer with many transactions. The general best practices in database and table design scale from the low end all the way up to the high-end and in the end – helps your database scale!

Q: Can you touch on rules of thumb for "pad index"? If fragmentation in the leaf level is minimized through proper index maintenance and fillfactor – then fragmentation in the non-leaf levels should be low as well. You rarely need to specify padindex unless you have widely varying distribution of data and really want to leave larger gaps because of strange densities of data.

Q: Do most of these "Index Rules" apply to Indexed-Views? Yes! All indexes can become fragmented after data modifications... Your scripts should always look for fragmentation across all scripts.

Q: Can you discuss fragmentation WRT horizontal partitioning, especially range partitioning on the primary key? SQL Server 2005 offers more granular rebuild options –but not necessarily online. In many cases, you might want to design a read-only partitioned table and keep the volatile portion of the table (especially if only one partition), in its own separate table – possibly using a partition view (or an inline table valued function) over these two tables.

Q: If I'm selecting from a table with a where FirstName = ... and LastName = ... and I have 2 indexes, one on LastName and another on FirstName. Are they both used? With an AND – maybe. The optimizer will look at the Index statistics to determine if either of them selective enough to use only one index. If neither is selective alone and a better index does not exist (a better index for AND would be one that includes BOTH of the columns in the SAME index – as a composite index), then SQL Server may choose to join the indexes (index intersection).

Q: URLs on the Resources slide can't be read. Could you type then into the Q&A, please? When the session is available for download (which is what happens when MSDN put this online), then you can access the URLs there as well. Typically, I place all of the links at the beginning of the Q&A – resources section. I’ll make sure to do this consistently!

Q: How does an uniqueidentifier used as a clustered primary key effect performance? This is best answered by session 4. In short, a non-sequential GUID can cause a lot of fragmentation.

Q: What is ExtentFragmentation as reported by DBCC SHOWCONTIG and is it less important than Logical Fragmentation? Extent Fragmenation refers to how many extents are next to each other. This is a bit more important than Logical Fragmentation as logical fragmentation shows whether or not the pages are next to each other.

Q: How much danger is there in the defrag processes? What kind of backup procedures do you suggest when you defrag? More frequent transaction log backups. A defrag generates a lot of log information. However, it does so in mini transactions. As a result, transaction log backups can occur concurrently with the defrag process and even though the defrag is not complete, the transaction can still be cleared because the defrag process runs as small transactions instead of one long running transaction. This also improves concurrency because the locks are released throughout the process.

Q: Defraging a large index can cause the log file to grow quite large. Is there a way to minimize this other than frequently log backups? Yes, you’re correct – defraging a large index WILL grow the log file quite large! As for minimizing this activity in the log – no way to do that. But – you’re correct in increasing the frequency of log backups!

Q: With very large tables, how much available disk space (both transaction logs and data drive) do you need to have to rebuild? Does it take less space to defrag than to rebuild? Well, this is really a multipart question… First, log space for rebuilds is mostly dependent on the recovery model. If you are running the FULL recovery model then creating and/or rebuilding indexes will take enough log space for the entire rebuild to complete. If you are running in the BULK_LOGGED or SIMPLE recovery models then this operation will run as a bulk operation and will be minimally logged. While this will take less time and significantly less log space, you are giving up some features when switching recovery models. I would strongly suggest reviewing the second session to see if this is appropriate.Now, as for data space – a rebuild will always require at least the table size in free space and possibly as much as double (if an online rebuild is being performed). Typically, when space estimates are being done (when capacity planning the database) I always recommend taking the largest table size and multiplying it by 2 or 3 – in order to make sure you have enough space for rebuilds. There is space needed for sorting as well – this can come from the database OR from tempdb (using the SORT_IN_TEMPDB option).Defraging doesn’t move an object so it doesn’t take additional data space BUT it does require more overall log space because it runs as mini transactions instead of just one.

Q: Should we look at different fragmentation stats if there are multiple files in the same filegroup? No, you still want to review average fragmentation. However, you may have more “fragments” in a table that spans filegroups; this does not necessarily mean that your table is fragmented.

Q: Are there any good third party tools for checking fragmentation and performing maintenance? Unfortunately no revolutionary ones (that I know of and/or can recommend)...but I still have high hopes :)

Q: How do you determine the appropriate fill factor? Unfortunately, there isn't a magic number... but, you can test your guestimate by seeing how fragmented the table becomes between your regularly scheduled defragmentation routines.

Q: Does it matter if I build the clustered index before/after rebuilding the non-clustered indexes? You should always create the clustered index before creating non-clustered index but as for rebuilding - you can rebuild your indexes independently as a rebuild of the clustered does not (generally) cause a rebuild of the non-clustered. There is one exception to this rule in SQL Server 2000 – in 2000 ONLY, SQL Server will automatically rebuild the non-clustered indexes when a non-unique clustered index is rebuild. Why? Because the uniqueifiers are rebuilt.

Q: Will doing a defrag followed later by a rebuild decrease the work of the rebuild? Not really. A defrag doesn’t move the object – only a rebuild does. However, you might minimize the cost of the sort…

Q: How does cache map to table pages, i.e., does free space in table pages have a 1:1 correspondence to wasted cache? SQL Server reads the 8K page from disk into memory. The number of bytes that are wasted on disk are also wasted in memory. This is often the motivation for vertical partitioning! You might refer back to session three for more details on row/page structures!

Q: If switching a varchar cluster to a bigint and vice-versa in 2000, what would the best order of drop/create index? Actually, this is the reason that CREATE with DROP_EXISTING was created… so that you could “change” the definition of the clustered

DROP TABLE test
go
CREATE TABLE test
(
      testid      int               not null,
      col1       varchar(100)      not null
)
go
CREATE CLUSTERED INDEX testind ON test(col1)
go
CREATE CLUSTERED INDEX testind ON test(testid) WITH DROP_EXISTING
go
sp_rename 'test.testind', 'NewIndexName', 'INDEX'
go
sp_helpindex test
go
Q: What about instances of one name only? (like Madonna, Cher, etc. ;^) Well, this is a good question and this is something that you might need to plan for in design. In these cases, you might allow NULLs in the lastname column and then make sure to search both when a lookup is performed. To be honest, I probably won’t do all that much to find these special first names – unless you wanted to do searches across both columns without knowing whether or not this is a first or last name. You might do something like this in a lookup
SELECT * FROM NamesTable
WHERE LastName = @variable
      OR (FirstName = @variable AND LastName IS NULL)
Comment: Just wanted to say I appreciate the blog you have put together.

Thanks for the thanks! It's a lot of work but I think it's great as a reference!! Even for me! To be honest, I can't always remember where to find things either! J

Thanks! So – we’re half way there – 5 more to go! And, lots more questions coming I’m sure J For the next session, we’re going to cover Isolation and options in Isolation in SQL Server 2005. If you’re interested in hearing more isolation, locking/blocking – here’s the registration link:MSDN Webcast: A Primer to Proper SQL Server Development (Part 6 of 10): Mixed Workloads, Secondary Databases, Wait States, Locking and Isolation See you on Friday!

kt

Effectively Designing a Scalable and Reliable Database A Primer to Proper SQL Server Development
Indexing Best Practices, Part 4 of 10
Presented by Kimberly L. Tripp, SQLskills.com For a list of the non-technical Resources related to this webcast, please review the following blog entry:MSDN Webcast Q&A for Resources: Indexing Best Practices, Part 4 of 10  Technical Questions

Q: When a clustered index is added to a table what happens physically to the data pages? SQL Server moves the data to the new location defined by the ON clause of the index creation statement. The ON clause can specify a different filegroup or even a partition scheme and the heap will move to the new location.

 Q: I'm not getting the follow-up emails with the replay link. How do I get on this email list for replay?

You *must* register through MSDN. If you are using a partner site then that is probably why you’re not getting the email. IF you are registering through MSDN and still not receiving the email, please contact eventsup@microsoft.com with your information.

 Q: Does the copy affect the transaction log?

Yes. A create index or rebuild index statement does require the statement to be a single transaction. If you are running with your database set to the FULL recovery model then this single transaction will create a lot of log activity. You could switch to the BULK_LOGGED or SIMPLE recovery model and get less log activity but switching to the SIMPLE recovery model should be avoided as it will break the continuity of the transaction log. I would recommend two resources to get more information about Recovery Models and their impact on performance, logging and recovery.

MSDN Webcast Q&A: A Primer to Proper SQL Server Development, Creating a Reliable and Automated Backup Strategy, Part 2 of 10

Review the sample chapter: Chapter 9: Database Environment Basics for Recovery from the MSPress title: Microsoft SQL Server 2000 High AvailabilityQ: Can one switch back and forth between simple and full recovery mode?

Ideally, you would switch ONLY between the FULL Recovery model and BULK_LOGGED Recovery model but not with SIMPLE. The two resources listed above will also help to clarify this point.

 Q: What about re-building a clustered index? Will it be that slow too? Is it recommended to do it off-hours?

Rebuilding a clustered index takes a lot of the same requirements that creating the CL index does. It can also be slow and yes – as a result, you might want to do this off hours. You can speed up the process by changing recovery models but switching recovery models can also have an impact on recovery. See the two resources above for more details to help clarify this point. And, in part 5, I’ll cover more of the fragmentation details.

 Q: When is it desirable to move a table?

This is not necessarily an easy one to answer simply. But, I often recommend separating transaction processing data from read-only data as well as possibly placing a large table on its own filegroup (especially true with partitioned objects). But, I don’t often recommend separating indexes. Here’s a link to a SQL Server Magazine article I wrote – it does give some guidance. In general I don’t separate too many objects. However, if you’re building a database that you will put into a read-only state you might want to work a bit harder to “shift” objects to filegroups in such a way that each of those objects gets a nice and clean and contiguous chunk of space within the file. Another time that it’s useful to “move” an object is when you load data into a heap and then want to build the clustered index on another filegroup – where that heap will become a partition of a partitioned table. This is good when the clustered index will move the object into a filegroup that does not have other objects in it. Again, the clustered index will be nice and clean and contiguous in the destination filegroup.

 Q: So, if I understand correctly… a good approach to creating a table would be to have a clustered index independent of the primary key. So, ideally you could use a GUID for the PK and then have a clustered index in order to minimize the fragmentation.

The most ideal combination is when the clustered index is on an identity column – and even better if this is the Primary Key. However, if you do want to keep a GUID as the primary key then yes, you can create a clustered index which is independent of the primary key.

Q: In general, do you think folks have more problems setting up indexes for OLTP, OLAP/DWH, or Web apps?

In general, I think Decision Support environments can have more indexes with less of a negative… OLTP is a bit harder to find the right balance. The hardest is combination environments…

 Q: What was the diff between 0 and 1 on the third argument of DBCC IND?

The third parameter is for Index ID. 0 = Heap, 1 = Clustered Index, 2-250 are non-clustered index ids.

 Q: Should we keep the install defaults like Collation, and others? Or do you have recommendations?

Generally speaking, yes – I would recommend keeping the defaults. However, I do recommend that some development environments – especially those who are producing software for sale. When the database can be installed/setup on other servers that may have different code pages, etc., you might want to use binary case-sensitive sort. This will make your application more consistent – even with case-sensitive servers…and you never know if one of your customers may have one! It will also help to make you a better coder.

 Q: What design considerations should I take into account when planning my primary keys, indexes and clustered indexes for databases that will be used with replication?

There are no specific requirements for Replication other than the fact that you must have a Primary Key… but the same best practices discussed today apply and should be used in replication as well.

 Q: I am using SQL Server 2000. I have a table that is used a lot on the web server. The scan density is 97%. Should I adjust the indexing and/or Primary Key?

I tend to think more about how fast it drops rather than the actual number. But – 97% is pretty good in general. What you might want to do is a rebuild IF you have off hours time to rebuild (in SQL Server 2000 rebuilding can be done offline ONLY). Otherwise, you might just want to setup a defrag procedure. However, this is NOT

 Q: Why are so many SQL commands undocumented and how do you get to know how to use them if they are undocumented?

First and foremost, I want to strongly urge you to take your time with commands that are undocumented. If the commands are really useful for day-to-day operations, they’re probably already documented. Occasionally, development teams add additional functionality during development to test their functionality. These commands – while proving useful in some cases – do not generally go through the same quality assurance procedures that fully documented commands go through and as a result, I would be very cautious before using most undocumented commands.

 

Having said all of that, you’ll find a lot of useful tidbits in articles, on webcasts, in books, etc.  Just make sure you test the command thoroughly and don’t expect to get support on it. In general, undocumented commands are NOT supported.

 Q: What is the command for scan density on SQL Server 2000?

DBCC SHOWCONTIG. If you’re interested in learning more about defragmentation on SQL Server 2000, please review this MSDN Webcast: SQL Server 2000 Performance Tuning: Index Defragmentation Best Practices

 Q: In SQL Server 2000, DBCC SHOWCONTIG, can you explain extent scan fragmentation and if we should pay attention to it?

We’re going to go over this more with our next session on Index Defrag but Extent Scan Fragmentation describes whether or not the extents that an object owns are “next” to each other. If there are gaps then this implies that there are other objects “interleaved” with this one. This is not ideal, defragmenting the object will help to improve scans and should.

 Q: The size of the database doubles when a clustered index is added?

Once the index is created the space needed to build the index is released. However, while the index is being created you need roughly 2-3 times the TABLE size to build it. If you have a table which is 1GB then you should plan for at least 2-3GB of free space in order to build or rebuild a clustered index. However, it’s only the table’s size that doubles and it’s only during the index creation NOT after the index has been created.

 Q: Aren't GUIDs too large to use for an efficient index?

No, an index can be useful even if it’s a large percentage of the size of the table – mostly, because of the order of the columns and the criteria used in the query. Hard to give a fast answer to this but in general 16 bytes isn’t so bad – much larger may even be acceptable.

 Q: Does level 1 of a clustered index point to rows, or does it describe the minimum and maximum row on a page?

Basically, it’s a pointer to the page and it has a pointer to the lowest value on the page. Each pointer in the non-leaf always points to the “first” value on the page (meaning the lowest).

 

Q: Why would you be inserting using an index other than using the primary key? Assuming the primary key is appropriate, I wouldn’t. However, there are a lot of environments where the primary key is chosen and kept – for a variety of reasons. SQL Server creates the clustered index on the Primary Key by default but it is not required’ you can create the clustered index on something else. It’s the clustered index which defines the insert order – not the Primary Key; however, they are often one and the same.

 Q: Should all indexes be clustered if unique? Would this increase performance on search results from SP?

I think the question should really be should all clustered indexes be unique. Generally, the answer to this is yes. But to answer the second part – would this increase performance on an SP – the answer here is not necessarily. Stored procedures have many reasons for being – or not being – optimal. Plan choice, optimization and recompilation issues all play a major factor in whether or not a stored procedure is optimal. However, in session 7, I will talk about plan caching and stored procedure optimizations!

 Q: If there is no natural clustering key, what is the advantage of "coming up with one". Or, said another way, what is the basic reason why I want a cluster rather than a heap?A heap does not inherently have order to it (that’s the point) but when the records are not ordered then SQL Server needs to find a location in which to insert the row. This causes a lookup in something called a PFS (page free space) as well as a lookup in the table’s Allocation Map (actually called an Index Allocation Map in SQL Server 2000 and a Heap or B-Tree Map in SQL Server 2005). To help the insert performance you generally want to have the insert location defined. However, just having a clustered index is not the best answer as there are other internal dependencies on the log. Because of these internal dependencies you typically want the clustered index to be unique, narrow and static.  If you’re interested in reading more about the reasons behind having a clustered index which is unique, narrow and static:Ever-increasing clustering key - the Clustered Index Debate..........again! MSDN Webcast Q&A: Index Defrag Best Practices - Clustering Key Choice In some cases this means adding a column to a table solely for the purposes of clustering on it. Q: Is it a good practice to just have non-clustered indexes?Well, that’s a different way of answering the prior question…and while heaps can be useful for high performance loading they are generally not as useful for day to day OLTP or combination OLTP/DSS tables. So, my recommendation is almost always – the RIGHT clustered index + additional supporting non-clustered indexes. Q: In large OLTP databases is it recommended to move reporting to a separate database which has a lot of non-clustered indexes and no clustered indexes?No… Again, it’s not really a binary thing (should you always have a clustered, or not always have a clustered). Really, there are cases for both BUT in most databases – even DSS (Decision Support System) – a clustered index helps to define a narrow and unique value that all indexes have in common. This can help other operations…  So, even in a DSS environment I generally recommend a clustered, as well as additional supporting non-clustered. Having said that there are cases where a single clustered – setup for order by and range queries can be useful but that’s generally only when the large majority of queries all want * (all columns) and you typically want the exact search and order by in almost every case. Q: I've read that you shouldn't cluster a column that is an Identity column because it hurts performance for OLTP as all writes are on same page/area of disk? Your thoughts?

There are some cases where insert performance can be compromised – in very high volume OLTP systems (typically more than 400 inserts per second but this can vary) but in general, inserting into the same “hot” spot of a table improves performance because the activity is isolated and the needed pages are already in cache. Lots more on this on in other Q&As.

 Q: If I add an identity column to my table to have a good clustered key, it will never be used in a query. Would I naturally also include the primary key, even though it would make the clustered key less narrow?

You don’t need to include the primary key is the clustered key is already unique, narrow, static and ever-increasing.

 Q: Does this mean that vertical partitioning of a table is only helpful when the table is scanned?

No, there are still many benefits in vertical partitioning – such as with locking, scanning. I’d refer you to review the Webcast that was part 3 in this series, for more details.

 

Q: Can you use DTA against SQL 2000 databases?

Yes. You can use either the DTA or the ITW against SQL Server 2000. If you're looking for details about how to use the SQL Server 2000 Index Tuning Wizard you can review this whitepaper: Index Tuning Wizard for Microsoft SQL Server 2000.

 Q: Is DTA in the Express version, too?

DTA does not come with the Express Edition but you can certainly do tuning on a higher version and then use that tuned database in Express.

 Q: Can you explain briefly Index View?

It’s a materialized view – in the sense that the data as defined by the view – is actually materialized and stored on disk. This has both pro’s and con’s and needs to be evaluated fully before you implement these. Please check out these whitepapers for more details:

 Improving Performance with SQL Server 2005 Indexed Views by Author: Eric Hanson and Contributors: Gail Erickson, Lubor Kollar, Jason WardImproving Performance with SQL Server 2000 Indexed Views by Author: Gail Erickson and Contributors: Lubor Kollar, Jason Ward contributor Q: Do you think one can become expert in SQL Server and other aspects of Design/Programming or does one need to specialize?

Wow, I think it this is a tough question; more than anything I would recommend learning all of the options available so that you can better architect the right solution. Once you think you have a good solution, then move forward to really learn that area/topic/features. I’ve been using a phrase with SQL Server 2005 a lot – Jack of all trades, master of some...

 Q: Can you repeat the new function INCLUDE on index?

In SQL Server 2000 the maximum size of an index key is 900 bytes or 16 columns – whichever comes first. This key helps to keep an index’s b-tree smaller and more scalable. In SQL Serer 2005, the leaf-level of the index can include additional non-key columns. This continues to help keep the tree structure scalable but also allows you to cover more queries.

 Q: I have installed SQL Server 2005 CTP, but can only find SQL Configuration Manager. I don't see the management tools that you are using. Any suggestions?

Actually, no. This seems a bit strange?! I might try uninstalling and re-installing, as well as reviewing the setup logs to see if there are errors within them. For SQL Server 2005, all installation log files are stored in C:\Program Files\Microsoft SQL Server\90\Setup Bootstrap\LOG.

 Q: if your non clustered index either includes all or part of the clustered index are the clustered index columns added again to the non clustered index?

NO! SQL Server only adds the column(s) of the clustering key – if they are missing.

 Q: Did you mention the cool timebound option in DTA?

Yes, but only briefly. Vipul Shah discusses DTA in greater detail in these two webcasts.

 TechNet Webcast: Performance Diagnosis in SQL Server 2005 (Level 300) by Vipul ShahTechNet Webcast: SQL Server 2005 Database Tuning Advisor (Level 300) by Vipul Shah Q: What is the fill factor and how is it used?

I think I’m going to defer you to the next session! Now, there’s some motivation. J

 

And here are a few comments that I’ve received! I just wanted to say thanks for your comments and thank yous… J It's your questions that keep me on my toes and current. This is still fun so keep it coming.

 

Comment: Thank you again, particularly for thoroughness and enthusiasm.

Comment: Let me add my voice to the chorus of "thank you's", Kim. You're  presentations and verbal Q&As are great, and your online ones are a truly generous gift to the community. No other webcaster has come close to doing what you do. Many thanks from all of us.  

Comment: I must compliment you on your Q&A in your blog - I've just checked it out - couldn't be more comprehensive!

 

So – 4 down, 6 to go! Lots more questions coming I’m sure J

 

If you’re interested in hearing more about fragmentation, how to view it, how to clean it up and finally, how to prevent it – here’s the registration link:

MSDN Webcast: A Primer to Proper SQL Server Development (Part 5 of 10): New Features in Indexing and Index Maintenance Best Practices (Level 200) Thanks!kt

Effectively Designing a Scalable and Reliable Database A Primer to Proper SQL Server Development

Indexing Best Practices, Part 4 of 10

Presented by Kimberly L. Tripp, SQLskills.com

Q: Can I view a recording of this webcast? The webcast will be available for viewing within 24 hours. All “on-demand” webcasts can be found here.

Q: Where can we get the demo scripts AND the sample database: Credit? The demo scripts are in this zip (20050826MSDNDemoScripts.zip (12.17 KB)); here in this blog entry. However, at the series completion, I will also create an entry under Past Event Resources for the entire webcast series.  To download the ZIP of the Credit Database Backup click here. Once unzipped, restore this backup to SQL Server 2000 or SQL Server 2005. The backup is a SQL Server 2000 backup and can be restored to either version! If restoring to SQL Server 2005, you might want to change the destination for the data and log file as the path will probably be different.

Q: Where are the links to all prior Webcast Q&As from this series?

Part 1: Creating a Recoverable Database
For the MSDN Download for Part 1, click here.
For the SQLskills Blog Entry for Part 1, click here.

Part 2: Creating a Reliable and Automated Backup Strategy
For the MSDN Download for Part 2, click here.
For the SQLskills Blog Entry for Part 2, click here.

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures
For the MSDN Download for Part 3, click here.
For the SQLskills Blog Entry for Part 3, click here.

Q: How can I replay previous sessions? I thought we were going to get emails for replaying -- but I haven't received any replay emails. You will receive replay emails ONLY when you register for these sessions through MSDN. We’ve come to find out that there are other ways to register but it’s only through MSDN that we know for sure you will receive the replay information. Regardless, you can always find the “on-demand” version of the sessions here.

Resources mentioned in today’s Session:

If you’re interested in hearing more of the theory behind the indexing recommednations, as well as more of the reasoning behind unique, narrow and static criteria for the clustering key –  watch the TechNet "It's Showtime" recording, recorded at Tech Ed Europe titled: Index Creation Best Practices with SQL Server 2005 which covers Index Creation Best Practices on SQL Server 2005 and really gets into a lot of the "why's". If you’re interested in hearing a lot more about how to use Profiler effectively with ITW, watch the MSDN Webcast titled: SQL Server 2000: Performance Tuning - Finding the Right Balance of Indexes which covers Index Creation Best Practices on SQL Server 2000.

As for whitepapers:

SQL Server 2005 Beta II Whitepaper: Partitioned Tables and Indexes by Kimberly L. Tripp

SQL Server 2005 Beta II Whitepaper: Snapshot Isolation by Kimberly L. Tripp

PREVIEW: The Database Administrator’s Guide to the SQL Server Database Engine .NET Common Language Runtime Environment by Kimberly L. Tripp

Improving Performance with SQL Server 2005 Indexed Views by Author: Eric Hanson and Contributors: Gail Erickson, Lubor Kollar, Jason Ward

Improving Performance with SQL Server 2000 Indexed Views by Author: Gail Erickson and Contributors: Lubor Kollar, Jason Ward contributor 

Additional Webcasts:

TechNet Webcast: Performance Diagnosis in SQL Server 2005 (Level 300) by Vipul Shah

TechNet Webcast: SQL Server 2005 Database Tuning Advisor (Level 300) by Vipul Shah Support WebCast: SQL Server 2000 Profiler: What's New and How to Effectively Use It by Sri Kasam and Ajay Manchepalli TechNet Support WebCast: I/O performance problems and resolutions in Microsoft SQL Server 2000 by David G. Brown TechNet Webcast: SQL Server 2005 Series (Part 7 of 10): Indexing Enhancements (Level 200) By Bryan Von Axelson

Summary!

Finally, check out the Indexes category on my blog – for a variety of Q&A postings, other resources and lots of great links! The rest of the technical Q&A, I should be able to post soon. Stay tuned! Thanks,kt

MSDN Webcast: A Primer to Proper SQL Server Development
Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures, Part 3 of 10

Presented by Kimberly L. Tripp, SQLskills.com

Q: Can I view a recording of this webcast? Here’s the specific replay link.

Q: Where can we get the demo scripts? The demo scripts are in this zip (20050819MSDNScripts.zip (3.67 KB)); here in this blog entry. However, at the series completion, I will also create an entry under Past Event Resources for the entire webcast series.

Q: Does Kimberly L. Tripp have a blog? Yes, www.SQLskills.com/blogs/Kimberly and if you’re reading this on my blog…well, then you already know! J

Q: Where are the links to all prior Webcast Q&As from this series?

Part 1: Creating a Recoverable Database
For the MSDN Download for Part 1, click here.
For the SQLskills Blog Entry for Part 1, click here.

Part 2: Creating a Reliable and Automated Backup Strategy
For the MSDN Download for Part 2, click here.
For the SQLskills Blog Entry for Part 2, click here.

Q: Is this Session 2 of the "Effectively Designing a Scalable and Reliable Database" webcast series? The Title bar here reads "MSDN Webcast: MSDN: A Primer to Proper SQL Server Development (Part...)". A bit misleading don't you think? So, I finally figured out the mystery here. When I originally submitted my proposal for this series, I titled it like this: Effectively Designing a Scalable and Reliable DatabaseA Database Developer’s Primer to Proper SQL Server Development (Level 200-300) But somehow, through a few edits and since each module also has a more detailed title shown in this following list, then I think the title just got edited down to fit. In fact, with subtitles like these, I can barely fit the entire title, subtitle and session title on a single slide. J

Part 1: Creating a Recoverable Database

Part 2: Creating a Reliable and Automated Backup Strategy

Part 3: Designing Tables that Scale – Best Practices in Data Types and Initial Table Structures

So, the mystery ends here… It’s all the same content and the general idea is that building a scalable and reliable database takes a variety of best practices that together make it happen; there is no “DBCC MAKEITGOFASTER(‘TRUE’)” option. I'll unveal a new title slide in part 4... oh, the excitement!

Q: How can I replay previous sessions? I thought we were going to get emails for replaying -- but I haven't received any replay emails. You will receive replay emails ONLY when you register for these sessions through MSDN. We’ve come to find out that there are other ways to register but it’s only through MSDN that we know for sure you will receive the replay information. Regardless, you can always find the “on-demand” version of the sessions here.

Q: What does CLR stand for? Common Language Runtime. Quick WHATIS.COM Definition here.

Q: If something uses sp_OA* now in SQL 2000, am I guaranteed it will use SQLCLR in SQL 2005? Not automatically; however, you should consider rewriting you sp_OA* is used to automate COM objects; these, by definition, are not framework objects. Moving to SQL Server 2005, you can continue to run with your previous sp_OA procedures to access COM objects but if you can write SAFE .NET Frameworks code to do the same thing then you should convert your sp_OA* code where possible. Not everything can be converted but the basic principle is that which you use sp_OA* is a good thing to target for conversion. For some great details on the differences and how to move forward – go here. Also helpful is this KB Article regarding COM and .NET in SQL Server 2000 Using extended stored procedures or SP_OA stored procedures to load CLR in SQL Server is not supported

Q: I loved how XML data is available interactively in SQL Server Management Studio (SSMS); can HTML data columns be displayed in the same way? Yes, if you store HTML in a column of type XML then the column data will appear as a link and when clicked, you can edit it in the XML Editor window in SSMS.

Q: Is the "XML" datatype really a "varchar(max)"? No. In previous releases the XML type really was just a “blob” stored in the database. There were no inherent optimizations and you were not able to define a scheme for an XML column. In SQL Server 2005, XML data is stored natively and offers the ability to put indexes on the columns as well as better access and manipulation. You *can* store XML as varchar(max) but it will be harder to access/code. There’s a very nice description of the tradeoffs in this whitepaper:XML Options in Microsoft SQL Server 2005 

Q: What is the difference between numeric and decimal? In SQL Server they are synonyms. I seem to remember hearing that there was a difference in the standards…somewhere but with a few references and looking around I can’t find any. Regardless, I’d make sure that you standardize on one or the other – just in case they do change, at least all of your data is consistent.

Q: I use decimal datatype for money values. Why it may be suboptimal to use decimal (instead of money datatype)? I wouldn’t say that it’s suboptimal – except maybe in storage length. However, decimal can be more precise.

Q: When should you use a float? A number of my clients use Float – i.e. Banks, Stocks, etc. What would be your argument against this? Just the lack of an absolute minimum precision/scale.

Q: We have a database where datatypes are set. But to use a simple percentage calculation we had to add .0 to the calculation to get the correct percentage. For example 100(5/9) is truncated to 0, but 100((5+.0)/9) yields correct result. Is there any standard procedure without massive conversions to assure a correct result? It’s a matter of data type. In the case of 100, 5 and 9 – these are all integers – without any decimal place. So – the division of 5 divided by 9 yields 0. However, 5 divided by 9.0 or 5.0/9 – both have a decimal and in that case the other data types are implicitly converted to the higher/more precise data value. A cool way to find out what the actual “base type” and/or the level of precision/scale is to ask these questions as if the data is of type sql_variant:

SELECT SQL_VARIANT_PROPERTY(5/9.0, 'BaseType')
SELECT SQL_VARIANT_PROPERTY(5/9.0, 'Precision')
SELECT SQL_VARIANT_PROPERTY(5/9.0, 'Scale')   -- numeric(8,6)
SELECT SQL_VARIANT_PROPERTY(5/9, 'BaseType')
SELECT SQL_VARIANT_PROPERTY(5/9, 'Precision')
SELECT SQL_VARIANT_PROPERTY(5/9, 'Scale')   -- int

Q: I am surprised by the nchar/nvarchar suggestions. I've always been told to NOT use these data types. Can you give the "reasoning" for this suggestion here? The primary reason that you want to use Unicode data is to avoid a cost in “translation” between the client and the server – when you’re using web-based applications that natively use Unicode. Additionally, with ever growing customer bases – you can more effectively store international data properly by supporting Unicode in the client and in the database. You are correct that these take 2 bytes per character rather than one – but that’s what gives them the larger range of characters to use/display/manipulate. Also, if you plan to use the SQLCLR against these columns, you will need to use Unicode as opposed to ASCii as the SQLCLR requires it.

Q: Does Unicode use twice as much space? Yes, Unicode columns are 2 bytes per character. If you defined a column of type nchar(100), that column will require 200 bytes of storage.

Q: Sometimes, and not always, it becomes a tradeoff in Storage Space v. Performance, doesn’t it? That’s an excellent point. You always need to look at the trade-offs associated with any decision and in fact, this is yet another. So, I always want the row size to be as optimal and as reasonable as possible. However, there are many compelling reasons – international and/or web-based applications as well as SQLCLR functionality.

Q: Is the N' keyword usable in MSSQL 2000? Yes, all Unicode character strings should be preceded with an upper case N in all releases where Unicode is supported (SQL Server 7.0 was the first release that supported Unicode).

Q: Is there any relationship between the OS's code page and nvarchar/varchar? Yes. When SQL Server is installed- it chooses the server’s default code page from the Operating System. A code page (or collation) defines the order in which characters are evaluated. The additional attributes a collation can define are: whether or not the characters are case-sensitive; the sensitivity of width and kana-type. This may not seem obvious but if these characters sets are native to you then the order to the data would probably make more sense.  Once the collation is set for the server, the database inherits the server’s collation by default, when tables are created – each character-based column inherits the database’s default collation by default. However, at ANY of these levels the collation can be changed. There are benefits to doing so – in that you can keep data within a specific table ordered for a specific character set. Wow, now that I’m getting started, this could be a session in and of itself. So, just to give you a few tips:

  • Be careful using temporary tables if your database’s collation is different than your server’s collation.
  • Changing collation on the fly (for example, if you want to do case-sensitive searching through a view – of which I’ve included a sample file) can be expensive.
  • Changing a database’s collation is supported but you might have troubles converting from a case-sensitive to a case insensitive database.

The files that can help you to get more familiar with verifying collations, changing collations, etc. are CaseSensitiveSearchingInView.sql and ChangingDatabaseCollation.sql.

Q: Another great use of adjunct tables is to isolate culture-specific data from the base (and usually indexed) data. So, this is a great point. If you change the collation for a column – and index it – then you can get better performance when accessing that column. Changing a collation on the fly (as in the script sample for the previous questions), is what’s expensive. However, you should make sure to do a lot of testing.

Q: I have a US code page OS but Latin general 850 collation set for SQL Server; I have no problems storing and retrieving data. Exactly! That’s the benefit of collation information being stored with each column. It allows the application to store many types of data – natively and then just deal with their display at the client.  For those of you who want more details on how to do this, see COLLATE in the books online.

Q: Somewhere along the line I was always told to define varchars in multiples of 8 characters. Something to do with space allocated with new rows.  Any ideas if it has any premise? Well, my first response to this was that I’ve never heard of this… So, I thought I’d check around and well, the couple of folks I’ve chatted with haven’t heard this either… IF anyone has – feel free to comment on this blog.

Q: How about images? Is it better to store them in the db or just the url? This is really an old debate. I still think there’s validity to both sides of it as well. I’d say that you should go through the pro’s and con’s and really choose what’s best for your application. If you use a lot of urls then you might be able to reference more without having to manage a copy of that data… If the url is your data then you might find that managing it in the database (from a backup/restore perspective) is easier – and you know that it’s always going to be there v. the url occasionally not working. Wow, I’d almost like to get a panel of folks together to have everyone hash out their opinions on this one. J SQL Server 2005 does make managing images easier but there are still points to both arguments!

Q: How would SQL Server support becoming a data server of video file? The only real format for this data would be binary. However, there is a 2GB limitation to a single value.

Q: Are you aware of a stored procedure to handle blob data effectively? i.e. for image storage? No. But there is an executable that comes with SQL Server 2000’s samples and I think it’s called textcopy.exe. That might be some good sample code to review.

Q: Does SQL Server support the SQL 2003 NULLS FIRST and NULLS LAST keywords of the ORDER BY clause? No support for NULLS FIRST or NULLS LAST but the order by can include ASC for ASCending or DESC for DESCending. NULL values are treated as the lowest value so NULLS FIRST would be ASC and NULLS LAST would be DESC. But – that’s the only support for specific placement of NULL values.

Q: Is there a "best practice" for the new C# nullable types? i.e. int? vs. int -- re:MSSQL2005? When programming with the SQLCLR, always use types that allows NULLs (assuming your data can contain nulls) and then make sure to always follow the appropriate behavior with NULL. For example, concatenating NULL should yield NULL.

Q: Can I change (refactor) the underlying type of a UDT (e.g., from int to bigint) in SQL Server 2005?  Unless I've missed something, I can't do it in SQL Server 2000 unless I drop the type from all uses. No, as the user-defined data type (UDDT) is only used at create time to map to the datatype; the connection is then broken (altering the UDDT definition will only impact new tables and variable definitions). User-defined Types (UDTs) are even more schema bound than UDDTs and it's a much tougher job to alter them (you need to write conversion from the old to the new). This is much more complex since they fall outside the conversion matrix and the only thing SQL knows is the binary representation. So if the data type was smart enough to add a signature to the serialized data, so that the next version can understand which data type is represents and determine if it can convert to it you might have a shot, otherwise you have to add a column, and convert columns and drop a column afterwards.

Q: Can I easily replace the SQL Server 2000 UDT with a SQL Server 2005 CLR Custom Type? I have a client who wants to -- eventually -- migrate to SQL Server 2005, but for the immediate future wants me to build their application with SQL Server 2000. I'm very interested in SQLCLR types, especially WRT to maintaining type consistency across T-SQL, CLR SP's and external data objects, so using CLR custom types. This is a typical thought for the SQLCLR – “I am going to build an object database.” Type fidelity between the client and the server exists today, but they want class fidelity, so they are going to create the customer object UDT, which is not really for what the SQLCLR was intended. You should be thinking scalar types with additional custom characteristics – not object types – as the scalar types will give you optimal and predictable performance.

Q: Can you find ANSI_NULL_DFLT_OFF setting as on or off? Does DBCC USEROPTIONS do that? Yes, DBCC USEROPTIONS is a nice, quick way of seeing the currently set LIST of session settings. However, there are better ways to check these settings programmatically. it’s not as nice to check it programmatically  The files that can help you to get more familiar with ANSI Null Issues is ANSINullIssues.sql.

Q: It would be great if you could show us a list of  classic fields "Account No", "Account Balance", "Description" etc… and tell us what type you would use inc. "Null/Not Null" etc. Well, this is a bit subjective and with a bank account there would probably be a lot of business rules that surround the account number. However, if this were a sales table and you needed an ordered then it would be a bit easier (again, business rules might change this). Typically, I’d go for a Primary Key which is simple and narrow and not necessarily a “natural” key. However, some of this reasoning comes from how tables are joined and how SQL Server internally stores data. Account Balance – I’d go with a precise numeric. Based on length of storage and the business rules behind required precision, you might choose a numeric(19,6). This will require 9 bytes of storage. For description, I would definitely go with a variable type – probably nvarchar(200) depending on what description really means. Here’s how the table would probably look on creation:

CREATE TABLE Account(
AccountID     int           IDENTITY  NOT NULL CONSTRAINT AccountPK          PRIMARY KEY,
Balance       numeric(19,6)           NOT NULL CONSTRAINT AccountBalanceDflt DEFAULT 0,
Description   nvarchar(200)           NULL)
go
INSERT Account (Description) VALUES ('The account of Kimberly Tripp.')
PRINT 'Kimberly Tripp''s account number is ' + convert(varchar(10), @@IDENTITY) + '.'
go

Q: In the Q&A you might want to tell your listeners that in SQL the string 'abc' is equal to 'abc ' when compared – unlike for C, C++ ,C# , VBNET, etc. This will help with the char v. varchar issue… It’s a good point! You should make sure that you do some testing to make sure that data behaves the way you expect. Trailing blanks may be trimmed with some data types and not others and once again, ANSI settings may impact this behavior.

Q: What are the NULL block and varchar block in the row layout? The NULL block helps to store and identify rows that are set to NULL. In the past, SQL Server stored this as a single space in a varchar column ad there was no real way to differentiate between a real space (as a value) from a NULL value (stored as a space). When the storage engine was re-architected for SQL Server 7.0 the NULL block was added to store a NULL value specifically.

Q: How would the table structures differ between the 32-bit and 64-bit in SQL Server 2005 and SQL Server 2000? They don’t! There is no difference in the on-disk format between any of the versions of SQL Server. In fact, this was even true of the MIPS/Alpha/Win32 releases when SQL Server 6.0 released on all of those platforms.

Q: do you have a script to calculate rows per pages (using the formula you showed? Ha! Seemed a bit painful I know… But that formula works.But, yes – it’s a heck of a lot easier to just do this yourself if you have a table already. In SQL Server 2000 you need to use DBCC SHOWCONTIG with TABLERESULTS and in SQL Server 2005, you use a new DMV called sys.dm_db_index_physical_stats. This takes 5 parameters and has some really nice new features to it.

Q: I might have missed this, but is the SQL Server Management Studio (SSMS) the replacement for Enterprise Manager? Yes, SSMS replaces the Enterprise Manager and the Query Analyzer. And, for a more scaled down query tool, you can use Express Manager (XM). To download XM, you can go here  If you want more details regarding the latest SQL Server CTP, you can go here.

Q: Is Management Studio backward compatible for use with SQL Server 2000? Yes! You can use SSMS to connect to SQL Server 2005 and SQL Server 2000.

Q: How do we write our queries for Insert/Update/Select on a database that we have no idea how it was setup - so we don’t experience the local issue that you talked about? Basically, the best way is for the developer to create an interface that masks the complexities of the database schema through stored procedures, functions and views.

Q: Do indexes point to the row level or the page level? There are two primary parts to an index – the leaf level and the non-leaf level. The leaf-level of a clustered index IS the data (a clustered index is an ordered table).

Q: What is the best practice or rule of thumb on using identity primary keys on tables in a database? And would that be all tables and/or some tables or not all? This is a great question. However, I might phrase it a bit differently because I *think* I know where you’re going with this… Generally, I recommend that large tables have a clustered index on an identity column. This can help to improve performance of inserts by not causing splits (inserts go to the end) and by helping non-clustered indexes look up the corresponding rows in the clustered table by using a clustering key which is unique, narrow and static. This is really quite a bit related to the session on indexing… I think you should definitely attend session!

Q: How could a LOB end up in the leaf level? This is really another great question for the indexing session – and a bit on internals as well. In SQL Server 2005, rows can span pages (so you can have a row greater than 8060 bytes) and rows can have LOB types (even in the leaf level of an index). I’ll explain how this can happen with the new INCLUDE option for indexes – in the next session!

Q: Are you talking at the Vegas launch of 2005? Yes! You can see the complete list of my upcoming events here.

Comment: By the way, thanks for the great Web cast. I don't know of many other companies that do this sort of thing -- take questions directly from their customers.

Thanks! This is part of what I love about these webcasts… it allows me to hear what you’re interested in and it allows me to keep learning. To be honest, often I have to look things up and/or hit the KB and/or ask some of the other SQL geeks I know – in order to answer some of these questions. It really helps to keep me on my toes! In fact, this week I bothered two of my favorite people, my partner Patrick and my good friend Gert.

 

Thanks again for another GREAT week of questions!

kt

See you later today – Friday, Aug 26th

MSDN Webcast: A Primer to Proper SQL Server Development (Part 4 of 10): Best Practices in Indexing (Level 200) 

Two days isn't enough time!

In fact, that's really true. There are so many interesting (well, I think they're interesting) aspects to Indexing that we could in fact do a whole 10-part Series on Indexing and it's likely that we'd still have other things to look at and talk about. So, in preparation for part 4 of my MSDN Webcast Series, I thought I'd take a few minutes to pull together a number of related resources. If you watch/read them prior to the webcast then you'll be a step ahead. To minimize overlap, part 4 is a myriad of best practices with more of the "what's" and "how's" to implement the right indexes - but not quite as much of the theory and/or the "why's" as I've done quite a bit of that before and even if you don't watch/read these resources BEFORE the session...you can use these resources to fill in the gaps and answers the "why's" AFTER the session!

So, let's talk about the session... I'm going to focus on SQL Server 2005 for all of my demos (and I'm going to try to be VERY demo heavy in this one) but quite a bit of the theory also applies to SQL Server 2000...

If you're interested in some of the "why's" and/or you want to hear the information for SQL Server 2000 and/or you just want to see what other folks have asked, here are your options:

Watch the MSDN Webcast titled: SQL Server 2000: Performance Tuning - Finding the Right Balance of Indexes which covers Index Creation Best Practices on SQL Server 2000
Watch the TechNet "It's Showtime" recording, recorded at Tech Ed Europe titled: Index Creation Best Practices with SQL Server 2005 which covers Index Creation Best Practices on SQL Server 2005 and really gets into a lot of the "why's"

Read many of the past blog entries - including A LOT of Q&As created from former webcasts by accessing my "Indexes" category on my blog entries here.

So, if you're not scared to spend quite a bit of time with indexes... I think there's lots of good stuff to review! At a minimum you should think about showing up tomorrow, I promise to give you Practical Indexing Strategies and a lot of demos to prove my point!

Effectively Designing a Scalable and Reliable Database
A Database Developer’s Primer to Proper SQL Server Development
Part 4: Best Practices in Indexing

Attend Part 4 LIVE on 26 August 2005 by registering at this link!

MSDN Webcast Q&A: A Primer to Proper SQL Server Development

Creating a Reliable and Automated Backup Strategy, Part 2 of 10

Presented by Kimberly L. Tripp, SQLskills.com

Q: Can I view a recording of this webcast? Here’s the specific replay link.

Q: Where can we get the demo scripts? The demo scripts are in this zip (20050812MSDNWebcast.zip (4.86 KB)); here in this blog entry. However, I will also create an entry www.SQLskills.com under Past Event Resources for the entire webcast series.

Q: Does Kimberly L. Tripp have a blog? Yes, www.SQLskills.com/blogs/Kimberly and if you’re reading this on my blog…well, then you already know! J

Q: Where are the links to all prior Webcast Q&As from this series? Click here for the MSDN Download for Part 1, click here. For the SQLskills Blog Entry for Part 1, click here.

Q: Is this Session 2 of the "Effectively Designing a Scalable and Reliable Database" webcast series? The Title bar here reads "MSDN Webcast: MSDN: A Primer to Proper SQL Server Development (Part...)". A bit misleading don't you think? So, I finally figured out the mystery here. When I originally submitted my proposal for this series I titled it like this: Effectively Designing a Scalable and Reliable DatabaseA Database Developer’s Primer to Proper SQL Server Development (Level 200-300) But somehow, through a few edits and since each module also has a more detailed title like the following, then I think the title just got edited down to fit. In fact, with subtitles like the following… I can’t even fit the entire title, subtitle and session title on a single slide. JPart 1: Creating a Recoverable DatabasePart 2: Creating a Reliable and Automated Backup StrategyPart 3: Designing Tables that Scale – Best Practices in Data Types and Initial Table StructuresSo, the mystery ends here… It’s all the same content and the general idea is that building a scalable and reliable database takes a variety of best practices that together make it happen; there is no “DBCC MAKEITGOFASTER(‘TRUE’)” option.

Q: WRT "best practices," what do you think of SQLBPA? It’s a great (and free J) tool that can help you to recognize some of the most commonly overlooked best practices. Part of the intent is also to help you determine if you’re violating certain practices that might impact your upgrade strategy. For example, if you’re accessing the system tables you will want change that code to use information_schema views, stored procedures (sp_*) or system functions to gather the system information. If you’re looking solely for information about your upgrade path, the BPA has been updated/replaced with a new tool called the Upgrade Advisor.

Best Practices Analyzer Tool for Microsoft SQL Server 2000 1.0 

Upgrade advisor URL - Microsoft SQL Server 2005 upgrade Advisor - CTP June version

Q: In the SBS2003 environment the default backup does a live backup using a volume shadow snapshot of the MSDE and SQL databases on the server. How bad is this practice v. doing a SQL backup and backing up the backup and the logs? Well, there are really a couple of issues here… First, the positives of controlling backups within SQL Server:(1)   You can backup databases individually(2)   With certain database backup strategies (specifically, Database/Log strategies in the FULL Recovery Model), you can recover up to a specific point in time (P.I.T. Recovery) or to a marked transaction or even up-to-the-minute (if the transaction log is still accessible at the time of the disaster.(3)   The process of backing up the transaction log helps to keep the transaction log small and manageable.Second, the negatives:(1)   You have to manage each database backup strategy individually – however, tools like the Database Maintenance Wizard can help to minimize this.(2)   You’ll need to do backups more frequently if you want to take more advantage of these features – however, this would also give you better granularity on your backups AND result in less data loss because your snapshots probably aren’t very frequent.Now – as for volume shadow snapshot – this is GREAT for a complete image of the system and it does work (especially well) when you have to replace the complete system or when you want access to specific files on the system. Where you can run into troubles (and I’m not saying that this is always going to be the case, one way or the other) is when a SQL Server database becomes damaged and you revert back to just a file (or even all of the database files) from a shadow copy image. In almost all cases (and I really can’t think of too many where this would work except in some read only cases), you won’t be successful *just* restoring an old image of just a single database file. So, the damaged database *might* end up being lost. Instead, you could restore the entire database image from the shadow copy point in time and that *might* work; however, if the database structures have changed since you backed it up then the best choice would be to drop the database, copy over the files (the mdf and ldf) and then re-attach the database. That should work and all you’ll need to do from there is add any logins and/or changes since the last backup. Where things could be a lot worse is if one of your system databases becomes corrupt. This might limit your ability to recover. Now, having said all of this – I *know* that your SQL backup strategy WILL work and I know the complete set of options related to it. While it is a bit more administrative work, if your data is critical and minimizing data loss is *very* high on your list; I would use BOTH. Volume Shadow Copy for the system and file recovery strategies and SQL Server database and log backups for your databases. OK, so I decided that I wanted to check things out a bit more…especially, since I am not all that well-versed on VSS; I decided to look up as much as I could find and well, I’m going to stop with this. First, pre-Windows 2003 sp1 it looks like even backing up a volume that includes databases that are in the FULL or BULK_LOGGED Recovery models, generates an error. As of sp1 (or the hotfix associated with this KB), the error is no longer generated BUT I did find the following paragraph very useful:

This hotfix does not let Windows Backup perform a Volume Shadow Copy service backup of a SQL Server database with a recovery model that is set to Full or Bulk-Logged. You must use the SQL Server backup procedure to back up your SQL Server databases that have a recovery model that is set to Full or Bulk-Logged. And – if you do a backup with NTBackup, it will log a message in your backup log of the following: NtBackup does not support backing up SQL databases which are configured in non-simple recovery mode.If the database has been backed up, it is likely corrupted. Please do not restore this database.The recommended way of backing up non-simple SQL database is to use the backup solution that comes with SQL Server.

So, while I’d really like to dig into this deeper – I have to use my best judgement at this point and say that you should probably work to handle backups through SQL Server.

Q: What is the default recovery model in MSDE and SQL Express? I ended up demo’ing this in the presentation. I was sure about MSDE but I wanted to do a quick demo on SQL Express. The quick check is just to check the database properties of the model database… Another option, create a new database and see what the properties are. For both MSDE and SQL Express, the default database recovery model is SIMPLE.

Q: How would you compare the native backup to products like LiteSpeed from Imceda (which is used internally at Microsoft - I'll avoid the typical dog food comments) and SQL Backup from the chaps at Red-Gate? Third-party products can offer significant additional benefits over just the native backup types – for example, encryption and compression. This can result in improved security, faster backup times and smaller backup files.  

For more details about SQL Lite Speed, please go to: http://www.imceda.com/For more details about SQL Backup, please go to: http://www.red-gate.com/

However, I would also like to make a comment about the fact that Lite Speed is used internally at Microsoft. Backup products are an interesting special case in terms of how they work with regard to SQL Server. Most backup products use the VDI interface (the Virtual Device Interface provided for SQL Server) in order to get access to essentially the same pages that SQL Server Backup does… So, in fact, using a third-party product like SQL Lite Speed, is actually still testing the VDI interface and still working SQL Server pretty well.

Q: Can you use 3-rd party backup with log shipping? Depends on whether or not they support it and/or have automated software to help you configure it. Many products do offer ways to help automate log shipping with compressed versions of your backup files… See earlier links for references to other products.

Q: How do you fix a log file that won't truncate? The best way to fix a transaction log that won’t clear:(1)   Terminate all connections to the database (best to set the database to SINGLE_USER mode)(2)   Backup the transaction log with a normal BACKUP LOG command(3)   Shrink the transaction log to the smallest size possible (DBCC SHIRNKFILE WITH TRUNCATEONLY)(4)   Alter the database to increase the transaction log to a more appropriate size with ALTER DATABASE.(5)   Return the database to MULTI_USER mode.These steps are exactly the same as the steps necessary to remove fragmentation in VLFs. The file that can help you to get more familiar with this process is located in the scripts associated with this session and is named: VLF Optimization - Northwind Walkthrough.sql.

Q: If you do the steps recommended to clean up the fragmented VLFs, does it break the continuity of the transaction log backup chain? No! You can backup the transaction log before, as part of (step 2) and after you clear up the fragmentation – and no other backups are necessary. It is a complete chain that can be used in recovery.

Q: The transaction log portion of the tempdb seems to be growing and growing- even with the simple recovery model. In fact, it only seems to release space when CHECKPOINT is run manually? Well, I have to admit that this is a bit strange and should not otherwise be the case. I would suggest seeing if you have any open transactions in the database (DBCC OPENTRAN) but that shouldn’t make a difference if it DOES clear when you execute CHECKPOINT. So, I have to admit that this is a bit bizarre. I would try to do some profiling and see if you can find any consistencies with the types of commands that are running at the time when it doesn’t seem to clear. I would also wonder if this is ALL the time or just occasionally. You may want to contact PSS for more assistance with this.

Q: TRUNCATE TABLE does not benefit from BULK_LOGGED? When I say that something benefits from the change to the BULK_LOGGED recovery model, it means that the change must offer a performance gain or otherwise. In the case of TRUNCATE TABLE, it is always executed in a manner which is logged – efficiently. So, there is no gain in switching; it is always an optimal command to execute.

Q: We have a habit of switching to "bulk-logged" mode when we are doing index -rebuilding, etc. which seem to benefit from this setting. Then, when we are done, we "switch" back to full mode. Does this affect our ability to recover later on using the logs that we are creating because of this switching back and forth? and Q: If I switch between the FULL and BULK_LOGGED recovery model, do I need to take full database backup after the switch to BULK_LOGGED? No! In fact, it’s recommended as long as you’re aware of the options that you lose when you are running in the BULK_LOGGED recovery model. However, the transaction log continuity is not broken – given the fact that you can successfully back up the log after the bulk operation completes. To minimize the time of which you are vulnerable, make sure to backup your log right before the switch (to BULK_LOGGED) as well as right after the switch (to FULL).  The file that can help you to get more familiar with this process is located in the scripts associated with this session and is named: DBAlterForBatchOperation.sql.

Q: Is it advisable to use transactions for each and every SQL statement? Even if they are a single, simple, one-command transaction? If not, how can we use the marked transaction? Well, the answer to the first part of the statements are: NO, it is not necessary and it is not advisable to wrap every command in a BEGIN TRAN/COMMIT TRAN block. However, as per the second part – you cannot mark a transaction without using the following complete syntax: BEGIN TRANSACTION TransactionName      WITH MARK 'description of mark'SQL_StatementsCOMMIT TRANThe file that can help you to get more familiar with this process is located in the scripts associated with this session and is named: Restore Log and Marked Transactions.sql.

Q: Do you dynamically create Transaction MARKs to make the mark names unique? I would assume a restore would stop at the first mark it came across with the specified name? You can certainly use dynamic string execution to build your transaction mark names dynamically. However, you can also use a mark with a datetime parameter. From the BOL: The STOPATMARK and STOPBEFOREMARK options both support an AFTER datetime clause. When used with datetime, mark names need not be unique. In a RESTORE DATABASE statement, datetime is required to recover to a marked transaction. In a RESTORE LOG, however, datetime is optional. If AFTER datetime is omitted, roll forward stops at the first mark with the specified name. If AFTER datetime is specified, roll forward stops at the first mark having the specified name exactly at or after datetime.

Q: When configuring a database maintenance plan through the Database Maintenance Plan Wizard, there is no possibility to switch between recovery models…Correct. However, you can modify the jobs created by the Database Maintenance Plans to include changes to the recovery model. BUT – having said that, I’m not sure why you’d want to change it during the actual backups… Instead, you should consider changing the recovery model as part of your batch process that would benefit from the change.

Q: How do you clear the log in SQL Server 2000 through the Enterprise Manager? The only way to clear the transaction log through the SQL Server Enterprise Manager is to do a regular transaction log backup. The way you can do this is to right-click on databases, tasks, backup - then choose a transaction log backup. The default behavior is to clear the inactive portion when the backup completes.There is no way to just “clear the transaction log” as that would break the continuity of the log. You could execute a command to clear the log but if you find that you’re manually clearing the log and NOT keeping transaction log backups, I would recommend setting the recovery model to SIMPLE so that the transaction log is cleared automatically.

Q: Can you explain when you need to recover .mdf and .ldf files? Recovering directly from the .mdf and .ldf files (through sp_attach_db) is really only guaranteed when the files were closed properly – through either a sp_detach_db or the files were copied when the server was shutdown.

Q: How do you recover if the master database becomes corrupt? Is the only option uninstalling and reinstalling SQL Server? No, SQL Server 2000 includes a utility to rebuild the master database called rebuildm. SQL Server 2005 uses setup – but with special parameters that only rebuild the system databases as opposed to do a full installation.

Q: Any BP on restores on replicated database i.e. distribution matching database? Transaction Log marking can help to ensure that multiple servers are all restored and rolled forward to the same point in time. Additionally, there are options related to whether or not you KEEP_REPLICATION (this is for the publisher). However, outside of those couple of options, you should review the Books Online for best practices in this configuration.

Q: Is it possible to backup database in the standby mode? No. But, I wasn’t completely sure during the webcast so I decided to try it. On the backup, I received:Database 'test' is in warm-standby state (set by executing RESTORE WITH STANDBY) and cannot be backed up until the entire load sequence is completed. BACKUP DATABASE is terminating abnormally.

Q: With log shipping and 2000, and the destination database is exactly the same as the source, how about backing up the destination rather than the source? What happens to the shipped logs during the backup? You can’t backup a database that is in the NORECOVERY or the STANDBY recovery completion states. As a result, you can only backup the source, not the destination.

Q: What are my risks if I use Simple Recovery and what are the advantages of it? Risks are mostly in the potential for work-loss exposure and data loss in the event of a disk failure for your database. The advantages are in simplicity and ease of management because you don’t need to backup the transaction log. However, through simple automation procedures you can automate the backups of transaction logs while getting the most options for recovery. If you want more details, you should review the chapter that’s available for download on SQLskills. This chapter covers the differences in the Recovery Models as well their pros and cons in management and recovery! Here’s a direct link: www.sqlskills.com/resources/SQLServerHAChapter9.pdf 

Q: How come you have both the SQL Server 2005 and Express on the same machine? Is it because you don't have Visual Studio 2005 installed? The documentation of VS.Net 2005 states to uncheck the SQL Server 2005 Express if also installing SQL Server 2005 CTP. Part of the reason that Visual Studio doesn’t want you to install Express if you already have the CTP of SQL Server is that they may not be the same versions. However, if you have the appropriately matched builds of SQL Server and Visual Studio, you can install multiple versions of SQL Server on the same machine- with Visual Studio Whidbey. Having said that – of what’s available today- the correct combination would be SQL Server CTP June (IDW15) with Visual Studio Beta II. On my machine, I’m actually running SQL Server 2000 (as a default instance), two Developer editions and one Express edition. AFTER SQL Server was successfully installed, I installed the appropriately matched Visual Studio installation. There won’t be any need for this once the products release in November but while they’re both being developed – and while there are “later” releases that leap-frog over each-other, it’s a bit confusing.

Q: SQL 2000 Maintenance Plans for transaction logs (say, all user databases) indicate job failure if just one user db is in the SIMPLE recovery model. Any alternatives short of creating our own backup script to avoid this SQL Maintenance backup job failure? Yes, this is because the sqlmaint.txt does not verify the recovery model.

Q: Is this a SQL 2005 behavior too? Database Maintenance Plan Wizard has been changed significantly in SQL Server 2005 and it’s now a lot easier to create more interesting/more complex plans on one or more databases – I would work harder to create a class of jobs for your SIMPLE recovery model databases (just full backups and maybe differentials) and a different class of jobs for your FULL/BULK_LOGGED recovery model databases.

Q: Does the execution of a transaction log backup impact simultaneous database usage? If well optimized the cost of a transaction log backup should be relatively low. Might be a good time to make sure that the transaction log is optimized! Go figure, I wrote another blog entry to help you optimize the transaction log. Check out 8 Steps to Better Transaction Log Throughput.

Q: Why does the log backup restore slower than restoring a full database backup? In general, a transaction log is a more intense operation. When restored, the transaction log must be redone (“redo”) and then undone (“undo”). This process is what applies the changes to get you up to what the database looked like when the transaction log backup was executed. And, while each log row’s redo operation is generally much faster than the original execution (they do not have to do any calculations, etc. as the FINAL version of the changes for the row is what’s in the log), there might be a lot of them!

Q: Do I have to backup the ReportServerTempDB? I’m not a reporting services expert but my understanding is that yes, you do need to backup a ReportServerTempDB. The best thing to read is from the Books Online: Backup and Restore Operations for a Reporting Services Installation.

Q: Will the 2005 Backup wizard allow scripting of the 'scheduling'? The backup command itself will be in the scheduled job but the statement that is used to create the job can only be scripted by scripting the job – not the backup command.

Q: Did you talk about partial database availability? Yes! SQL Server 2000 does not offer this new feature, but SQL Server 2005 does. Partial database availability is a new feature that allows a damaged database to stay online – even while secondary data files are damaged and unavailable.

Q: What is an optional secondary file?An optional secondary file is a data file that is NOT the mdf. Secondary data files are created in larger databases and store user-defined data. They might be used to store different types of data and/or even possibly part of a large table (partitioning). Imagine having 100s of millions of rows – because you have three years of historical data – if year 2003 is in a secondary data file and becomes damaged, it will not require the database to be taken offline.

Q: I have not seen Management Studio. Is this new with SQL Server 2005? Yes, SQL Server Management Studio (SSMS) replaces both Query Analyzer and SQL Server Enterprise Manager – in a more robust UI that also include Solution/Project control and Source Integration. Check out the online labs to get some hands-on lab time in the Virtual Lab environment here.

Q: A FULL BACKUP has the option "remove inactive entries from transaction log" --- why doesn't it seem to do this? It would seem from this option that nightly full database backups would keep the log size small? This is a SQL Server Enterprise Manager bug. It should NOT even be an option on that tab. It should be grayed out. If you click around to differential and then back to full on the General Tab, before you go to the Options tab, then you will no longer see that option available. As far as this being something desired – not really… The reason why clearing the log is NOT performed after a full backup (or as part of a full backup) is so that if the full database backup becomes corrupt in some way, you can fall back on your transaction log sequence to successfully restore! Basically, this ensures that nothing ever breaks the continuity of the transaction log!

Q: You recommend Trace Flag 3231, what does it do? Trace Flag 3231 is an undocumented Trace Flag that disables the ability to clear the transaction log in a database that is running in the FULL or BULK_LOGGED recovery models. This significantly reduces the ability for someone to break the continuity of the transaction log… ah, so that would probably be the next question – what else could possibly break the continuity of the transaction log? Changes to the SIMPLE recovery model. It is highly recommended that ONLY changes between FULL and BULK_LOGGED be performed in databases where transaction log backups are relied upon for recovery.

Q: How can we test our backup strategy? How do we test our differential backup content? When we test the differential backup content, do we need to restore the full backup content at the same time?The best test of your backup strategy is a complete restore sequence to a secondary server – which you then follow up by testing your application. There *are* other dependencies outside of the database that could cause your application to fail. However, the restore to the secondary server will ensure that at least your data is accessible. Below are some KB articles that you can review for more details on some of these additional dependencies. To access them, go to http://support.microsoft.com.  Q240872 – INF: How to Resolve Permission Issues When a Database is Moved Between SQL ServersQ246133 – INF: How To Transfer Logins and Passwords Between SQL ServersQ307775 – INF: Disaster Recovery Articles for Microsoft SQL ServerQ224071 – INF: Moving SQL Server Databases to a New Location with Detach/Attach.

Q: I have a database server hosted by a 3rd party on the web ... is there a secure, automated way for me to make remote copies of the Database Backup? You'll need to talk to your ISP to see if they support that. More than anything it depends on the interface they give you, the services you've paid for and whether or not you have secure channels over which you can communicate with your ISP.

Q: Is there a way for you to answer these questions – that is – release the question from the question manager so that another question can be asked? I had a question in the "queue" for most of the webcast and because of that, I could not ask additional questions which I had wanted to ask... I ended up deleting my original question just so the question window would allow me to ask the next question... This is a LiveMeeting issue. I’m trying to put just a quick “a” in the response during the webcast but really, it’s an issue with the LiveMeeting Query Manager UI. I REALLY wish that questions just went into the queue and didn’t hold your window as well… I’ll see if we can pass your suggestion on to the LiveMeeting folks!

Thanks again for another GREAT week of questions. See you in the next session…which because this is posted so late, is session 4 - on Friday, August 26th. You can register for this session here: MSDN Webcast: A Primer to Proper SQL Server Development (Part 4 of 10): Best Practices in Indexing (Level 200)

Thanks for reading!

kt

MSDN Webcast Q&A: A Primer to Proper SQL Server Development

Creating a Recoverable Database, Part 1 of 10

Presented by Kimberly L. Tripp, SQLskills.com

Q: Can I view a recording of this webcast? Here’s the specific Replay Link: http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032278585&Culture=en-US

Q: Where can we get the demo scripts? The demo scripts are in this zip (20050805 MSDN Webcast ScalableSys01.zip (5.47 KB)); here in this blog entry. However, I will also create an entry www.SQLskills.com under Past Event Resources for the entire webcast series.

 

Q: Does Kimberly L. Tripp have a blog? Yes, www.SQLskills.com/blogs/Kimberly and if you’re reading this on my blog…well, then you already know! J

Q: How about running on a VMware Virtual Machine? Sure.  

Q: Is the placement of data and log important when using a disk array or SAN? It’s important to make sure that you don’t have bottlenecks at the physical disk. You should see if your SAN software supports monitoring at the disk level and if so, consider reconfiguring if bottlenecks occur. It’s harder to say that a SAN will have the same problems that you would have with direct attached storage since SANs often place larger logical disks on more than one physical disks (by using 9, 18 or 36 GB chunks from a variety of physical drives). The main point is that SANs offer a lot of caching and other optimizations, so you don’t often have to worry as much. However, I would strongly suggest getting to know your SAN config as well as work with your SAN vendor to really get an understanding of how the resources are allocated as well as used. Also, make sure you’re current with all bios, firmware, drivers, etc.

Q: Where will be the demo code available? What version of SQL Server will you be using? All demos were done on SQL Server 2005; however, for this (and even for many others), a lot of the demo code works on both SQL Server 2000 as well as SQL Server 2005. At a minimum, many of the concepts apply and where a feature is new or only supported on one version, it will be pointed out. For example, fast file initialization is on SQL Server 2005 only.

Q: Does SQL Server support log mirroring (i.e. dual logging)? No. A very old version did (I think 4.2 on OS/2 did) but the key problem was performance. You are a lot better off letting the hardware handle the mirroring rather than software talking to os talking to hardware.

Q: Do Secondary files allow partial backups? Not really sure I follow this question but…if you’re asking whether or not you can perform a backup of just a part of the database – yes! In fact, any file can be backed up independently of the filegroup of which it’s a member (there was a restriction in 7.0 that didn’t allow this but as of SQL Server 2000, files can be backed up at any time). If you’re really inrerested in file and filegroup backups, I wrote a couple of articles for SQL Server Magazine and you can find the complete list here: http://www.sqlskills.com/articles.asp 

Q: What is a page? Generally speaking a page is a unit of storage. Specifically speaking with regard to to SQL Server, a page is 8K in size and is the smallest unit of I/O in terms of data. When a table is stored on disk, it is stored in 8K chunks (and most of the time SQL Server allocates 8 – 8K chunks to objects). A 64KB block of the database is called an extent. SQL Server allocates extents once an object reaches a minimum size (which is also 64KB) to try to keep an object more contiguous.

Q: Is caching user specific or server specific. If two users are accessing same table, does it create two separate pages in cache or just one? Well, data cache is not user specific (at least not with regard to your question). However, there are a few things that would be local in scope – like a user-defined temporary table… However, a database table being accessed by many users, would only have one set of pages in cache.

Q: Why would a user of the database do a checkpoint? Users cannot perform checkpoints, only database owners and administrators can force a checkpoint. Generally speaking, it is not often that a checkpoint needs to be force. SQL Server controls and handles the checkpoint automatically.

Q: Wouldn't the engine know the best time to do a checkpoint? Yep! And it does. The default setting for checkpoint is “0” minutes. Meaning – SQL Server decides!

Q: How does page caching function during the 'redo' process? Same as it would normally. Data being “redone” is loaded into cache and changed – based on what’s held in the transaction log. (Keep reading, more in the next question/answer.)

Q: So does the transaction log contain all the information about say an update… i.e. what columns, what data? Yes, for the most part what is contained in the transaction log is the “after” version of the modifications so that log rows do not need to execute any functions or other code. The idea is that log rows can be processed extremely quickly – but have sufficient information to make sure that the data is modified properly. (And keep reading, more in the next question/answer.)

Q: When you say transaction is re-done, what exactly happens? Does SQL Server automatically take values again from cache and copy to the disk? What if we loose the cache also during the process? Then redo will begin again when the system restarts…Generally, the process is – go to the log, redo, undo, checkpoint and the information is not deemed inactive until after it checkpoints. So, even if you had the equivalent to “truncate log on checkpoint” this information would not be lost in the midst of restart recovery (redo) because it’s still active until it’s checkpointed!

Q: Is Roll back notification done asynchronously or is another client request required, e.g. long running client with long SQL call intervals…I’m not sure if I follow this exactly BUT I think you’re wondering how you would know that your transaction has been rolled back? The key way that many applications know – is that they never got actual confirmation and maybe they’ve lost their connection and just timed out. But, if this doesn’t answer your question…send me mail!

Q: Can you pre-grow a database at non-peak times grow the database when space starts getting low? Sure, there are a few options really. One way, create a SQL Agent Job that checks space allocation at 2am and if it’s getting close to full – execute a manual increase in space. With a bit of dynamic string execution I think you could get this to be a very flexible and easily automated process!

Q: Best drive configuration for data, logs, indexes? Well, typically, I like to separate data by access pattern and type rather than data from indexes. Typically, I like RO v. RW v. a single large table v. LOB Data.

Q: I have a 24/7 SQL Server production environment with a database that's well over 300 GB - when do I defrag it? You really have a couple of options to defrag a table – truly defrag (and only defrag it) or rebuild it (which does a lot more than just defrag). To make the answer even more clear – defrag often as it doesn’t take the table offline (in 2000). Rebuild the table when/if you can afford downtime against that table (in SQL Server 2000, a rebuild requires that the table be either read-only [when rebuilding a non-clustered index] or completely inaccessible [when rebuilding the clustered index]). So, now this gets a bit harder to answer! Check out the webcast on SQL Server 2000 – Index Fragmentation Best Practices here.

Q: If you have multiple data file, can you merge back to a single data file? Only when the files are a member of the same filegroup. If/when you want to do this you have two steps: DBCC SHRINKFILE(file_to_remove_logical_name, EMPTYFILE) This will empty the contents of the file into the other files within the same filegroup. Once emptied, use:ALTER DATABASE dbnameREMOVE FILE file_to_remove_logical_name

Q: What is zero initialization?Where the entire contents of the file are zero’ed out. This is done for security reasons.

Q: Doesn’t truncateonly make us lose continuity between transaction backups? First – this was with regard to my recommendation to use DBCC SHRINKFILE with the TRUNCATEONLY option NOT using with TRUNCATE_ONLY on a transaction log backup… So, specifically, NO. DBCC SHRINKFILE with the TRUNCATEONLY option does not break the continuity of the transaction log. Now – just to add a bit of irony here… BACKUP LOG with TRUNCATE_ONLY no longer breaks the continuity of the transaction log in SQL Server 2005. In fact, BOTH the TRUNCATE_ONLY and the NO_LOG options have been changed to ONLY perform a CHECKPOINT. In a database running in the FULL or BULK_LOGGED Recovery Model, this will have NO real impact on the transaction log. In a database running in the SIMPLE Recovery Model, this will execute a checkpoint and the database setting of simple truncates the inactive portion of the transaction log when a checkpoint occurs.

Q: I've read the BOLs about faster performance when putting files on raw partitions? Hmm… I’d love to see the reference. They might have said that raw partitions may offer performance benefits but I would generally doubt it. More importantly, you’d lose other key features if you didn’t use NTFS – like Database Snapshots. So, my main point – don’t use raw partitions! Even if they did offer a performance gain, what you’d lose isn’t generally worth AND I can get better gains elsewhere (indexing, optimizing procedural code, etc.)

Q: Could you please tell us quickly why raid 1+0 is better? Basically, RAID 1+0 offers better availability than RAID 0+1 because it can tolerate the loss of more than one drive. If a drive in a RAID 1 array is lost, all other drives still function. If a drive in a RAID 0 array is lost, all other drives in the RAID 0 array stop functioning. Here’s a good link to review more about different RAID configurations: http://www.raidarray.com/04_00.html

Q: I recently came across a 200meg db with a 50GB log. We tried everything to truncate it but finally could only fix it by changing to a SIMPLE recovery model. What is the most likely culprit for such an outrageous log growth? Well, I wish I could say I hadn’t seen this… But, it’s due to autogrowth and it’s due to running in the full recovery model without performing transaction log backups (but you are performing database backups). Now, the reason why you had so much trouble with it – was because it was horribly fragmented (because of the autogrowth defaults).  So, the main point here – if you follow the steps of the VLF Optimization content from the session, you would have also solved the problem. I’d at least check your current fragmentation and make sure that everything is fixed!

Q: What is a "long" transaction (seconds, minutes, or hours?) There is no specific time but the longer a transaction is – especially relative to transaction log backups – the more possible larger growth of the log and less control.

Q: Will *you* be the one doing parts 2-10? Yep… I’m a glutten for punishment. Just kidding! To be honest, I really enjoy these webcasts!!! See you in the next one.

Q: Should SQL Server developers invest the time to learn Microsoft's Enterprise Library (the Application Blocks thing)? OK, I checked around a bit on this and the general consensus is that developers should be aware of all of the resources that exist. It’s by no means an absolute but there are quite a few *very* useful resources there. Review things lightly to see if anything seems useful to your application and then dig deeper if/when they do.

Q: Should the SQL Server Disks be defragged at the os layer with disk tools to defrag? Sure. It doesn’t hurt and can help. The only negative is that you must shutdown SQL Server in order to do so. The good news, once defragged you don’t really need to do it again (unless you have a lot of autogrowth and other files on the disks).

Q: I had a lot of problems convincing my clients to create a new instance of SQL in their servers. The objective was to isolate our systems from theirs. What are the pros and cons of creating an additional instance against adding more DBs to an existing one? Well, you’re correct that it’s mostly an isolation issue. A big concern in some SQL Server environments is the access to metadata that everyone has. In SQL Server 2000, logins can see that other databases and other logins exist – even if they can’t use them. This presents a security concern. In SQL Server 2005, metadata is restricted so that this doesn’t happen.

Finally, here is the link to the Part 2 Registration page.

See you in Part 2,

kt

Wow, I've been horribly bad at blogging these days and I owe you a few entries as well... But, good news, I have a new MSDN Webcast series that will help to answer a lot of best practices in database design strategies that can help you achieve better scalability, availability and reliability. All of my demos are on SQL Server 2005 but many of the concepts apply to both SQL Server 2000 as well as 2005.

The sessions are all targeted at best practices and as such we went with a base of 200 level. Knowing me I can't go through an entire session without some internals and/or technical depth so you should expect 200-300 overall. I'm very excited about this series because everything counts! What I mean is that each and every best practice you implement helps to bring about a more scalable, available and reliable solution - there are no magic bullets only great overall design can achieve these targets!

So, what are we going to talk about in 10 - 90 minute presentations..... a lot!

Here's a link to the first one and as we get rolling, I'll blog a lot more about additional resources, best practices and of course, I'll blog the webcast Q&A as I've done in the past.

Enjoy!

Scott Stanfield, a Microsoft Regional Director, had a great idea for Tech*Ed US (well, it was a great idea before he knew exactly how much work was involved :)...

He decided that he would get other Microsoft Regional Directors together to discuss (and record) interesting topics for just 10 minutes (of course, some of us went over... SORRY!). We could pick anything we wanted and then he would - with the help of a few other RDs such as Scott Golightly, Patrick "Beach Master" Hynds, Kate Gregory, and J. Michael Palermo - record our sessions to later edit and post on the GrokTalk site.

So, after A LOT OF HARD WORK - they're done! All 35 of the GrokTalks are posted and you can find them on the GrokTalk site: http://www.groktalk.net/

Finally, since I delivered a GrokTalk on Stored Procedures, I'm going to make sure that you have a comprehensive set of resources related to stored procedure optimizations here:

Well, now that should keep you busy...

Enjoy!!

NOTE/UPDATE: Be sure to read this post as well: Transaction Log VLFs too many or too few? after reading this one. Not only can you have too many small VLFs but if incorrectly sized, you can have too few! Enjoy!!

On a few of my last trips, I've had the pleasure of doing a lot of customer visits - visiting some of the larger implementations of SQL Server. Many of these implementations are at banks where both performance and recovery are critical. After my trip to Turkey (where I did customer visits prior to presenting at the Microsoft Professional Developers Summit), I received this email:

As you might remember we talked at xxxbank, Turkey about performance problems in transaction log backups and splitting tempdb data file into equal size chunks equal to number of CPUs. I have implemented both optimizations in my troubled server, and the results are great! Transaction log backups do not impact the server at all and with lowered congestion in tempdb, overall system performance has gone up as well.

So - this is my first of two blog entries targeting these two VERY typical problems:
* Transaction log fragmentation (both internal and external - this customer was having problems with internal fragmentation)
* Optimizing TempDB

We'll start with 8 Steps to Optimizing your Transaction Log - not necessarily in any order. It's best to review all of these to make sure you have a comprehensive view of how to improve both performance and recovery of your transaction logs!

1) Try to dedicate the transaction log portion of your database to its own phyiscal disk. In high volume OLTP system, isolating the transaction log can allow the disk head to be ready for the next write by not having other files contend for the physical disk resource. If your database already exists, the best way to "move" your transaction log is to detach your database (sp_detach_db) and then reattach it (sp_attach_db). Here's a great KB article titled: Moving SQL Server databases to a new location with Detach/Attach.

2) Defrag the disk(s) on which your transaction logs reside. This will get rid of external fragmentation of the transaction log - better known as disk file fragmentation. This will require that your server be taken offline but if your files have had a lot of autogrowth and/or they reside on a disk with a lot of other files that have been modified, then all of your files (incl. the transaction log file) are likely to be interleaved and fragmented. The good news is that this is a one time operation (unless the files continue to grow and shrink excessively - but we'll fix that as well). The bad news is that you'll need to shutdown SQL Server and then use some form of system tool - Microsoft or third party - to defrag. On Windows XP and Windows Server 2003 machines, the disk defragmenter tool can be launched by running: dfrg.msc.

3) Create only ONE transaction log file. Even though you can create multiple transaction log files, you only need one... SQL Server DOES not "stripe" across multiple transaction log files. Instead, SQL Server uses the transaction log files sequentially. While this might sound bad - it's not. If you want to get better performance out of the transaction log, place it on faster disks and/or a more performant disk (RAID) configuration. You could place the transaction log on a RAID 0 array but you'll likely want fault tolerance as well as performance. Isolation, as per number 1 is a good start but if you also need increased capacity then using hardware based RAID to combine disks for both better performance and redundancy is your next choice. See 4.

4) Not only should you try to isolate the transaction log to its own physical disk but you should make sure that the logical/physical disk configuration is as efficient as possible. Try to use an isolated RAID 1 mirroring set if you don't need significant capacity. If you need a greater capacity OR you want better performance, consider a combination of RAID 0 and RAID 1 (either RAID 0 + 1 or RAID 1 + 0). While RAID 0 + 1 can often offer better performance, RAID 1 + 0 offers better reliability. If you're new to RAID and are interested in learning more - check out the RAID Tutorial here: http://www.raidarray.com/04_01_00.html.

5) Don't be caught up in nothing but transaction log speed, you'll also want to make sure that your transaction log is always available as this can help you in times of disaster. Even if the data is damaged, if the transaction log is available and you have a series of backups up to and including the last transaction log then you can make a final backup of the transaction log that will represent all of the changes since your last transaction log backup. If this backup is possible (it's called backing up the "tail" of the log), then you can achieve up-to-the-minute recovery. This is only possible when the log is available. By placing the log on mirrored disks you can increase your chances of recovering data and minimize data loss!

6) Create transaction log files with a reasonable initial size. When you create a database it's ideal to PRE-ALLOCATE both your data files and your transaction log file. A little bit of capacity planning goes a long way... Now, if you think that you've got absolutely no idea how to size your transaction log you're going to need, well - here are the things that have the greatest impact:

  • Type of activity - transaction processing or decision support
  • Frequency of that activity - the more frequent the changes, the faster the transaction log will grow
  • Recovery Model - the recovery model of the database
  • Frequency of transaction log backups
  • Whether or not replication is used (since the log reader relies on the transaction log)

I wish I could give you a rough idea on sizing but if you look only at database size and none of these other factors, you could end up with a transaction log that's seriously oversized or seriously undersized. I've seen recommendations of 10-25% of the size of the data and you can use that but I would also add a bit of common sense. A larger database with very frequent transaction log backups may not need a transaction log that's even 1% of the size of the data... The best way is to setup your development environment similar to that of your production environment (including backup jobs) and then see how the transaction log grows. If you have a lot of autogrowth (because your guess was wrong), you can later clean up the fragmentation that has occurred and get back to a reasonable, intact, and optimal transaction log. If you want to know more about Recovery Models and their impact on the transaction log, review a sample book chapter from an MSPress book (Microsoft® SQL Server™ 2000 High Availability) that I helped to co-author. I only helped out on a couple of chapters but this specific chapter explains a lot about the general database environment settings that impact recovery. You can download Chapter 9: Database Environment Basics for Recovery here: http://www.sqlskills.com/resources/SQLServerHAChapter9.pdf

7) Don't let autogrowth get out of control. As important as capacity planning, you're not likely to be spot-on in your estimates. I don't recommend completely turning off autogrowth but in general I also don't like SQL Server 2000's default growth rate (or max size). In general, I would recommend setting ALL of the transaction log files attributes: initial size, growth rate AND maximum size. For the growth rate, I recommend something that can be allocated somewhat quickly and something of a fixed size. In general, I recommend a value which is less than or equal to 1GB (based on total size) but something that doesn't mean that you're going to autogrow again soon. So, for databases whose transaction logs are under 1GB then you might set autogrow to somewhere between 20 and 100 MB. For those of you who have transaction logs measured in GB, then I'd set the autogrowth to 500MB or 1GB. In general, I don't like the percentage because it needs to be calculated (which I realize isn't really all that big of a deal) but the larger the file the larger the autogrowth and the longer it takes. In SQL Server 2000, autogrowth can create blocking...so, it's best to minimize this in general.

8) Check/fix your internal fragmentation. OK, so this is the one that really helped the customer in Turkey. Often, when transaction logs are not pre-allocated and/or when there's been a lot of autogrowths, the transaction log can become internally fragmented. Internally your transaction logs are broken down into smaller more granular chunks called VLFs (Virtual Log Files). The size and number of VLFs you'll have depends largely on the size that the chunk is when it's added to you transaction log. If you add a new chunk to the transaction log which is 20MB (through autogrowth or through manual growth) then the number of VLFs that are added is 4. If you add a chunk which is greater than 64MB but less than or equal to 1GB, you'll add 8 VLFs. If you add more than 1GB then you'll add 16VLFs. In general, most transaction logs will only have 20 or 30 VLFs - even 50 could be reasonable depending on the total size of the transaction log. However, in many cases what happens is that excessive autogrowths can cause an excessive number of VLFs to be added - sometimes resulting in hundreds of VLFs. Having an excessive number of VLFs can negatively impact all transaction log related activities and you may even see degradation in performance when transaction log backups occur. To see how many VLFs you have solely look at the number of rows returned by DBCC LOGINFO. The number of rows returned equals the number of VLFs your transaction log file has. If you have more than 50, I would recommend fixing it and adjusting your autogrowth so that it doesn't occur as fequently. To get rid of all of the execessive VLFs, follow these easy steps to shrink off the fragmented chunk and add a new, clean chunk to your transaction log:

1. Wait for an inactive time of day (ideally, it would be best to put the database into single user mode first) and then clear all transaction log activity through a regular transaction log backup. If you're using the simple recovery model then you don't need to do a log backup... Instead, just clear the transaction log by running a checkpoint.

 BACKUP LOG databasename TO devicename

2. Shrink the log to as small a size as possible (truncateonly)

DBCC SHRINKFILE(transactionloglogicalfilename, TRUNCATEONLY)

NOTE: if you don't know the logical filename of your transaction log use sp_helpfile to list all of your database files.

3. Alter the database to modify the transaction log file to the appropriate size - in one step

ALTER DATABASE databasename
MODIFY FILE 

      NAME = transactionloglogicalfilename 
    , SIZE = newtotalsize
)

And, that should do it!

Many of you in the .NET Community are already familar with theServerSide.NET but many of you in the SQL Server community are not...well, here's your chance to bridge the gap over to .NET and hear a bunch of great .NET interviews! My interview is more SQL-centric but there are lots of great ones on Development in general and even one on SQL Server 2005 Reporting Services by Jason Carlson.

My interview was recorded with them back in February when I was speaking at VSLive in San Francisco and they've been saving them up, editing them and then posting an interview roughly each week (mine is the 29th posted!). The interview was great fun and Paul Ballard asked some really interesting questions about tuning, indexes, stored procedures and SQL Server 2005. Mostly it's an interview about optimization tips in both SQL Server 2000 and 2005 and just in general - things to look forward to in SQL Server 2005.

Here's a link to all of theServerSide.NET interviews: http://www.theserverside.net/talks/index.tss 
Here's a link specifically to mine: http://www.theserverside.net/talks/videos/KimberlyTripp/interview.tss?bandwidth=dsl
Here's the link to Jason Carlson's Reporting Services interview: http://www.theserverside.net/talks/videos/JasonCarlson/interview.tss?bandwidth=dsl and finally, here's a TechNet Webcast that Jason did titled: Authoring Reports in SQL Server 2000 Reporting Services (Level 200)

Enjoy!

On Tuesday, June 21, I had the pleasure of assisting on a TechNet webcast delivered by Michiel Wories - a Program Manager from the SQL Server Development Team at Microsoft. My assistance was limited as I really just listened and helped answer some of the questions that came up during the session. In fact, there were a few of us fielding questions and in general, there seems to be a lot of excitement around this new tool. The focus of the webcast was a new utitlity called SQLCMD. SQLCMD is a command-line tool that replaces OSQL and allows a lot of new functionality, such as: parameterization, access to environment variables, complex scripts that connect and execute against multiple servers, MUCH better code page support, input scripts, output scripts and error handling within scripts. Well, you'll probably learn more if you just watch the webcast! Below are a bunch of resources related to SQLCMD, I hope you enjoy.

Michiel Wories webcast titled: Scripting in Microsoft SQL Server 2005 (Level 300)
Michiel Wories blog entry with all of his webcast demo scripts: http://blogs.msdn.com/mwories/archive/2005/06/21/scripting_webcast.aspx

As for additional resources:
I've written a couple of blog entries that related to new SQL Server 2005 tools and specifically SQLCMD here:
     SQL Server 2005 - Management Studio Scripting Options and SQLCMD (loving it!)
Click here for the full group of my SQL Server 2005 blog entries.

Click here for the full list of all TechNet on-demand webcasts.

Click here for the list of “Get Ready for SQL Server 2005” resources!

Last but not least, here are the blogs of the other MVP's who helped answer questions during the SQLCMD session - lots of great stuff to keep reading!
Louis Davidson, Steve Kass and Kent Tegels

Categories:
Events | Resources | SQL Server 2005

I think there are numerous reasons for why I love technology but at the top of the list: learning. It's amazing to me that not a day goes by where I don't dig deeper into something or clarify it further. Even learning something trivial, like a new keystroke, can make our work easier to do and make us more productive at it. Things are constantly changing; the one thing I do know about technology is that there's a lot to learn!

So, to start, thanks for everyone's feedback (esp. Steffen Krause here) on some of my recent posts regarding LEFT and RIGHT based partitioning functions here. There was a lot feedback regarding the simplicity in syntax and declaration of a RIGHT-based partition function and by making the first partition of a RIGHT-based partition function empty, YOU'RE RIGHT! We can eliminate the need for data movement. There's no performance difference and it doesn't matter internally which type you choose but - I've heard you all loud and clear! You don't like dealing with the imprecision of a datetime data type when specifying upper boundaries. So, having said that - I need to make a few changes. In my next revision of my presentation materials, whitepaper, scripts, etc. I'll work to give both perspectives. For some reason, I still like LEFT-based partition function but RIGHT is definitely easier to define.

So, keep it coming everyone. You've hit the nail on the head. The fun part about technology is... no one knows everything and we're all always learning!

Enjoy Tech*Ed!

Blog edits brought to you by Richard Campbell (long story)

For the past couple of years, the early betas of SQL Server 2005 were very limited and even the events were somewhat limited - limited primarily to large customers under NDA, MVPs, RDs, and very dedicated SQL Server specialists. So far, only the SQL Express 2005 Edition made it on microsoft.com for download (click here if you're interested in SQL Express and the SQL Express Manager downloads)...

SQL Server Beta 2 and soon SQL Server Beta 3 have had slightly wider distribution, even being handed out to SQL Server conference attendees (such as at now PAST SQL Connections conference in Orlando), etc. but in most cases you need to have your own clean machine to install on and play with and you really need to have a completely spare machine as you don't want to impact any of your production and/or otherwise critical applications. For some, this meant not installing it at all.

If you're still waiting to get your hands on it, you have a new option! The SQL team has created a virtual lab environment.

Below is the promotional blurb I received. And, I decided to check it out for a second. When you click you'll need to install an ActiveX control, register and then login. Once logged in, you'll select your lab environment and then they will “build it”. Once selected, you'll have 90 minutes to complete the lab (and you can only enter one lab at a time) and you can download/print your lab manual (pdf format) to help you along. The feel is very similar to using Virtual PC and the build that was used in the labs I entered - was the December CTP of SQL Server (IDW 11). The current beta on Beta Place (for SQL Server 2005 Beta Subscribers is the February CTP which is IDW 13). Performance in my quick look around (of menu items, SSMS, etc.) seemed pretty good but I also have a decent internet connection. I started this process but didn't actually complete a lab yet... it's certainly worth checking out! Now you don't even need to install it yourself, you can use a Virtual Lab environment. If you do check it out - good or bad - let me know!

*************************************************

Are you ready to experience SQL Server 2005?

Announcing the launch of the SQL Server 2005 Virtual Hands on labs. In these labs, you will get to experience many of the new features in SQL Server 2005 including CLR integration, XML support and deep business intelligence integration.

Just follow the link and experience SQL Server 2005 for yourself!

Registration link:
http://msdn.demoservers.com/login.aspx?group=sql2005

*************************************************

Starting the week of December 6th, Microsoft will launch a series of webcasts focused on SQL Server 2005 Development. During the week of December 6th, Microsoft/MSDN will host 15 webcasts introducing all the new features of SQL Server 2005 for Developers. Then, starting in January, there will be at least one webcast per week focused on specific developer topics for SQL Server 2005.

There are a bunch of benefits to watching this series (in addition to the wealth of information provided ;):

  • Everyone who attends a webcast will receive a copy of the Beta 2 Resource Kit and the Beta 3 Resource Kit when it ships. 
  • The first 1500 people that watch 5 or more webcasts will also receive a special, limited edition SQL Server 2005 Webcast T-shirt.
  • All webcast viewers will be entered into a competition to win an XBOX, one for each day of the week. The official rules are here.

The official site with session information, dates, times and abstracts is: http://msdn.microsoft.com/SQL/2005Webcasts and if I hear of any updates I will also keep you posted on further details, etc...

Categories:
Events | Resources | SQL Server 2005

The title is a common question I've received in the past and I thought I'd take a few minutes to explain a bit about keys and indexes...

This is by no means a lot of detail regarding relational theory, etc. but there are a few things that we should quickly review to make sure that the basis for indexes being created makes sense. First, from a relational theory perspective every table must have a primary key. From SQL Server's perspective it's not a requirement but it's generally a good idea. A primary/unique key are entity identifiers. Each are a unique way of identifying a row. There are subtle differences between the two - in implementation:

Primary Key

  • In SQL Server the Primary Key is enforced through a Primary Key Constraint.
  • None of the columns that make up the primary key allow nulls.
  • The values in the Primary Key must be unique - to enforce uniqueness (as well as make it efficient), SQL Server creates a unique clustered [composite] index on the column(s) that make up the key.

Unique Key

  • In SQL Server the Primary Key is enforced through a Unique Key Constraint.
  • The columns that make up the unique key CAN allow nulls but not for more than one complete key. Meaning that allowing Nulls on a single column unique key really only allows NULL (only one NULL value). Overall, allowing Nulls in a column (like Social Security Number) doesn't really make much sense but if you need to then you can't go with a unique constraint - instead consider a unique index. At that point, I typically get the question of what's the difference between a unique key and a unique index...
  • Allowing Nulls values in the columns that make up a composite unique key makes more sense as long as the complete key is not null for more than one row.
  • The values in the unique Key must be unique and to enforce uniqueness (as well as make it efficient), SQL Server creates a unique NON-clustered [composite] index on the column(s) that make up the key.

What's the difference between a unique index and a unique constraint?

  • A unique key CAN be referenced by a foreign key constraint and a column which has only a unique index cannot be referenced.
  • Constraints are checked before indexes and this can lead to a large multi-row insert/select or update to fail before modification. However, indexes might (for a large modification) be validated at the end (instead of row by row) so a large modification that has a failure will need to rollback at the end of the modification rather than before. This is a good point - and one I hadn't really thought of until I was poking around some of the Q&A on SQLMag's website. Here's where I found it.

So, all of this leads me up to the original question (yes, you knew I get here someday :) and that's “When did SQL Server stop putting indexes on Foreign Key columns?”

First, SQL Server has NEVER put an index on a foreign key column... Indexes are used (as described above) to make the lookup (in a primary or unique key) for a duplicate value FAST. If the keys are ordered then checking to see if one already exists is trivial (i.e. fast). There is NO reason for SQL Server to put an index on a foreign key column as the column does not (and probably would never be) unique (if it is then it's likely to have a primary or unique key on it as well - as in a 1-1 relationship). So, that leads me to another key point...

Are there any benefits to indexing foreign key columns? YES

  • Better performance on maintaining the relationship on a delete of a primary/unique key. When you delete a key row, SQL Server must check to see if there are any rows which reference the row being deleted.
    • If the foreign key relationship is defined with NO ACTION (on update/delete) then a referenced row CANNOT be deleted as it would leave the referencing rows “orphaned.” To find the rows efficiently an index on the foreign key column helps!
    • If the foreign key relationship is defined with CASCADE (on update/delete) then when a referenced row is modified all of the referencing rows must be modified as well (either updated to reflect the new value or on cascade delete). To find the rows to modify efficiently, an index on the foreign key column helps!
  • Better join performance - for many of the reasons above, SQL Server can more effectively find the rows to join to when tables are joined on primary/foreign key relationships. However, this is NOT always the “best” indexing choice for joins but it is a good start. 

Finally, if you want a few titles related to relational theory check out these links:

An Introduction to Database Systems, Eighth Edition
     by C.J. Date
     E.F. Codd

Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design
     Mike Hernandez

Have fun!
kt

Categories:
Design | Indexes | Resources | Tips

While getting ready for the PASS Community Summit in Orlando this week (where I plan to talk about SQL Server 2005 Partitioned Tables/Indexes), I realized we were close but not quite ready to release the post-beta II whitepaper on MSDN.

So, in preparation for PASS we decided to post a pre-release of the partitioning paper here.

Enjoy!

See you in Orlando... eventually?!

Categories:
Events | Resources | SQL Server 2005

I began setting up for Profiler and Performance Monitor by first launching Profiler and then launching Performance Monitor within it (there's an icon for PerfMon second from the Right or you can select Performance Monitor from the Tools Menu). I had already created a Performance Monitor Counter Log so all I needed to do was start it (more details on PerfMon Counter Logs later).

 

Tips:

·                     If you want to correlate Perf Mon events with Profiler you will need to do this from SAVED Profiler Traces and PerfMon logs. Real-time analysis is not supported.

·                     The correlation is done based on the time of the events so it is important that the times are correct. You should make sure to either:

·                     Run these on the same machine

·                     Make sure that the machines are in sync in terms of time.

·                     The correlation is a bit tough if there are TONs of events as Perf Mon’s granularity for sampling is 1 sec and Profiler can return a large number of events in a single 1 sec range… SO – filtering (and you can do this later) will be a very good idea. Regardless, this is a GREAT feature to see how certain counters (memory, disk, CPU, etc.) look over the course of long running activities.

 

If you're not familiar with creating a Performance Monitor Counter Log click here.

 

Ok, so that was mainly setup. For my first real demo I wanted to show Profiler and a few of my favorite new features. Here’s a quick list:

 

When creating a trace there are ONLY two dialogs: General and Events Selection. The General dialog is similar to the old General dialog where you choose the template, the save to file and/or table options, etc. The Events Selection dialog however, is ALL NEW. The Events Selection Dialog has a ton of new and incredibly useful changes:

  • Only data columns that produce values for each of the specific events are shown and only those data columns can be selected (check boxes). See this dialog here: EventSelectionDialog.jpg (56.51 KB). Notice that there are blank spots where an event doesn't produce a data column.. yeah!
  • Filters apply at the column level. See this dialog here: EditFilterDialog.jpg (16.74 KB). And you can state whether or not events that do NOT generate a value for the filter are sent (yeah!!!). However, I do remember having a bit of trouble with this one...
  • You can pause an active profiler trace, change the events and/or data columns and restart it… all of the additional events go to the same trace file and you do not lose the previously started trace, etc.
  • Deadlock graphing – this is an Event – and it produces a graphic display of the spids that were involved in the deadlock. You can “extract” these events (right click on the event “Deadlock Graph“ and choose “Extract Event Data“) and then open this xdl file within SQL Management Studio so you can spend more time analyzing it.

So, once setup I created a deadlock scenario, showed the profiler output for deadlock graphing, paused the trace, removed the deadlock related events (really only because I could J) and then opened another window to generate some simple activity. So – speaking of the simple activity... I created a “demotable” that had defaults for all columns and then I created an insert statement using the DEFAULT VALUES clause. To get this to execute repetitively I could have written a complex (not!) loop such as WHILE 1=1 but instead I chose to use go N (mostly because people don’t tend to know about this one). Here’s the full extent of my “code”

 

            INSERT DEMOTABLE DEFAULT VALUES

            GO 100000

 

This will cause that statement/batch to execute 100000 times.

 

Once that was running, I wanted to generate more activity to profiler so I decided to go back to show another favorite from Management Studio. In Management Studio, I had two registered servers waiting: one is a SQL Server 2000 server and the other SQL Server 2005. I connected to both. One of the databases on my SQL Server 2000 system is a schema from a Microsoft.com data warehouse. The reason why this is interesting is that this database holds 300+ tables and 1300+ views (not to mention many other objects). While this isn’t the largest database out there (at 450GB) it certainly has issues in terms of finding objects and managing your “view” of what you’re trying to manage. So – in the SQL Server 2005 Management studio you’re able to create “Filtered” Views. By right clicking on Views, you can create a name filter. For example, all of their user objects have a certain naming convention which includes “WMU”, all of their rollup counts have the word “count” and the counts refer to a dimension of time (weekly, monthly, etc.)… So, if I want to see all of their user related views with counts by week, I can create this filter WMU%Count%Week. Using this filter, the object list drops from 1300 views down to only 12. And – this demo was using the SQL Server 2005 Management Studio against a SQL 2000 server. Nice.

 

So - now with a bunch of activity generated I can go back over and get the data to correlate. I stop the trace. I stop the Profiler Log...

 

I re-open the profiler trace (only because this would likely to be done later... I'm not sure if I've ever just stopped it and then integrated perfmon... that should work?). Anyway, only AFTER the entire trace file is loaded with the menu option to integrate perf mon counters become available. Choose “Import Performance Data“ from the File dropdown and wham - IF they have corresponding times you will see how the times match (remember, if this is on two machines and the times aren't in sync then the correlation could be virtually meaningless). Also, if they don't intersect you'll get an error: Correlation is not possible because there is no intersection between trace and performance data time ranges. Not bad!

 

If the two DO correlate in terms of time, then you'll get a new dialog which will show all of the performance monitor counters that are IN the perfmon log. The idea here is to “limit“ the number of counters to correlate. In fact, the dialog is called “Performance Counters Limit Dialog.“ OK, so my log has only a few counters so I'll just select them all. Click Ok. Once clicked your performance monitor dialog will show both - profiler data and perfmon data. Here you can select points in time and see what performance events occurred or choose profiler events and see what the perfmon counters look like. Here's what it looks like:ProfilerPerfmon.jpg (167.51 KB). You can also highlight chunks of the graph and expand them to show only a specific time range. You do this by clicking in the graph and then drag the mouse to create a rectangle.

 

OK, so in summary - I REALLY like this BUT I think you'll need to play with this a bit. I think it's a great feature but the range of values for correlation for Performance Monitor only being 1 second makes it a bit tough to see exactly what event caused what spike (if you have thousands of events per second in Profiler) BUT - it's good to get you close!!! And much closer than anything we've had before!

 

Have fun.

kt

Categories:
Events | Resources | SQL Server 2005