Coming up quickly, there are some great events to help you with best practices, tips and tricks and learning SQL Server 2008 R2 (which has just been announced to release in May this year). You can read about the upcoming R2 release on the "Data Platform Insider" blog post here.

So, how do you keep up?

A great way is through conferences (for broad tips/tricks and new features) and/or workshops/classes (for deeper content - usually with more focus) - where you can spend 3-5 days to completely focus on learning, networking and creating your new work ToDo list!

As for where we're going to be, here's our upcoming list:

March 8 - 12: Toronto, Canada - DevTeach

Here we have a precon (Indexing for Performance) and postcon (Developers Can't Ignore Database Maintenance) and 2 sessions (Optimizing Procedural Code and Tips/Tricks for Proper Table Design). This conference will be packed full of real-world advice on a variety of topics - with some SQL (more developer-oriented) - and has a lot of recognized speakers from around the world.

March 29 - April 2: Boston, MA - SQL Server Immersion: Internals, Performance, and Availability 

This is a full five day Immersion Event where we get to "make your brains hurt" (that's a quote :) with content. We delivered a similar event in Ireland (and Australia) last Fall and you can read the event report by clicking through the photo to get to the pdf. And, we're currently in the planning stages for heading back to Ireland and Australia this year as well! As for Boston - we can't wait to get back there. I lived there in the early 90s (but only for a short while before moving to Redmond) and Boston is where Paul and I met. It's a great city, we're really looking forward to it.

April 12 - 16: Las Vegas, NV - Microsoft SQL Server Conference and Expo (SQLConnections)

This year is a part of DevConnections where we have a full set of tracks solely focused on SQL Server and this year - specially focused on best practices, tips and tricks - and SQL Server 2008 R2. This conference is also focusing on the Visual Studio 2010 launch. 

We hope to see you there!
kt

Categories:
Conferences | Events

Ha... I'm finally sitting down today to write my homework. Homework created by my lovely husband when he tagged me in his blog post: http://www.sqlskills.com/BLOGS/PAUL/post/What-three-events-brought-you-here.aspx.

Obviously, there are many factors which lead people to where they are now. Some influences are small – almost undetectable. While others are large – and can’t possibly go without notice. Finally, some changes are fleeting while some last a lifetime. Really, it takes a lot to make a person – to make me/you. It’s hard to break this down into 3 – especially as I’m planning to focus on where I am – in this industry (and wrt to SQL Server - as it's been such a large part of my life!). As a person and the values that I have, I thank (blame? ;-) my Mother and Father. They were instrumental in giving me the personality that I have and the compassion that I value. Technically, however, my Mother doesn’t even have a computer and it wasn’t until last year that I convinced her that she needs a cell phone. Ironically, I think I’ve only called her on it once (maybe she didn't need it?). My Father is no longer with us but I’m certain his biggest influence was his value for adventure. (Donald Wayne Tripp: http://www.sqlskills.com/BLOGS/KIMBERLY/post/Donald-Wayne-Tripp-August-6-1944-November-28-2007.aspx)

As for computers, computing, databases and SQL Server…

#1: Jeremy Smith, PCDistributing -> Computers!

The first and most consequential was meeting Jeremy Smith - who, at the time, was President/Founder of PCDistributing (PCD). The funny part about how we met was that I worked at the local AM/PM Mini Mart and gas station (I even did full-serve on occasion) on the corner of Lake Ave. and Milwaukee Ave. in Northbrook, IL - only 1 block from PCD. PCDistributing had an onsite credit card (a card they could use only at our store) for the executives (Jeremy and Marie). I knew he worked with computers and well, I was interested in a better job; I was 16. During one of his stops in the store, I asked if they were hiring and he said - stop by and ask for me.

I worked after school and summers for PCD through my sophomore, junior and senior years and my job mainly consisted of data entry. However, I was pretty good at it and computers definitely interested me. I ended up starting night school at Loyola University (which was just down the street from where my Father lived in Rogers Park, Chicago) while in the 2nd half of my senior year of high school (my Mother moved from Northbrook to Vernon Hills, IL).

At Loyola I really learned a lot - some of which I learned from my partners in crime at the Data Center (where I worked). This would be Ross Naheedy (who I affectionately *still* call geekbag), Yatrik Shah (aka Yatman), Michael Dillon (aka Matt), James Kalemis, Carlos Talbot, Alvin Paul, Amy Overmyer, Cesar Lopex, John Long, Shelley Naheedy (now ;), Mike Abel, and Rita Moy - all of whom are on facebook (which is excellent!). This was an absolute blast. I remember many LATE nights and/or all-nighters in the data center. Probably the most infamous is one (probably more than one) where Ross and I wrote a program to trigger the PC alarm to wake us up from a nap we finally decided to take on the floor of one of the classrooms - where we had been geeking out all night. I have no memory of what we were playing with/on or if we were just writing code (Pascal probably, maybe LISP [I really enjoyed LISP]) - and I think it was on PCs but it may have been the mini or even the System 370 mainframe.

And, that brings me to #2 (even though I have to admit - my escapades with this motley crew definitely had an effect...).

#2: System 370 Assembler, Dr Hamilton and IBM -> Ward L. Christensen

At Loyola, I majored in Math/Computer Science and minored in Fine Arts (Graphics/Design) and Philosophy. Ah, it's no wonder that I never finished my degree. Can we say left-brain/right-brain confusion (or ADHD ;-) with too many distractions (boys, bars, jobs that paid money!). I did end up finishing most of my math/comp sci and even a fair portion of fine arts/Philosophy (I loved ethics) but I never did get my degree as I just loved working. And, in fact, this is where things are really interesting. My instructor for System 370 assembler (Dr. Hamilton) gave my name to a Marketing Manager at IBM (Dennis Kapral). Dennis called me one afternoon and I remember the call well, “Hi Kimberly, my name is Dennis Kapral and I work for IBM. We'd be interested in having you come in and apply for a job as a Marketing Support Assistant for the GET (Government, Education and Transportation) branch.” Er, I thought this was a joke. I interviewed with Dennis and then he passed me on to another gentleman for my technical interview. I remember this interview still - we chatted about how harddisks worked and the interview flew by. I remember this gentleman dropping me back in Dennis' office where he said, "Hire her!" and walked out. Dennis joked that he was impressed that I held a conversation with Ward and well, that was that - I was hired.

Working for IBM was a blast. I worked with a sharp group (including Ward, Mike Jania and many other folks that were fantastic but I’m struggling to remember everyone’s names) and my position was all about marketing. In '89, I learned PageMaker (on the 286/386) and I created an advertisement for our College Education team (that was the main area with whom I worked in GET). We were marketing the new PC line (286/386) to colleges around the country and I designed a simple advertisement that ended up going nationwide. See here:

But, you really need to read the text of the PCs for sale... especially the "Lightning Bolt" 386 (click the picture for the larger version):

To track all of the customizations (school name on the front, school colors for the titles, 8x11 brochure or folded/mailer style, etc.), I needed a database. I used DataEase. The advertisement was a huge success. In fact, we cross charged the other branches a *VERY* small fee (something under 1 cent per brochure) and we actually made some money for the branch. I won a "Branch Managers Award" (it was a first for an MSA) and I was even invited to the branch party at Arlington Race Track (I think it was Arlington). And, this was a HUGE deal because MSAs were not supposed to attend; they had to get Regional Manager sign-off for this. So, that was pretty cool!

The long story short is that I loved the advertising side but this database stuff was even better. A new project came up (working with FoxBase) and they asked Ward who they should hire and he said me. They asked him if I knew FoxBase and he said it didn't really matter (THANKS Ward!!). After 9 months coding in a large corner window office in 1 IBM Plaza (fyi – the office/view was gorgeous – I must see if I can find a few pics) with Stan Podolsky, the project was over and I really loved databases. Ward had heard of a small company in Chicago that was into training (and he knew I enjoyed training as I had delivered courses on WordPerfect at the Computer Center at Loyola) and the next thing I knew, I was working with LAN/mind. They were a training and consulting firm that specialized in NetWare but were branching (more and more) into OS/2, Windows, LanManager and this new thing - SQL Server (thanks to Marshall Olsen).

Oh, and here's a throwback to the 80s - my badge pictures from my IBM badges (I can't believe I'm posting these??):

#3 Marshall Olsen, SQL Server and finally, Microsoft

I started working for LAN/mind and immediately loved the Microsoft side of things (absolutely LOVED the team - Marshall Olsen, Jeffrey Starzec, Garry Forman, Charlie Spencer, Alice VonHansen (RIP Alice) and many others - we had a great time there). With LAN/mind I became a Microsoft Certified Professional (MCP #128 (I just recently stumbled on my old card :)) and soon after I applied to work for Microsoft - specifically, Microsoft University. I still feel like I could have spent a lot more time at LAN/mind but I loved Microsoft and I had fallen in love with the Pacific Northwest on my first trip out here. As for MSU, it was a great place to work as well - trainers played practical jokes on other trainers and once my good friend Robbie Laws (RIP Robbie) rollerbladed through my classroom.

However, it wasn't long after I joined Microsoft that they realized local training by their own trainers just wouldn't scale. This is when Certified Partners and Solution Centers really kicked off and MSU slowly turned into only content development (this is now Microsoft Learning). I joined the SQL team (and learned a lot and worked with some amazing folks!) but never really found my groove.

I loved writing, training, speaking at conferences and consulting... what do I do? I start my own company.

The rest has been here (SQLskills) for the past 8-9 years (well, that's when I started the web presence) and before that I did consulting/training mostly through word of mouth (since 1995). Speaking at conferences has also been an amazing experience. Not only have a traveled the world and met some of my best friends but it’s conferences that brought Paul and I together. The funny part is that we had exchanged emails on a few problems (as early as 2002 – we think) but it wasn't until TechEd 2006 where we first met. That was probably my biggest life changing event... As you know, there’s a lot of energy around Paul and as you can imagine, he came into my life like what I expect a huge steam engine roaring into the station would seem like... he asked me to come aboard and well, the rest has been smooth traveling (and lots of it!) since.

So, in a nutshell - I blame (but in a good way):

#1: Jeremy Smith

#2: Ward Christensen

Sidenote: It's wonderful to cross post with Brent Ozar's “What inspired him” post on how BBSes were so instrumental… why? Because Ward (and his business partner at the time - Randy Seuss) invented XMODEM and it’s this which led to the first public dial-up BBS. So, I think both Brent and I (and quite a few other folks too ;-) owe a lot of our direction to Ward! And, I still chat with Ward after we reconnected on Facebook. Even today, he commented on a picture from my early days (it’s “retro photo posting time on FB”) and quite a few of us were trying to think of where the time has gone.

#3: Marshall Olsen

What has and who have inspired you?

Or possibly better yet, where do you want to go from here?

Thanks for reading,
kt

PS - Be sure to check Paul's blog post for trackbacks and links to other great stories/posts. Enjoy!

Categories:
Personal

Data Dude (Visual Studio "DBPro") has generally been an expensive tool - only in the "Team Systems" Edition of Visual Studio. But, that's all going to change in VS 2010. Here's a picture of the new editions and which components of Data Dude are where:

This is a HUGE improvement (after the clarification isn't not quite as huge but still VERY useful that the tools are in different editions) and is going to allow many of you to use/leverage these key features (such as Data Generation and Static Code Analysis - those are some of my favorites :)) in smaller shops - without having to purchase the full-on Team Systems Edition of VS. If you're interested in more info - check out the following:

CLARIFICATION!!

In 2010 “Team System” goes away (as a brand), and it's been replaced with VS 2010 Ultimate and Premium (VS Pro is still there). Basically VS Premium is similar to VSTS Development Edition + Database Edition (and is priced the same). Ultimate is similar to Team Suite. 

The functionality didn't really move down the product line (or the price point), the big different is that there's more than only DB Pro tooling in there now!

Thanks to Doug "D7" Seven for setting me straight!

Enjoy!
kt

Categories:
Tips | VS DBPro

This is a bit overdue as Kevin Kline tagged me in a meme post to help people get ready for their travel to PASS. However, we were too busy with travel (we were in Australia at the time) to get to it. And, while this year's PASS has past, it's not too late for travel in general OR for next year! I suspect that a few folks are already planning for PASS, or Connections, or other business travel! So, here we go - my travel recommendations/tips.

Travel tips:
  • Join the frequent flyer program even if you're not a frequent flyer. As long as you travel within another 12 months (and sometimes even longer) your miles won't expire... Miles (for upgrades or free tickets) are always good :) and even if you're only a 2 or 3 times a year flyer - they'll add up (again, as long as you keep traveling).
  • Try to pick an airline and stick with it. The more you fly, the higher the bonuses. In fact, I'm so focused on flying just one airline that I'll often fly two flights (for example, SEA -> DEN, DEN->MCO [Orlando]) even when another airline (like American) has a direct SEA->MCO. This happens more than I would like but it also means that I get things like extra bonuses as the end of the year. For example, as I hit each 25K after 100K (ah, I really do fly *A LOT*), I get 2 extra systemwide upgrades. These allow me to upgrade (at the time of booking) for any international flight from departure to destination. So, I can upgrade SEA->MEL (even though that's actually SEA->LAX, LAX->SYD and SYD->MEL). But, systemwides are only given at the highest levels... so, you really have to fly a lot and you MUST stick with the same airline. My airline of choice: United. Also, to help you earn points faster - be sure to check the airline promotions REGULARLY. For example, many airlines offered a "double your EQM (Elite Qualifying Miles)" for Sept to Dec flight... all solely for registering. This might get you to the next level with little work! Here's United's page on promotions: http://www.united.com/page/article/1,,52102,00.html?navSource=Dropdown07&linkTitle=earnmiles.
  • When you book, start by doing some research online. Try general travel sites (Expedia, Travelocity, etc.) first but be sure to check the specific airlines as well. And, you *might* need to get creative if your itinerary is messy. Paul and I recently had to travel to Dublin and then Warsaw on the same trip but when we booked the entire set of flights the tickets were OUT OF CONTROL expensive. We ended up booking SEA->DUB and then WAW->SEA (it's called an open jaw) and we saved over 60% (no, I'm NOT kidding). Now, don't get me wrong - we spent something like 5 hours coming up with different options/combinations both online and on the phone. And, in many cases this kind of thing costs more not less so it can get tricky; it really isn't all that common but can really help. One thing of which to be aware is that connecting flights on a different ticket may cause you A LOT of grief (even though they can save you $$$). If your final destination if DUB and you purchase one RT ticket (let's say SEA to LHR) and then a separate RT ticket LHR to DUB but you plan to connect in LHR to fly immediately to DUB... some airlines/partners do NOT allow the baggage to be checked all the way through - this means that you'll need to pick up baggage in LHR and check-in again (and since this is international - it means going through customs/immigration). What this means is that you'll need more time (so pick those connections with 2-3 hours - maybe more). And, if there are any delays, etc. you won't be a high priority for them... this is where it can get messy. But, I have to admit this is something I do OFTEN. I always *try* to fly on partner airlines (meaning Star Alliance) as I have "Gold" privileges BUT, this doesn't always help (when checking in). However, it can help while en route. Again, fly the same airline/group and it does have its benefits.
  • Seats – when you book, try and get your seats assigned. And, be sure to check out your seat on SeatGuru (http://www.SeatGuru.com). If you book online (which can often offer discounts) be sure to follow up by calling the specific airline to get your seats assigned. In fact, if the price isn’t all that much better you might be MUCH better off using the specific company’s website and/or phone system so that you CAN get your seats assigned. And, in the worst case, if you can’t get your seats assigned – get to the airport EARLY (really target the 1.5-2 hour rule) and then be VERY nice to the folks that check you in… see if you can get an exit row or bulk head. Hmmmm, bulkhead seating is debatable. On some planes, bulkhead
Getting ready for travel:
  • A couple of weeks prior – go through all of your dates and itineraries and double-check everything. Once, I arrived at an airport for a flight and I had been booked on the wrong day. I remember the phone call well – I had been going back/forth about specific dates (and costs) and in the end, my E-ticket was wrong. Had I double-checked it, it wouldn’t have been a problem… Luckily, seats were available and all was well (and, I had elite status) otherwise, it could have been a nightmare!
  • Is there anything special you’re taking? Does it need to make it to the cleaner, etc.? Are you going to remember to pick it up in time? Really, planning a bit for this is important. I’ve forgotten to go to the cleaner the night before a trip only to realize that my flight is so early that I can’t even stop on the way to the airport. Now I need to go to plan b. This is why Paul creates a comprehensive “Pack List” and he’s got it online. When he thinks of something he wants to take on a trip, he updates this doc. Then, the day before he packs – he double-checks it, prints it and then starts packing. I’m pretty good at remembering most things so I haven’t done this but I have forgotten a few things over the years and travels. So, I’d really recommend this and I’m probably going to do this moving forward too!
Making the travel less painful:
  • Sleeping. Does melatonin work for you? Or, have you even tried it? If you’re crossing many time zones, you might want to pick up a bottle of this. It’s an anti-stress/herbal sleep aid and taking it right before bedtime might help you sleep better and adjust more quickly.
  • Sinus rinse. I swear by this and I honestly think this helps me avoid getting sick on some flights. I rinse my sinuses twice daily regardless of travel but given that you’re traveling with so many other folks and breathing recycled air – rinsing your sinuses can’t hurt. And, you might want to pick up some saline solution for in-flight moisture. Here's a link for the specific sinus rinse I recommend: http://www.neilmed.com/usa/products.php#isotonic. And, they have a nasal moisturizer as well: http://www.neilmed.com/usa/products.php#nasogel.
  • Flight socks. Honestly, I just recently found out about these and I LOVE them. My feet can swell up a bit while flying (I just recently had foot surgery) and now, I wear these even for shorter flights.
  • Slippers. Yes, I actually travel with wool slippers. Seriously, I get cold...really cold... And, when I travel with these they make a HUGE difference. Especially for long haul flights. These aren't hard slippers - more like thick socks (easy to pack).
  • Snack. Even if you do get an upgrade – having a snack with you might make an otherwise hungry time a lot more bearable.
  • Water. I buy water in the airport prior to boarding so that I’m never without. Flying can be very dehydrating and that will only help you get sick.
  • Entertainment. Have something ready for you to read, watch, work-on… whatever. This is always my excuse to buy the trashy mags at the airport before I board. However, I have to admit that I know fewer and fewer of the stars referenced. Not sure how I’m supposed to feel about that?! J

Well… I’ll plan to come back to this and update it with other tips/tricks as I think about them!

Safe – and happy – travels to all! And, I look forward to seeing you at an event sometime soon…if not, at PASS, or Connections again next year.

Thanks for reading!
kt

Categories:
Conferences | Opinions

At PASS we hung out with Sean McCown and Jen McCown (of MidnightDBA) and chatted... and, despite our better judgement, it was recorded. ;-) These aren't overly technical but they are fun! Enjoy!!

And, our favorite "travel tip" from the video is to know your plane and check your seat on http://www.seatguru.com/. In fact, I have a few other travel tips and I'm overdue on a meme that Kevin Kline tagged me in so I'll post that shortly as well.

Enjoy!
kt

Categories:
Interviews | Resources

Well... it's finally been released and last week was the book launch. This book came together first as a goal by Paul Nielsen who was motivated by Steve Ballmer's request that MVPs "give back" even more and even outside of the direct technical community. Paul had the idea of a book - written for charity - and contributed to by MVPs. Well, here it is!

The book contains content from 53 MVPs with Paul Nielsen, Kalen Delaney, Greg Low, Adam Machanic, Paul S. Randal and I as editors. It was great fun going through all of the content and we're really happy that it's now available!

To purchase, click on the book (to the left) and purchase directly from Manning. When purchased through Manning a larger amount of the proceeds go to WarChild.

Finally, many of us were able to get together at PASS and "launch" the book. We gathered around a long table and people (if interested) had their copies signed. It was great fun! When it was all done - we took the group shot below.

Thanks for a great week (to all that were at PASS) and for great fun working on this title!

UPDATE November 16, 2009: After a few comments, I realized that a few people were tagged incorrectly in my facebook post of this photo and as a result, I had a few names switched around too. So, I just completely started over again and double-checked all the names... Also, I've added links to their websites so you can find out more information about the authors and editors of this title. Finally, I added that purchases should be made through the Manning website instead of other sites (so that more money goes to WarChild). And, finally, I added the entire chapter list with as many links as possible to everyone's websites. I'll keep this up-to-date as folks send me corrected lnks, etc. ENJOY!!

 

Present for the photo

Here's the full chapter list from the book - along with links to the authors/editors websites:

PART I

DATABASE DESIGN AND ARCHITECTURE
Paul Nielsen, Editor

 1. Louis and Paul's 10 relational database design rules by
Paul Nielsen and Louis Davidson 
 2. SQL Server tools for maintaining data integrity by Louis Davidson
3. Finding functional dependencies by Hugo Kornelis


PART II

DATABASE DEVELOPMENT
Adam Machanic, Editor
4. Set-based iteration: the third alternative by Hugo Kornelis 
5. Gaps and islands by Itzik Ben-Gan
6. Error handling in SQL Server and applications by Bill Graziano
7. Pulling apart the FROM clause by Rob Farley
8. What makes a bulk insert a minimally logged operation? by Denis Gobo
9. Avoiding three common query mistakes by Kathi Kellenberger
10. Introduction to XQuery on SQL Server by Michael Coles
11. SQL Server XML frequently asked questions by Michael Coles
12. Using XML to transport relational data by Matija Lah
13. Full text searching by Robert C. Cain
14. Simil: an algorithm to look for similar strings by Tom van Stiphout
15. LINQ to SQL and the ADO.NET Entity Framework by Bob Beauchemin
16. Table-valued parameters by Don Kiely
17. Build your own index by Erland Sommarskog
18. Getting and staying connected—or not by William Vaughn
19. Extending your productivity in SSMS and Query Analyzer by Pawel Potasinski

20. Why every SQL developer needs to have a tools database by Denis Gobo
21. Deprecation feature by Cristian Lefter
22. Placing SQL Server in your pocket by Christopher Fairbairn
23. Mobile data strategies by John Baird



PART III

DATABASE ADMINISTRATION
Kimberly L. Tripp and Paul S. Randal, Editors

24. What does it mean to be a DBA? by Brad McGehee
25. Working with maintenance plans by Tibor Karaszi
26. PowerShell in SQL Server by Richard Siddaway
27. Automating SQL Server management using Server Management Objects by Allen White
28. Practical auditing in SQL Server 2008 by Jasper Smith
29. My favorite DMVs and why by Aaron Bertrand
30. Reusing space in a table by Joe Webb
31. Some practical issues in table partitioning by Ron Talmage
32. Partitioning for manageability (and maybe performance) by Dan Guzman
33. Faster, smaller, and more energy-efficient backups without indexes by Greg Linwood
34. Using database mirroring to become a superhero! by Glenn Berry
35. The poor man's SQL Server log shipping by Edwin Sarmiento
36. Some understated changes found in SQL Server 2005 replication by Paul Ibison
37. High performance Transactional Replication by Hilary Cotter
38. Successfully implementing Kerberos delegation by Scott Stauffer
39. Running SQL Server on Hyper-V by John Paul Cook

PART IV

PERFORMANCE TUNING AND OPTIMIZATION
Kalen Delaney, Editor
40. When is an unused index not an unused index? by Rob Farley
41. Speeding up queries with index covering by Alex Kuznetsov
42. Tracing the deadlock by Gail Shaw
43. How to optimize tempdb performance by Brad McGehee
44. Does the order of columns in an index matter? by Joe Webb
45. Correlating SQL Profiler with Windows Performance Monitor by Kevin Kline
46. Using correlation to improve query performance by Gert-Jan Strik
47. Dr. DMV: how to use Dynamic Management Views by Glenn Berry
48. Query performance and disk I/O counters by Linchi Shea
49. XEVENT: the next event infrastructure for SQL Server by Cristian Lefter



PART V

BI DEVELOPMENT
Greg Low, Editor
50. BI for the relational guy by Erin Welker
51. Unlocking the secrets of SQL Server 2008 Reporting Services by William Vaughn
52. Reporting Services tips and tricks by Bruce Loehle-Conger
53. SQL Server Audit: change tracking or change data capture? by Aaron Bertrand 
54. Introduction to SSAS data mining by Dejan Sarka
55. To aggregate or not to aggregate: is there really a question? by Erin Welker
56. Incorporating data profiling in the ETL process by John Welch
57. Expressions in SQL Server Integration Services by Matthew Roche
58. SSIS performance tips by Phil Brammer
59. Incremental loads using T-SQL and SSIS by Andy Leonard

Categories:
Conferences | Resources

Time flies when you're having fun, eh? This week Paul and I are in Dublin, Ireland delivering a deep technical "SQL Immersion Event" on Internals, Performance Tuning and Maintenance with our partner ProData (and having a blast - we have a really fun and very interactive group). And, today's an even more interesting day to be in Dublin as it's the 250th anniversary of the Guinness Brewery...everyone around Dublin will be toasting the most popular drink in Ireland today (Sept 24) at 5:59pm GMT. Unfortunately, our class runs until 6pm but Paul tells me that we're still going to celebrate. ;-)

For now, we'll just have to celebrate some SQL (I can hear minor grumblings everywhere :)...

Filtered indexes and filtered stats are an incredibly powerful feature in SQL Server 2008 - one that I've posted on before and will again. Today's post is about some of the gotchas with filtered index stats and filtered stats. Cutting to the punchline - you might have some trouble with their accuracy as your data changes. The SIMPLE solution is to update these specific stats manually and more frequently (meaning that you don't want to rely on the database option AUTO_UPDATE_STATISTICS). In fact, depending on your data volatility, you might want to update these statistics on a VERY frequent basis. Part of the question you'll have to answer is how much of the filtered set changes over the course of a day/week/month. And, in whatever frequency you reach 5-10% (let's say weekly), I'd suggest updating the statistics weekly. If you update 5-10% within the course of a day - then daily. Let me give you the full story...

Statistics are traditionally updated when roughly 20% (+ a minimum of 500 rows) of the data has changed. If you have a table with 100,000 rows then SQL Server will hit the threshold at 20,500 rows (just a tad over 20%) and a table with 10,000 rows will hit the threshold at 2,500 rows (which is 25%). Additionally, in SQL Server 200x (2000, 2005 and 2008) statistics are NOT immediately updated when the threshold is reached, instead they are invalidated. It's not until someone needs the statistic that SQL Server updates it. This reduces thrashing that occurred in SQL Server 7.0 when stats were updated immediately instead of just being invalidated. Another interesting point is what is meant by "20% of the data has changed?"... How is that defined? Is it based on updates to columns or inserts of rows? Of course the answer is... it depends - here, it depends on the version of SQL Server that you're using:

  • SQL Server 2000 defines 20% as 20% of the ROWS have changed. You can see this in sysindexes.rcmodctr.
  • SQL Server 2005/8 defines 20% as 20% of the COLUMN data has changed. You cannot see this unless you are accessing SQL Server through the DAC as it's in a base system table (2005: sysrowsetcolumns.rcmodified and for 2008: sysrscols.rcmodified).

Now, for regular indexes (those that do not include a filter) both strategies have pros and cons. Let's start with the con for SQL 2000... If you have a particularly volatile column then the rowmodctr will be increased quickly and then ALL statistics (even those that have not changed) are invalidated. Without a way to understand a column's volatility, you will invalidate some statistics too soon. So, 2005/2008 created a column modification counter to better define when a statistic should be invalidated. But, the con for SQL 2005/2008 (which is the pro for row-based algorithm of SQL 2000) is that a lot more data must change in a single column before the statistics get updated. If your modifications are fairly well balanced (across the table) then the row-based algorithm could suit you better. If your modifications are isolated (to a specific column) then the column-based algorithm will suit you better. But, neither is perfect... and, the new column-based algorithm poses another HUGE problem for filtered indexes.

For filtered indexes the threshold invalidation algorithm is tied SOLELY to the column and not just to the filtered set. So, if your table has 10,000 rows it takes 2,500 modifications in that column to update statistics. If your filtered index only has 1,000 rows... then you could theoretically modify this specific filtered set 2.5 times before IT would be updated. Yes, I realized that this sounds unlikely but it's really not. What if your filter is for ACTIVE = 1. It's only 1% of your table and it's the most active data set. You will effectively have to modify this set 20 times over before the statistics for your filtered index get updated.

But, having said that - I also have a simple solution. Don't wait... For filtered indexes (especially on volatile data) add a special UPDATE STATISTICS job that specifically updates their stats with a FULLSCAN. The good news is that they are likely to be smaller indexes (and therefore easier/less-costly statistics to update) and you'll clearly get a lot better uses out of them. Additional good news is that many filtered indexes are designed to be covering indexes, covering indexes aren't going to rely as heavily on statistics so there are many covered queries that won't care. The following query will get the list of all of your tables/indexes that have filters. Once you have this list you can create a loop to update statistics.

SELECT object_name(object_id) AS [Table Name]
       , name AS [Index Name]
       , stats_date(object_id, stats_id) AS [Last Updated]
FROM sys.
stats
WHERE has_filter = 1

So, there are lots of options and definitely a few things of which to be aware... filtered indexes and filtered stats are a powerful option to leverage, just make sure you keep those stats up to date!

Cheers... it's time for a pint of Guinness!

kt

PS - For more information on statistics, check out the SQL Server whitepapers:

We've finally booked all of our tickets and started planning final details for all our of our classes and conferences this fall. We've got quite a few things planned and tons of great content at each. By continent, here's the plan:

Europe

  • Dublin, Ireland: September 21-25. Paul and I will be teaching a week-long Immersion Event - combining internals, performance tuning, database maintenance and more.
  • Warsaw, Poland: September 29-30. Paul and I will be presenting a full day of SQL Server sessions at the Microsoft Technology Summit 2009

Australia - Melbourne

  • October 15-16. Paul and I will be teaching a 2-day class "SQL Server 2008: New Features - Updating Your Administrations Skills in Database Infrastructure and Scalability"
  • October 19-22. Paul will be teaching a 4-day class "SQL Server 2005/2008: DB Maintenance and Availability: From Performance to Disaster Recovery"
  • October 19-22. I will be teaching a 4-day class "SQL Server 2005/2008: Performance Tuning – From Design to Indexing to Optimizing Procedural Code"

US

  • Seattle, WA for SQLPASS: November 2-5. Paul and I will be teaching two full-day workshops and a Spotlight Session each.
  • Las Vegas, NV for SQL Connections: Nobember 9-12: Paul and I will be teaching two full-day workshops and 5 sessions. Stacia will be teaching 4 sessions on BI.
  • Redmond, WA for SQL MCMs and Sharepoint MCM's through the end of this year

You can find more details and links to the various registration sites on our Upcoming Events page. And, if you have any questions, feel free to email me: Kimberly AT SQLskills.com or Paul AT SQLskills.com.

We hope to see you there and be sure to swing by and say hi!

Cheers,
kt

Categories:
Conferences | Events

OK, for years I've been saying that SQL Server doesn't care about the order in which you define the columns of your table because internally SQL Server will re-arrange your columns to store all of the fixed width columns first and the variable columns last. In both the fixed-width portion of the row as well as the variable-width portion of the row, the columns are defined in the order in which they are declared. So, what does matter?

It's all in the cost of the variable array's offset values. If the large majority of your NULLable records are at the end, then SQL Server doesn't need to completely populate the variable block array (which saves 2 bytes per column). If you have a table where 36 columns are NULLable and generally they are NULL, then defining those columns at the end of the row can save you space.

The following script will show you how the maximum length of the row changes based on whether or not a later column in the variable block is NOT NULL - even when most/all of the prior columns are!

CREATE TABLE RowSizeVariableBlock
(
ID
int NOT NULL identity
,
c01 char(10) NOT NULL default 'test'
,
c02 datetime2(7) NOT NULL default sysdatetime
(),
c03 char(80) NOT NULL default 'junk'
,
c04 varchar(100)
NULL,
c05 varchar(100)
NULL,
c06 varchar(100)
NULL,
c07 varchar(100)
NULL,
c08 varchar(100)
NULL,
c09 varchar(100)
NULL,
c10 varchar(100)
NULL,
c11 varchar(100)
NULL,
c12 varchar(100)
NULL,
c13 varchar(100)
NULL,
c14 varchar(100)
NULL,
c15 varchar(100)
NULL,
c16 varchar(100)
NULL,
c17 varchar(100)
NULL,
c18 varchar(100)
NULL,
c19 varchar(100)
NULL,
c20 varchar(100)
NULL,
c21 varchar(100)
NULL,
c22 varchar(100)
NULL,
c23 varchar(100)
NULL,
c24 varchar(100)
NULL,
c25 varchar(100)
NULL,
c26 varchar(100)
NULL,
c27 varchar(100)
NULL,
c28 varchar(100)
NULL,
c29 varchar(100)
NULL,
c30 varchar(100)
NULL,
c31 varchar(100)
NULL,
c32 varchar(100)
NULL,
c33 varchar(100)
NULL,
c34 varchar(100)
NULL,
c35 varchar(100)
NULL,
c36 varchar(100)
NULL,
c37 varchar(100)
NULL,
c38 varchar(100)
NULL,
c39 varchar(100)
NULL,
c40 varchar(100)
NULL
)
go

insert RowSizeVariableBlock DEFAULT VALUES
go

select * from RowSizeVariableBlock
go

select * from sys.dm_db_index_physical_stats
(db_id(), object_id('RowSizeVariableBlock'), null, null, 'detailed'
)
-- review "max" record size = 114
go

insert RowSizeVariableBlock (c01, c03, c20)
values ('med row', 'up to c20', 'test'
)
go

select * from RowSizeVariableBlock
go

select * from sys.dm_db_index_physical_stats
(db_id(), object_id('RowSizeVariableBlock'), null, null, 'detailed'
)
-- review "max" record size = 154
go

insert RowSizeVariableBlock (c01, c03, c30)
values ('med+ row', 'up to c30', 'test'
)
go

select * from RowSizeVariableBlock
go

select * from sys.dm_db_index_physical_stats
(db_id(), object_id('RowSizeVariableBlock'), null, null, 'detailed'
)
-- review "max" record size = 174
go

insert RowSizeVariableBlock (c01, c03, c40)
values ('large row', 'up to c40', 'test'
)
go

select * from RowSizeVariableBlock
go

select * from sys.dm_db_index_physical_stats
(db_id(), object_id('RowSizeVariableBlock'), null, null, 'detailed'
)
-- review "max" record size = 194
go

While there are some other optimizations at this level, most tables cannot benefit from this as the data populations aren't as predictable nor are most tables filled with so many variable-width and NULLable columns. However, if you do have this data pattern, defining these columns at the end of your table's definition - MIGHT save a tremendous amount of space, especially when this table is very large!

Paul's blogged more on these structures as well as the NULL bitmap here: http://www.sqlskills.com/BLOGS/PAUL/post/Misconceptions-around-null-bitmap-size.aspx.

Enjoy! And, thanks for reading,
kt

Paul and I will be teaching a week-long public Immersion Event in Dublin, September 21-25, in partnership with our good friends at Prodata and Microsoft Ireland.

The class will cover:

  • Day 1: SQL Server Internals (On-disk structures, index internals, logging, recovery, transaction log architecture)
  • Day 2: Designing for Performance (data types, table and index partitioning)
  • Day 3: Indexing for Performance (Access patterns, covering, INCLUDE, indexing strategies)
  • Day 4: Essential Database Maintenance (data and log files, tempdb, index and statistics maintenance, backup and restore)
  • Day 5.1: Essential Database Maintenance (consistency checking and disaster recovery)
  • Day 5.2: SQL Consolidation and Virtualization

This is going to be a great workshop where we combine the best of all our various classes into a superb learning opportunity.

Checkout the registration site for more in-depth details about the content, location, and cost.

Hope to see you there!

Categories:

I've always been concerned with security and I've always stressed the importance of auditing the REAL user context not just the current user (see this post on EXECUTE AS and auditing). So, I generally try to avoid using dynamic string execution and if necessary create well tested/protected parameters (fyi - using QUOTENAME can be a fantasic solution to protectng identifiers as input parameters but it can't protect more complex strings).

Having said that, what if I'm looking at a database for the first time... just poking around trying to see if there's anything that needs further attention? I've come up with a quick query... And, while it's not going to "solve" your problem (as that's going to take some re-writing of code) or even truly verify if you're vulnerable, it gives you a "quick list" of where you should look first! If your code uses dynamic strings AND it's elevated - then start there! 

SELECT OBJECT_NAME(object_id) AS [Procedure Name],
  CASE
      WHEN sm.definition LIKE '%EXEC (%' OR sm.definition LIKE '%EXEC(%' THEN
'WARNING: code contains EXEC'
      WHEN sm.definition LIKE '%EXECUTE (%' OR sm.definition LIKE '%EXECUTE(%' THEN
'WARNING: code contains EXECUTE'
  END AS [Dynamic Strings]
,
  CASE
     
WHEN execute_as_principal_id IS NOT NULL THEN N'WARNING: EXECUTE AS ' + user_name(execute_as_principal_id
)
      ELSE
'Code to run as caller - check connection context'
  END AS [Execution Context Status]
FROM sys.sql_modules AS sm
ORDER BY [Procedure Name]

Is this enough? Anything else you'd check? What do you think?

THANKS!
kt

OK, I'll definitely take a beating from all of you for having gone so long between my survey posts and now. I won't even go into the details but between some crazy work schedules, multiple sinus problems and even migraines... well, I've been a bit behind. Let's just say that April/May were rough at best. I'm feeling better and well, now I'm trying to catch up. I had really gotten the blogging bug in March but I completely lost it in April. But, this tipping point series is in dire need of lots of explaining so I'm really hoping to get a few posts done in this area for sure!

First, I started the discussion around this in a few surveys:

Survey/Question 1

Q1 was described as this: if a table has 1 million rows at 20 rows per page (50,000 pages), at what percentage (roughly) of the data would a nonclustered index no longer be used. Blogged here. Here's what the survey said as of today:

And, for Q1 the correct result (Between 0-2% of the rows) is actually the best result (but, by no means the overwhelming majority at only 28%). However, often people just "think" the answer is very small. So... I did a few more questions/surveys. 

Survey/Question 2

Q2 was described as this: if a table has 1 million rows at 100 rows per page (10,000 pages), at what percentage (roughly) of the data would a nonclustered index no longer be used. Blogged here. Here's what the survey said as of today:

And, for Q2 the correct result (Less than .5% of the rows) is actually at a tie for the best (but, again, even a small percentage at only 22%). Again, often people just "think" the answer is very small. So... I did one more question/survey. 

Survey/Question 3

Q3 was described as this: if a table has 1 million rows at 2 rows per page (500,000 pages), at what percentage (roughly) of the data would a nonclustered index no longer be used. Blogged here. Here's what the survey said as of today:

And, for Q3 the correct result (Between 10-20% of the rows) is actually NOT the highest answer. And, this is even more convincing that there's confusion around what's going on and why.

The Tipping Point

What is the tipping point?

It's the point where the number of rows returned is "no longer selective enough". SQL Server chooses NOT to use the nonclustered index to look up the corresponding data rows and instead performs a table scan.

When does the tipping point occur?

It depends... it's MOSTLY tied to the number of pages in the table. Generally, around 30% of the number of PAGES in the table is about where the tipping point occurs. However, parallelism, some server settings (processor affinity and I/O affinity), memory and table size - all can have an impact. And, since it can vary - I typically estimate somewhere between 25% and 33% as a rough tipping point (and, you'll see from a bunch of my examples, that number is not EXACT). Then, I translate that into rows.

Math for Tipping Point Query 3: If a table has 500,000 pages then 25% = 125,000 and 33% = 166,000. So, somewhere between 125,000 and 166,000 ROWS the query will tip. Turning that into a percentage 125,000/1million = 12.5% and 166,000/1million = 16.6%. So, if a table has 500,000 pages (and 1 million rows) then queries that return less than 12.5% of the data are likely to USE the nonclustered index to lookup the data and queries over 16.6% of the data are LIKELY to use a table scan. For this table, that percentage seems "reasonable". But, most of us say that the tipping point happens at a much lower percentage... why? Because row size - which determines table size (and therefore pages) is really what has the greatest impact. So, let's look at Tipping Point Query 2... 

Math for Tipping Point Query 2: If a table has 10,000 pages then 25% = 2,500 and 33% = 3,333. So, somewhere between 2,500 and 3,333 ROWS the query will tip. Turning that into a percentage 2,500/1million = .25% and 3,333/1million = .33% (not even 1%). So, if a table has only 10,000 pages (and 1 million rows) then queries that return less than a quarter of 1% of the data are likely to USE the nonclustered index to lookup the data and queries over one third of one percent are LIKELY to use a table scan. For this table, that percentage seems really low BUT, at the same time it makes sense (to a point) that a small table would be scanned... but, for less than 1%. 1% is NOT selective enough. For small tables, it might not matter all that much (they're small, they fit in cache, etc.) but for bigger tables - it might be a big performance problem. 

Math for Tipping Point Query 1: If a table has 50,000 pages then 25% = 12,500 and 33% = 16,666. So, somewhere between 12,500 and 16,666 ROWS the query will tip. Turning that into a percentage 12,500/1million = 1.25% and 16,666/1million = 1.66% (under 2%). So, if a table has 50,000 pages (and 1 million rows) then queries that return less than 1.25% of the data are likely to USE the nonclustered index to lookup the data and queries over 1.66% are LIKELY to use a table scan. Again, this seems like a low number. Again, for small tables, it might not matter all that much (they're small, they fit in cache, etc.) but as tables get larger and larger - it CAN be a big performance problem. 

Why is the tipping point interesting?

  • It shows that narrow (non-covering) nonclustered indexes have fewer uses than often expected (just because a query has a column in the WHERE clause doesn't mean that SQL Server's going to use that index)
  • It happens at a point that's typically MUCH earlier than expected... and, in fact, sometimes this is a VERY bad thing!
  • Only nonclustered indexes that do not cover a query have a tipping point. Covering indexes don't have this same issue (which further proves why they're so important for performance tuning)
  • You might find larger tables/queries performing table scans when in fact, it might be better to use a nonclustered index. How do you know, how do you test, how do you hint and/or force... and, is that a good thing?

Real example of an interesting tipping point

Earlier today, I went on facebook and twitter and gave the following information - very vaguely - and I asked "why" is Q2 so much slower than Q1 if Q2 returns only 10 more rows. Same table and no hints (other than MAXDOP)...

Q1: SELECT * FROM table WHERE colx < 597420 OPTION (MAXDOP 1)

  • returns 197,419 rows
  • takes 116,031 ms (1 minute, 52 seconds)
  • 1,197,700 logical reads, 5 physical reads, 137,861 read-ahead reads
  • 7,562 ms CPU time

    Q2: SELECT * FROM table WHERE colx < 597430 OPTION (MAXDOP 1)

  • returns 197,429 rows
  • takes 244,094 ms (4 minutes, 4 seconds)
  • 801,685 logical reads, 1410 physical reads, 801,678 read-ahead reads
  • 9,188 ms CPU time

There were lots of great guesses... but, it's the tipping point. SQL Server chose to "tip" the second query because it was "over the line". But, it's important to realize that there are cases when that's NOT a good idea. And, what are your options?

In SQL Server 2005 - the only option is to force the nonclustered index to be used:

Q2: SELECT * FROM table WITH (INDEX (NCInd)) WHERE colx < 597430 OPTION (MAXDOP 1)

But, this can be TERRIBLY bad on some machines where the IOs could be a lot faster (and where data might already be in cache). These specific numbers are exactly that - specific to this HARDWARE (and, I chose not-so-optimal HW in this case to highlight this problem). And, depending on what number you use (what if this is a parameter in sps?) you might force SQL Server to do WAY more IOs by forcing the index than allowing the tipping point to do its job. But, depending on your hardware (and/or what you know to be in cache at the time of execution), it might be better to force an index instead of letting SQL Server choose. So, should I force the index? Be careful, if you're wrong - it could take more time and actually be slower.

In SQL Server 2008 - there's a new hint - FORCESEEK:

Q2: SELECT * FROM table WITH (INDEX (FORCESEEK)) WHERE colx < 597430 OPTION (MAXDOP 1)

FORCESEEK is better because it doesn't tie you to a particular index directly but it also doesn't let SQL Server tip to a table scan. However, just like forcing an index - you can be wrong!

So, what should you do? It depends. If you know your data well and you do some extensive testing you might consider using a hint (there are some clever things you can do programmatically in sps, I'll try and dedicate a post to this soon). However, a much better choice (if at all possible) is to consider covering (that's really my main point :). In my queries, covering is unrealistic because my queries want all columns (the evil SELECT *) but, if your queries are narrower AND they are high-priority, you are better off with a covering index (in many cases) over a hint because an index which covers a query, never tips.

That's the answer to the puzzle for now but there's definitely a lot more to dive into. The Tipping Point can be a very good thing - and it usually works well. But, if you're finding that you can force an index and get better performance you might want to do some investigating and see if it's this. Then consider how likely a hint is to help and now you know where you can focus.

Thanks for reading,
kt

Whenever I setup a new machine, I like to set a few default options in SSMS. Here are my favorites:

Tools, Options

  • Environment
    • Fonts and Colors
      • Text Editor font: Lucida Console (a bit thicker and it's a fixed-width font)
      • Text Editor: Selected Text (under display items)
        • Item foreground: Black
        • Item background: Yellow (looks like a highlighter)
      • Sometimes I'll also make the results windows have larger fonts - especially if it's a presentation machine
    • Keyboard
      • Keyboard scheme: SQL Server 2000 (ok, maybe I'm old-school but the QA keyboard shortcuts still seem a lot more natural than the VS keyboard shortcuts... but, if you're more of a VS person, then stick with Standard)
      • Query shortcuts:
        • Ctrl+F1: sp_helpindex2 (if you highlight an object and then hit Ctrl+F1, then it passes the highlighted object in as a parameter. The only negative is that it doesn't delimit it so you can't highlight schema.object unless it's already quoted for the sp 'schema.object')
        • Ctrl+3: SELECT object_name(object_id) AS ObjName, * FROM sys.dm_db_index_physical_stats(db_id(), NULL, NULL, NULL, NULL) ORDER BY 1, 4
        • Ctrl+4: other common queries/procedures...
  • Text Editor
    • All Languages
      • Line numbers
    • Editor Tab and Status Bar - I usually change the colors of the "Group connections" setting to something very noticeable like Fuschia. I've already blogged why here.
  • Query Execution
    • By default, open new queries in SQLCMD mode (however, if you set this - you will disable Intellisense and Transact-SQL Debugging... and, there's no "hint" or warning that you're doing it.)
  • Query Results
    • SQL Server, Results to Grid
      • Include the query in the result set (this will show you what you executed in the results window while you wait for completion... also confirms what you think you executed. This gets annoying for large batches but is sometimes useful to see what you last executed)
      • Include column headers when copying or saving the results (VERY useful if you're pushing data over to Excel, etc.)
      • Display results in a separate tab (this gives you Tabbed mode instead of split-window mode meaning that results will go to a results Tab instead of the lower half of your window. I prefer this when I have a lot of results to review AND when I'm presenting and typically run with a lower screen res.)
        • Switch to results tab after the query executes (I prefer this so that I'm waiting for the results in the results window)
    • SQL Server, Results to Text
      • I set most of the same settings as Results to Grid (Include column headers, Include the query, Display results in separate tab and Switch to the results tab) BUT, I usually turn off Scroll results as received. If I'm looking at rows at the top of the set (while it's still processing) it often takes my cursor down to the end. I also turn this off in Profiler.

The primary reason for this post though - is to make sure that you realize that one specific option - Tools, Options, Query Execution, "By default, open new queries in SQLCMD mode" - can really surprise you. I used to think that this option was really helpful (and benign) because SQLCMD is a superset of Transact-SQL. So, I typically recommended that you turn it on. However, turning on this option disables Intellisense and the Transact-SQL Debugger. It's documented here: http://msdn.microsoft.com/en-us/library/ms174187.aspx (thanks Paul Mestemaker! :) and I thought this was a bit unintuitive at first but it really does make sense. Both the Transact-SQL Debugger and Intellisense are debugging and/or prompting you with Transact-SQL (not SQLCMD commands). So, disabling it seems right (well, I guess I could argue that it could work with Intellisense but debugging would be much harder). However, I just wish there were some better information in the Tools/Options dialog. I'd like my 2 hours back (trying to figure out which option disabled Intellisense, chatting with my friends on fb to see if they knew, filing a connect bug when I figured out that it was SQLCMD, and then learning from Paul Mest that it's actually expected behavior (duh!), changing the Connect bug to say that the dialog might be better with a warning... it was definitely one of those mornings!  :) :) :)

So, my main point for this post - save you the time I lost in trying to figure out where Intellisense went. It's expected behavior (and it does make sense........now :).

Cheers,
kt

Categories:
Manageability | Tools

OK, I know I don't blog all that often but when I do, I do try and post as much useful information as I can :). I've got a few posts in the queue and a few more tests to do and code to write before I can wrap them up. In the interim, Paul and I have both decided to throw a bit of our "spare time" to keeping up with friends and family more and just staying "more connected." In that effort, we've both joined facebook and twitter (www.twitter.com/KimberlyLTripp and www.twitter.com/PaulRandal). Our end goal is:

  • Blogging: large/complex posts with detailed info/code, etc.
  • Twitter: short, quick posts on things to check out, review, etc. Maybe I'll do a weekly summary of tweets that I do and/or followed? Would you be interested in that? And, as for my interface, I'm currently using TweetDeck and Twitter.com and I've also joined via WeFollow.com: #sqlserver, #mvp and #womenintech.
  • Facebook: more fun stuff and keeping up with friends, etc. I have to admit that I really love the interface and I'm constantly impressed at how easy it is to upload video, photos, etc... It's just NOT what I expected before I tried it out. I'd truly recommend this to anyone that wants to asynchronously keep up with a large number of people and wants to share photos, comments, video, etc. It's really well done. However, beware of many of the facebook apps. They tend to spam your friends list - sometimes even when they don't ask. Outside of better requirements on fb app developers, I don't have a lot of complaints there.

Having said that, neither of us is doing anything else (no Plaxo, no LinkedIn, etc.) so if you want to find us - we're definitely around but we're going to stay somewhat focused. ;-)

And, now that I'm back home again, I hope to have a few more of my longer posts done.

Cheers everyone!
kt

Kimberly L. Tripp's Facebook profile Paul Randal's Facebook profile

Categories:
Opinions | Personal

Recently, the SQL Server 2008 Internals title was released (and only in the past few days have people actually received their copies)! In fact, I still haven't seen the book in person... soon though!

Anyway, Kalen, Paul, Conor, Adam and I worked to create a comprehensive resource on SQL Server 2008 internals and to supplement the written content, many of us created demo scripts and examples. I've now gone back and created a sample script based on ALL of the code in the entire chapter (and in many cases I extended the code in the samples). And, while this updated content is going to be located on the companion content site, I thought I’d also release it here with some notes.

The zip contains a solution with 3 projects, each with a few scripts:

I know the names seem a bit strange but everything is ordered EXACTLY as it is shown in the book. And, in the book, I referenced "a" script called EmployeeCaseStudy-AnalyzeStructures.sql but that script was so big that I broke it down into 7 parts (hence the naming convention of 06...01, 06...02, etc.). Regardless, each script contains a brief header and a few details about the script. To get the most from the script, do not just open the script and execute it. If you really work your way through the script, you should see all of the comments and they will help you to make instance specific changes so that everything runs without error. Just take your time and really try to step back and think about each command (and what you're expecting the output to be) to test yourself while your working through the results. Taking your time and really grapsing all of these internals is what makes it fun!

Quite a bit of this content can stand alone but it's really best as companion content to the title as there's a lot more "text" and detailed information in the book. But, the scripts are really a great way to dive deeper, learn documented/undocumented commands and really get to know what the heck is really going on internally.

Finally, I only worked on Chapter 6 so here it is: 20090810-IndexInternals-Chapter6-Resources.zip (6.13 mb). As for the other companion content, you'll need to get the links from the book. Ah, or from Kalen :). 

And, certainly, if you find a typo or anything that you think needs more clarification, let me know! I'm more than happy to post updates to this content. 

Have fun,
kt

UPDATES/ERRORLOG:
2009-Apr-13 (8am): Updated the zip after remembering in my sleep (yes, sad, but true!) that one of my comments that referenced some line numbers didn't get updated in the final version. So, no errors and if you don't get this update, it's not going to break anything. But, the script that's been tweaked is script: 05_EmployeeCaseStudy-TableDefinition.sql. Enjoy! kt
2009-Apr-13 (4:30pm): Ha... guess what arrived today. Yes - our copies of the book. Wow, it's great to see it in person. Again, enjoy!
2009-Aug-10: Added a :CONNECT option inside the IndexInternals restore script AND, cleaned up the zip as it had an extra copy of the IndexInternals database in it.

A couple of weeks ago I wrote a blog post titled Whose job is it anyway? It's an interesting debate and something I've been hearing more and more - that SQL Server is a "set it and forget it" technology - a black box where you just don't need to know how it works to do well with it. In fact, I've even had a few folks comment that they think it would be better to "roll their own" database rather than have to learn how to work in a "general purpose" database. And, while there are certainly lots of different angle to this debate - one fact remains... if you don't know anything about the database on which you're developing (whether it's SQL Server, mySQL, Oracle, whatever), I *PROMISE* you won't have a truly scalable, optimal solution. Why do you think there are so many knobs? It's because there are so many different ways to work with data. There is more than one way to query, more than one way to design. This is also why every answer to a "how should I do this" question starts with "It depends". And, while that seems like a scary response it's actually a good one. It means that you have lots of options - options that can offer many different pros/cons. And, as a result of knowing these pros/cons, you can make better decisions - decisions that will ultimately determine how well you can scale.

So..... while I don't think this debate will EVER be finished (as to WHOSE job it is to know these things), I do think a lot of folks are seeing the effects of not knowing more about their store (and, again, this is NOT limited to SQL Server in any way, shape or form).

At a minimum, hear the discussion on RunAsRadio with Richard, Greg and I and let us know what you think!

Kim Tripp on the Roles of Developers and DBAs with the Database!

Cheers,
kt

I started the series here: http://www.sqlskills.com/BLOGS/KIMBERLY/post/Spring-cleaning-your-indexes-Part-I.aspx and I want to continue with Part II today by clarifying some great questions/comments that have come up on the series. In Part III, I'll give you a few more ways to get rid of (or consolidate) indexes. And, I think there's still a bit more that Paul and I will investigate further (wrt to operational stats) but, I want to address a few comments and a few interesting things that both Paul and I have found.

In the Part I post, I talked about using sys.dm_db_index_usage_stats to see if there are any indexes that just aren't being used at all... A few comments asked why I didn't use operational_stats instead. To address that first, there are a few key differences:

  • dm_db_index_operational_stats is persisted only as long as an object is in cache (however, it's not cleared when objects are forced out of cache with DBCC DROPCLEANBUFFERS). If you want to clear ALL DMVs for a specific database, then a relatively easy way to do this (IN TESTING) is to take the database offline and then immediately bring it online again.
    • ALTER DATABASE <dbname> SET OFFLINE
    • ALTER DATABASE <dbname> SET ONLINE
      • NOTE: If there are any suspect files, you will NOT be able to bring your database back online without FIRST taking all suspect files OFFLINE. And, if you take a FILE offline then it's even more important to know that THERE IS NO WAY TO BRING A FILE ONLINE without restoring it from backups. So, it's VERY important to understand that OFFLINE/ONLINE for a database is really easy IF AND ONLY IF there are no other problems with the DB. You really need to resolve those problems first (or at least know that you're going to need to resolve those problems later through backup/restore) before you take a database offline.
  • dm_db_index_operational_stats is (from BOL) neither persistent nor transactionally consistent. This means you cannot use these counters to determine whether an index has been used or not, or when the index was last used. For information about this, see sys.dm_db_index_usage_stats.

Having said that though, none of these are really any guarantee of perfect information. And, they're not meant to be. I look at these DMVs as being a quick and easy way to get some relatively descent insight into what is or is not happening in my environement. However, even though the sys.dm_db_index_operational_stats might give you insight that you have a problem it still doesn't give you good insight into exactly what that problem might be. For example, it *might* be splits that causes some of your wait times to increase (column: page_io_latch_wait_in_ms) but, it could be something else too (some other system issue).

The main point, you can use these to get insight into which tables have the biggest problems (i.e. the biggest waits) and where they might have a lot of splits (column:nonleaf_allocation_count) but, in all honesty, that's not a guarantee. In fact, the reason I said "might" is that pages that are allocated at the end of the leaf level STILL allocate a page and require an entry to be made in the next level up in the index. So, a lot of nonleaf_allocations COULD be for a perfectly unfragmented index. So, it still doesn't tell you how fragmented the objects are or what the REAL problem is (or even if it is a problem yet).

Basically, these just tell you where you have the most activity and give you a starting point for problem solving. But, none of these (usage or operational) really tell you how to solve the problem. However, sometimes even knowing where to start IS the problem in and of itself. So, I'm not against these DMVs and I really do think you can get some good insight from them. Just use them as a tool to help focus your investigations. Use better tools like sys.dm_db_index_physical_stats to really see if you have fragmentation and where it's the worst.

hth,
kt

This morning, I woke to a wonderful email and I thought I would share it:

Good morning Kimberly, I just wanted to send you an email to let you know how awesome you are! I have applied some of your indexing strategies to our website’s homegrown CMS and we are experiencing a significant performance increase. WOW....amazing tips delivered in an exciting way! 

Thanks again,
Jim

OK, I seriously can't complain about the awesome part ;-)... but, what I think is so cool is that he sent the email. No, I don't expect everyone to send me mail after they implement a tip/trick BUT, it's really nice when I do get an email (especially like this :). There are so many ways to improve performance and some can lead to significant gains (or minimized downtime/data loss, etc.) and this is exactly why we (speakers/RDs/MVPs, etc.) do this stuff!

The long story short is that it does cost you money to go to a conference (or, at least, time to read blogs, etc.) but what we try to do at events like Connections is distill down the key points into our sessions so that you can [hopefully] apply these techniques immediately. It makes for fast-paced sessions and an information packed week but our conference just ended yesterday and I already have email from it. That's just really cool!

So, I know the economy is in a tough state and I know that quantifying the gain of a conference is hard (especially when the cost of the conference is so easy to quantify) but the end result of some of this time away might be more than worthwhile!

THANKS Jim!
kt

Categories:
Conferences | Personal

First and foremost, happy spring! I truly hope we're on the path to summer (although who would know it here - we're in Florida for SQL Connections and the weather is a bit chilly and it's been raining off/on today - I hope this is short term (no, I don't want to look at the forecast as I don't want to jinx it :)). But, wherever you are - I hope you're on your way to nicer weather and minimal cold weather (ok, I guess I only have warm wishes for the northern hem... for you southies - I hope your fall is lovely!!).

But, for everyone - now's a good time to start thinking about cleaning out the [non-literal index] closet... and getting rid of some of those dusty indexes?

Why/when should you get rid of some of your indexes?

  1. It's possible that some of them aren't being used at all...
  2. Especially when they're not being used but even when they're "redundant" (or minimally useful) they're still costing you in many ways:
    1. Wasting space on disk
    2. Wasting space in memory (well, if they are being used then they're cluttering up your cache)
    3. Wasting space/time in your maintenance routines (so, here they're cluttering up your cache for sure!)
    4. Wasting space in your backups
  3. You might be able to reduce your overall indexes with index consolidation...

So, for this post, I'm going to target #1 - are there any indexes that just aren't being used at all...

First, how do you know if your indexes are being used?
In SQL Server 2005 and higher, there's a DMV (dynamic management view) called sys.dm_db_index_usage_stats and it's there to track index usage patterns. However, it's not persisted since the beginning of time and as a result, if you look at this and believe that it's telling you ALL of the indexes that have been used in your database - then you might be mistaken. The index usage stats DMV is cleared when SQL Server is restarted as well as when you detach/attach the database or when you backup/restore the database.

Therefore, you don't want to just run the following query and drop all of the indexes that aren't being used. A better way to "trust" this information is to periodically persist the data from the DMV in your own table and then query it after you've completed a business cycle's worth of activity - logging all of the usage stats. Then you can trust this much more. Again, here are a couple of negatives:
1) it's not persisted
2) it only keeps the database_id and the index_id (which could change over time). You're right in thinking it probably shouldn't change but, a nonclustered index's ID is not permanent so, it's better to track the index name in addition to the index_id. And, when you run your queries to determine what to delete, you can easily verify indexes against the current indexes because your comparison is within the same db (more on this below).

If you want to persist this, then you have two ways to do this:

Store the index usage patterns in a table within the specific database you're tracking:

Pro: it goes with the database when you back it up, etc. and, it's easier to reverse engineer which actual indexes you're referring to (grabbing the names and not just the IDs).

Con: it's a bit more complex of a query to run and you'll need to run it for all of your databases (ok, it's really not all that bad - but, using something like sp_msForEachDB will really help)

Store the index usage patterns from all databases in a table within master or your own "performance database":

Pro: you only need one job to handle all the index usage info AND object_name *does* support TWO parameters (object_id and database_id) so, as long as you trap the name at the time of insert then you'll be good.

Con: it doesn't go with the database (e.g. backup/restore - and if you're restoring to a test system and you want to see what the usage patterns were then you'll need to get this information as well...)

Here's a simple query that you can run that shows all the indexes used right now - and adds the databasename/objectname into the results - in a persisted table you'll also want to add the runtime:

SELECT getdate() AS RunTime
, DB_NAME(i.database_id) as DatabaseName
, OBJECT_NAME(i.object_id, i.database_id) as ObjectName
, *
FROM sys.dm_db_index_usage_stats AS i
WHERE object_id > 100

And, if you want to get a few more insghts into how to persist this on Paul's blog here: http://www.sqlskills.com/BLOGS/PAUL/post/Indexes-From-Every-Angle-How-can-you-tell-if-an-index-is-being-used.aspx.

OK, so, you have a few options to think about and I have a few more parts to post!
kt

PS - The Tipping Point is coming soon too. I'm still adding a few things to that one!!! ;-)

OK, I've been pretty quiet this week... I had high hopes that I'd blog at least once or twice this week but the week turned out to be crazy busy as I presented T-W-Th at the SQL MCM (Microsoft Certified Masters) and then Fr at the Sharepoint MCM. The groups were really fun and wanted to go deeper and deeper into the technologies so on most days, I didn't even end lecture (which started at 8am) until 6:30+. So, it was a looooonnnggg week.

If you're interested, here's a link to the "very intense but fun" MCM programs: Microsoft Certified Master Program

Tomorrow, Paul and I fly out at 6am (to Florida) for SQLConnections. It's a packed week at Connections (a precon, a postcon and 5 sessions between us) and we're really looking forward to seeing many of you there! I'll be back to blogging soon as I know I still owe you the results/answers for the Tipping Point queries.

In the interim, check out Sean's Midnight DBA videos... he seriously needs to get some sleep! ;-)

Cheers to you all!
kt

Categories:
Resources

Theme design by Nukeation based on Jelle Druyts