This is a tough topic. It's a big topic and more than any other - I think there are a lot of misunderstandings about what the log is for, why it's so critical and ESPECIALLY when/why it gets extrememly large. Simply put, it gets large when it's not managed correctly. OK, there are times when it can become large - even if it is well managed. But, more often than not, when a transaction log is wildly out of control (orders of magnitude larger than the data itself) it indicates a management/maintenance problem.

There are a lot of places where you can go to find out the technical details behind the transaction log but I'm going to target this blog post to the relatively straightforward easy (no, really easy!!) facts about transaction log maintenance.

What kind of transaction log management is right for YOUR database?

First and foremost, you MUST decide whether or not you need to do log backups. SQL Server *requires* you to make some form of decision. Well, I take that back. They don't tell you anywhere that you need to make this decision but the transaction log can get wildly out of control if you don't (see the next section for more details on this one :)).

Why? Transaction log backups will allow you better recovery options in the event of a disaster. If you create a good backup strategy, you should be able to recover from a disaster very close (possibly even up-to-the-minute) to the time of the disaster. Howevre, you are not required to do log backups. Instead you can do only database-level backups and recover with those. That's fine. There's really nothing wrong with that strategy. However, it does mean that you have a greater potential for data loss. Basically, if you decide that you're doing to do weekly full backups - then you need to be OK with losing everything that's happened since your last full backup. If that's OK, then performing full database backups (and never worrying about the log) is absolutely fine.

However, if you want more granular control and more recovery options (again, possibly even up-to-the-minute recovery - which is transactional recovery up to the time of the disaster), then you MUST add transaction log backups into your disaster recovery strategy.

So, make this decision FIRST:

  1. Am I OK with some data loss? (then you're probably OK with just database-level backups... but, you will need to do something else! be sure to keep reading!!!)
  2. Do I want to minimize data loss to the smallest amount possible? (then you're going to want to AUTOMATE transaction log backups)

But I didn't do anything - why is the log WAY out of control (in terms of size)?

OK, even if you consciously make the decision to ONLY do database-level backups, you are NOT DONE!!! In fact, this is actually what led me to do this post. I found these two (relatively dated but interesting nonetheless) MSDN forum discussions for TFS (Team Foundation Server) databases:

    MSDN Forum discussion "Recommended SQL Maintenance Plan": http://social.msdn.microsoft.com/forums/en-US/tfsadmin/thread/b23f7018-3eaa-4596-96e4-728b02cf6211/ 
    MSDN Forum discussion "Huge log files":
http://social.msdn.microsoft.com/forums/en-US/tfsadmin/thread/605d51f7-23fd-470c-945e-53fa7ed5aa87/

And, I know EXACTLY what happened in ALL of these cases (and MANY more... Paul and I see this ALL the time, in fact). In the "Huge log files" thread, there's a database mentioned (TfsWareHouse) with a 124MB mdf and a transaction log of 61.8GB. It didn't mention whether or not there were other data files but my guess is that there weren't. My guess is that they were completely shocked by why the data portion had grown to a size that's 510 TIMES the size of the database... The reason is actually somewhat simple (no pun intended). If you're not going to do transaction log maintenance (meaning transaction log backups), then you need to tell SQL Server that. (This is the part that's completely unexpected.)

When a database is created, SQL Server runs that database in a "pseudo simple recovery model". (Yes, I know - that didn't help.) What that means is that SQL Server automatically clears inactive records from the transaction log once it knows that it no longer needs them. It no longer needs them to be stored in the log because no one is using the log (i.e. you're not doing ANY backups). However, once you do start to do backups (and, people generally start by doing a full database backup), then SQL Server looks to your recovery model to determine what to do with log records. If the recovery model is set to full (and, yes, this is the default), then SQL Server gives you the "full feature set" with regard to backup/restore. SQL Server is expecting YOU to manage the transaction log by backing it up. Once it's backed up, SQL Server can remove the inactive records from the transaction log (and when you do a transaction log backup, it automatically clears the inactive records by default).

So, there are really two choices - and ONLY two choices here:

  1. Perform transaction log backups as part of your maintenance plan
  2. Change the recovery model to the SIMPLE recovery model so that SQL Server clears inactive transactions from the log automatically

Is there anything else to do for the transaction log? 

Yes! If you decide that you want to do transaction log backups then I would recommend a few things. I'd first recommend reading 8 Steps to Better Transaction Log Throughput and when you decide how large your transaction log needs to be, then also read Transaction Log VLFs - too many or too few?. These two posts will help you to create a more appropriately sized log as well as one that won't be prone to performance problems (such as internal VLF fragmentation).

If you want to learn more about the transaction log, I'd suggest a few of Paul's resources (it's probably because he has such a fantastic tech editor... oh, I'm asking for trouble with this comment!! ;-):

  1. Read Paul’s blog post to his TechNet article on Logging & Recovery. It’s a great article that covers a lot of different aspects of logging. He also did a great short video on why the transaction log grows wildly out of control. Here’s a link to the blog post that pulls all of the TechNet resources together: http://www.sqlskills.com/BLOGS/PAUL/post/TechNet-Magazine-feature-article-on-understanding-logging-and-recovery.aspx.
  2. Read Paul’s blog post to his TechNet article on Database Maintenance. It’s a great overview of all of these maintenance tasks and will give you a good overview of what each one does. Here’s a link to the blog post that pulls all of the TechNet resources together: http://www.sqlskills.com/BLOGS/PAUL/post/TechNet-Magazine-Effective-Database-Maintenance-article-and-August-SQL-QA-column.aspx

OK, so, I think that sums up part III. I think that's the last one in the series for now. I'll go through and explain "The Tipping Point" next. However, I was hoping for more results to my brain teasers (in those two posts)!!

Cheers,
kt

Paul and I started discussing a comment that came up regarding the many issues surrounding logging & recovery. It's one of our favorite topics and in fact was the title to an article that Paul recently wrote for TechNet here: http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx. And, as a sidenote, depending on how much you already know about the transaction log - you might want to review that article first!

The comment that came up was related to a common misunderstanding on what is and what is not required to make a backup transactionally consistent when restored. And, in my opinion, some of the confusion as to whether or not log backups are "required" is because many changes have occurred release to release. Also, a lot of us say "log backups are required for better recovery" and while restoring log backups is what allows features like up-to-the-minute recovery and point-in-time recovery, not all strategies or recovery procedures actual require additional and/or separate log backups (some backups actually backup part of the log during their backup - and this is actually something that has changed release to release). And so, this is the reason for this post, I want to try and clear up a few of the many misconceptions about what happens with regard to the log during backup and restore. What's really interesting is that some of the best features (seemingly minor) have been around exactly this - the behavior of the transaction log during other backups and the requirement during restore. So, I thought I'd give a play by play from 2000 to 2005/2008 to discuss the differences and what's changed and why those changes were significant. The biggest changes were between 2000 and 2005.

First and foremost, the log portion of a database is required to make that database transactionally consistent. The transaction log is the key to SQL Server's durability (data integrity even after power loss). Transaction log backups are the key to our being able to recover from more catastrophic failures (possibly even point-in-time recovery if the right backup strategy exists). Inside the database, SQL Server doesn't really need all of the transaction details after they've guaranteed a transaction's durability (or, more simply put, once the effect of the change has been reflected in the data portion of the database then the details of that change are no longer needed in the log portion of the database). As a result, you can have SQL Server clear the "inactive" portion of the log by setting the database's recovery model to the SIMPLE recovery model. Loosely translated the SIMPLE recovery model means "when SQL Server no longer requires the transaction information to guarantee durability - then the log information can be removed from the log". Setting the recovery model to SIMPLE limits your backup options and makes administration easier (i.e. simple :)); however, it does not offer any other protection in the event of a more catastrophic disaster (because the log is being regularly cleared then there's no transactional information to backup). For some development/test databases and databases where data loss is not a major concern, then this can be an easy choice because log management (i.e. backups) does not need to be performed. However, if you want to minimize data loss - you can't choose the SIMPLE recovery model; you must choose either the FULL (which is the default) or the BULK_LOGGED recovery model. However, the discussion on when/why to choose BULK_LOGGED is a lengthy one and it does NOT impact the rest of this blog post. However, I did write a chapter for a SQL Server 2000 HA book and I described in detail the best uses for the BULK_LOGGED recovery model as well as the benefits and concerns. While this was written for SQL Server 2000, most of it *still* applies (and there are a few timeline based examples as well). You can download a pdf of this chapter here: http://www.sqlskills.com/resources/SQLServerHAChapter9.pdf.

As for the main purpose of this post - there are basically a few key questions that I want to answer/clarify by version:

  • Is the log backed up as part of the other backups?
  • Is it cleared?
  • Is there anything else that's affected?
  • What happens to the log during other database, filegroup, file, database-differential, filegroup-differential and file-differential backups? And, since the behaviors and internals seem to be grouped into two groups, I will differentiate between these two different groups of backup strategies with the following types:   
    • Database-level backup strategies are backups that use database and optionally database-differential backups
    • Granular backup strategies are backup strategies that use file and/or filegroup backups and optionally file-differential/filegroup-differential backups

SQL Server 2000
Database-level backups cannot occur simultaneously with log backups. However, granular backups *CAN* occur concurrently with log backups.
If a log backup is attempted while a database-level backup is running, then the log backup is paused. This can have the following affects:

  • the transaction log may require auto-growth and become very large
  • the transaction log for a secondary server (i.e. through log shipping) can fall *very* far behind the primary server. And, this is a HUGE concern for high availability. If a full backup takes 4 hours to run, then logs cannot be shipped for 4 hours. As a result of this limitation, some chose to use a granular backup strategy. The reason why log backups CAN occur concurrently in SQL Server 2000 is because, in implementation, SQL Server does NOT to backup the log as part of these more granular backups. As a result, transactional integrity is not guaranteed until the appropriate log chain is rolled forward. This has the following affects:
    • Granular backups only support the BULK_LOGGED or FULL recovery models (somewhat negative but not really)
    • The transaction log backups could run and even clear the inactive portion of the log while these granular backups were running (this is a huge benefit because it limits the need to auto-grow during these backups)
    • Recovery during restore is required (for transactional integrity) which means that all logs need to be restored to cover the time of the granular backup (and then all of those up-to-the-minute or to the desired point-in-time). And, even if a filegroup is set to READ_ONLY - *all* transaction logs need to be restored (this is a big negative but there is a trick: perform periodic file/filegroup differentials (after setting the filegroup to READ_ONLY) so that you can avoid having to perform numerous transaction log restores).

So, to answer the questions for SQL Server 2000:

  • Is the log backed up as part of the other backups?
    • for Database-level backups: YES
    • for Granular backups: NO
  • Is it cleared?
    • When a transaction log backup is performed then the default behavior is to clear the inactive portion of the log
    • When a database-level backup is performed AND there's no corresponding log chain (meaning the first time you backup the database OR the first time you backup the database after the transaction log chain was broken), then the transaction log is cleared. NOTE: Breaking the continuity of the log is relatively easily done in SQL Server 2000 when someone executes a BACKUP LOG with NO_LOG or a BACKUP LOG with TRUNCATE_ONLY command. To disable these from execution (for the FULL or BULK_LOGGED recovery model, use TRACE FLAG 3231). This is a VERY COOL and *SAFE* trace flag. I blogged about this trace flag in a "MSDN webcast Q&A" here. An important side note here is that in SQL Server 2000, log backups can be performed AFTER the continuity of the log has been broken. So, if someone manually cleared the log (using NO_LOG or TRUNCATE_ONLY) and did NOT follow that with a database-level (or appropriate granular backups), then scheduled log backups could continue to run without failure or errors. However, log backups performed AFTER the continuity of the log has been broken CANNOT be restored. So, during recovery you might receive an error that a log backup cannot be applied because it's too "late" to apply. Using Trace Flag 3231 reduces this possibility. However, SQL Server 2005 fixes some of these issues.
  • Is there anything else that's affected?
    • Log backups are paused during database-level backups
    • When restored, database-level backups are transactionally consistent (and can be recovered directly - without restoring additional logs)
    • When restored, granular backups require transaction log backups to guarantee transactional integrity (note: this can be complex to determine the "minimum effective log sequence" and I wrote a series of articles for SQL Server Magazine on how to determine the appropriate log sequence here)
  • What happens to the log during other database, filegroup, file, database-differential, filegroup-differential and file-differential backups?
    • for Database-level backups: log backups cannot occur concurrently 
    • for Granular backups: log backups can occur concurrently and are required for recovery

SQL Server 2005
The biggest improvement in SQL Server 2005 was that log backups are no longer paused by database-level backups - they *can* occur simultaneously; however, this change to database-level backups also applied to granular backups. While 2000 did allow log backups at the same time as a granular backup, they did so by NOT maintaining transactional integrity in the backup. In SQL Server 2000, you need to restore logs to make the granular backup transactionally consistent. In SQL Server 2005, they changed ALL backup strategies to follow the same behavior - database-level and granular backup strategies ALL backup the required log information needed to recover the backup to a transactionally consistent point in time which is essentially when the backup completes (this is a lot more complex than it sounds but Paul wrote a comprehensive post on exactly what this means here). Simply put, this requirement means that transaction log backups CAN occur concurrently; however, the log CANNOT be cleared until the backup completes. The primary negative effect is that the transaction log may require auto-growth and become very large. However, the positives are that you can do granular backups in any recovery model (although there are still some limitations to how this works in the SIMPLE recovery model but they added a new option during backup to allow a backup of ALL of the READ_WRITE_FILEGROUPS as a unit - separately from the read-only file groups which could be backed up at any time after they are set to READ_ONLY).

So, to answer the questions for SQL Server 2005:

  • Is the log backed up as part of the other backups?
    • for Database-level backups: YES
    • for Granular backups: *YES*
  • Is it cleared?
    • When a transaction log backup is performed then the default behavior is to clear the inactive portion of the log
    • When a database-level backup is performed AND there's no corresponding log chain (meaning the first time you backup the database), then yes, the inactive portion of the log is cleared. As far as breaking the continuity of the transaction log... In SQL Server 2005, they significantly reduced the problems that occur after the log chain is broken by NOT allowing log backups to continue. If a log backup is attempted after the continuity of the log is broken then you will receive error: 
        Msg 4214, Level 16, State 1, Line 1
        BACKUP LOG cannot be performed because there is no current database backup.

      So, this means that you don't necessarily need the trace flags. However, I still recommend using the trace flag because it would be better to not break the continuity of the trace flag to begin with! And, in fact, in SQL Server 2005, there are two trace flags: 3231 and 3031. They are both safe and here's how the two differ:
      • Trace Flag 3231 (same as 2000): When set, BACKUP LOG with TRUNCATE_ONLY and BACKUP LOG with NO_LOG do not allow a log backup to run if the database's recovery model is FULL or BULK_LOGGED.
      • Trace Flag 3031 (new in 2005): When set, BACKUP LOG with TRUNCATE_ONLY and BACKUP LOG with NO_LOG run as a CHECKPOINT - regardless of recovery model.
  • Is there anything else that's affected?
    • Log backups are *NOT* paused during database-level backups
    • When restored, database-level backups are transactionally consistent (and can be recovered directly - without restoring additional logs)
    • When restored, granular backups are transactionally consistent (and can be recovered directly - without restoring additional logs). However, you must always remember that the database cannot be brought online until the entire database is at a single transactionally consistent point in time. All read-write-filegroups must be restored as a unit (if in the SIMPLE recovery model) OR you must use transaction log backups to recover the entire database up to the SAME point in time.
  • What happens to the log during other database, filegroup, file, database-differential, filegroup-differential and file-differential backups?
    • for Database-level backups: log backups *can* occur concurrently (but the log will not be cleared until the backup completes) 
    • for Granular backups: log backups can occur concurrently (but the log will not be cleared until the backup completes) 

SQL Server 2008
Almost everything is the same in SQL Server 2008 as it was in 2005 - they made the largest number of improvements in 2005. However, one thing did change. In SQL Server 2008, the BACKUP LOG with NO_LOG and BACKUP LOG with TRUNCATE_ONLY options are not allowed at all. There is no need for the trace flags (3231/3031) because breaking the continuity of the log is not allowed (well, there is still a way... I'll get to that in a moment :)). In SQL Server 2008, if BACKUP LOG with NO_LOG or BACKUP LOG with TRUNCATE_ONLY are attempted, you will receive this error:
     Msg 3032, Level 16, State 2, Line 1
   One or more of the options (no_log) are not supported for this statement. Review the documentation for supported options.
But, what if you really don't want to backup the log? Why? Take this scenario (from a real customer!)... You have a 10GB database that's been around for quite some time AND you're doing regular full database backups... then, all of a sudden you run out of disk space. In looking around for large files (to investigate why you ran out of space), you find that this 10GB database's log is 987GB... so, you wonder - what happened? A database that is in the FULL recovery model (remember, this is the default) requires transaction log management. The easiest way to manage the log is with regular log backups; however, you're only doing full database backups (which do NOT clear the log). As a result, the transaction log grows and grows and grows and grows - until you're out of disk space (Paul demo'ed this in a TechNet Podcast here). At this point, how do you get rid of this 987GB transaction log? In prior releases, you can "clear" the log by using TRUNCATE_ONLY or NO_LOG but in 2008, what do you do? Switch to the SIMPLE recovery model. And, if you only want to do full database backups, stay there. And, if you want to physically shrink down the size of the transaction log file to a reasonable size - check out these two related blog posts: 8 Steps to Better Transaction Log Throughput and Transaction Log VLFs - Too many or too few?. And, in related news, Linchi Shea posted a good post on some tests he ran related to too many VLFs here and a second post that shows that some workloads don't see any issues wrt to lots of VLFs here. But, the long story short is that you still want to be proactive about creating a reasonably sized transaction log (my two other previously mentioned posts). Significant auto-growth can cause problems and backup operations (and managment in general) can be more difficult with lots of VLFs.

Wow, that was much longer than I was expecting... and, in writing it all down - pretty complex (I had a hard time trying to section things but I think this works?!). Regardless, all the facts are there so this should help to clarify what happens the when, where and why - wrt to the transaction log. Let me know if you have more questions!

Thanks for reading,
kt

Way back in June 2005, I blogged about '8 Steps to better transaction log throughput'. I did this blog post after seeing (again and again) overly fragmented transaction logs... Transaction logs can become *VERY* fragmented when they are not preallocated and instead they grow excessively through unmanaged (and probably the default settings for) auto-growth.

While having WAY too many VLFs because of auto-growth is still the most common form of problem within transaction logs, another problem has been creeping up more and more... too few VLFs. If you preallocate a very large transaction log (10s to 100s of GB), SQL Server may only allocate a few VLFs - as a result, log backups will be allowed to run normally but, SQL Server only clears the inactive VLFs when you've moved into a different VLF. If your VLFs are 8GB in size, then you need to accumulate 8GB of log information before the log can be cleared...so, many of your log backups will occur normally but then one (the one that finally hits > 8GB in used size) will take quite a bit more time AND possibly cause you performance problems because it's now clearing 8GB of log information.

First, here's how the log is divided into VLFs. Each "chunk" that is added, is divided into VLFs at the time the log growth (regardless of whether this is a manual or auto-grow addition) and it's all dependant on the size that is ADDED not the size of the log itself. So, take a 10MB log that is extended to 50MB, here a 40MB chunk is being added. This 40MB chunk will be divided into 4 VLFs. Here's the breakdown for chunksize:

chunks less than 64MB = 4 VLFs

chunks of 64MB and less than 1GB = 8 VLFs

chunks of 1GB and larger = 16 VLFs

And, what this translates into is that a transaction log of 64GB would have 16 VLFs of 4GB each. As a result, the transaction log could only clear at more than 4GB of log information AND that only when it's completely inactive.

To have a more ideally sized VLF, consider creating the transaction log in 8GB chunks (8GB, then extend it to 16GB, then extend it to 24GB and so forth) so that the number (and size) of your VLFs is more reasonable (in this case 512MB).

Have fun and thanks for reading!!
kt

PS - I've been made aware of a bug when you use an exact size of 4096MB. I'll get more details and post them here but the long story short is to avoid 4096MB as an exact value. I've been told (and I haven't played with this one yet), that 4095 doesn't have the problem. Oh, and the problem is that the 4GB does NOT get divided into equally sized VLFs.

NOTE/UPDATE: Be sure to read this post as well: Transaction Log VLFs too many or too few? after reading this one. Not only can you have too many small VLFs but if incorrectly sized, you can have too few! Enjoy!!

On a few of my last trips, I've had the pleasure of doing a lot of customer visits - visiting some of the larger implementations of SQL Server. Many of these implementations are at banks where both performance and recovery are critical. After my trip to Turkey (where I did customer visits prior to presenting at the Microsoft Professional Developers Summit), I received this email:

As you might remember we talked at xxxbank, Turkey about performance problems in transaction log backups and splitting tempdb data file into equal size chunks equal to number of CPUs. I have implemented both optimizations in my troubled server, and the results are great! Transaction log backups do not impact the server at all and with lowered congestion in tempdb, overall system performance has gone up as well.

So - this is my first of two blog entries targeting these two VERY typical problems:
* Transaction log fragmentation (both internal and external - this customer was having problems with internal fragmentation)
* Optimizing TempDB

We'll start with 8 Steps to Optimizing your Transaction Log - not necessarily in any order. It's best to review all of these to make sure you have a comprehensive view of how to improve both performance and recovery of your transaction logs!

1) Try to dedicate the transaction log portion of your database to its own phyiscal disk. In high volume OLTP system, isolating the transaction log can allow the disk head to be ready for the next write by not having other files contend for the physical disk resource. If your database already exists, the best way to "move" your transaction log is to detach your database (sp_detach_db) and then reattach it (sp_attach_db). Here's a great KB article titled: Moving SQL Server databases to a new location with Detach/Attach.

2) Defrag the disk(s) on which your transaction logs reside. This will get rid of external fragmentation of the transaction log - better known as disk file fragmentation. This will require that your server be taken offline but if your files have had a lot of autogrowth and/or they reside on a disk with a lot of other files that have been modified, then all of your files (incl. the transaction log file) are likely to be interleaved and fragmented. The good news is that this is a one time operation (unless the files continue to grow and shrink excessively - but we'll fix that as well). The bad news is that you'll need to shutdown SQL Server and then use some form of system tool - Microsoft or third party - to defrag. On Windows XP and Windows Server 2003 machines, the disk defragmenter tool can be launched by running: dfrg.msc.

3) Create only ONE transaction log file. Even though you can create multiple transaction log files, you only need one... SQL Server DOES not "stripe" across multiple transaction log files. Instead, SQL Server uses the transaction log files sequentially. While this might sound bad - it's not. If you want to get better performance out of the transaction log, place it on faster disks and/or a more performant disk (RAID) configuration. You could place the transaction log on a RAID 0 array but you'll likely want fault tolerance as well as performance. Isolation, as per number 1 is a good start but if you also need increased capacity then using hardware based RAID to combine disks for both better performance and redundancy is your next choice. See 4.

4) Not only should you try to isolate the transaction log to its own physical disk but you should make sure that the logical/physical disk configuration is as efficient as possible. Try to use an isolated RAID 1 mirroring set if you don't need significant capacity. If you need a greater capacity OR you want better performance, consider a combination of RAID 0 and RAID 1 (either RAID 0 + 1 or RAID 1 + 0). While RAID 0 + 1 can often offer better performance, RAID 1 + 0 offers better reliability. If you're new to RAID and are interested in learning more - check out the RAID Tutorial here: http://www.raidarray.com/04_01_00.html.

5) Don't be caught up in nothing but transaction log speed, you'll also want to make sure that your transaction log is always available as this can help you in times of disaster. Even if the data is damaged, if the transaction log is available and you have a series of backups up to and including the last transaction log then you can make a final backup of the transaction log that will represent all of the changes since your last transaction log backup. If this backup is possible (it's called backing up the "tail" of the log), then you can achieve up-to-the-minute recovery. This is only possible when the log is available. By placing the log on mirrored disks you can increase your chances of recovering data and minimize data loss!

6) Create transaction log files with a reasonable initial size. When you create a database it's ideal to PRE-ALLOCATE both your data files and your transaction log file. A little bit of capacity planning goes a long way... Now, if you think that you've got absolutely no idea how to size your transaction log you're going to need, well - here are the things that have the greatest impact:

  • Type of activity - transaction processing or decision support
  • Frequency of that activity - the more frequent the changes, the faster the transaction log will grow
  • Recovery Model - the recovery model of the database
  • Frequency of transaction log backups
  • Whether or not replication is used (since the log reader relies on the transaction log)

I wish I could give you a rough idea on sizing but if you look only at database size and none of these other factors, you could end up with a transaction log that's seriously oversized or seriously undersized. I've seen recommendations of 10-25% of the size of the data and you can use that but I would also add a bit of common sense. A larger database with very frequent transaction log backups may not need a transaction log that's even 1% of the size of the data... The best way is to setup your development environment similar to that of your production environment (including backup jobs) and then see how the transaction log grows. If you have a lot of autogrowth (because your guess was wrong), you can later clean up the fragmentation that has occurred and get back to a reasonable, intact, and optimal transaction log. If you want to know more about Recovery Models and their impact on the transaction log, review a sample book chapter from an MSPress book (Microsoft® SQL Server™ 2000 High Availability) that I helped to co-author. I only helped out on a couple of chapters but this specific chapter explains a lot about the general database environment settings that impact recovery. You can download Chapter 9: Database Environment Basics for Recovery here: http://www.sqlskills.com/resources/SQLServerHAChapter9.pdf

7) Don't let autogrowth get out of control. As important as capacity planning, you're not likely to be spot-on in your estimates. I don't recommend completely turning off autogrowth but in general I also don't like SQL Server 2000's default growth rate (or max size). In general, I would recommend setting ALL of the transaction log files attributes: initial size, growth rate AND maximum size. For the growth rate, I recommend something that can be allocated somewhat quickly and something of a fixed size. In general, I recommend a value which is less than or equal to 1GB (based on total size) but something that doesn't mean that you're going to autogrow again soon. So, for databases whose transaction logs are under 1GB then you might set autogrow to somewhere between 20 and 100 MB. For those of you who have transaction logs measured in GB, then I'd set the autogrowth to 500MB or 1GB. In general, I don't like the percentage because it needs to be calculated (which I realize isn't really all that big of a deal) but the larger the file the larger the autogrowth and the longer it takes. In SQL Server 2000, autogrowth can create blocking...so, it's best to minimize this in general.

8) Check/fix your internal fragmentation. OK, so this is the one that really helped the customer in Turkey. Often, when transaction logs are not pre-allocated and/or when there's been a lot of autogrowths, the transaction log can become internally fragmented. Internally your transaction logs are broken down into smaller more granular chunks called VLFs (Virtual Log Files). The size and number of VLFs you'll have depends largely on the size that the chunk is when it's added to you transaction log. If you add a new chunk to the transaction log which is 20MB (through autogrowth or through manual growth) then the number of VLFs that are added is 4. If you add a chunk which is greater than 64MB but less than or equal to 1GB, you'll add 8 VLFs. If you add more than 1GB then you'll add 16VLFs. In general, most transaction logs will only have 20 or 30 VLFs - even 50 could be reasonable depending on the total size of the transaction log. However, in many cases what happens is that excessive autogrowths can cause an excessive number of VLFs to be added - sometimes resulting in hundreds of VLFs. Having an excessive number of VLFs can negatively impact all transaction log related activities and you may even see degradation in performance when transaction log backups occur. To see how many VLFs you have solely look at the number of rows returned by DBCC LOGINFO. The number of rows returned equals the number of VLFs your transaction log file has. If you have more than 50, I would recommend fixing it and adjusting your autogrowth so that it doesn't occur as fequently. To get rid of all of the execessive VLFs, follow these easy steps to shrink off the fragmented chunk and add a new, clean chunk to your transaction log:

1. Wait for an inactive time of day (ideally, it would be best to put the database into single user mode first) and then clear all transaction log activity through a regular transaction log backup. If you're using the simple recovery model then you don't need to do a log backup... Instead, just clear the transaction log by running a checkpoint.

 BACKUP LOG databasename TO devicename

2. Shrink the log to as small a size as possible (truncateonly)

DBCC SHRINKFILE(transactionloglogicalfilename, TRUNCATEONLY)

NOTE: if you don't know the logical filename of your transaction log use sp_helpfile to list all of your database files.

3. Alter the database to modify the transaction log file to the appropriate size - in one step

ALTER DATABASE databasename
MODIFY FILE 

      NAME = transactionloglogicalfilename 
    , SIZE = newtotalsize
)

And, that should do it!

Theme design by Nukeation based on Jelle Druyts