SQL Server 2005 and 2008 – Compression

OK, I still have a way to go in learning about data compression in SQL Server 2008 but one thing that I do know is that nothing is free. So, the trade-off will be performance (i.e. CPU) v. space. And, that’s not really a new trade-off wrt to compression. Sometimes that trade-off has other benefits that still minimize the overall cost (for example, backup compression compresses in-memory and before it goes to disk… this actually makes the overall backup process faster because the actual backup written to disk is smaller). However, if we’re talking about data and data access, then we need to think more about how the data is going to be used as well as the impact on performance. I can definitely think of many reasons to compress older (and read-mostly, if not read-only) data (mostly due to volume) but depending on the queries and the impact to uncompress it (based on the volume of data being accessed), I’m going to do a lot of testing before I compress high performance/OLTP data. To help estimate the savings on space, SQL Server 2008 offers a stored-proc: sp_estimate_data_compression_savings.


Compression in SQL Server 2005
SQL Server 2005 offers the ability to have read-only data compressed using Windows NTFS file compression. File compression is only supported for secondary non-primary data files and only when they’re set to read-only. If the entire database is set to read-only then all files (incl. the primary and log) can be on compressed drives. While supported, and it can make sense to do this when you have large amounts of historical data, it’s still not very granular.


The other form of compression in SQL Server 2005 was introduced in SP2 as data compression for the decimal/numeric data types, called vardecimal. First, you enable compression at the database level and then you turn it on at the table level. The primary form of compression used by vardecimal is when your actual values are generally much smaller than the defined/declared decimal/numeric column. For example, if you’ve chosen to define a lot of columns as precision/scale (38,4) then as a decimal column each value (per column, per row) will take 17 bytes whether you use all of it or not. If you only store the value 87.5 (which would normally take only 5 bytes as a decimal(3,1)) then you’re wasting 12 bytes. This form of compression will still be supported in SQL Server 2008 so if you’re interested in how the vardecimal type works, check out this whitepaper. As for the new forms of compression… row-level compression is similar to vardecimal, but the other forms are quite different, and very interesting (especially the page-level dictionary compression)!


Compression in SQL Server 2008
In addition to offering support for NTFS file compression and vardecimal, SQL Server 2008 offers row-level compression or page-level compression (which includes row-level compression) AND it offers the ability to turn these on at the partition-level or at the table-level for all partitions. While I think the per-partition option is excellent, you might still want to separate your OLTP and read-only data into separate tables for other benefits (like online index operations which I mentioned here) but, the “table-level only” options are certainly starting to decrease! And, more granular options always means better manageability.


So, how does compression work in SQL Server 2008:
   Paul wrote about backup compression here.
   Sunil wrote about data compression here and here.
   Chad Boyd wrote about both here.


Paul and I will post more on compression… I really want to get some numbers regarding performance and Paul will dive into all of the internals using DBCC PAGE (go figure! :).


Enjoy!
kt

2 thoughts on “SQL Server 2005 and 2008 – Compression

  1. Hi
    Having problem to kill user process.

    OS: windows server 2003
    database version: sql server 2005

    i want to kill user process.when i tried to kill user process using its spid… kill 57
    i got error SPID 57: transaction rollback in progress. Estimated rollback completion: 52%. Estimated time remaining: 448373 seconds.
    i also found that this process has suspended status and wait_type is BACKUPTHREAD
    one more thing i wait to complete rollback transcation since last 3 days.
    but after 3 days getting same error and estimated time increase every time i issue KILL command.
    even i tried activity monitor to kill process.but no luck
    is there any way to slove this issue???
    how to kill this process.??
    well i havnt try to stop and restart sql server processes.because its producation server.will database complete shudown and server process restart help to overcome this issue.
    your help is really appreciate.
    thanks,
    saurabh

  2. A great use of the compression might be to analyze IIS logs. They are HUGE text files but compressed the data is very small.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Wow! Wow! Wow! THANK YOU!

I announced my retirement from SQL/tech here and your comments on my blog, on LinkedIn, and on Facebook were overwhelming and humbling! I’m so touched

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.