The Database Compression feature in SQL Server 2008 Enterprise Edition can provide some significant reductions in storage requirements for SQL Server databases, and in the right implementations and scenarios performance improvements as well. There isn’t really a whole lot of information about the operations of database compression that is documented as being available in the DMV’s or SQL Trace. Paul Randal pointed out on Twitter today that sys.dm_db_index_operational_stats() provides the page_compression_attempt_count and page_compression_success_count available. Beyond that the only other documented information for monitoring Data Compression are the Page Compression Attempts/sec and Pages Compressed/sec Performance Counters of the SQL Server:Access Methods object in Perfmon (http://msdn.microsoft.com/en-us/library/cc280449.aspx).
There is one thing in common about the documented methods of monitoring Data Compression, and that is they all only deal with Page compression, and not Row compression, and in Extended Events we find the same commonality as there are no Row compression Events in Extended Events. There are two Page compression Events in Extended Events; sqlserver.page_compression_attempt_failed and sqlserver.page_compression_tracing. These two Events can be used to track Page compression operations at multiple levels, including database, object, index, and even down to the individual page. The sqlserver.page_compression_tracing Event provides Start and End tracing of Page compression operations inside of the Database Engine and returns the database_id, index_id, rowset_id, page_id, and duration of the compression operation. The sqlserver.page_compression_attempt_failed is really poorly named, and doesn’t provide information about failures in the sense that something broke, but provides information for why a page compression attempt did not actually change the compression of the data in the page. It also returns the database_id, index_id, rowset_id, and page_id for the compression attempt, and it also includes a failure_reason column which correlates to the page_compression_failure_reason Map Value.
-- Get the payload information for the Events SELECT object_name, column_id, name, type_name FROM sys.dm_xe_object_columns WHERE object_name IN ('page_compression_tracing', 'page_compression_attempt_failed') AND column_type = 'data'
To demonstrate how these Events function, I am going to use the LineItem table from the TPC-H Benchmark that was created by Quest Benchmark Factory using Level 2 for the table sizing, which makes the table just at 1.8GB in size. All of the indexes on the table will be rebuilt using PAGE compression, and then 10,000 rows will be added to the table. To setup the environment, first load the TPC-H LineItem table with the appropriate seed of data, this can be done with the free trial version of Benchmark Factory. Then rebuild all of the indexes on the LineItem table using PAGE compression, and review the PAGE compression statistics from sys.dm_db_index_operational_stats for the database and object.
USE [TPCH] GO -- Rebuild the indexes with Page compression ALTER INDEX ALL ON dbo.H_Lineitem REBUILD WITH (DATA_COMPRESSION = PAGE) GO -- Look at the compression information in sys.dm_db_index_operational_stats SELECT database_id, object_id, index_id, page_compression_attempt_count, page_compression_success_count, (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null) GO
Once the table and its indexes have been rebuilt using PAGE compression, we can then create our Event Session, start it, and add 10,000 rows to the LineItem table. After we add the rows, we can then check the page compression statistics in sys.dm_db_index_operational_stats, and drop our Event Session from the server.
-- Create an Event Session to Track the Failed attempts CREATE EVENT SESSION PageCompressionTracing ON SERVER ADD EVENT sqlserver.page_compression_attempt_failed, ADD EVENT sqlserver.page_compression_tracing ADD TARGET package0.asynchronous_file_target( SET filename='C:\SQLBlog\PageCompressionTracing.xel', metadatafile='C:\SQLBlog\PageCompressionTracing.xem') WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY=5SECONDS) GO -- Start the Event Session ALTER EVENT SESSION PageCompressionTracing ON SERVER STATE=START GO -- Insert 10000 rows into the H_Lineitem table INSERT INTO H_Lineitem (l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment) SELECT TOP 10000 l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment FROM H_Lineitem GO -- Look at the compression information in sys.dm_db_index_operational_stats SELECT database_id, object_id, index_id, page_compression_attempt_count, page_compression_success_count, (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null) GO -- Drop the Event Session DROP EVENT SESSION PageCompressionTracing ON SERVER GO
Now we can parse the Events that were captured by our Event Session and compare the information presented by sys.dm_db_index_operational_stats() with what was collected by Extended Events.
-- Create our result Analysis database CREATE DATABASE [PageCompTestResults] GO USE [PageCompTestResults] GO -- Create intermediate temp table for raw event data CREATE TABLE RawEventData (Rowid int identity primary key, event_data xml) GO -- Read the file data into intermediate temp table INSERT INTO RawEventData (event_data) SELECT CAST(event_data AS XML) AS event_data FROM sys.fn_xe_file_target_read_file('C:\SQLBlog\PageCompressionTracing*.xel', 'C:\SQLBlog\PageCompressionTracing*.xem', null, null) GO -- Fetch the Event Data from the Event Session Target SELECT RowID, event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name, DATEADD(hh, DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp], COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id, event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id], event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id], event_data.value('(event/data[@name="rowset_id"]/value)[1]', 'bigint') AS [rowset_id], event_data.value('(event/data[@name="failure_reason"]/text)[1]', 'nvarchar(150)') AS [failure_reason], event_data.value('(event/action[@name="system_thread_id"]/value)[1]', 'int') AS [system_thread_id], event_data.value('(event/action[@name="scheduler_id"]/value)[1]', 'int') AS [scheduler_id], event_data.value('(event/action[@name="cpu_id"]/value)[1]', 'int') AS [cpu_id] INTO ParsedResults FROM RawEventData GO
After parsing out the data, we can begin to really leverage the information we’ve gathered. If we join the ParsedResults table to sys.partitions for our TPCH database by rowset_id = hobt_id, we can get the object_id and index_id and aggregate the failure reasons up to the object and index level.
SELECT pr.database_id, p.object_id, p.index_id, failure_reason, COUNT(*) as failure_count FROM TPCH.sys.partitions p JOIN ParsedResults pr ON pr.rowset_id = p.hobt_id WHERE event_name = 'page_compression_attempt_failed' GROUP BY pr.database_id, p.object_id, p.index_id, failure_reason GO -- Look at the compression information in sys.dm_db_index_operational_stats SELECT database_id, object_id, index_id, page_compression_attempt_count, page_compression_success_count, (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null) GO
With this we can se that the Extended Events sqlserver.page_compression_attempt_failed Event tracks failures and attempts that are not counted in sys.dm_db_index_operational_stats(). The PageModCountBelowThreshold failure isn’t really a failed attempt at compression. This reason shows that the page was evaluated for recalculation, and the modified counter for the page hadn’t passed the internal threshold for recalculation so the actual compression operation wasn’t performed. If we look at the sqlserver.page_compression_tracing Event information, we can see how the numbers begin to come together to match what is output by sys.dm_db_index_operational_stats().
SELECT pr.database_id, p.object_id, p.index_id, COUNT(*) as attempt_count FROM TPCH.sys.partitions p JOIN ParsedResults pr ON pr.rowset_id = p.hobt_id WHERE event_name = 'page_compression_tracing' AND opcode = 'Begin' GROUP BY pr.database_id, p.object_id, p.index_id GO -- Look at the compression information in sys.dm_db_index_operational_stats SELECT database_id, object_id, index_id, page_compression_attempt_count, page_compression_success_count, (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null) GO
We have 193 attempts by this Event, and we have 72 PageModCountBelowThreshold failures, matching our actual attempts of 121 from the DMF. We can then subtract out the other failures and get the 93 successful operations matching the DMF as well.