This is a quick post inspired by a question I was sent in email (thanks Marcos!) which very neatly lets me show a DMV I’ve been meaning to blog about for a while. And the weather here in Redmond really sucks right now so I can’t go outside – blogging will serve as my work-avoidance strategy this afternoon :-).
The (paraphrased) question is: A checkpoint is a process that writes all dirty pages to disk, and is per-database. So, if the data cache can hold a page from any database, how does checkpoint know which pages to check for a dirty status? Does it scan through buffer pool looking for pages for database X and process only those? Or is data cache somehow partitioned by database? I’d like to know a bit better how it works under the covers.
The answer is that pages are stored in buffers in the buffer pool (aka buffer cache or data cache), and the buffers are indeed hashed so they can easily be found by database. You can see what pages are currently in the buffer pool, and their status using the sys.dm_os_buffer_descriptors DMV in 2005:
SELECT * FROM sys.dm_os_buffer_descriptors; GO
database_id file_id page_id page_level allocation_unit_id page_type row_count free_space_in_bytes is_modified ----------- -------- -------- ----------- -------------------- -------------- ----------- ------------------- ----------- 1 1 9 0 6488064 BOOT_PAGE 1 7362 0 1 1 6 0 6488064 DIFF_MAP_PAGE 2 6 0 1 1 7 0 6488064 ML_MAP_PAGE 2 6 0 1 1 104 0 262144 DATA_PAGE 100 4196 0 1 1 105 0 851968 DATA_PAGE 65 5041 0 1 1 106 0 262144 DATA_PAGE 197 413 0 1 1 107 0 262144 DATA_PAGE 207 23 0 1 1 108 1 262144 INDEX_PAGE 7 7949 0 . .
I cut off the output rather than list all 3258 pages in the buffer pool on my laptop. The DMV gives you back some info from the pages themselves as well as you can see(remember all this is in memory so it’s quick to find).
I played around with the DMV a little bit and came up with a neat script that will tell you may many clean and dirty pages there are in the buffer pool per-database.
SELECT (CASE WHEN ([is_modified] = 1) THEN N'Dirty' ELSE N'Clean' END) AS N'Page State', (CASE WHEN ([database_id] = 32767) THEN N'Resource Database' ELSE DB_NAME ([database_id]) END) AS N'Database Name', COUNT (*) AS N'Page Count' FROM sys.dm_os_buffer_descriptors GROUP BY [database_id], [is_modified] ORDER BY [database_id], [is_modified]; GO
Page State Database Name Page Count ---------- ----------------------- ---------- Clean master 302 Dirty master 1 Clean tempdb 88 Dirty tempdb 52 Clean model 56 Clean msdb 622 Dirty msdb 5 Clean adventureworks 110 Clean DemoRestoreOrRepair 64 Clean DBMaint2008 88 Clean DemoFatalCorruption1 64 Clean DemoFatalCorruption2 64 Clean broken 64 Clean DemoFatalCorruption3 64 Clean DemoCorruptMetadata 111 Clean DemoDataPurity 88 Clean SalesDB 123 Clean DemoNCIndex 88 Clean shrinktest 88 Clean DemoRestoreOrRepairCopy 64 Clean DemoSuspect 64 Clean FileHeaderTest 96 Clean MultiFileDB 96 Clean HA2008 88 Clean SalesDB_Snapshot 21 Clean BootPageTest 88 Clean Resource Database 599
Later this week I’ll try to blog a script that can tell you how much of a particular table is in memory ([Edit 12/8/11: that script is here]). Enjoy!
14 thoughts on “Inside the Storage Engine: What’s in the buffer pool?”
Paul,
I like your script using the sys.dm_os_buffer_descriptors DMV.
In the interest of readability, I revised it to show just one line per database, with both the clean and dirty page counts on the same line. It will usually have fewer result lines than the original version, and will show counts of zero if a given database didn’t have both clean and dirty pages in the buffer pool.
The revised SQL statement is:
SELECT
(CASE WHEN ([database_id] = 32767) THEN ‘Resource Database’ ELSE DB_NAME (database_id) END) AS ‘Database Name’,
SUM(CASE WHEN ([is_modified] = 1) THEN 1 ELSE 0 END) AS DirtyPageCount,
SUM(CASE WHEN ([is_modified] = 1) THEN 0 ELSE 1 END) AS CleanPageCount
FROM sys.dm_os_buffer_descriptors
GROUP BY database_id
ORDER BY DB_NAME(database_id)
GO
The resulting output for your example data would be (recreated manually – I couldn’t get tabs to work in this post – sorry!):
Database Name DirtyPageCount CleanPageCount
—————– ————– ————–
Resource Database 0 599
adventureworks 0 110
BootPageTest 0 88
broken 0 64
DBMaint2008 0 88
DemoCorruptMetadata 0 111
DemoDataPurity 0 88
DemoFatalCorruption1 0 64
DemoFatalCorruption2 0 64
DemoFatalCorruption3 0 64
DemoNCIndex 0 88
DemoRestoreOrRepair 0 64
DemoRestoreOrRepairCopy 0 64
DemoSuspect 0 64
FileHeaderTest 0 96
HA2008 0 88
master 1 302
model 0 56
msdb 5 622
MultiFileDB 0 96
SalesDB 0 123
SalesDB_Snapshot 0 21
shrinktest 0 88
tempdb 52 88
I also changed the “Order By” clause to sort alphabetically by DB name (rather than DB ID).
Scott R.
Nice – thanks!
I ran this script on a few production servers & noticed that tempdb seems to always be the DB with the most dirty pages. Even busy DBs tended to have only a few hundred pages dirty at max, whilst tempdb was often in the 10’s of thousands.
Any explanation as to why this is the case? Are they not checkpointed perhaps?
Hey Greg – tempdb does not have automatic checkpoints the same as user databases, as there’s no such thing as crash recovery of tempdb. As such, the only automatic checkpoint of tempdb is when the transaction log becomes 70% full. This is what’s causing the behavior you’re seeing. Cheers
Hi Paul –
Thanks for the script but the original question was regarding the checkpoint how does it know which are dirty pages.
Does it scan all the pages or goes to the particular page because it is hashed.
Thanks
Scans looking for buffer with a DIRTY bit set.
What is “clean page” in the above script?
A clean page is one that has not been changed since it was last read from or written to disk. A dirty page has been changed.
Paul, i runned the script and found some dbs showing clean page count with more than 10000 , does it mean the clean pages stored in the buffer pool causes the memory occupied for sql engine ?
Yes, anything in the buffer pool is using memory (8KB per page) but that’s perfectly normal.
When I see lot of ‘dirtypages’ in different databases should I issue run ‘checkpoint’ to clears dirty pages? or what do I need to do? Please explain.
No – you don’t need to do anything. SQL Server will automatically checkpoint the various databases over time.
Hello Paul,
I’m facing with a question that drives me crazy:
I try to figure out if plan cache (like other memory areas) are included in buffer pool.
I consider the output of sys.dm_os_memory_clerks, where MEMORYCLERK_SQLBUFFERPOOL, MEMORYCLERK_SQLCONNECTIONPOOL, CACHESTORE_SQLCP seem to be distinct memory areas. From that output I understand that plan cache (CACHESTORE_SQLCP + CACHESTORE_OBJCP + CACHESTORE_PHDR + CACHESTORE_XPROC) is not included in MEMORYCLERK_SQLBUFFERPOOL because they are different clerk types.
At the same time if I run DBCC FREEPROCCACHE, MEMORYCLERK_SQLBUFFERPOOL is not touched at all.
The same I say about connection memory (MEMORYCLERK_SQLCONNECTIONPOOL) and others…
Am I correct?
Thanks!
It’s complicated. See the various blog posts on memory on Slava Ok’s old blog.