Now that I’ve done all the business-related blog posts, back to the good stuff to stop people complaining!
Something that’s cropped up a few times over the summer so far is people trying to repair boot page corruptions.
First off, what’s a boot page? Every database has a single page that stores critical information about the database itself. It’s always page 9 in file 1 (the first file in the PRIMARY filegroup). You can examine the page using DBCC PAGE and it will interpret all the fields for you, but there’s another command, DBCC DBINFO, that also dumps all this info (in fact the DBCC PAGE code calls the same underlying dumping code). This command is undocumented and unsupported but widely known and ‘documented’ in lots of places on the web – given that it uses the same code as DBCC PAGE, it’s just as safe to use IMHO.
Note that the output of DBCC DBINFO and DBCC PAGE of the boot page may change from release to release!
So what’s on the boot page?
DBCC DBINFO (‘BootPageTest’);
GO
DBINFO STRUCTURE:
DBINFO @0x5BF6EF84
dbi_dbid = 19 dbi_status = 65536 dbi_nextid = 2073058421
dbi_dbname = BootPageTest dbi_maxDbTimestamp = 2000 dbi_version = 611
dbi_createVersion = 611 dbi_ESVersion = 0
dbi_nextseqnum = 1900-01-01 00:00:00.000 dbi_crdate = 2008-07-10 15:53:18.843
dbi_filegeneration = 0
dbi_checkptLSN
m_fSeqNo = 41 m_blockOffset = 29 m_slotId = 55
dbi_RebuildLogs = 0 dbi_dbccFlags = 2
dbi_dbccLastKnownGood = 1900-01-01 00:00:00.000
dbi_dbbackupLSN
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_oldestBackupXactLSN
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_LastLogBackupTime = 1900-01-01 00:00:00.000
dbi_differentialBaseLSN
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_createIndexLSN
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_versionChangeLSN
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_familyGUID = a4e88c13-b4cf-4320-834e-92b237244d4b
dbi_recoveryForkNameStack
entry 0
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
m_guid = a4e88c13-b4cf-4320-834e-92b237244d4b
entry 1
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
m_guid = 00000000-0000-0000-0000-000000000000
dbi_differentialBaseGuid = 00000000-0000-0000-0000-000000000000 dbi_firstSysIndexes = 0001:00000014
dbi_collation = 872468488 dbi_category = 0 dbi_maxLogSpaceUsed = 231936
dbi_localState = 0 dbi_roleSequence = 0
dbi_failoverLsn
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_dbmRedoLsn
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_dbmOldestXactLsn
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_dbMirrorId = 00000000-0000-0000-0000-000000000000
dbi_pageUndoLsn
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_disabledSequence = 0
dbi_dvSplitPoint
m_fSeqNo = 0 m_blockOffset = 0 m_slotId = 0
dbi_CloneCpuCount = 0 dbi_CloneMemorySize = 0
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
There’s all kinds on interesting things in there, for instance:
-
dbi_version and dbi_createversion: the physical version number of the database (and when it was created). See question 1 in the August 2008 SQL Q&A column in TechNet Magazine for an explanation (see here).
-
dbi_RebuildLogs: a count of the number of times the transaction log has been rebuilt for the database. PSS can use this to tell whether corruption problems could have been caused by DBAs rebuilding the log
-
dbi_dbccLastKnownGood: the completion time of the last ‘clean’ run of DBCC CHECKDB
-
a bunch of different LSNs related to checkpoint, backups, database mirroring
-
dbi_LastLogBackupTime: self-explanatory
-
dbi_differentialBaseGuid: the GUID generated by the last full database backup. Differential backups can only be restored on top of a matching full backup – so an out-of-band full backup could screw-up your disaster recovery – see this blog post for more info.
Now, what about if this page is corrupt in some way? I corrupted the BootPageTest database to have a corrupt boot page. Let’s see what happens:
USE BootPagetest;
GO
Msg 913, Level 16, State 4, Line 1
Could not find database ID 19. Database may not be activated yet or may be in transition. Reissue the query once the database is available. If you do not think this error is due to a database that is transitioning its state and this error continues to occur, contact your primary support provider. Please have available for review the Microsoft SQL Server error log and any additional information relevant to the circumstances when the error occurred.
Okay – let’s try setting the database into EMERGENCY mode:
ALTER DATABASE BootPageTest SET EMERGENCY;
GO
Msg 824, Level 24, State 2, Line 1
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xcdee22fa; actual: 0xcb6ea2fa). It occurred during a read of page (1:9) in database ID 19 at offset 0x00000000012000 in file ‘C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\BootPageTest.mdf’. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Hmm. What about running DBCC CHECKDB?
DBCC CHECKDB (‘BootPageTest’) WITH NO_INFOMSGS, ALL_ERRORMSGS;
GO
Msg 922, Level 14, State 1, Line 1
Database ‘BootPageTest’ is being recovered. Waiting until recovery is finished.
It’s not looking good. Obviously the change to EMERGENCY mode couldn’t complete properly. What’s the database status?
SELECT [state_desc] FROM sys.databases WHERE [name] = ‘BootPageTest’;
GO
state_desc
————————————————————
RECOVERY_PENDING
The boot page is inaccessible so in effect the database is inaccessible too, this is what the database state means in this case.
So what does this mean? If the boot page is corrupt, you can’t run DBCC CHECKDB so you can’t possibly run repair, and you can’t put the database into EMERGENCY mode so you can’t extract data into a new database. It means that there’s NO WAY to recover from a corrupt boot page EXCEPT to restore from backups. One more reason to have backups… [Edit 2015: You can swap out the entire boot page using a hex editor with one from a restored backup.]
5 thoughts on “Search Engine Q&A #20: Boot pages, and boot page corruption”
Could this be enough reason to have two copies of the boot page? Similar to the MFT (Master File Table) in NTFS where there are two copies kept in case the MFT gets corrupted.
Absolutely! And it would be great if the file header pages at the start of each file were duplicated somewhere else too. There are know plans that I know of to do this – it was discussed a few times when I was in the team but nothing ever came of it. It’s a pretty rare corruption (the chance of that page being corrupted are 1 / #_of_pages_in_database) and can be mitigated with backups.
However, as you say, NTFS does duplicate the MFT, as did the VMS filesystem that I used to be responsible for when I was at DEC.
Thanks
Would you get similar errors if the file header pages of any of the files in Primary were damaged?
What does the dbi_firstSysIndexes refer to? The first IAM of the 2005 equivalent to sysindexes?
How to create two copies of boot page/ file header information
What do you mean two copies?