Time for the first post in the Inside the Storage Engine series. I’m going to focus on SQL Server 2005 in this series and I’ll point out major differences between 2005 and previous versions. Please drop me a line if there’s something you’d like to see explained and demo’d. Before jumping into how things work, I’d like to go over two commands I’ll be using a lot – DBCC PAGE and DBCC IND. These are both undocumented and unsupported commands, but are very safe to use as they’re used extensively inside and outside Microsoft when troubleshooting. Nevertheless, use at your own risk. They’re quite well known in the SQL community and I and others have publicized them before (I even demo’d them last year at ITForum in Spain). To illustrate their use, I’m going to use a simple script I wrote to prove that page splits never roll back. I was having a discussion with someone a while ago about this question and the answer is always no. A page split occurs when an insert or update has to happen at a certain point in an index page, and there’s no room on the page to accomomodate the new or updated record. Page splits are done internally as separate ‘system’ transactions. Once a system transaction commits, it cannot be rolled back – even if the user transaction it was part of rolls back. So, let’s run through the script. First thing to do is create a database containing a table with an index (as page splits only happen in indexes). USE
MASTER;
GO
IF
DATABASEPROPERTY (N‘pagesplittest’, ‘Version’) > 0 DROP DATABASE pagesplittest;GO
CREATE
DATABASE pagesplittest;GO
USE
pagesplittest;GO
CREATE
TABLE t1 (c1 INT, c2 VARCHAR (1000));CREATE
CLUSTERED INDEX t1c1 ON t1 (c1);GO
Now I’m going to fill up a page in the index, but leave a gap in the c1 values so that I can force a page split by inserting the missing key value.
INTO t1 VALUES (1, REPLICATE (‘a’, 900));
INSERT
INSERT
INTO t1 VALUES (2, REPLICATE (‘b’, 900));INSERT
INTO t1 VALUES (3, REPLICATE (‘c’, 900));INSERT
INTO t1 VALUES (4, REPLICATE (‘d’, 900));— leave a gap at 5
INSERT
INTO t1 VALUES (6, REPLICATE (‘f’, 900));INSERT
INTO t1 VALUES (7, REPLICATE (‘g’, 900));INSERT
INTO t1 VALUES (8, REPLICATE (‘h’, 900));INSERT
INTO t1 VALUES (9, REPLICATE (‘i’, 900));GO
I can find out what the first index page is using the DBCC IND command:
IND (‘pagesplittest’, ‘t1’, 1);
DBCC
GO
This command list all the pages that are allocated to an index. Here’s the output in this case:
The columns mean:
PageFID – the file ID of the page
PagePID – the page number in the file
IAMFID – the file ID of the IAM page that maps this page (this will be NULL for IAM pages themselves as they’re not self-referential)
IAMPID – the page number in the file of the IAM page that maps this page
ObjectID – the ID of the object this page is part of
IndexID – the ID of the index this page is part of
PartitionNumber – the partition number (as defined by the partitioning scheme for the index) of the partition this page is part of
PartitionID – the internal ID of the partition this page is part of
iam_chain_type – see IAM chains and allocation units in SQL Server 2005
PageType – the page type. Some common ones are:
1 – data page
2 – index page
3 and 4 – text pages
8 – GAM page
9 – SGAM page
10 – IAM page
11 – PFS page
IndexLevel – what level the page is at in the index (if at all). Remember that index levels go from 0 at the leaf to N at the root page (except in clustered indexes in SQL Server 2000 and 7.0 – where there’s a 0 at the leaf level (data pages) and a 0 at the next level up (first level of index pages))
NextPageFID and NextPagePID – the page ID of the next page in the doubly-linked list of pages at this level of the index
PrevPageFID and PrevPagePID – the page ID of the previous page in the doubly-linked list of pages at this level of the index
So you can see we’ve got a single page clustered index with an IAM page. Note that the page IDs returned may differ on your server. Let’s look at the data page:
TRACEON (3604);
DBCC
GO
DBCC
PAGE (pagesplittest, 1, 143, 3);GO
The traceflag is to make the output of DBCC PAGE go to the console, rather than to the error log. The syntax for DBCC PAGE is:
dbcc page ( {‘dbname’ | dbid}, filenum, pagenum [, printopt={0|1|2|3} ])
The filenum and pagenum parameters are taken from the page IDs that come from various system tables and appear in DBCC or other system error messages. A page ID of, say, (1:143) has filenum = 1 and pagenum = 143.
The printopt parameter has the following meanings:
0 – print just the page header
1 – page header plus per-row hex dumps and a dump of the page slot array (unless its a page that doesn’t have one, like allocation bitmaps)
2 – page header plus whole page hex dump
3 – page header plus detailed per-row interpretation
The per-row interpretation works for all page types, including the allocation bitmaps. In our case, we asked for a detailed output. I’ll explain the various parts of the output in a post about the anatomy of a page. Here’s the output from DBCC PAGE, with a bunch of the repeated per-row info removed for brevity:
PAGE: (1:143)
BUFFER:
BUF @0x02C49720
bpage = 0x05400000 bhash = 0x00000000 bpageno = (1:143)
bdbid = 8 breferences = 0 bUse1 = 22163
bstat = 0xc0010b blog = 0x32159bb bnext = 0x00000000
PAGE HEADER:
Page @0x05400000
m_pageId = (1:143) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x8000
m_objId (AllocUnitId.idObj) = 68 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594042384384
Metadata: PartitionId = 72057594038386688 Metadata: IndexId = 1
Metadata: ObjectId = 2073058421 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 8 m_slotCnt = 8 m_freeCnt = 744
m_freeData = 7432 m_reservedCnt = 0 m_lsn = (18:113:2)
m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0
m_tornBits = 0
Allocation Status
GAM (1:2) = ALLOCATED SGAM (1:3) = ALLOCATED
PFS (1:1) = 0x60 MIXED_EXT ALLOCATED 0_PCT_FULL DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED
Slot 0 Offset 0x60 Length 917
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
Memory Dump @0x5C24C060
00000000: 30000800 01000000 0300f802 00110095 †0……………
00000010: 03616161 61616161 61616161 61616161 †.aaaaaaaaaaaaaaa
<snip> I’ve removed this section to save space
00000380: 61616161 61616161 61616161 61616161 †aaaaaaaaaaaaaaaa
00000390: 61616161 61††††††††††††††††††††††††††aaaaa
UNIQUIFIER = [NULL]
Slot 0 Column 1 Offset 0x4 Length 4
c1 = 1
Slot 0 Column 2 Offset 0x11 Length 900
c2 = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
a
Slot 1 Offset 0x3f5 Length 917
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
Memory Dump @0x5C24C3F5
<snip> And again…
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
As you can see from the output, each row is 917 bytes long and there’s only 744 bytes free (look at the m_freecnt value in the PAGE HEADER section). This means that we can’t insert another row on that page of the same length – there just isn’t space – but that’s what we’re going to do! Remember that this page currently has nine rows on it. Let’s force a page split:
TRAN;
BEGIN
GO
INSERT
INTO t1 VALUES (5, REPLICATE (‘a’, 900));GO
Now we know there wasn’t enough room so the page must have split. Let’s check DBCC IND again to see if another page was allocated to the index – here’s the output:
Two pages have been added – an index page and another data page. Before we added the extra row and caused the page split, the index only needed one page. Now that there are two data pages, there needs to be an index page to allow searches through the index b-tree. Let’s take a look at the two data pages to see which rows are stored on which page. Doing DBCC PAGE on them shows that page (1:143) has 5 rows, with c1 values 1 through 5, and page (1:154) has the 4 rows with c1 values 6 through 9 (I’m not going to post all the DBCC PAGE output – that would make the post way too long and it gives you an incentive to try the commands out). This is what we’d expect, as the page split occurs at the insertion point, and the row being inserted is put onto the page that split.
Now let’s rollback the user transaction and see what happens:
TRAN;
ROLLBACK
GO
Running the DBCC PAGE commands again shows that the index structure remains the same as after the split. Page (1:154) has the 4 rows on it from the split and page (1:143) has the other rows but not the one we inserted in the explicit transaction.
So, proof that a page split is never rolled back. I’ll be making much more use of these two DBCC commands in future posts and I’ll do the page anatomy one later this week. Let me know if there’s anything in particular you’d like to see described in this series.
5 thoughts on “Inside the Storage Engine: Using DBCC PAGE and DBCC IND to find out if page splits ever roll back”
It’s so great!but I can’t know where and how the index key value is before split in page(1,143),here is key value not data,if possible,can you show me the detail,thanks.
Excellent work
excellent post, had good time reading and executing. One minor thing, I wasn’t not able to run DBCC IND (‘pagesplittest’, ‘t1’, 1) during an open transaction of INSERT INTO t1 VALUES (5, REPLICATE (‘a’, 900)). DBCC IND was blocked until I Rolled back the INSERT, although internal split was obviously executed as you mentioned above. Could see that after the ROLLBACK.
Not sure what you were doing, but it works perfectly for me on all current versions. DBCC IND doesn’t acquire page locks so there’s no way for it to be blocked by the insert.
i faced the same issue , you can avoid it by running DBCC IND (‘pagesplittest’, ‘t1’, 1) on the same session window where an open transaction runs INSERT INTO t1 VALUES (5, REPLICATE (‘a’, 900)).