Resolving In Place Alters - Part 3

aford's picture

In Resolving In Place Alters - Part 2 I decided that I want to try to identify only the pages with a pending IPA and only "fix" those data pages in an attempt to speed up the resolving of IPAs and limit the work load added to the system when doing so.

To do this, I will need to read the actual data pages stored on disk and use information in the page header and slot table to determine if a page needs to be fixed. More on this in the next blog, today I just want to talk about how to read in the raw data pages.

As far as I know, I only have 2 good options. Bypass the engine and open and read the data in the chunk directly or use the sysmaster:sysrawdsk table. I tried both options and did not notice a performance difference either way, so I decided to use sysrawdsk simply because it made things easier since I can use SQL to query and get the raw data pages.

sysmaster:sysrawdsk looks like this (for those concerned about the Internal Use Only warning, just don't do any of this outside and you'll be fine):

    create table informix.sysrawdsk             { Internal Use Only             }
(
pg_chunk integer, { physical page address - chunk }
pg_offset integer, { physical page address - offset}
offset smallint, { bytes into page }
loc char(23), { location representation }
hexdata char(40), { 16 bytes hexdumped from offset}
ascdata char(16) { 16 bytes ascii-dumped }
);
create unique index informix.sysrawdskidx on sysrawdsk (pg_chunk, pg_offset, offset);

Each row in the table represents 16 bytes of raw data. For example:

select 
hexdata
from
sysrawdsk
where
pg_chunk = 4 and
pg_offset = 1000 and
offset = 64;

hexdata

34313037 30353632 36302020 20202020

Gives me the 16 bytes of data (in hex) that starts on the 64th byte of page 1000 in chunk 4.

We are making progress. Now that I can query the raw data to use to my advantage later, I need to figure out how to find the pages that belong to the table I'm trying to fix. sysmaster to the rescue yet again, this time in the form of the sysextents table.

select
chunk,
offset,
size
from
sysextents
where
dbsname = "mydb" and
tabname = "mytab"
order by
offset asc;

chunk offset size

4 851706 77423
4 929129 77423
4 1006552 77423
4 1083975 77423
4 1161398 77423
4 1238821 77423
4 1316244 77423
4 1393667 77423
4 1471090 77423
4 1548513 77423
4 1625936 77423
4 1703535 524288
4 2227823 524288
4 2752111 524288
4 3276399 524288
4 3800687 524288
4 4324975 524288
4 4849263 524288
4 5373551 524288
....

In review, we now have:

  1. A way to read the raw data pages from sysrawdsk
  2. A way to find the pages that belong to our table

Now we need to find the data pages in these extents, back to good old sysmaster.

The sysmaster:sysptnbit table holds the partition bit maps, this data will tell us if a page we are looking at is a "Data Page with Room for another Row" (bitmap value of 4), a "Data Page without Room for another Row" (bitmap value of 12) or some page type we don't care about. Now would be a good time to mention that my row size is fixed with no variable length fields and one row fits completey on a page so there are no remainder pages. If you have either of these conditions, well things are going to be a little more difficult for you.

{ Partition Bit Maps }
create table informix.sysptnbit
(
pb_partnum integer, { partnum for this partition }
pb_pagenum integer, { logical pagenum represented }
pb_bitmap integer { bitmap value for page }
);
create unique index informix.sysptnbitidx on sysptnbit (pb_partnum, pb_pagenum);

Here things get a little annoying for us. The data in sysptnbit is referenced by logical page and the pages we are looking at in sysrawdsk are referenced by physical page. If you need a refresher course on how to convert a physical page reference (chunk and offset) to a logical page reference (partnum and page number) then here is my best attempt...

Logical page numbers in a partition are sequential starting with page 0 in the first extent. The second page in the first extent is page number 1, the third page in the first extent is page number 2 and so on and so forth. If the first extent has 1000 pages, the last page in the first extent will be logical page number 999 and the first page in the second extent will be logical page number 1000. The second page in the second extent will be page number 1001, the third page in the second extent will be 1002 and so on and so forth. We can use this logic and the sysmaster:sysptnext table to find the logical page number for any physical page.

Here is an example with chunk 4, page 5373552.  

Step 1 - Find the first page of the extent that this page belongs to

select
offset
from
sysextents
where
chunk = 4 and
5373552 between offset and offset + size;

offset

5373551

Step 2 - Find the partnum and extent number that maps to this extent

select
pe_partnum,
pe_extnum
from
sysptnext
where
pe_chunk = 4 and
pe_offset = 5373551; -- first page of the extent that page 5373552 of chunk 4 belongs to

pe_partnum pe_extnum

4194317 1

Step 3 - Find the total number of pages that belong to the extents created before this extent

select
nvl(sum(pe_size), 0) pages
from
sysptnext
where
pe_chunk = 4 and
pe_partnum = 4194317 and
pe_extnum < 1;

pages

77423

Step 4 - We now know the logical page number for the first page of the extent that our page belongs to, just need to calculate our page number

Our Logical Page Number = 77423 + (5373552 - 5373551) = 77424

Taking our new found logical page address (partnum and pagenum) we can query sysptnbit and see if this is a page we are interested in (bitmap value 4 or 12)

select
pb_bitmap
from
sysptnbit
where
pb_partnum = 4194317 and
pb_pagenum = 77424;

pb_bitmap

12

One last thing to note before I conclude Part 3, if no bitmap row had been found then I would know that I am at the end of valid data for this extent and can stop looking at the pages it contains and move on to the next extent.

Next time - Part 4: How to figure out if the data page I just found has a pending IPA or not.

Now, have some ads. I insist.