Are your statistics (distributions) are really up to date ? - PART III

mholzbauer's picture

Part I an d Part II discussed the behavior of UST MEDIUM/HIGH on "normal"

or non-/un-fragmented/-partitioned tables.

This Part shows how it works on on fragmented/partitioned tables.





First I generate a fragmented/partitioned table with 3 partitions

and load some test data into it.

 

$ dbaccess stores_demo - <
CREATE TABLE tst

  (

    f1 INTEGER

  )

  FRAGMENT BY EXPRESSION

    PARTITION tst_p1 (f1 <= 333333 )


              IN datadbs,

    PARTITION tst_p2 ((f1 > 333333 ) AND (f1 <= 666666 ) )

              IN datadbs,

    REMAINDER IN datadbs;

EOF





$ UNLFILE=zzz.unl

$ echo "" | awk '{for(i=1;i<=1000000;i++){printf("%d;\n",i)}}' > $UNLFILE

$ echo "LOAD FROM $UNLFILE DELIMITER ';' INSERT INTO tst" | dbaccess stores_demo -



$ dbaccess stores_demo -



Database selected.



> SELECT COUNT(*) FROM tst;



      (count(*))



         1000000



1 row(s) retrieved.



> SELECT FIRST 10 * FROM tst;





         f1



          1

          2

          3

          4

          5

          6

          7

          8

          9

         10



10 row(s) retrieved.



> SELECT tabname, colname, mode, constr_time FROM systables t, syscolumns c,

> sysdistrib d WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND c.colno = d.colno

> AND d.seqno = 1 AND t.tabtype = 'T' AND t.tabname = 'tst';



No rows found.



> UPDATE STATISTICS HIGH FOR TABLE tst(f1);



Statistics updated.



> SELECT tabname, colname, mode, constr_time FROM systables t, syscolumns c,

> sysdistrib d WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND c.colno = d.colno

> AND d.seqno = 1 AND t.tabtype = 'T' AND t.tabname = 'tst';





tabname      tst

colname      f1

mode         H

constr_time  2015-03-26 10:03:59.00000



1 row(s) retrieved.



> SELECT c.colname[1,12], d.constr_time, d.mode, d.nupdates::INT nup,

> d.ninserts::INT nins FROM systables t, syscolumns c, sysdistrib d

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname      constr_time               mode         nup        nins



f1           2015-03-26 10:03:59.00000 H              0     1000000



1 row(s) retrieved.



> SELECT t.tabname[1,12], c.colname[1,12], p.nupdates::INT nup,

> p.ninserts::INT nins FROM systables t, syscolumns c, sysmaster:sysptnhdr p,

> sysdistrib d WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND c.colno = d.colno

> AND d.seqno = 1 AND t.partnum = p.partnum AND t.tabname = 'tst';





tabname      colname              nup        nins





No rows found.



> SELECT t.tabname[1,12], c.colname[1,12], p.nupdates::INT nup,

> p.ninserts::INT nins FROM systables t, syscolumns c, sysmaster:sysptnhdr p,

> sysfragdist d WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND c.colno = d.colno

> AND d.seqno = 1 AND t.partnum = p.partnum AND t.tabname = 'tst';





tabname      colname              nup        nins





No rows found.




> SELECT tabid FROM systables WHERE tabname = 'tst';





      tabid



        114



1 row(s) retrieved.



> SELECT * FROM sysfragments WHERE tabid = 114;





fragtype    T

tabid       114

indexname   

colno       0

partn       2097476

strategy    E

location    L

servername  

evalpos     0

exprtext    

(f1 <= 333333 )

exprbin    

exprarr    

flags       0

dbspace     datadbs

levels      0

npused      1323.000000000

nrows       333333.0000000

clust       0.00

partition   tst_p1

version     2

nupdates    0.00

ndeletes    0.00

ninserts    0.00



fragtype    T

tabid       114

indexname   

colno       0

partn       2097477

strategy    E

location    L

servername  

evalpos     1

exprtext    

((f1 > 333333 ) AND (f1 <= 666666 ) )

exprbin    

exprarr    

flags       0

dbspace     datadbs

levels      0

npused      1323.000000000

nrows       333333.0000000

clust       0.00

partition   tst_p2

version     2

nupdates    0.00

ndeletes    0.00

ninserts    0.00



fragtype    T

tabid       114

indexname   

colno       0

partn       2097478

strategy    E

location    L

servername  

evalpos     2

exprtext    

remainder

exprbin    

exprarr    

flags       1

dbspace     datadbs

levels      0

npused      1323.000000000

nrows       333334.0000000

clust       0.00

partition   datadbs

version     2

nupdates    0.00

ndeletes    0.00

ninserts    0.00



3 row(s) retrieved.



> SELECT * FROM sysfragdist WHERE tabid = 114;





No rows found.



> INSERT INTO tst VALUES(1000001);



1 row(s) inserted.



> UPDATE STATISTICS LOW FOR TABLE tst DROP DISTRIBUTIONS;



Statistics updated.



> SELECT count(*) FROM tst;





      (count(*))



         1000001



1 row(s) retrieved.



> UPDATE STATISTICS HIGH FOR TABLE tst(f1);



Statistics updated.



> SELECT * FROM sysfragdist WHERE tabid = 114;





tabid             114

fragid            2097476

colno             1

seqno             1

mode              H

resolution        0.5

confidence        0.00

rowssmpld         333333.0000000

constr_time       2015-03-26 10:25:36.00000

ustbuildduration    0:00:00.39438

ustnrows          333333.0000000

minibinsize       16.00000000000

nupdates          0.00

ndeletes          0.00

ninserts          333333.0000000

version           0

dbsnum            4

encdist                                                                       

                              



tabid             114

fragid            2097477

colno             1

seqno             1

mode              H

resolution        0.5

confidence        0.00

rowssmpld         333333.0000000

constr_time       2015-03-26 10:25:36.00000

ustbuildduration    0:00:00.31651

ustnrows          333333.0000000

minibinsize       16.00000000000

nupdates          0.00

ndeletes          0.00

ninserts          333333.0000000

version           0

dbsnum            4

encdist                                                                       

                              



tabid             114

fragid            2097478

colno             1

seqno             1

mode              H

resolution        0.5

confidence        0.00

rowssmpld         333335.0000000

constr_time       2015-03-26 10:25:36.00000

ustbuildduration    0:00:00.31581

ustnrows          333335.0000000

minibinsize       16.00000000000

nupdates          0.00

ndeletes          0.00

ninserts          333335.0000000

version           0

dbsnum            4

encdist                                                                       

                              



3 row(s) retrieved.

--

-- after inserting 1mio + 1 row sysfragments + running update statistics (high)

-- ninserts field is updated in sysfragdist.

--




> SELECT (TRIM(t.tabname)||'['||TRIM(f.partition)||']')::CHAR(18) tabname,

> c.colname[1,12],p.nupdates::INT nup, p.ninserts::INT nins FROM systables t,

> syscolumns c, sysmaster:sysptnhdr p, sysfragdist d,sysfragments f

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND t.tabid = f.tabid

> AND d.fragid = f.partn AND c.colno = d.colno AND d.seqno = 1

> AND f.partn = p.partnum AND t.tabname = 'tst';





tabname            colname              nup        nins



tst[tst_p1]        f1                     0      333333

tst[tst_p2]        f1                     0      333333

tst[datadbs]       f1                     0      333335



3 row(s) retrieved.







> SELECT c.colname[1,12], d.constr_time, d.mode, d.nupdates::INT nup,

> d.ninserts::INT nins FROM systables t,syscolumns c, sysdistrib d

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname      constr_time               mode         nup        nins



f1           2015-03-26 10:25:37.00000 H              0     1000001



1 row(s) retrieved.



> SELECT (TRIM (c.colname)||'['||TRIM(f.partition)||']')::CHAR(18) colname,

> d.constr_time, d.mode, d.nupdates::INT nup, d.ninserts::INT nins FROM systables t,

> syscolumns c, sysfragments f, sysfragdist d WHERE t.tabid = c.tabid

> AND t.tabid = d.tabid AND t.tabid = f.tabid AND d.fragid = f.partn

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname            constr_time               mode         nup        nins



f1[tst_p1]         2015-03-26 10:25:36.00000 H              0      333333

f1[tst_p2]         2015-03-26 10:25:36.00000 H              0      333333

f1[datadbs]        2015-03-26 10:25:36.00000 H              0      333335



3 row(s) retrieved.



> UPDATE tst SET f1=f1 WHERE f1 <= 33333;



33333 row(s) updated.



> SELECT (TRIM(t.tabname)||'['||TRIM(f.partition)||']')::CHAR(18) tabname,

> c.colname[1,12],p.nupdates::INT nup, p.ninserts::INT nins FROM systables t,

> syscolumns c, sysmaster:sysptnhdr p, sysfragdist d,sysfragments f

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND t.tabid = f.tabid

> AND d.fragid = f.partn AND c.colno = d.colno AND d.seqno = 1

> AND f.partn = p.partnum AND t.tabname = 'tst';





tabname            colname              nup        nins



tst[tst_p1]        f1                 33333      333333

tst[tst_p2]        f1                     0      333333

tst[datadbs]       f1                     0      333335



3 row(s) retrieved.



> SELECT (TRIM (c.colname)||'['||TRIM(f.partition)||']')::CHAR(18) colname,

> d.constr_time, d.mode, d.nupdates::INT nup, d.ninserts::INT nins FROM systables t,

> syscolumns c, sysfragments f, sysfragdist d WHERE t.tabid = c.tabid

> AND t.tabid = d.tabid AND t.tabid = f.tabid AND d.fragid = f.partn

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname            constr_time               mode         nup        nins



f1[tst_p1]         2015-03-26 10:25:36.00000 H              0      333333

f1[tst_p2]         2015-03-26 10:25:36.00000 H              0      333333

f1[datadbs]        2015-03-26 10:25:36.00000 H              0      333335



3 row(s) retrieved.



> UPDATE STATISTICS HIGH FOR TABLE tst(f1) DISTRIBUTIONS ONLY;



Statistics updated.



> SELECT (TRIM (c.colname)||'['||TRIM(f.partition)||']')::CHAR(18) colname,

> d.constr_time, d.mode, d.nupdates::INT nup, d.ninserts::INT nins FROM systables t,

> syscolumns c, sysfragments f, sysfragdist d WHERE t.tabid = c.tabid

> AND t.tabid = d.tabid AND t.tabid = f.tabid AND d.fragid = f.partn

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname            constr_time               mode         nup        nins



f1[tst_p1]         2015-03-26 10:25:36.00000 H              0      333333

f1[tst_p2]         2015-03-26 10:25:36.00000 H              0      333333

f1[datadbs]        2015-03-26 10:25:36.00000 H              0      333335



3 row(s) retrieved.



> UPDATE tst SET f1=f1 WHERE f1 <= 1;



1 row(s) updated.



> UPDATE STATISTICS HIGH FOR TABLE tst(f1) DISTRIBUTIONS ONLY;



Statistics updated.



> SELECT (TRIM (c.colname)||'['||TRIM(f.partition)||']')::CHAR(18) colname,

> d.constr_time, d.mode, d.nupdates::INT nup, d.ninserts::INT nins FROM systables t,

> syscolumns c, sysfragments f, sysfragdist d WHERE t.tabid = c.tabid

> AND t.tabid = d.tabid AND t.tabid = f.tabid AND d.fragid = f.partn

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname            constr_time               mode         nup        nins



f1[tst_p1]         2015-03-26 10:25:36.00000 H              0      333333

f1[tst_p2]         2015-03-26 10:25:36.00000 H              0      333333

f1[datadbs]        2015-03-26 10:25:36.00000 H              0      333335



3 row(s) retrieved.



> SELECT (TRIM(t.tabname)||'['||TRIM(f.partition)||']')::CHAR(18) tabname,

> c.colname[1,12],p.nupdates::INT nup, p.ninserts::INT nins FROM systables t,

> syscolumns c, sysmaster:sysptnhdr p, sysfragdist d,sysfragments f

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND t.tabid = f.tabid

> AND d.fragid = f.partn AND c.colno = d.colno AND d.seqno = 1

> AND f.partn = p.partnum AND t.tabname = 'tst';





tabname            colname              nup        nins



tst[tst_p1]        f1                 33334      333333

tst[tst_p2]        f1                     0      333333

tst[datadbs]       f1                     0      333335



3 row(s) retrieved.





> UPDATE tst SET f1=f1 WHERE f1 <= 1;



1 row(s) updated.



> UPDATE STATISTICS HIGH FOR TABLE tst(f1) DISTRIBUTIONS ONLY;



Statistics updated.



> SELECT (TRIM (c.colname)||'['||TRIM(f.partition)||']')::CHAR(18) colname,

> d.constr_time, d.mode, d.nupdates::INT nup, d.ninserts::INT nins FROM systables t,

> syscolumns c, sysfragments f, sysfragdist d WHERE t.tabid = c.tabid

> AND t.tabid = d.tabid AND t.tabid = f.tabid AND d.fragid = f.partn

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname            constr_time               mode         nup        nins



f1[tst_p1]         2015-03-26 10:48:29.00000 H          33335      333333

f1[tst_p2]         2015-03-26 10:25:36.00000 H              0      333333

f1[datadbs]        2015-03-26 10:25:36.00000 H              0      333335



3 row(s) retrieved.



> SELECT (TRIM(t.tabname)||'['||TRIM(f.partition)||']')::CHAR(18) tabname,

> c.colname[1,12],p.nupdates::INT nup, p.ninserts::INT nins FROM systables t,

> syscolumns c, sysmaster:sysptnhdr p, sysfragdist d,sysfragments f

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND t.tabid = f.tabid

> AND d.fragid = f.partn AND c.colno = d.colno AND d.seqno = 1

> AND f.partn = p.partnum AND t.tabname = 'tst';





tabname            colname              nup        nins



tst[tst_p1]        f1                 33335      333333

tst[tst_p2]        f1                     0      333333

tst[datadbs]       f1                     0      333335



3 row(s) retrieved.



> SELECT tabname, colname, mode, constr_time FROM systables t, syscolumns c,

> sysdistrib d WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND c.colno = d.colno

> AND d.seqno = 1 AND t.tabtype = 'T' AND t.tabname = 'tst';





tabname      tst

colname      f1

mode         H

constr_time  2015-03-26 10:48:30.00000



1 row(s) retrieved.



> SELECT c.colname[1,12], d.constr_time, d.mode, d.nupdates::INT nup,

> d.ninserts::INT nins FROM systables t, syscolumns c, sysdistrib d

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND c.colno = d.colno

> AND d.seqno = 1 AND t.tabname = 'tst';





colname      constr_time               mode         nup        nins



f1           2015-03-26 10:48:30.00000 H          33335     1000001



1 row(s) retrieved.



> SELECT nupdates FROM sysdistrib WHERE seqno = 1 AND tabid = 114;





      nupdates



33335.00000000



1 row(s) retrieved.



> SELECT nupdates FROM sysfragdist WHERE seqno = 1 AND tabid = 114;





      nupdates



33335.00000000

          0.00

          0.00



3 row(s) retrieved.





> UPDATE tst SET f1=f1 WHERE f1 > 333333 AND f1 < (333333+33334);



33333 row(s) updated.



> UPDATE tst SET f1=f1 WHERE f1 = 333334;



1 row(s) updated.



> UPDATE STATISTICS HIGH FOR TABLE tst(f1) DISTRIBUTIONS ONLY;



Statistics updated.



> SELECT (TRIM (c.colname)||'['||TRIM(f.partition)||']')::CHAR(18) colname,

> d.constr_time, d.mode, d.nupdates::INT nup, d.ninserts::INT nins FROM systables t,

> syscolumns c, sysfragments f, sysfragdist d WHERE t.tabid = c.tabid

> AND t.tabid = d.tabid AND t.tabid = f.tabid AND d.fragid = f.partn

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname            constr_time               mode         nup        nins



f1[tst_p1]         2015-03-26 10:48:29.00000 H          33335      333333

f1[tst_p2]         2015-03-26 10:25:36.00000 H              0      333333

f1[datadbs]        2015-03-26 10:25:36.00000 H              0      333335



3 row(s) retrieved.



> SELECT (TRIM(t.tabname)||'['||TRIM(f.partition)||']')::CHAR(18) tabname,

> c.colname[1,12],p.nupdates::INT nup, p.ninserts::INT nins FROM systables t,

> syscolumns c, sysmaster:sysptnhdr p, sysfragdist d,sysfragments f

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND t.tabid = f.tabid

> AND d.fragid = f.partn AND c.colno = d.colno AND d.seqno = 1

> AND f.partn = p.partnum AND t.tabname = 'tst';





tabname            colname              nup        nins



tst[tst_p1]        f1                 33335      333333

tst[tst_p2]        f1                 33334      333333

tst[datadbs]       f1                     0      333335



3 row(s) retrieved.



> UPDATE tst SET f1=f1 WHERE f1 = 333334;



1 row(s) updated.



> UPDATE STATISTICS HIGH FOR TABLE tst(f1) DISTRIBUTIONS ONLY;



Statistics updated.



> SELECT (TRIM (c.colname)||'['||TRIM(f.partition)||']')::CHAR(18) colname,

> d.constr_time, d.mode, d.nupdates::INT nup, d.ninserts::INT nins FROM systables t,

> syscolumns c, sysfragments f, sysfragdist d WHERE t.tabid = c.tabid

> AND t.tabid = d.tabid AND t.tabid = f.tabid AND d.fragid = f.partn

> AND c.colno = d.colno AND d.seqno = 1 AND t.tabname = 'tst';





colname            constr_time               mode         nup        nins



f1[tst_p1]         2015-03-26 10:48:29.00000 H          33335      333333

f1[tst_p2]         2015-03-26 11:01:59.00000 H          33335      333333

f1[datadbs]        2015-03-26 10:25:36.00000 H              0      333335



3 row(s) retrieved.



> SELECT (TRIM(t.tabname)||'['||TRIM(f.partition)||']')::CHAR(18) tabname,

> c.colname[1,12],p.nupdates::INT nup, p.ninserts::INT nins FROM systables t,

> syscolumns c, sysmaster:sysptnhdr p, sysfragdist d,sysfragments f

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND t.tabid = f.tabid

> AND d.fragid = f.partn AND c.colno = d.colno AND d.seqno = 1

> AND f.partn = p.partnum AND t.tabname = 'tst';





tabname            colname              nup        nins



tst[tst_p1]        f1                 33335      333333

tst[tst_p2]        f1                 33335      333333

tst[datadbs]       f1                     0      333335



3 row(s) retrieved.



> SELECT c.colname[1,12], d.constr_time, d.mode, d.nupdates::INT nup,

> d.ninserts::INT nins FROM systables t, syscolumns c, sysdistrib d

> WHERE t.tabid = c.tabid AND t.tabid = d.tabid AND c.colno = d.colno

> AND d.seqno = 1 AND t.tabname = 'tst';





colname      constr_time               mode         nup        nins



f1           2015-03-26 11:01:59.00000 H          66670     1000001



1 row(s) retrieved.


 

SUMMARY





- we have fragment level statistics (sysdistrib + sysfragdist) if we have more than 1 mio rows

  and table is fragmented/partitioned.

- table level statistics (only sysdistrib) if we have less or equal 1 mio rows.

  even if table is fragmented/partitioned.

- with fragment level statistics, sysfragdist stores statistic data (inserts/updates/deletes) for each fragment

  and sysdistrib stores the sum of all fragments.

- fragment statistics (sysfragdist) will be updated if 10% (STATCHANGE) rows are changed

  (updated/inserted/deleted)

- fragment level statistics are stored in a sbspace, so you need to define SYSSBSPACENAME

- we can use UPDATE STATISTICS MEDIUM/HIGH ... FORCE to "force" a "real" UST MED/HIGH



From the manuals:



Note: The SYSSBSPACENAME configuration parameter, which must be set when

the database server instance is initialized, specifies the sbspace in which the

database server stores fragment-level data distribution statistics. These are stored

as BLOB objects in the encdist column of the syfragsdist system catalog table. For

the database server to support fragment level statistics, the SYSSBSPACENAME

configuration parameter setting must specify an existing sbspace.



If you use the Statistics Options clause to set the STATLEVEL property to

FRAGMENT, the database server returns an error -9814 ("Invalid default sbspace

name") if either of the following is true:



- The SYSSBSPACENAME configuration parameter is not set

- The sbspace that SYSSBSPACENAME specifies was not properly allocated by the

  onspaces -c -S command.