Storage Model

storage information

The storage footprint for any given database schema or table or column can be obtained by calling one of the storage() table producing functions:

sys.storage(sname varchar(1024), tname varchar(1024), cname varchar(1024))
sys.storage(sname varchar(1024), tname varchar(1024))
sys.storage(sname varchar(1024))
sys.storage()

To illustrate below we see the storage characteristics of the lineitem table in TPCH SF-10. The column width for variable length strings, e.g. l_comment represents the average length found using a sample over the underlying column storage.

sql>select * from sys.storage('tpch','lineitem');
schema  table     column                                 type     mode      location  count  typewidth  columnsize  heapsize  hashes  phash  imprints  sorted  revsorted  unique  orderidx
------  --------  -------------------------------------  -------  --------  --------  -----  ---------  ----------  --------  ------  -----  --------  ------  ---------  ------  --------
tpch    lineitem  l_orderkey                             bigint   writable  26/2601       0          8           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_partkey                              bigint   writable  11/1151       0          8           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_suppkey                              bigint   writable  15/1553       0          8           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_linenumber                           int      writable  05/500        0          4           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_quantity                             decimal  writable  20/2062       0          8           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_extendedprice                        decimal  writable  12/1251       0          8           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_discount                             decimal  writable  12/1213       0          8           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_tax                                  decimal  writable  25/2576       0          8           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_returnflag                           char     writable  12/1247       0          1           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_linestatus                           char     writable  26/2616       0          1           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_shipdate                             date     writable  25/2540       0          4           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_commitdate                           date     writable  16/1663       0          4           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_receiptdate                          date     writable  11/1126       0          4           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_shipinstruct                         char     writable  20/2077       0          1           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_shipmode                             char     writable  23/2344       0          1           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  l_comment                              varchar  writable  17/1754       0          1           0         0       0  false         0  true    true       <null>         0
tpch    lineitem  lineitem_l_orderkey_l_linenumber_pkey  oid      writable  25/2572       0          8           0         0       0  false         0  true    true       <null>         0

You may also use the system convenience views which excludes storage info of system tables:

select * from sys.storage where "schema" = 'tpch' and "table" = 'lineitem';
select * from sys.tablestorage where "schema" = 'tpch' and "table" = 'lineitem';
select * from sys.schemastorage where "schema" = 'tpch';

Note that the table producing storage() functions are faster than the views when filtering on specific schema, table and column.

To estimate the storage footprint for a variation of this scheme, we first construct a relation with the model input. Thereafter it can be updated to reflect the expected database size and varwidth properties. The footprint then becomes available as table producing function storagemodel(). Also be aware that dictionary encoding may have taken place, which leads to a much less storage footprint (see l_shipinstruct). Dictionary encoding is currently a runtime feature, it depends on insertion sequence and the dictionary size. It may lead to an overshoot in the estimated size. Therefore, the size of varchar columns should be taken with a grain of salt.

sql>call storagemodelinit();
sql>select * from storagemodelinput where "schema"='tpch' and "table"='lineitem';
schema  table     column                                 type     typewidth  count  distinct  atomwidth  reference  sorted  unique  isacolumn
------  --------  -------------------------------------  -------  ---------  -----  --------  ---------  ---------  ------  ------  ---------
tpch    lineitem  l_comment                              varchar          1      0         0          1  false      true    <null>  true
tpch    lineitem  l_commitdate                           date             4      0         0          4  false      true    <null>  true
tpch    lineitem  l_discount                             decimal          8      0         0          8  false      true    <null>  true
tpch    lineitem  l_extendedprice                        decimal          8      0         0          8  false      true    <null>  true
tpch    lineitem  l_linenumber                           int              4      0         0          4  false      true    <null>  true
tpch    lineitem  l_linestatus                           char             1      0         0          1  false      true    <null>  true
tpch    lineitem  l_orderkey                             bigint           8      0         0          8  false      true    <null>  true
tpch    lineitem  l_partkey                              bigint           8      0         0          8  false      true    <null>  true
tpch    lineitem  l_quantity                             decimal          8      0         0          8  false      true    <null>  true
tpch    lineitem  l_receiptdate                          date             4      0         0          4  false      true    <null>  true
tpch    lineitem  l_returnflag                           char             1      0         0          1  false      true    <null>  true
tpch    lineitem  l_shipdate                             date             4      0         0          4  false      true    <null>  true
tpch    lineitem  l_shipinstruct                         char             1      0         0          1  false      true    <null>  true
tpch    lineitem  l_shipmode                             char             1      0         0          1  false      true    <null>  true
tpch    lineitem  l_suppkey                              bigint           8      0         0          8  false      true    <null>  true
tpch    lineitem  l_tax                                  decimal          8      0         0          8  false      true    <null>  true
tpch    lineitem  lineitem_l_orderkey_l_linenumber_pkey  oid              8      0         0          8  false      true    <null>  false

sql>update storagemodelinput set count=1000000 where "schema"='tpch' and "table"='lineitem';
sql>update storagemodelinput set "distinct"=1000 where "schema"='tpch' and "table"='lineitem' and "type"='char';
sql>update storagemodelinput set "distinct"=330000 where "schema"='tpch' and "table"='lineitem' and "column"='l_comment';
sql>select * from storagemodelinput where "schema"='tpch' and "table"='lineitem';
schema  table     column                                 type     typewidth    count  distinct  atomwidth  reference  sorted  unique  isacolumn
------  --------  -------------------------------------  -------  ---------  -------  --------  ---------  ---------  ------  ------  ---------
tpch    lineitem  l_comment                              varchar          1  1000000    330000          1  false      true    <null>  true
tpch    lineitem  l_commitdate                           date             4  1000000         0          4  false      true    <null>  true
tpch    lineitem  l_discount                             decimal          8  1000000         0          8  false      true    <null>  true
tpch    lineitem  l_extendedprice                        decimal          8  1000000         0          8  false      true    <null>  true
tpch    lineitem  l_linenumber                           int              4  1000000         0          4  false      true    <null>  true
tpch    lineitem  l_linestatus                           char             1  1000000      1000          1  false      true    <null>  true
tpch    lineitem  l_orderkey                             bigint           8  1000000         0          8  false      true    <null>  true
tpch    lineitem  l_partkey                              bigint           8  1000000         0          8  false      true    <null>  true
tpch    lineitem  l_quantity                             decimal          8  1000000         0          8  false      true    <null>  true
tpch    lineitem  l_receiptdate                          date             4  1000000         0          4  false      true    <null>  true
tpch    lineitem  l_returnflag                           char             1  1000000      1000          1  false      true    <null>  true
tpch    lineitem  l_shipdate                             date             4  1000000         0          4  false      true    <null>  true
tpch    lineitem  l_shipinstruct                         char             1  1000000      1000          1  false      true    <null>  true
tpch    lineitem  l_shipmode                             char             1  1000000      1000          1  false      true    <null>  true
tpch    lineitem  l_suppkey                              bigint           8  1000000         0          8  false      true    <null>  true
tpch    lineitem  l_tax                                  decimal          8  1000000         0          8  false      true    <null>  true
tpch    lineitem  lineitem_l_orderkey_l_linenumber_pkey  oid              8  1000000         0          8  false      true    <null>  false

sql>select * from sys.storagemodel where "schema"='tpch' and "table"='lineitem';
schema  table     column                                 type       count  columnsize  heapsize  hashsize  imprintsize  orderidxsize  sorted  unique  isacolumn
------  --------  -------------------------------------  -------  -------  ----------  --------  --------  -----------  ------------  ------  ------  ---------
tpch    lineitem  l_comment                              varchar  1000000     4000000   2978192         0            0             0  true    <null>  true
tpch    lineitem  l_commitdate                           date     1000000     4000000         0         0       800000             0  true    <null>  true
tpch    lineitem  l_discount                             decimal  1000000    16000000         0         0      3200000             0  true    <null>  true
tpch    lineitem  l_extendedprice                        decimal  1000000    16000000         0         0      3200000             0  true    <null>  true
tpch    lineitem  l_linenumber                           int      1000000     4000000         0         0       800000             0  true    <null>  true
tpch    lineitem  l_linestatus                           char     1000000     4000000     17192         0            0             0  true    <null>  true
tpch    lineitem  l_orderkey                             bigint   1000000     8000000         0         0      1600000             0  true    <null>  true
tpch    lineitem  l_partkey                              bigint   1000000     8000000         0         0      1600000             0  true    <null>  true
tpch    lineitem  l_quantity                             decimal  1000000    16000000         0         0      3200000             0  true    <null>  true
tpch    lineitem  l_receiptdate                          date     1000000     4000000         0         0       800000             0  true    <null>  true
tpch    lineitem  l_returnflag                           char     1000000     4000000     17192         0            0             0  true    <null>  true
tpch    lineitem  l_shipdate                             date     1000000     4000000         0         0       800000             0  true    <null>  true
tpch    lineitem  l_shipinstruct                         char     1000000     4000000     17192         0            0             0  true    <null>  true
tpch    lineitem  l_shipmode                             char     1000000     4000000     17192         0            0             0  true    <null>  true
tpch    lineitem  l_suppkey                              bigint   1000000     8000000         0         0      1600000             0  true    <null>  true
tpch    lineitem  l_tax                                  decimal  1000000    16000000         0         0      3200000             0  true    <null>  true
tpch    lineitem  lineitem_l_orderkey_l_linenumber_pkey  oid      1000000     8000000         0         0            0             0  true    <null>  false

sql>delete from storagemodelinput;

Delta structures

When data updates are done on columns, the changes are first stored in so-called delta structures and later merged (in bulk) into the column storages. For analysis, monitoring and debugging purposes we provide three table producing functions:

sys.deltas("schema" string, "table" string, "column" string)
sys.deltas("schema" string, "table" string)
sys.deltas("schema" string)

Some examples to query these table producing functions:

SELECT * FROM sys.deltas('sys','privileges','auth_id');

SELECT id, segments, "all", inserted, updates, deletes, level
   FROM sys.deltas('sys','privileges');

SELECT s.name as "schema", t.name as "table", c.name as "column", d.*
   FROM sys.schemas s
   JOIN sys.tables t ON s.id = t.schema_id
   JOIN sys.columns c ON t.id = c.table_id
   JOIN sys.deltas('sys') d ON c.id = d.id;

SELECT s.schema, s.table, s.column, d.*
  FROM sys.deltas('sys') d
  JOIN statistics() s on d.id = s.column_id;