Deep dive into interval partitioning &rolling window table in IBM InformixKeshava MurthyIBM Informix Development
• Partitioning 101• Interval partitioning• Rolling window tablepartitioning
Partitioning 101What? Ability to partition a table or indexinto multiple physical partitions.Applications have a single schema or table.Underneath, table or index is organized bymultiple partitions; Query processing andtools understand this and combine thepartitions to provide a single view of thetable. E.g. UNION (ALL) of States makingUNITED STATES of AMERICA.Why? Capacity, parallelism, queryperformance (parallelism, partitionelimination), time cyclic data management,faster statistics collection, multi-temperature data storageHow?-CREATE TABLE with PARTITION(FRAGMENT) BY clause-ALTER TABLE INIT-CREATE INDEX on a partitionedtable-CREATE INDEX explicitly withPARTITION clauseQuery Processing and more:-Scans all the fragments to complete the scan- Parallelization- Partition elimination during scan and join- Parallelized Index builds
Customer_table Partitionidx_cust_idCustomer_tablePartitionParititionStoresales_tableIdx_store_idPartitionPartitionTables, Indices and Partitions
CREATE TABLE customer_p (id int, lname varchar(32))FRAGMENT BY ROUND ROBINPARTITION part1 IN dbs1,PARTITION part2 IN dbs1,PARTITION part3 IN dbs2;CREATE TABLE customer_p (id int, state varchar (32))FRAGMENT BY EXPRESSIONPARTITION part1 (state = "CA") in dbs1,PARTITION part2 (state = "KS") in dbs1,PARTITION part3 (state = "OR") in dbs1,PARTITION part4 (state = "NV") in dbs1;CREATE TABLE customer (id int, state char (2), zipcode decimal(5,0))FRAGMENT BY EXPRESSIONPARTITION partca93 (state = CA and zipcode <= 93000) in dbs1,PARTITION partcagt93 (state = CA and zipcode > 93000) in dbs2,PARTITION partks (state = KS) in dbs3,PARTITION partor (state = OR) in dbs1,PARTITION part4 (state = NV) in dbs1;
• Multi-threaded Dynamic ScalableArchitecture (DSA)– Scalability and Performance– Optimal usage of hardware and OSresources• DSS Parameters to optimize memory– DSS queries– Efficient hash joins• Parallel Data Query for paralleloperations– Light scans, extensive– calculations, sorts, multiple joins– Ideal for DSS queries and batchoperations• Data Compression• Time cyclic data mgmt– Fragment elimination, fragmentattach and detach– Data/index distribution schemas– Improve large data volumemanageability– Increase performance bymaximizing I/O throughput• Configurable Page Size– On disk and in memory– Additional performance gains• Large Chunks support– Allows IDS instances to handlelarge volumes• Quick Sequential Scans– Essential for table scans commonto DSS environments 17Top IDS features utilized for building warehouseSource:
• Multi-threaded Dynamic ScalableArchitecture (DSA)– Scalability and Performance– Optimal usage of hardware and OSresources• DSS Parameters to optimize memory– DSS queries– Efficient hash joins• Parallel Data Query for paralleloperations– Light scans, extensive– calculations, sorts, multiple joins– Ideal for DSS queries and batchoperations• Data Compression• Time cyclic data mgmt– Fragment elimination, fragmentattach and detach– Data/index distribution schemas– Improve large data volumemanageability– Increase performance bymaximizing I/O throughput• Configurable Page Size– On disk and in memory– Additional performance gains• Large Chunks support– Allows IDS instances to handlelarge volumes• Quick Sequential Scans– Essential for table scans commonto DSS environments 17Top IDS features utilized for building warehouseSource:Fragmentation Features
List fragmentationCREATE TABLE customer(id SERIAL, fname CHAR(32), lname CHAR(32), state CHAR(2), phone CHAR(12))FRAGMENT BY LIST (state)PARTITION p0 VALUES ("KS", "IL", "IN") IN dbs0,PARTITION p1 VALUES ("CA", "OR", "NV") IN dbs1,PARTITION p2 VALUES ("NY", "MN") IN dbs2,PARTITION p3 VALUES (NULL) IN dbs3,PARTITION p4 REMAINDER IN dbs3;
Open Loops with Partitioning – As of 11.501. UPDATES STATISTICS on a large fragmented table takes a long time2. Need to explicitly create new partitions for new range of data3. Need database & application down time to manage the application
• Statistics collection by partition–Distinct histograms for each partition–All the histograms are combined–Each data partition has UDI counter–Subsequently, only recollect modified partitions& update the global histogram• Smarter Statistics– Only recollect if 10% of th data has changed– Automatic statistics during attach, detachSmarter UPDATE STATISTICS
UPDATE STATISTICS during ATTACH, DETACH• Automatically kick-off update statisticsrefresh in the background – need toenable fragment level statistics• tasks eliminated by interval fragmentation–Running of update statistics manually afterALTER operations–Time taken to collect statistics is reduced aswell.
Fragment Level Statistics (FLS)• Generate and store column distribution atfragment level• Fragment level stats are combined to formcolumn distribution• System monitors UDI (Update/Delete/Insert)activities on each fragment• Stats are refreshed only for frequently updatedfragments• Fragment level distribution is used to re-calculatecolumn distribution• No need to re-generate stats across entire table
Generating Table Level Statistics• Distribution created for entire column dataset from all fragments.• Stored in sysdistrib with (tabid,colno) combination.• Dbschema utility can decodes and display encoded distribution.• Optimizer uses in-memory distribution representation for queryoptimization.DataDistributionCacheDataDistributionCacheFeedSortedDataFeedColumnDataStoreEncodedDistributionDecodeDistributionBinGenerator& EncoderSORTSysdistribCatalogtableFrag 1Frag 2Frag n
STATLEVEL propertySTATLEVEL defines the granularity or level of statistics created for thetable.Can be set using CREATE or ALTER TABLE.STATLEVEL [TABLE | FRAGMENT | AUTO] are the allowed values forSTATLEVEL.TABLE – entire table dataset is read and table level statistics arestored in sysdistrib catalog.FRAGMENT – dataset of each fragment is read an fragment levelstatistics are stored in new sysfragdist catalog. This option is onlyallowed for fragmented tables.AUTO – System determines when update statistics is run if TABLE orFRAGMENT level statistics should be created.
UPDATE STATISTICS extensions• UPDATE STATISTICS [AUTO | FORCE];• UPDATE STATISTICS HIGH FOR TABLE [AUTO |FORCE];• UPDATE STATISTICS MEDIUM FOR TABLE tab1SAMPLING SIZE 0.8 RESOLUTION 1.0 [AUTO |FORCE ];• Mode specified in UPDATE STATISTICS statementoverrides the AUTO_STAT_MODE session setting.Session setting overrides the ONCONFIGsAUTO_STAT_MODE parameter.
UPDATE STATISTICS extensions• New metadata columns - nupdates, ndeletes and ninserts –in sysdistrib and sysfragdist store the correspondingcounter values from partition page at the time of statisticsgeneration. These columns will be used by consecutiveupdate statistics run for evaluating if statistics are stale orreusable.• Statistics evaluation is done at fragment level for tableswith fragment level statistics and at table level for the rest.• Statistics created by MEDIUM or HIGH mode (columndistributions) is evaluated.• The LOW statistics is saved at the fragment level as well andis aggregated to collect global statistics
Alter Fragment Attach/Detach• Automatic background refreshing of column statistics afterexecuting ALTER FRAGMENT ATTACH/DETACH on a table withfragmented statistics.• Refreshing of statistics begins after the ALTER has beencommitted.• For ATTACH operation, fragmented statistics of the newfragment is built and table level statistics is rebuilt from allfragmented statistics. Any existing fragments with out of datecolumn statistics will be rebuilt at this time too.• For DETACH operation, table level statistics of the resultingtables are rebuilt from the fragmented statistics.• The background task that refreshes statistics is “refreshstats”and will print errors in online.log if any are encountered.
Design for Time Cyclic data mgmtcreate table mytrans(custid integer,proc_date date,store_loc char(12)….) fragment by expression......(proc_date < DATE (01/01/2009 ) ) in fe_auth_log20081231,(MONTH(proc_date) = 1 ) in frag2009Jan ,(MONTH(proc_date) = 2 ) in frag2009Feb,….(MONTH(proc_date) = 10 and proc_date < DATE (10/26/2009 ) ) in frag2009Oct ,(proc_date = DATE (10/26/2009 ) ) in frag20091026 ,(proc_date = DATE (10/27/2009 ) ) in frag20091027,(proc_date = DATE (10/28/2009 ) ) in frag20091027 ,(proc_date = DATE (10/29/2009 ) ) in frag20091027 ,(proc_date = DATE (10/30/2009 ) ) in frag20091027 ,(proc_date = DATE (10/31/2009 ) ) in frag20091027 ,(proc_date = DATE (11/01/2009 ) ) in frag20091027 ,;
RoundRobinList Expression IntervalParallelism Yes Yes Yes YesRange Expression No Yes Yes YesEqualityExpressionNo Yes Yes YesFLS Yes Yes Yes YesSmarter Stats Yes Yes Yes YesATTACH ONLINE No No No YesDETACH ONLINE No No No YesMODIFY ONLINE No No No Yes -- MODIFYtransition valueCreate indexONLINEYes Yes Yes Not yetStorageProvisioningNo No No Yes
Type of filter (WHEREclause)NonoverlappingSingle fragmentkeyOverlapping on asingle column keyNonoverlappingMultiple columnkeyRange expression Can eliminate Cannot eliminate Cannot eliminateEquality expression Can eliminate Can eliminate Can eliminateFragment elimination
New fragmentation Strategies inInformix v11.70• List Fragmentation–Similar to expression based fragmentation–Syntax compatibility• Interval Fragmentation–Like expression, but policy based–Improves availability of the system
Time Cyclic Data management• Time-cyclic data management (roll-on, roll-off)• Attach the new fragment• Detach the fragment no longer needed• Update the statistics (low, medium/high) to keepeverything up to date.fieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfield fieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldJanJan FebFeb MarMar AprAprMay 09May 09Dec 08Dec 08enables storing data over time
fieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldfieldJanJan FebFeb MarMar AprAprMay 09May 09Dec 08Dec 08• ATTACH, DETACH and rest of ALTERs require exclusive access– Planned Downtime• These can be scripted, but still need to lock out the users– Informix 11.50.xC6 has DDL_FORCE_EXEC to lock out the users• Expression strategy gives you flexibility, but elimination can betricky.Time Cyclic Data management
Fragment by Expressioncreate table orders(order_num int,order_date date,customer_num integer not null,ship_instruct char(40),backlog char(1),po_num char(10),ship_date date,ship_weight decimal(8,2),ship_charge money(6),paid_date date ) partition by expressionpartition prv_partition (order_date < date(’01-01-2010’)) in mydbs,partition jan_partition (order_date >= date(’01-01-2010’) andorder_date < date(’02-01-2010’) in mydbs,partition feb_partition (order_date >= date(’02-01-2010’) andorder_date < date(’03-01-2010’) in mydbs,partition mar_partition (order_date >= date(’03-01-2010’) andorder_date < date(’04-01-2010’) in mydbs,partition apr_partition (order_date >= date(’04-01-2010’) andorder_date < date(’05-01-2010’) in mydbs,…
Fragment by IntervalInterval ValueInitial PartitionPartition Keydbspacescreate table orders(order_num int,order_date date,customer_num integer not null,ship_instruct char(40),backlog char(1),po_num char(10),ship_date date,ship_weight decimal(8,2),ship_charge money(6),paid_date date )partition by range(order_date) interval(1 units month)store in (dbs1, dbs2)partition prv_partition values < date(’01-01-2010’) in dbs3;
Interval Fragmentation• Fragments data based on an interval value– E.g. fragment for every month or every million customer records• Tables have an initial set of fragments defined by a rangeexpression• When a row is inserted that does not fit in the initial rangefragments, IDS will automatically create fragment to holdthe row (no DBA intervention)• No X-lock is required for fragment addition• All the benefits of fragment by expression
ONLINE attach, detach• ATTACH– Load the data into a staging table, create the indices exactlyas you have in your target table.– Then simply attach the table as a fragment into anothertable.• DETACH– Identify the partition you want to detach– Simply detach the partition with ONLINE keyword to avoidattemps to get exclusive access
Attach ExampleALTER FRAGMENT ONLINE ON TABLE “sales”.ordersATTACH december_orders_table as PARTITION december_partitionvalues < 01-01-2011;
December_orders_tableTable to attachordersquery1 query2Issue ALTER ATTACH ONLINEquery1query2Query1 and Query2 continue andwon’t access the new partitionAttachModify the dictionary entry to indicate online attachis in progress. Other sessions can read the list butcannot modify.query1query2query3New queries will work on the table and won’tconsider the table fragment for the queries.ONLINE ATTACH operation is complete.Table is fully available for queriesquery3query4Get exclusive access to the partion list (in thedictionary) .The dictionary entry gets modifiedand new dictionary entries for new queries fromhere onNew queries will work on the table and willconsider the new table fragment .Attaching online
ONLINE operations• ATTACH a fragment• DETACH a fragment• MODIFY transition value• Automatic ADDing of new fragments on insertor update• tasks eliminated by interval fragmentation– Scheduling downtime to get exclusive access forADD, ATTACH, DETACH– Defining proper expressions to ensure fragmentelimination– Running of update statistics manually after ALTERoperations– Time taken to collect statistics is reduced as well.
37Agenda• Reasons for feature• Syntax• Examples• Limitations
38Reasons for feature - Enterprise• Enterprise customers and customer applicationshave a policy–Keep 13 months of sales data–Every month, purge/compress/move this data• Customers write scripts to implement this policy• All these require database & system down time.• Scripts have to be tested, maintained.• Why not support the policy itself in the database?
39Reasons for feature - Embedded• Embedded applications have a need to managelimited amount of space automatically• OEMs have written thousands of lines of SPL tolimit the amount of space taken by tables• Offering the ability to control table space usagedeclaratively simplifies applications• Why not embed the policy itself into the table?
40Syntax• Start with Interval fragmentation• Augument the following:FRAGMENT BYRANGE (<column list>)INTERVAL (<value>)[ [ROLLING(<integer value> FRAGMENTS)][LIMIT TO <integer value> <SIZEUNIT>][DETACH|DISCARD]][[ANY|INTERVAL FIRST|INTERVAL ONLY]]STORE IN (<dbspace list> |<function_to_return_dbspacename()>) ;SIZEUNIT:[K | KB | KiB | M | MB | MiB | G | GB | GiB | T | TB | TiB ]
41• ALTER FRAGMENT... MODIFY INTERVALaugmented as such[ROLLING(<integer value> FRAGMENTS)] [LIMIT TO<integer value> <SIZEUNIT>] [DETACH|DISCARD]• ALTER FRAGMENT … MODIFY DROP ALL ROLLING removesthe rolling window policy altogether• ALTER FRAGMENT … MODIFY INTERVAL DISABLE disablesrolling window policies without dropping them• ALTER FRAGMENT … MODIFY INTERVAL ENABLEreinstates the current rolling window policy, if anyis defined
42ROLLING clause• Used to specify the number of active intervalfragments• When interval fragments exceed the set value(that is when a new one is created), the intervalfragment holding the lowest set of values will bedetached
43LIMIT clause• Specifies the maximum size of the table• When limit exceeded, fragments holding thelowest value will be detached until space used isbelow limit• The comparison is done against the overall size(data and indices pages allocated) of the table• Both interval and initial range fragments could bedetached depending on the action specified
44DETACH | DISCARD clause• Decides the fate of victim fragments• DISCARD will eliminate the fragment for good• DETACH will preserve the data by detaching the fragmentin a new table• Applications can detect detached fragments and archivetheir contents into different tables, databases, aggregateit, etc• The actual detach / discard is done through theDBscheduler
45ANY | INTERVAL FIRST | INTERVAL ONLY• The LIMIT TO clause has three modes of operation– ANY – any fragment will be detached starting from the lowest– INTERVAL FIRST – interval fragments will be detached startingfrom the lowest, if table still exceeds LIMIT, range fragmentswill be detached from the lowest: intended as an emergencyaction– INTERVAL ONLY – range fragments preserved even if table stillexceeds LIMIT• Default is INTERVAL FIRST• Range fragments will be detached but preserved empty
46STORE clause• The STORE clause has been extended to be able to take afunction which returns the dbspace name where the nextfragment is to be created.• The function takes four arguments: table owner, tablename, value being inserted, retry flag• If creating fragment fails first time round, function will beinvoked again with retry flag set• On new failure, DML being executed will fail• The statement will fail if the UDR cannot be determined
47Examplescreate table orders(order_num serial(1001),order_date date,customer_num integer not null,ship_instruct char(40),backlog char(1),po_num char(10),ship_date date,ship_weight decimal(8,2),ship_charge money(6),paid_date date )partition by range(order_date) interval(1 units month)ROLLING (12 FRAGMENTS) LIMIT TO 20 GB DETACHstore in (mydbname())partition prv_partition values < date(’01-01-2010’) in mydbs;
48Examples #2• Sample UDR to determine dbspaceCREATE FUNCTION mydbname(owner char(255),table CHAR(255),value DATETIME YEAR TO SECOND,retry INTEGER)RETURNING CHAR(255)IF (retry > 0) THENRETURN NULL; -- if it does not work the first time,-- do not try againEND IF;IF (MONTH(value) < 7)THENRETURN "dbs1";ELSERETURN "dbs2";END IF;END FUNCTION
49UDRs and DBScheduler• Detaching or dropping fragments can be donemanually, executing function syspurge() from thedatabase from which the fragments should bedropped detached• It will be done automatically every day throughthe purge_tables Dbscheduler system task.• Task is enabled by default• By default, when does purge_tables taskscheduled?
50Limitations• Due to primary key constraint violations, thefeature will not be applicable to tables with aprimary key having referential constraints to it(primary key with no references is fine).• Only indices following the same fragmentationstrategy as the table are allowed (to allow realtime fragment detach).• This means indices have to be created with nostorage option, and no ALTER FRAGMENTMODIFY, ALTER FRAGMENT INIT is allowed on theindices.
Deep dive into interval androlling window table partitioning in IBM InformixKeshava Murthy IBM email@example.com