DB2 – Differentiating Business Value

3,582 views
3,412 views

Published on

Key features and business value of DB2 10.
Denna presentation hölls på IBM Data Server Day den 22 maj i Stockholm av Les King, Director, Distributed Data Server Product Management, IBM

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,582
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Adaptive Compression is included only in the Storage Optimization feature, and as such is only available for: - Enterprise Server Edition and - Advanced Enterprise Server Edition Significant enhancements to DB2's industry leading compression technologies come in the new Adaptive Compression which can further reduce your storage needs. The enhancements deliver efficient compression of high amounts of new and changing data. The improved compression ratio further reduces storage needs and allows for more data in memory therefore increasing performance. The new approach used in Adaptive Compression also reduces the need for table reorganization. As a consequence, overall maintenance of compressed data is reduced, providing additional cost savings.
  • This chart shows features that are new in DB2 9.7. DB2 is extending its leadership with additional breakthroughs in data compression. Currently, IBM clients enjoy data compression rates of up to 83%. This translates into a storage savings of up to 50%. Actual rates vary, depending on the type of data. With the following new features, IBM expects to move this storage savings higher: - Compression of indexes - Compression of log files - Compression of temporary tables - Compression of XML data (currently only inline XML data is compressed) - Compression of Large Objects (LOBs)
  • SAP Warehousing workload Overview for 5 largest tables, 54GB total size Data is sorted Shows offline reorg case Additional 39% savings by Adaptive Compression after Reorg Increase in compression ratio 5.2x to 8.5x in offline reorg case
  • SAP Warehousing workload Overview for 5 largest tables, 54GB total size Data is sorted Shows online ADC case (Automatic Dictionary Creation) Additional 60% savings by Adaptive Compression in ADC case Increase in compression ratio 2.5x to 6.4x in ADC case
  • DB2 Galileo compression is far superior to Oracle for a number of reasons: In order to get the maximum compression possible a customer will need to buy Exadata and use columnar compression The standard compression in Oracle only uses page-level compression so this is not as efficient as using Table and Page level compression Data must be pre-sorted to get the best possible compression ratio – this is often not possible in a production database! Exadata requires the use of “ classic ” compression if the workloads that are running need to do updates, inserts, and deletes.
  • Key point: The amount of business information in XML form is already as great or greater than other forms and growing faster - failure to leverage efficiently as structured data means high cost and/or missed opportunity In 2006 IBM introduced a new generation data server with the availability of DB2 9 (formerly known as “Viper”). The explosive growth of XML based data standards in all industries means competitive advantage for those businesses that use it most effectively and efficiently. Client, policy and claims processing in Insurance; supply chain management in Retail; financial transactions and asset management in Banking; patient care in Healthcare; citizen service in Government; implementing Service Oriented Architectures (SOA) in Computing Software and Services - and many other processes across all industries - increasingly rely on information captured and exchanged in XML form. Our clients are increasingly managing XML format text documents in a content management system for proper governance and efficient use in the business process workflow. But few are realizing the full value of all the business data they possess that are in XML format. Early users of the pureXML feature of DB2 9 are taking advantage of the fact that data in XML format is well structured and can be queried via standard languages such as XPath and XQuery. By doing so they are bringing that data to bear in both transactional and analytic processes - with higher performance and lower development costs than previously possible with a relational database. The difference is that DB2 9 supports both relational (tabular) and XML (hierarchical) structures in the same database so that both can be easily, efficiently and securely managed, analyzed and delivered. Unlike other relational data servers - and previous versions of DB2 - pureXML eliminates the overhead of fitting the “square peg” XML tree structure into the “round hole” row and column relational structure. Until DB2 9, managing XML data records with a relational data servers meant decomposing the data into columns - a process known as shredding. Or by storing the entire data record in a single cell as a character large object - known as a CLOB. The CLOB approach does not cost overhead as the data records go in. But when you query these records you pay the overhead of parsing each one at runtime which can be a significant performance impact to the application. With shredding, overhead is paid up front to turn the data into a relational record that can be queried efficiently. But overhead is also paid later if the record needs to be recreated for delivery in XML format. This process also affects the fidelity of the record itself - leading to an approach that uses both shredding and CLOB methods for applications that require both performance and fidelity. This results in even more overhead to ensure the records remain in sync. The impact of pureXML is seen by a large Banking client with a requirement to update over 500,000 XML data records per day. Attempts to use a competitors relational data server failed. Using DB2 9 with pureXML, the application was able to update more than half a million data records in less than an hour. And a large Insurance client has seen the impact of pureXML to development time and cost with a 65% reduction in lines of code and more than 75% reduction in time required to develop services accessing XML data.
  • Xmlement returns an xml value that is an Xquery element node =>Element name is <Department ..> … </Department> => Xmlattributes constructs XML attributes from the arguments e.dept (table employee with alias e and column dept) AS attribute name “name” => Xmlagg returns an XML sequence containing an item for each non-null value in a set of XML values => xmlelement returns the element <name> … </nmame) with values form e.firstname) Returns As dept_l;ist Group by dept name
  • For clause iterates over: Xmlcolumn(movies.doc) returns an xml sequence of the values in the xml column DOC in the movies column LET $actors := $movies//actor assigns the actor values to the $actors variable WHERE $movie/duration > 90 looks for movies that has a duration > 90 minutes ORDER BY $ Move/@year orders the moves by years RETURN <move> {$movie/title, $actors} </movie> Returns the sequence <move> …
  • Overview Table partitioning will be made available by a new PARTITION BY clause on CREATE TABLE. For example: CREATE TABLE foo(a INT) IN tbsp1, tbsp2, tbsp3, tbsp4, tbsp5 PARTITION BY RANGE(a) (STARTING FROM (1) ENDING (100) EVERY (20)) This creates a table where rows with a>=1 and a<=20 are stored in table space tbsp1, rows with 21 <= a <= 40 are in table space tbsp2, etc. This functionality leads to a three level data organization scheme DISTRIBUTE BY to spread data across EEE database partitions PARTITION BY to spread data across DMS objects in one or more tablespaces ORGANIZE BY to spread data across extents within a tablespace These clauses can be combined in a single table to create more complicated partitioning schemes. This for example allows similar functionality to Informix's Hybrid functionality by combining DISTRIBUTE BY and PARTITION BY to spread data both across EEE database partitions and multiple tablespaces. Each clause includes an algorithm (for example HASH, DIMENSIONS or RANGE) after the BY to indicate how the data should be spread out. Not all clauses will support all of the algorithms but this syntax allows consistency between the clauses as well as allowing future extensions to add new data layout algorithms. DML operations against a partitioned table will yield the same results as they would for an ordinary table. Data inserted or loaded into the table will be transparently placed in the correct data partition. Updates that move data from one data partition to another will automatically work as expected. All of the usual SQL features such as triggers, constraints, etc. will be supported for partitioned tables. The user visible differences will be in capacity, performance, and availability. Data partitioned tables can contain vastly more data than an ordinary table. Tables with up to 32767 data partitions can be created. The current limit on the number of DMS objects is 4096 tablespaces each with approximately 55000 objects in them. Query processing will be enhanced to automatically eliminate data partitions based on predicates of the query, resulting in better query performance. This feature is called "data partition elimination", which corresponds to "fragment elimination" in the Informix products and SmartScan in the Redbrick products. Many decision support queries benefit greatly from this. Some operations that currently can take a long time on a large table such as backup, will work partition by partition. Thus, it will be possible to backup one data partition of a partitioned table at a time. In future releases this will be extended to other administration operations such as reorg. Even if applications cannot function properly with a portion of the table unavailable, it will at least allow people to break one intolerably long maintenance operation into a series of smaller ones. One notable difference between data partitioned tables and non-partitioned tables (i.e. not MDC or EEE) occurs when an update where current of cursor (WCOC) operation moves the row from one partition to another. The cursor will no longer be positioned on a row. In this case, the row can be fetched to the next row position. So you can run 'update table t1 set col1 = col1 + 1' multiple times, but after the first update that changes the row position the cursor will no longer be positioned and no further WCOC operations will be allowed until a subsequent fetch positions it on a new row. This is consistent with the behavior of MPP partitioning and MDC partitioning. The table partitioning line item will provide the following features in the first release The ability to partition a table by key range into multiple tablespaces This feature includes the ability to have multiple ranges of a single table in one tablespace (each range will be in a separate DMS object). A long and short form of the CREATE TABLE syntax The short form allows easy creation of large numbers of data partitions when required. The long form supports low level control of placement when required, most likely for data skew cases. The ability to create an index on a partitioned table In this release, the index will be non-partitioned and hold references to data rows across all of the data partitions on a EEE database partition (i.e. each EEE database partition will have its own index). This means that a single DMS object will be used per EEE database partition to hold the index. Each index on the table is stored in a separate DMS object within a EEE database partition. MDC will use a non-partitioned index MDC block indexes for partitioned tables will include a data partition identifier in addition to BID. The ability to only scan the data partitions needed based on the where clause and the range definitions for each data partition This is known as data partition elimination. ALTER commands to ATTACH/DETACH tables to a partitioned table This feature allows easy roll-in/roll-out. Tables can be loaded up off-line and then attached to a live partitioned table, for easy roll-in. In addition, a data partition from a partitioned table can be detached into a stand alone table, for easy roll-out. For discussions on the concurrency rules for ATTACH and DETACH, please refer to the usage scenarios document. ALTER commands to ADD empty data partitions ALTER TABLE ... DROP PARTITION will not be supported in the first release. Use DETACH + DROP TABLE to accomplish the same result. DB2LOOK support for dumping out schemas of partitioned tables Only the long form of the syntax will be output - any tables created with the short form of the syntax will be dumped as if they had been created with the long form. RUNSTATS will generate stats from all data partitions RUNSTATS will gather statistics from all data partitions residing on the same database partition (EEE node). The ability to generate statistics from all nodes in a EEE environment is in another lineitem (see LI 1459 ) - currently statistics is only gathered on one EEE node and extrapolated. Note there is NO dependency on LI 1459 for this lineitem, we will simply expand the current algorithm to cover all data partitions within the same database partition. REORG will be supported at the table level (it won't be possible to reorg an individual data partition in this release). However, reorg of an individual partition can be achieved by detaching the partition, reorging the resulting non-partitioned table and then re-attaching the partition. Please see the Usage Scenarios document for details about this procedure. REORGCHK will be supported at the table level. Table and index statistics will be calculated based on the statistics for the whole table. REORG INDEXES ALL ALLOW READ/WRITE will not be supported for partitioned tables in the initial release. Instead only the ALLOW NO ACCESS option will be supported, making reorg indexes all available to reorg all the indexes for the partitioned table offline. Note that ALLOW NO ACCESS must be explicitly specified for REORG INDEXES ALL on a partitioned table because the default is ALLOW READ ACCESS. In addition there will be new syntax to support reorg of individual indexes as specified by the user. LOAD into a partitioned table will be supported, as well as IMPORT/EXPORT from a partitioned table. This line item does not address whether or not High Performance Unload (HPU) will be supported for data partitioned tables. Point in time ROLLFORWARD will be enhanced to handle partitioned tables much as it does indexes. All tablespaces belonging to a partitioned table will be rolled forward together. LOCK TABLE will support locking all of the data partitions via a single table lock. Partition level locks may be obtained as well to satisfy various locking requirements. All existing ALTER TABLE commands will be supported Parallel scans will be supported on multiple data partitions via a straw scan of each data partition individually. Insert/Update/Delete for data partitioned tables will be serialized just as it is for ordinary tables. Partitioning of hierarchical (typed) tables will NOT be supported in the first release. Partitioned MQTs are allowed, but you cannot do ALTER ATTACH/DETACH on them. Not all internal optimizations available on non-partitioned tables will necessarily be supported on partitioned tables (for example some of the TPCC optimizations introduced in V8 such as parameterized base tables may not be supported - the features will however work with partitioned tables just without the performance optimizations). DATALINK and XML columns will not be supported on data partitioned tables. CREATE INDEX IN <tablespace> enables user to specify a tablespace for each individual index on a partitioned table. It will override the one specified in CREATE TABLE statement or selected by the default rule for this index regardless of whether the base table's tablespace is DMS or SMS. Redistribute will be supported for partitioned tables but NOT to move data between ranges of a data partitioned table only between database partitions. Redistribute for partitioned tables has the following restrictions: the table must have an access mode of (systables.access_mode) Full Access and the table must not have outstanding ATTACHed or DETACHed partitions. This support is being implemented as part of the Fast Redistribute lineitem RENAME TABLE will rename a data partitioned table. The ALTER operations interact strongly with SET INTEGRITY. This aspect of the project has been split off as a separate line item. LI 2369 . Details Table partitioning will require the creation of two new catalog tables, SYSDATAPARTITIONS and SYSDATAPARTITIONEXPRESSION. This is because each data partition of a partitioned table will be a separate DMS object, which needs to be tracked via a tuple in SYSDATAPARTITIONS. This tuple will include the starting and ending key values for that partition's range and other information. An additional SYSDATAPARTITIONEXPRESSION catalog table will hold the details of the columns used to partition on (and in later releases the partitioning expressions). A partitioned table will have one entry in SYSTABLES , one entry in SYSDATAPARTITIONEXPRESSION per partitioning expression and n entries in a new catalog table SYSDATAPARTITIONS, where n is the number of data partitions. The PARTITION BY clause will specify the range of data values that goes into each DMS object on a database partition (the DISTRIBUTE BY clause will specify which EEE database partition the data should live on). As the data is now contained in separate DMS objects based on key range we can now eliminate one or more DMS objects from data scans based on the where clause conditions. This functionality is called "data partition elimination". It corresponds to "fragment elimination" in the Informix products and SmartScan in the Redbrick products. Because the table is now spread out across multiple tablespaces we no longer are limited by the 4-byte record id. Each tablespace can have its own record id range and we will supplement that with a data partition identifier internally to uniquely address each row (this is much how updates to union views are handled in V8). Also because the map of what data is in what object is in the catalogs we can reasonably easily add or remove ranges of data from the table. The ALTER statements allow addition/deletion of data partitions from a partitioned table, thus solving the roll-in/roll-out requirement. The ranges of data in each data partition can be specified in one of two ways Automatically generated This mirrors the proposed MDC syntax to allow expressions in the ORGANIZE BY DIMENSIONS clause and allows expressions to specify how the ranges are derived. Although as with the current MDC version we won't support table partitioning by expressions in the first version - generated columns can be used in the meantime to support this functionality. For example CREATE TABLE t(a INT, b INT GENERATED ALWAYS AS (a/10)) IN tbsp1, tbsp2, tbsp3, tbsp4, tbsp5, tbsp6, tbsp7, tbsp8, tbsp9, tbsp10 PARTITION BY RANGE(b) (STARTING FROM (1) ENDING (1000) EVERY (100)) Would result in 10 data partitions each with 100 key values in them. i.e (a/10) >=1 (a/10) < 101 in tbsp1 (a/10) >= 101 (a/10) < 201 in tbsp2 ... (a/10) >= 901 (a/10) <=1000 in tbsp10 The starting value of the first data partition (the one in tbsp1) will be inclusive because the overall starting bound (1) was inclusive (the default). Similarly the ending bound of the last data partition (the one in tbsp10) will be inclusive because the overall ending bound (1000) was inclusive (again by default). The remaining STARTING values are all inclusive and ENDING values are all exclusive and each data partition holds n key values where n is given by the EVERY clause. We use the formula (start + every) to find the end of each data partition range. The last data partition may have less key values if the EVERY values does not divide evenly into the START/END range. Now say we specify: CREATE TABLE t(a INT, b INT GENERATED ALWAYS AS (a/10)) IN tbsp1, tbsp2, tbsp3, tbsp4, tbsp5, tbsp6, tbsp7, tbsp8, tbsp9, tbsp10 PARTITION BY RANGE(b) (STARTING FROM (1) exclusive ENDING (1000) EVERY (100)) This would result in 10 data partitions each with 100 key values in them. i.e (a/10) > 1 (a/10) <= 101 in tbsp1 (a/10) > 101 (a/10) <= 201 in tbsp2 ... (a/10) > 901 (a/10) <=1000 in tbsp10 The starting value of the first data partition (the one in tbsp1) will be exclusive because the overall starting bound (1) was exclusive. Similarly the ending bound of the last data partition (the one in tbsp10) will be inclusive because the overall ending bound (1000) was inclusive. The remaining STARTING values are all exclusive and ENDING values are all inclusive and each data partition holds n key values where n is given by the EVERY clause. Finally, if both the starting and ending bound of the overall clause are exclusive, The starting value of the first data partition (the one in tbsp1) will be exclusive because the overall starting bound (1) was exclusive. Similarly the ending bound of the last data partition (the one in tbsp10) will be exclusive because the overall ending bound (1000) was exclusive. The remaining STARTING values are all exclusive and ENDING values are all inclusive and each data partition (except the last) holds n key values where n is given by the EVERY clause. Note we are still using the formula (start + every) to find the end of each data partition range. Tables created in this manner are constrained to use numeric or date time types in their PARTITION BY columns. For an example of the every clause using a date column, refer to the definition of the table, LINEITEM, in the Usage Scenarios section. Ranges are ascending. The increment in the EVERY clause must be greater than zero. The ENDING value must be greater than or equal to the STARTING value. MINVALUE and MAXVALUE are not allowed in the Automatically Generated form of the syntax. Manually generated This is like the traditional DB2 zOS style of partitioning where high values are specified for each data partition. A new data partition is created for each data partition boundary listed in the PARTITION BY clause. For example CREATE TABLE foo(a INT, b INT GENERATED ALWAYS AS (a/10)) PARTITION BY RANGE(b) (STARTING FROM (1) ENDING(100) IN tbsp1, ENDING(200) IN tbsp2, ENDING(300) IN tbsp3, ENDING(400) IN tbsp4, ENDING(500) IN tbsp5, ENDING(600) IN tbsp6, ENDING(700) IN tbsp7, ENDING(800) IN tbsp8, ENDING(900) IN tbsp9, ENDING(1000) IN tbsp10) Would result in data partitions with the same ranges as above. Like DB2 zOS only one end of each range needs to be specified - the other end is implied from the adjacent data partition. This is felt to be much simpler for the user than forcing both ends of each range to be specified - In particular with character and float columns the other end of the range could be difficult to specify. The following types (including synonyms) are supported for use as a RANGE partitioning column: SMALLINT INTEGER INT BIGINT FLOAT REAL DOUBLE DECIMAL DEC NUMERIC NUM CHARACTER CHAR VARCHAR CHARACTER VARYING CHAR VARYING CHARACTER FOR BIT DATA CHAR FOR BIT DATA VARCHAR FOR BIT DATA CHARACTER VARYING FOR BIT DATA CHAR VARYING FOR BIT DATA DATE TIME TIMESTAMP GRAPHIC VARGRAPHIC User defined types (distinct) Here are examples of types that can appear in a range partitioned table, but are not supported for use as a range partitioning column. Other types not mentioned here because they have not yet been implemented may or may not work in a range partitioned table. User defined types (structured) LONG VARCHAR LONG VARCHAR FOR BIT DATA BLOB BINARY LARGE OBJECT CLOB CHARACTER LARGE OBJECT DBCLOB LONG VARGRAPHIC REF Varying length string for C Varying length string for Pascal The following are examples of types are not supported for use in a range partitioned table at all: XML DATALINK BOOLEAN (The BOOLEAN data type is currently only supported internally.) In the manual generated form of the syntax, multiple columns can be used as the range partitioning key. For example CREATE TABLE foo(year INT, month INT) IN tbsp1, tbsp2, tbsp3, tbsp4, tbsp5, tbsp6, tbsp7, tbsp8 PARTITION BY RANGE(year, month) (STARTING FROM (2001, 1) ENDING (2001,3) IN tbsp1, ENDING (2001,6) IN tbsp2, ENDING (2001,9) IN tbsp3, ENDING (2001,12) IN tbsp4, ENDING (2002,3) IN tbsp5, ENDING (2002,6) IN tbsp6, ENDING (2002,9) IN tbsp7, ENDING (2002,12) IN tbsp8) This results in 8 data partitions, one for each quarter in year 2001 and 2002. Note that when multiple columns are used as the table partitioning key, they are treated as a "composite" key (similar to composite keys in an index), in the sense that trailing columns are dependent on the leading columns. Table partitioning is multi-column not multi-dimension. In table partitioning, all columns used are part of a single dimension, like a B-tree with multiple keys. Each starting or ending value (all of the columns, together) must be specified in 512 characters or less. This limit corresponds to the size if the columns SYSDATAPARTITIONS.LOWVALUE and SYSDATAPARTITIONS.HIGHVALUE. A starting or ending values specified with more than 512 characters will result in error SQL0636N, reason code 9. Create new table P1. Load data into P1. Do any necessary data cleansing and validation. P1 is an ordinary table. You can create indexes, constraints, whatever helps you accomplish this step. Do ALTER TABLE T1 ATTACH PARTITION ... FROM TABLE P1 At this point, P1 no longer exists as a separate table. Rather, the data from P1 is now part of T1. A z-lock is acquired on P1 and data in P1 is only online when SET INTEGRITY commits. The ATTACH operation does not require that th e data in P1 be read or written. So it is a more or less instantaneous operations. At this point, the table is fully read/write accessible except for the partition that has just been ATTACHed. Running SET INTEGRITY on T1 will make the new data visible. See LI 2369 (SET INTEGRITY) for details . Once the ATTACH returns it can be committed or aborted as required but a call of SET INTEGRITY will be required to complete the operation and to bring any newly created data partitions online. ATTACH completes more or less instantaneously. All work that involves scanning or other DML on the contents of the table takes place during the SET INTEGRITY operation. This SET INTEGRITY call will do the following: - check rows satisfy the range defined on attached partition - insert into indexes keys for the attached rows - check RI/check/generated_column constraints as applicable generate identity or generated column values as applicable LOAD into staging table: CREATE TABLE dec03(.....); LOAD FROM data_file OF DEL REPLACE INTO dec03; While LOAD is running, the staging table is offline. However, since it is a separate table from the target table, all existing data is fully accessible. [ Application specific data cleansing, transformation, checking, etc. ] Depending on the application, data may or may not have been massaged prior to loading. Since it has been loaded into a staging table that is completely independent of the target table, it may be convenient to do cleansing, checking, transformation on the staging table after LOAD has completed. ATTACH the staging table to the target table: ALTER TABLE stock ATTACH PARTITION dec03 STARTING '12/01/2003' ENDING '12/31/2003'; ALTER is more or less instantanous. This operation completes in at most a few seconds. However, the data is not yet visible. Note that ALTER acquires locks on some rows in the catalogs that are necessary for compiling new queries against the target table. Thus no new queries can be compiled for this table until the ALTER is committed, releasing the locks. Existing queries do continue to run (no query draining for ATTACH) and new queries can start even before the commit if they are pre-compiled (static SQL in packages). It is recommended that COMMIT be run immediately after ATTACH to avoid any interruption in access to the data. COMMIT the ALTER COMMIT WORK; At this point, all existing data in the target table is fully availble. However, data in the newly ATTACHed partition is not yet visible. Validate the new data using SET INTEGRITY. SET INTEGRITY FOR stock IMMEDIATE CHECKED FOR EXCEPTION IN stock USE stock_ex; While SET INTEGRITY is running, all existing data in the target table is fully accessible for select/insert/update/delete. Data in the newly ATTACHed partition is not yet visible. SET INTEGRITY is potentially a long running operation. During this period, DDL and utility type operations are not allowed. For example, LOAD or ALTER TABLE ... ADD COLUMN (see list below). The operations that I am aware of in these categories are: LOAD REORG REDISTRIBUTE Datalink reconcile Alter table (e.g. add columns, ADD, ATTACH, DETACH, truncate via alter to "not logged initially") Index create SET INTEGRITY completes When SET INTEGRITY completes, it will drain all queries accessing the target table. Then it will transition the state of the table to make the new data visible. This transition is more or less instantaneous (completes in a few seconds at most) but it cannot be performed until all existing queries have been drained. Summary of Data Availability at Each Phase of Roll-Out DETACH the partition to be rolled out ALTER TABLE stock DETACH PART dec01 INTO TABLE junk; DETACH drains all queries accessing the table. The DETACH operation is more or less instantaneous (completes in a few seconds at most), but it cannot be performed until all existing queries have been drained and all DDL and utility type operations have completed. All packages that reference the table will be invalidated at this time. Note that after this state transition, the table remains offline until DETACH is committed. It is recommended that COMMIT be run immediately after DETACH to avoid any interruption in access to the data. COMMIT the DETACH COMMIT WORK At this point, the DETACHed data is no longer visible in the data partitioned table. FIXME: when do you need to run SET INTEGRITY to make data visible in the target table?
  • Distribution clause +------ , --------+ +-- HASH --+ | | | | V | DISTRIBUTE BY --+----------+------ ( ------- col-name ----+-- ) ---------> | | ^ +- future -+ | | use | | | +-- REPLICATION -----------------------------------+ The DISTRIBUTE BY clause replaces the PARTITIONING KEY clause used in previous releases. The old PARTITIONING KEY syntax is deprecated, but will still be supported for backwards compatibility. There is no restriction on mixing this old syntax with PARTITION BY RANGE. The associated syntax for adding or dropping a partitioning via alter table is changed to ADD/DROP DISTRIBUTION KEY instead of ADD/DROP PARTITIONING KEY. The old PARTITIONING KEY syntax is deprecated, but will still be supported for backwards compatibility. There is no restriction on mixing this old syntax with PARTITION BY RANGE. >>-ALTER TABLE-- table-name --------------------------------------> .-------------------------------------------------------------------------. V .-COLUMN-. | >----------+-ADD--+-+--------+--| column-definition |-+--------------------------+-+------->< | +-| unique-constraint |-------------+ | | +-| referential-constraint |--------+ | | +-| check-constraint |--------------+ | | +-| partitioning-key-definition |---+ | | '-RESTRICT ON DROP------------------' | . . . partitioning-key-definition: .-,-----------. V | .-USING HASHING-. |-- DISTRIBUTION KEY--(---- column-name -+--)--+---------------+---| . . . DISTRIBUTE BY REPLICATION as per previous releases is only supported with MQTs. An error will be returned if it is supplied for any other type of table. This syntax change frees the PARTITIONING term for use in this project and makes all of the CREATE TABLE data layout clauses match the {DISTRIBUTE|PARTITION|ORGANIZE} BY <algorithm> pattern. Organization clause +------- , --------------+ +-- DIMENSIONS --+ | | | | V | ORGANIZE BY ---+-----------------+------ ( ----+-- col-name --------+-+- ) --+--> | | | | | | +---- , -----+ | | | | | | | | | | V | | | | +-(--- col-name-+--)-+ | | | +--KEY SEQUENCE-- - | sequence-key-spec |-----------------------+ | | +-- future use -----------------------------------------------+ This is mainly in this document for completeness (and because the syntax came out of the table partitioning meetings). The changes necessary to change MDC were put in late in V8. MDC A table can be both multi-dimensional clustered and table partitioned. In a table that is both multi-dimensional clustered and data partitioned, columns can be used both in the table partitioning range-partition-spec and in the MDC key. This is useful for achieving a finer granularity of partition + block elimination than could be acheived by either feature alone. There are also many applications where it is useful to specify different columns for the MDC than those on which the table is data partitioned. Refer to the usage scenarios for more details. Note that table partitioning is muti-column but not multi-dimension while MDC is multi-dimension.
  • DB2 provides the most tremendous flexibility in partitioning techniques to meet your design, performance and operational requirements. Hash partitioning, available with the Distributed Partitioning Feature (DPF), allows data to be spread across nodes allowing massively parallel IO to speed your query results. Range partitioning allows simplified Roll-In and Roll-Out for adding and removing data from the active database. Multi-dimensional Clustering (MDC) allows partitioning by like data attributes, or dimensions, providing optimized, high density, high yield access.
  • In this example, 4 partitions are used, but DB2 can support up to 1000. Hash partitioning divides the data using a built-in hash algorithm, and allows massively parallel access to the data. In the above example, each partition now only contains ¼ of the data. DB2 parallelizes the IO and can scale to handle your largest databases.
  • Range partitioning allows data to be physically segregated based upon the range of an attribute, most frequently date. Range partitioning allows an optimizer technique, referred to as pruning, to be used to limit the data accessed. For instance, in the above example a query accessing only 2006 data does not need to access 2005 data, so the 2005 data can be eliminated and not accessed. Range partitioning can be used alone, or in conjunction with other techniques, such as Hash partitioning as shown. Using both Hash and Range partitioning together further reduces the data accessed and spreads work of accessing that data across more resources, speeding query results even more.
  • The blocks represent Multi-dimensional Clustering (MDC) blocks. Data with like attributes, or dimensions, is partitioned into blocks, allowing fast, high density, high yield, block IOs compared with record-at-a-time access. Entire blocks of data can be read, and MDC technology assures that all records in the block contain the required attributes. MDC can be used alone or in conjunction with Hash and Range partitioning, as shown. Using a combination of Hash, Range and MDC allows division of the work among more resources and parallelism, as well as the opportunity for pruning through Range partitioning, and then further speeds access through MDC block access.
  • Multi-Temperature Data Management is included only in: - Enterprise Server Edition and - Advanced Enterprise Server Edition Tiered Storage is also know as Hierarchical Storage Management (HSM)
  • Top-level overview with everything expanded; note use of grey shading to distinguish ungrouped databases in the production group from grouped databases in the intranet group
  • DB2 workload manager (WLM) is a resource management and monitoring tool that is built right into the database engine. The primary client benefits include CPU control, detection/control of rogue queries (limit excessive, unexpected resource consumption) and monitoring of work on the database. The new WLM architecture is also designed with integration of external WLM products such as AIX workload manager in mind, allowing DB2 to take advantage of their capabilities (on platforms where they exist) while improving the end-to-end workload management story. WLM is simple to administer, based on workload identification and then prioritization. An excellent introductory article on DB2 workload management is available for download at http://www.ibm.com/developerworks/forums/servlet/JiveServlet/download/1116-166950-13965175-231542/Introduction%20to%20DB2%20workload%20management.pdf These workload management capabilities help ensuring proper resource allocation and utilization, which can help meet service level agreements and reduce your total cost of ownership. Now you don’t have to worry about someone hogging CPU time with a monster query, and you can ensure that high-priority work gets done first.
  • Rows Read Internal SQL activities, such as those initiated by the setting of a constraint or the refreshing of a materialized query table are also not counted for this condition.
  • Data Studio Administration of Adaptive Compression Create and alter table, index for adaptive compression Provide compression actions Update Row Compression action in control flow for new options Implementation, Deployment and Maintenance of Storage Groups Create, alter, drop storage groups Create, alter, drop tablespaces within storage groups Generate DDL script from physical database Generate DDL script from compare and synch Compare and synch to validate consistent deployment Analyze impacted objects Implementation, Deployment and Query Analysis of Temporal tables Create, alter, drop tables with temporal attributes Generate DDL script from physical database Generate Delta DDL script from compare and synch Compare and synch to validate consistent deployment Deploy script to multiple servers using deployment manager Analyze impacted objects Implementation, Deployment and Query Analysis of Row and Column access control Create, alter, drop permissions and masks for users, groups, and roles Create or promote secure functions or secure triggers Activate and deactivate Row and Column access control Generate DDL script from physical database Analyze impacted objects (e.g. objects needing rebind, made secure) Support for developing Java, C, and .NET applications against a DB2 pureScale environment Perform common administration tasks across DB2 pureScale members and CF Support for high speed unload utility Enhanced Visual Explain, Access Plan Explorer and Index advice to include jump scan and joins Optim Configuration Manager Recommendation compression savings opportunities for tables, indexes and XML objects View compression savings for all objects in the database Policy management Optim Performance Manager Storage group to tablespace alerts Storage group report with drill down to tablespace Route and remap based on data tags Integrated alerting and notification of DB2 pureScale members Seamless view of status and statistics across all DB2 pureScale members and CFs Optim Query Workload Tuner Query, Statistics, and Tuning advice for applications on DB2 pureScale systems InfoSphere Data Architect Logical modeling: system-period, business-period, or bi-temporal Transformation from Logical Modeling to Physical Modeling Physical modeling: system-period, business-period, or bi-temporal Reverse engineering from database and DDL to physical model
  • There will be new and improved tooling available in DB2 Galileo. This list just summarizes some of the enhancements that will be made and more details will be made available at GA time.
  • The high availability disaster recover (HADR) feature supports multiple standby databases. This allows you to have your data in more than two sites, providing improved data protection with a single technology. When you deploy the HADR feature in multiple standby mode, you can have up to three standby databases in your setup. You designate one of these databases as the principal HADR standby and any other standby database is an auxiliary HADR standby. Both types of HADR standby are synchronized with the HADR primary through a direct TCP/IP connection, both types support reads on standby, and both types can be configured for time-delayed log replay. In addition, you can issue a forced or non-forced take over on any standby. There are a few important distinctions between the principal and auxiliary standbys, however: IBM® Tivoli® System Automation for Multiplatforms (TSA) automated failover is only supported for the principal standby; you must issue a takeover on one of the auxiliary standbys to make one of them the primary.All of the HADR sync modes are supported on the principal standby only; the auxiliary standbys run in SUPERASYNC mode.
  • DB2 pureScale is a new optional DB2 feature that reduces the risk and cost of business growth by providing unlimited capacity, continuous availability, and application transparency. DB2 pureScale allows you to have multiple database servers in a system that all share a common set of disks. The system was developed in partnership with IBM Power Systems and helps DB2 both scale and always stay up. Key features of DB2 pureScale are: Unlimited Capacity Buy only what you need, add capacity as your needs grow We’ll discuss how this applies not only to your IT infrastructure but also to your licensing costs. Application Transparency Avoid the risk and cost of application changes. DB2 pureScale will help you scale without expensive and risky data changes. Continuous Availability Deliver uninterrupted access to your data with consistent performance DB2 has learned from the undisputed gold standard of reliability – System Z and based DB2 pureScale on the Z architecture that businesses have learned to trust for their most critical systems. We’ll be walking through how DB2 pureScale can help your business and how DB2 pureScale works.
  • Based on industry leading System z data sharing architecture, DB2 pureScale integrates IBM technologies to keep your critical systems available all the time. It includes: Automatic workload balancing to ensure that no node in the system is over loaded. DB2 will actually route transactions or connections to the least heavily used server. This workload balancing is hidden from the end user and even from applications by having the DB2 client handle all the workload balancing. The client will actually periodically check the workload levels and re-route transactions to different servers. The workload balancing can occur either at the transaction or connection level. Transaction support was added as many customers and ERP system use connection pooling and without transactional level support workloads may never be moved. DB2 pureScale is built on the most reliable UNIX system available – Power Systems. Other platforms will be available in the future DB2 and Power Systems worked very closely on DB2 pureScale to ensure that it is optimized for AIX at all levels, be it memory, networking or storage. The technology for globally sharing locks and memory is based on technology from z/OS which has a great track record of being the most reliable and scalable architecture available. Tivoli System Automation has been integrated deeply into DB2 pureScale. It is installed and configured as part of the DB2 installation process and DBAs and system administrators never even know its there. The DB2 fixpaks will even include and apply any Tivoli updates so DBAs and system administrators never need to understand another software product. The networking infrastructure leverages Infiniband and all additional clustering software is included as pat of DB2 pureScale installation. This technology has allows us to avoid many scaling problems other vendors have run into. The core of system is a shared disk architecture.
  • DB2 pureScale has a large technology demonstration being announced later this quarter that demonstrates the great scaling that can be achieved as more members are added. The technology can clearly scale beyond a simple 4 member configuration. This near linear scaling will allow a business to allocate as much capacity as they need without fear of the clustering technology failing them. No other vendor can demonstrate scaling to beyond a hundred members. Not all vendors are upfront with their scaling and don’t even allow scaling numbers to be published. Instead of showing the scaling they just put up a large number and never really show you what you could achieve with less hardware. Even worse, they tend to flat line at 4 servers as their architecture isn't designed to scale, or to scale without a lot of work as you add capacity. The workload that the demonstration ran on is described below (Note: only go into detail if asked) Demonstrate transparent application scaling. This was a Web commerce workload: Read-mostly, but not read-only, the application is not cluster-aware so there was no routing of of transactions to members. The original goal was to stop when the scalability hits 80%, or stop when hit triple digit number of members; whichever comes first. The results of this 112 member test shows that there is near linear scaling even out to 112 members in the cluster. Up to 64 members in the cluster, the scalability (compared to the 1 member result) is still above 90% and at 112 members the scalability was at 81%. Note that this is a validation of the architecture and includes some capabilities under development that will not be in the December GA code. Other vendors have not publicly stated what scalability they can obtain in such a large cluster with a workload that is not cluster-aware. Note that their offerings are limited to 100 nodes.
  • - pureScale now supports the DB2 SET WRITE command. The SET WRITE command allows a user to suspend I/O writes or to resume I/O writes for a database. Typical use of this command is for splitting a mirrored database. This type of mirroring is achieved through a disk storage system. - Cluster caching facilities (CFs) now support multiple low-latency, high-speed cluster interconnects. With multiple cluster interconnects on the CFs, you can connect each CF to more than one switch. Adding cluster interconnects, and adding a switch to a DB2 pureScale environment both improve fault tolerance. - With multiple cluster interconnects on the CFs, you can connect each CF to more than one switch. A one-switch multiple cluster interconnect configuration increases the throughput of request to CFs. A two-switch configuration helps with increased throughput and high availability. DB2 pureScale environments do not require multiple cluster interconnects. Can do geographically-dispersed clusters. It’s not a product feature per se. It is not in the official documentation. However, there is a whitepaper on it. http://www.ibm.com/developerworks/data/library/long/dm-1104purescalegdpc/index.html This article describes the geographically dispersed DB2® pureScale™ cluster (GDPC). Like the Geographically Dispersed Parallel Sysplex™ configuration of DB2 for z/OS®, GDPC provides the scalability and application transparency of a regular single-site DB2 pureScale cluster, but in a cross-site configuration that enables active/active system availability, even in the face of many types of disasters. The active/active part is important because it means that during normal operation, the DB2 pureScale members at both sites are sharing the workload between them as usual, with workload balancing (WLB) maintaining an optimal level of activity on all members, both within and between sites. This means that the second site is not a standby site, waiting for something to go wrong. Instead, the second site is pulling its weight, returning value for investment even during day-to-day operation. This article describes the prerequisites for a geographically dispersed DB2 pureScale cluster, followed by the steps to one deploy one, as well as some of the performance implications of different site-to-site distances and different workload types. The article covers the following topics: GDPC concepts GDPC infrastructure and prerequisites GDPC setup and configuration Performance factors Detailed configuration steps
  • Available only in DB2 Advanced Enterprise Server Edition. Easy to administer (powerful scripting capability)
  • Both compiles part of core DB2 engine. Produce equivalent quality runtime code. No preference. Tooling/Monitoring hooks in with run time code. So no problems.
  • Main Point: For more than six decades now, IBM has pioneered in developing systems that are not only optimized for performance but also maintain transactional integrity – something that is key to any large scale real time business system. As early as 1962, IBM delivered PARS (Programmable Airline Reservation System), a large scale airline reservation application and TPF (Transaction Processing Facility) as the underlying transaction operating system that became widely implemented by major airlines, credit cards, hotels, rental car reservations, police emergency response systems, and package delivery systems. In 1970, E. F. Codd of IBM Research published a paper that led to a new way for computers to manage information. Four years later, two IBMers published a paper that would become the basis for the SQL language standard. This set the stage for new, more powerful questions that could be asked of the data that lay within organizations. Continuing in its rich tradition of technology innovation, IBM has delivered top notch transactional systems built for the mainframe, such as CICS and IMS, and tight integration with the operating systems and server hardware, such as zSeries Parallel Sysplex architecture. On the AS/400 for instance, DB2 is implemented as part of the operating system itself, which support single-server and multi-server parallel processing and clustering. Into the 2000’s IBM continues to rapidly innovate to provide its customers the best of breed transactional systems optimized for the demands, expectations and evolving workloads of its customers in the 21 st century.
  • Steve Mills 10/27/09 091027 MILLS IOD CONFERENCE STD TEMPLATE
  • DB2 – Differentiating Business Value

    1. 1. IBM Information ManagementDB2 – Differentiating Business ValueLes KingDirector, Product Managementlking@ca.ibm.comMay, 2012 © 2006 IBM Corporation
    2. 2. IBM Client Confidential – Do Not Distribute IBM Information ManagementDB2 Overview - Agenda Reducing Storage Costs A hybrid database with pureXML Industry unique clustering for storage optimization, performance & scalability Autonomics, Workload Management and Advanced Tooling – let DB2 do the work High Availability and Extreme Scalability for all workloads Protecting your existing application investment The form-factor of your choice Security, Data Governance & Regulatory Compliance Going forward Data Server Innovation to Deliver True Business Value
    3. 3. IBM Client Confidential – Do Not Distribute IBM Information ManagementDB2 – A continuous delivery of content !! DB2 10.1 Just GA’d in 2Q 2012 DB2 9.8 Introduction of DB2 pureScale DB2 9.7 About 2/3 of our customers are here DB2 9.5 About 1/3 of our customers are here DB2 9.1 Coming up to end of support Data Server Innovation to Deliver True Business Value
    4. 4. IBM Client Confidential – Do Not DistributeIBM Information Management Compression Reducing Overall Storage CostsData Server Innovation to Deliver True Business Value
    5. 5. IBM Client Confidential – Do Not Distribute IBM Information Management Breakthrough Savings with Adaptive CompressionLower Storage Costs; Lower Administration Costs Galileo Adaptive Compression DB2 9.7 Temp Space & Index DB2 9.5 Compression On-line Enablement DB2 9.1 of Table Compression Compression • Adaptively apply both table-level compression and page-level compression • Table re-orgs not required to maintain high compression • Compress archive logs5 Data Server Innovation to Deliver True Business Value
    6. 6. IBM Client Confidential – Do Not Distribute IBM Information ManagementRow CompressionReduces the cost of data storage Fred, Dept 500, 10000, Plano, TX, 24355… John, Dept 500, 20000, Plano, TX, 24355, Site 3 Fred, (01), 10000, (02), John, (01), 20000, (02) 179.9 GB 01 Dept 500 76% 02 Plano, TX, 24355 Smaller! … … 42.5 GB Dictionary contains repeated information from the rows. Data Server Innovation to Deliver True Business Value
    7. 7. IBM Client Confidential – Do Not Distribute IBM Information Management…. And if that’s not enough  Performance gains of up to 40% …. – All data is compressed in memory  Reduced outages for utilities – Backup, Reorg now run 2X-4X faster  6X+ volume of data as compared to production size – Dev/Test, HA, DR, Backup repositories Data Server Innovation to Deliver True Business Value
    8. 8. IBM Client Confidential – Do Not Distribute IBM Information ManagementImproving the Best Compression in the Industry– Multiple algorithms for automatic index compression Unique in the industry– Automatic compression for temporary tables Unique in the Table Temp Table industry Order By Order By Temp– Intelligent compression of large objects and XML Data Server Innovation to Deliver True Business Value
    9. 9. IBM Client Confidential – Do Not Distribute IBM Information ManagementAdaptive Compression – What is it  Technology provides compression rates approaching 7X  Provides significant costs benefits – Storage reduction • Acquisition cost, floor space, power and cooling – I/O reduction • Reduced response time and improved throughput – Reduced backup and recovery times  Very simple to configure and use  Next generation compression is adaptive – Improve compression rates by up to 2X (approaching 15X) – Maintain compression rates with data skew Data Server Innovation to Deliver True Business Value 9
    10. 10. IBM Client Confidential – Do Not Distribute IBM Information ManagementAlmost 40% Improvement over DB2 9.7 CompressionWhen Using Offline Reorg and Compression 60 54.1 50 Storage Size (GB) 40 30 5.2x 20 8.5x 10 10.5 6.4 0 Uncompressed DB2 9.7 Compression Galileo Adaptive Compression10 Data Server Innovation to Deliver True Business Value
    11. 11. IBM Client Confidential – Do Not Distribute IBM Information ManagementUp to 60% Improvement over DB2 9.7 CompressionWhen Using Online Automatic Dictionary Creation (ADC) 60 54.1 50 Storage Size (GB) 40 30 2.5x 20 21.2 6.4x 10 8.4 0 Uncompressed DB2 9.7 Compression Galileo Adaptive Compression11 Data Server Innovation to Deliver True Business Value
    12. 12. IBM Client Confidential – Do Not Distribute IBM Information ManagementAdaptive Compression Shrinks your Data Storage Needs  Higher performance – Faster queries for I/O-bound environments – Faster backups  Lower costs – Postpone upcoming storage purchases – Lower ongoing storage needs – Easier administration with reduced need for table re-orgs “Page-level dynamic compression is one of the new DB2 features that will reduce planned outages by 40% and storage savings up to 50%” —Jessica Tatiana Flores Montiel, DAFROS Multiservicios “Our migration from Oracle Database to DB2 resulted in a 40% storage savings. Upgrading to DB2 9.7 and index compression brought our average savings to 57%. Now adaptive compression brings our average savings to 77%, dramatic savings!” —Andrew Juarez, Lead SAP Basis / DBA, Coca Cola Bottling Company.12 Data Server Innovation to Deliver True Business Value
    13. 13. IBM Client Confidential – Do Not Distribute IBM Information ManagementMore Proof Points …. Data Server Innovation to Deliver True Business Value
    14. 14. IBM Client Confidential – Do Not DistributeIBM Information Management pureXML A Hybrid DatabaseData Server Innovation to Deliver True Business Value
    15. 15. IBM Client Confidential – Do Not Distribute IBM Information ManagementDB2 XML - A New Generation Hybrid Data Server High cost development Streamlined development Poor performance High performance Or Business data in XML form Business data in XML form managed in relational database managed with DB2 pureXMLTM Data Server Innovation to Deliver True Business Value
    16. 16. IBM Client Confidential – Do Not Distribute IBM Information ManagementDB2 XML - Benefits XPath XPath XML XML Store Retrieve Mapping Retrieval Client Client Code Code XML XML Shred Compose Shredded Content Catalog DB2 XML Simplified and streamlined solution  No mapping code to write and maintain  No complex schema to manage and maintain  No proprietary catalog  No XPath parsing and result set composition Improved performance and flexibility Lower development and maintenance costs and faster to market Data Server Innovation to Deliver True Business Value
    17. 17. IBM Client Confidential – Do Not Distribute IBM Information ManagementDB2 XML – A First Class Citizen  Data Definition create table dept(deptID int, deptdoc xml);  Insert insert into dept(deptID, deptdoc) values (?,?)  Retrieve select deptID, deptdoc from dept  Query select deptID, xmlquery($d/dept/name passing deptdoc as “d") from dept where deptID <> “PR27”; Data Server Innovation to Deliver True Business Value
    18. 18. IBM Client Confidential – Do Not Distribute IBM Information ManagementSQL/XML: Use SQL to produce XML SELECT Available Functions: XMLELEMENT(NAME "Department", XMLELEMENT XMLATTRIBUTES (e.dept AS "name" ), XMLATTRIBUTES XMLFOREST XMLAGG( XMLELEMENT(NAME "emp", e.firstname) ) XMLCONCAT ) AS "dept_list" XMLAGG XML2CLOB FROM employee e XMLNAMESPACES WHERE ….. XMLCAST GROUP BY e.dept; Start With Produce dept_list <Department name="A00"> firstname lastname dept <emp>CHRISTINE</emp> <emp>VINCENZO </emp> SEAN LEE A00 <emp>SEAN</emp> MICHAEL JOHNSON B01 </Department> VINCENZO BARELLI A00 <Department name="B01"> CHRISTINE SMITH A00 <emp>MICHAEL</emp> </Department> Data Server Innovation to Deliver True Business Value
    19. 19. IBM Client Confidential – Do Not Distribute IBM Information Management The FLWOR Expression  FOR: iterates through a sequence, bind variable to items  LET: binds a variable to a sequence  WHERE: eliminates items of the iteration  ORDER: reorders items of the iteration  RETURN: constructs query resultsFOR $movie in xmlcolumn(‘movies.doc’)LET $actors := $movie//actorWHERE $movie/duration > 90 <movie> <title>Chicago</title>ORDER by $movie/@year <actor>Renee Zellweger</actor>RETURN <movie> <actor>Richard Gere</actor> <actor>Catherine Zeta-Jones</actor> {$movie/title, $actors} </movie> </movie> Data Server Innovation to Deliver True Business Value
    20. 20. IBM Client Confidential – Do Not DistributeIBM Information Management Storage Optimization and Multi Dimensional Clustering Unique in the IndustryData Server Innovation to Deliver True Business Value
    21. 21. IBM Client Confidential – Do Not Distribute IBM Information ManagementProblem : Optimizing for Multiple Access Keys  Database systems try to store the records of a table in a particular order (e.g. in part number order) to enable fast/ordered retrieval – Called ‘Clustering’ – Speeds up queries – but only for a single ‘key’ (aka ‘dimension’) – Queries involving other dimensions suffer – Clustering eventually degrades SELECT * FROM Sales WHERE Region = SW – Usually do not require a page I/O when reading the next record (because it’s usually on same page as previous record) Region – The page I/Os that are required, are sequential (efficient) … NW SW SW SW … SELECT * FROM Sales WHERE Year = 2009 – Usually do require a page I/O when reading the next record (because it’s usually on a different page than the previous record) … 2009, 2010, 2010, 2010, … – Each of these page I/Os is random (inefficient) Year Data Server Innovation to Deliver True Business Value
    22. 22. IBM Client Confidential – Do Not Distribute IBM Information ManagementSolution : Multi-Dimensional Clustering (MDC)  Divides the table up into ‘extents’ and ensures that each record in an extent contains the same value in all interesting dimensions – Extent = consecutive group of pages, big enough for efficient I/O (typically 32 pages; 4 in the e.g. below) – Queries in all dimensions benefit – This clustering is always maintained by DB2; it never degrades SELECT * FROM Sales WHERE Region = SW – 2 big block I/Os to retrieve pages containing region SW Region – All sequential I/O NW,2010 SW,2010 SW,2011 SELECT * FROM Sales WHERE Year = 2010 – 2 big block I/Os to retrieve pages containing year 2010 Year – All sequential I/O Have your cake and eat it too ! Data Server Innovation to Deliver True Business Value
    23. 23. IBM Client Confidential – Do Not Distribute IBM Information ManagementMDC : Simple and Flexible SyntaxExample 1: CREATE TABLE Sales (YEAR DATE, REGION CHAR(12), PRODUCT CHAR(30), … ORGANIZE BY (YEAR, REGION, PRODUCT);Example 2: CREATE TABLE Sales (SALES_DATE DATE, REGION CHAR(12), PRODUCT CHAR(30),… MONTH GENERATED ALWAYS AS ((INTEGER(DATE)/100)… ORGANIZE BY (MONTH, REGION, PRODUCT) For the query: select * from sales where sales_date>”2010/03/03” and date<“2011/01/01”.. The compiler generates the additional predicates: month>=201003 and month<=201101 Data Server Innovation to Deliver True Business Value
    24. 24. IBM Client Confidential – Do Not Distribute IBM Information ManagementRange Partitioning Allows a single logical table to be broken up into multiple separate physical storage objects  Each corresponds to a ‘partition’ of the table  Partition boundaries correspond to specified value ranges in a specified partition key Examples of Benefits  Allows for partition elimination during SQL processing  Allows for optimized roll-in / roll-out processing (e.g.. minimized logging)  Allows for “divide and conquer” table management Without Partitioning With Partitioning Tablespace A Tablespace B Tablespace C Tablespace 1 Table_1 Table_1.p1 Table_1.p2 Table_1.p3 Data Server Innovation to Deliver True Business Value
    25. 25. IBM Client Confidential – Do Not Distribute IBM Information ManagementDistributing, Partitioning, ClusteringCREATE TABLE ORDERS (ORDER_ID, SHIP_DATE, REGION, CATEGORY) IN TBSP1, TBSP2, TBSP3, TBSP4 DISTRIBUTE BY (ORDER_ID) World’s Richest PARTITION BY (SHIP_DATE) Slice & Dice (STARTING FROM(01-01-2010’) ENDING (‘9-31-2010’) EVERY (3 MONTHS)) Capability MONTH GENERATED ALWAYS AS ((INTEGER(DATE)/100) ORGANIZE BY (MONTH, REGION, PRODUCT) Data Rows Distribute via Hash ORDER_ID Partition By Range Partition By Range SHIP_DATE SHIP_DATE Tablespace A Tablespace B Tablespace C Tablespace A Tablespace B Tablespace C Part 1 Part 2 Part 3 Part 1 Part 2 Part 3 Month Month Month Month Month Month Region Region Region Region Region Region Organize By Organize By Organize By Organize By Organize By Organize By Product Product Product Product Product Product Data Server Innovation to Deliver True Business Value
    26. 26. IBM Client Confidential – Do Not Distribute IBM Information ManagementNo Partitioning Data Data Server Innovation to Deliver True Business Value
    27. 27. IBM Client Confidential – Do Not Distribute IBM Information ManagementDistribute by HashDivide & Conquer Parallelism P1 P2 P3 P4 Data Server Innovation to Deliver True Business Value
    28. 28. IBM Client Confidential – Do Not Distribute IBM Information ManagementHash + Partition by Range - Partition EliminationMassive Parallelism with Massive IO Reduction P1 P2 P3 P4 2 0 1 0 2 0 1 1 Data Server Innovation to Deliver True Business Value
    29. 29. IBM Client Confidential – Do Not Distribute IBM Information ManagementHash + Range + MDCHigh density, High Value, Low IO Reads P1 P2 P3 P4 2 0 1 0 2 0 1 1 Data Server Innovation to Deliver True Business Value
    30. 30. IBM Client Confidential – Do Not Distribute IBM Information Management Multi-Temperature Data Management Increase Ability to Meet SLAs; Postpone Hardware Upgrades Storage pools for different tiers of storage – For range partitions, policy-based automated movement of data HOT WARM COLD ARCHIVE SSD RAID SAS RAID SATARAID Optim Data Growth Higher performance – Improved ability to meet SLAs while retaining greater amount of data for analysis Lower costs – Embrace new lower-cost storage technology – Further reduces the cost for meeting SLAs “The multi-temperature database management feature of DB2 V10.1 is great because the hardware world is not just RAM and hard disks. There are many types of storage options with different I/O speeds and prices. This feature allows administrators to make optimal use of these different devices, balancing expensive SSDs with cheaper SATA disks and everything in between. Using SSDs for indexes and logs and a SATA array for the data, we noticed fantastic improvements in I/O speeds, especially for synchronous reads. Additionally, the background movement of data between the storages groups is very fast.” —Thomas Kalb, CEO ITGAIN GmbH30 Data Server Innovation to Deliver True Business Value
    31. 31. IBM Client Confidential – Do Not DistributeIBM Information Management Autonomics, Workload Management and Advanced Tooling Intelligence and SimplicityData Server Innovation to Deliver True Business Value
    32. 32. IBM Client Confidential – Do Not Distribute IBM Information Management Adaptive Self Tuning Memory Management DB2 9 introduced a revolutionary memory tuning system called the Self Tuning Memory Manager (STMM) – Works on main database memory parameters • Sort, locklist, package cache, buffer pools, and total database memory – Hands-off online memory tuning • Requires no DBA intervention – Senses the underlying workload and tunes the memory based on need – Adapts quickly to workload shifts that require memory redistribution – Adapts tuning frequency based on workload Interval BenefitPerPage Interval Tuner Tuner DB2 Step Tuner Clients Entry Size MIMO MIMO Control Algorithm Control Algorithm Y Memory Accurate Greedy Accurate Model Model (Constraint) Statistics Builder Builder Collector N Fixed 4-Bit Entry Step (Oscillation) Size DB2 UDB Server BenefitPerPage Data Server Innovation to Deliver True Business Value
    33. 33. IBM Client Confidential – Do Not Distribute IBM Information ManagementSTMM in action – Two databases on the same box 7000000 Second 6000000 database stopped 5000000 Second Memory (in 4K Pages) database 4000000 started 3000000 2000000 1000000 0 0 10000 20000 30000 40000 50000 60000 70000 Time (in seconds) Data Server Innovation to Deliver True Business Value
    34. 34. IBM Client Confidential – Do Not Distribute IBM Information ManagementSTMM in Action – Dropping an Important Index TPCH Query 21 - After drop index - Average times for the 10 streams 7000 Avg = 6205 6000 Reduced 63%Time in seconds 5000 4000 Indexes Dropped Avg = 2285 3000 2000 Avg = 959 1000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Order of execution Data Server Innovation to Deliver True Business Value
    35. 35. IBM Client Confidential – Do Not Distribute IBM Information ManagementSTMM – Comparing Different Configurations 70000 STMM also edges 63796 63302 out benchmark 60000 tuned system Transactions Per Minute 50000 40000 STMM beats default configuration by nearly 30000 4x 20000 16713 10000 0 Default (No tuning) Benchmark Tuned STMM Tuned Configuration Data Server Innovation to Deliver True Business Value
    36. 36. IBM Client Confidential – Do Not Distribute IBM Information Management“Time Spent” Metrics (example) Total Time Default Time Metrics Bufferpool Read Wait Bufferpool Write Wait Direct I/O Read Wait Direct I/O Write Wait Lock Wait Agent Wait WLM Queue Wait FCM Send Wait FCM Receive Wait Network Send Wait Network Receive Wait Log Write Wait Log Buffer Insert Wait Wait Times Processing / Non-Wait Time Data Server Innovation to Deliver True Business Value
    37. 37. IBM Client Confidential – Do Not Distribute IBM Information Management“Component Time” Metrics (example) Data Server Innovation to Deliver True Business Value
    38. 38. IBM Client Confidential – Do Not Distribute IBM Information ManagementHealth Monitor Context: All  Heat Alert Dashboar Chart s d Alerts System Database us tat c e nc e nt S ge an na ge ve r e ce Us a for m on s ns int e rA er ag pa er cti tio ito l ng Us ry g ac ng Ma S ca ni S mo ck in L P onne s gg i Mon atatus Criti D ta W ar CP U Di sk Me Lo SQ C Tr an Lo  Name S Context  Production   3 5           Web   1 0          eCommerce   1 0          Support   0 0           Retail   0 2          New York   0 1          Los Angeles   0 1          Accounts   1 0          Marketing   1 3           Test   1 6           Development   0 11          Data Server Innovation to Deliver True Business Value
    39. 39. IBM Client Confidential – Do Not Distribute IBM Information ManagementMeet Your Service Level Agreements Optimize Performance with Workload Management – Create controls ahead of time – Override them on the fly – Adjust to changing priorities throughout the day Lower your costs by automating resource allocation and utilization – Control for both applications and users – Establish controls based on business priority Workload management – Part of database engine – Request management – Time based resource management Data Server Innovation to Deliver True Business Value
    40. 40. IBM Client Confidential – Do Not Distribute IBM Information ManagementSummary of Key WLM Features  DB2 Service Class – Serves as the primary point of resource control for executing work – Acts as point of integration with AIX WLM for work being done within database  DB2 Workload – Serves as the primary point of control for submitters of work – Acts as primary router of work to a specific DB2 Service Class  DB2 Threshold – Provides limits to control behaviours of database activities based on predictive and reactive elements – Provides limits to control rate of concurrency for database activities  DB2 Work Action Set – Provides ability to discriminate between different types of database activities for service subclass mapping or for DB2 Threshold assignment  DB2 WLM Monitor and Control capabilities – New table functions, event monitors, and stored procedures to provide monitoring and control mechanisms for DB2 WLM Data Server Innovation to Deliver True Business Value
    41. 41. IBM Client Confidential – Do Not Distribute IBM Information ManagementWorkload Management Enhanced Thresholds – Rows Read • Does not include index access • Checked on user configurable time interval • Database, work action set, service class, workload – Processing Time (CPU) • Database, work action set, service class, workload • Calculated on a user specified check interval – Aggregate System Temp • Controls overall aggregate temp space within a service class Tiered service class model – Use in service class thresholds to remap to new class – Processing time or number rows read Defining thresholds on workload domain – Estimated SQL cost, SQL rows returned, actitivity total time, SQL temp space – Rows read – Processing time Bufferpool sensitivity to I/O priority – Introduce h/m/l bufferpool priority for service class – Pages will be swapped out based on the priority they were fetched under Data Server Innovation to Deliver True Business Value
    42. 42. IBM Client Confidential – Do Not Distribute IBM Information ManagementAccelerate Value for New FeaturesIncrease Ability to Meet SLAs; Lower Administration Costs  Updated Database Admin solutions: – IBM Data Studio – InfoSphere Data Architect  Updated Performance Mgmt solutions: – InfoSphere Optim Performance Manager – InfoSphere Optim Query Workload Tuner – InfoSphere Optim Configuration Manager  Higher performance – Immediate support for new performance features – Enhanced Visual Explain, Access Plan Explorer and Index Advice – Extended Insight identifies source of performance issues  Lower costs – Immediate support for new time saving features (incl. Temporal, Multi-Temperature Data Management & Row and Column Access Control) – IBM solutions are integrated and consistent42 Data Server Innovation to Deliver True Business Value
    43. 43. IBM Client Confidential – Do Not Distribute IBM Information Management New and Enhanced Tooling IBM Data Studio 3.1.1 – Merges functionality from Data Studio, Optim Development Studio and Optim Database Administrator – Includes all functionality available in Control Center and MORE!! – Supports DB2 Galileo features Optim Query Workload Tuner: Tunes multiple queries in parallel IBM Data Studio Console replaces Health Center Optim Performance Manager: Monitors workloads and events Workload Manager (WLM) replaces some of Query Patroller and Governor functionality Adding Additional Value to Advanced Enterprise Edition43 Data Server Innovation to Deliver True Business Value
    44. 44. IBM Client Confidential – Do Not DistributeIBM Information Management High Availability and Extreme Scalability Optimized for the workloadData Server Innovation to Deliver True Business Value
    45. 45. IBM Client Confidential – Do Not Distribute IBM Information Management HADR now Supports Multiple Standby ServersIncrease Ability to Meet SLAs; Disaster Recovery  HADR now supports more than one stand-by server  If Primary Server fails, Principal Standby takes over  If Principal Standby then fails, can switch to Auxiliary Standby  Auxiliary Standby can provide complete offsite availability, while maintaining speed of local standby45 Data Server Innovation to Deliver True Business Value
    46. 46. IBM Client Confidential – Do Not Distribute IBM Information ManagementDB2 pureScale –OLTP Workloads Unlimited Capacity – Buy only what you need, add capacity as your needs grow Application Transparency – Avoid the risk and cost of application changes Continuous Availability – Deliver uninterrupted access to your data with consistent performance Learning from the undisputed Gold Standard... System z Data Server Innovation to Deliver True Business Value
    47. 47. IBM Client Confidential – Do Not Distribute IBM Information ManagementDB2 pureScale Architecture Automatic workload balancing Cluster of DB2 members running on Linux or Power servers Leverages the global lock and memory manager technology from z/OS Integrated Tivoli System Automation InfiniBand network & DB2 Cluster Services Shared Data Data Server Innovation to Deliver True Business Value
    48. 48. IBM Client Confidential – Do Not Distribute IBM Information ManagementDB2 pureScale: Near Linear Scaling 112 MembersOLTP Workloads 81% Scalability 88 Members 87% Scalability 2, 4 and 8 64 Members Members 91% Scalability Over 95% Scalability 32 Members Over 95% Scalability 16 Members Over 95% Scalability Number of Members in the Cluster Data Server Innovation to Deliver True Business Value
    49. 49. IBM Client Confidential – Do Not Distribute IBM Information Management DB2 pureScale Enhancements Increase Ability to Meet SLAs; Easily Add or Remove Capacity  Further Improving IBM’s Shared-Disk Cluster Capability – NEW! Workload management for DB2 pureScale – NEW! Multiple database support • Easy multi-tenancy – NEW! Range partitioning support – NEW! Additional backup/restore options – NEW! Support for 10-gigabit Ethernet – NEW! Support for multiple Infiniband adapters and switches  Configurable geographically-dispersed clusters “Vormetric’s integration with DB2 pureScale GPFS provides IBM customers with a fantastic combination of Vormetric Data Security with pureScale availability, capacity and scalability. Improved performance and availability with data security offers our mutual customers a phenomenal solution.” -- Todd Thiemann, Senior Director, Product Marketing Vormetric, Inc.49 Data Server Innovation to Deliver True Business Value
    50. 50. IBM Client Confidential – Do Not Distribute IBM Information Management Real-Time Data WarehousingFaster Business Decisions; More Accurate Business Decisions  Continuous feed of data  Parallel processing  Supports multiple connections  Higher performance – Faster availability of data – Minimal impact on query performance – No downtime (even for large volumes of data)  Lower costs – Costs less than solutions outside database – Reduced infrastructure costs “You can now continuously feed data into your data warehouse at a high rate even whilst you are running queries against the tables in your data warehouse. DB2 10 represents a greatly strengthened offering for the data warehouse market.” —Ivo Grodtke, LIS.TEC GmbH50 Data Server Innovation to Deliver True Business Value
    51. 51. IBM Client Confidential – Do Not Distribute IBM Information ManagementActive - Active Warehousing Full database or subset Bi-directional or uni- directional Integrated packaging Multiple stand-bys Read-Only or Read/Write Time Delay Read/Write Application Applications Seamless application failover IBM IBM IBM IBM p e s S rie pSeries pSeries pSe s rie server server se r rve server Storage Area Network part1 part5 part9 part13 part2 part6 part10 part14 part3 part7 part11 part15 part4 part8 part12 part16 IBM IB M IB M IB M IBM IBM IBM IBM se rve r pSeries s erv er pSer ie s ser ver pSer ies ser ver pSer ies DB2 Warehouse C s erver pSeries server pSerie s serv er pSeries serv er pSeries (Standby) Storage Area Network Q replication Storage Area Network part1 part5 part9 part13 part1 part5 part9 part13 part2 part6 part10 part14 part2 part6 part10 part14 part3 part7 part11 part15 part3 part7 part11 part15 Database Log buffers part4 part8 part12 part16 part4 part8 part12 part16 DB2 Warehouse A DB2 Warehouse B (Primary) Primary DB2 Client connection path (Standby) DB2 Client Reroute connection path Data Server Innovation to Deliver True Business Value 51
    52. 52. IBM Client Confidential – Do Not DistributeIBM Information Management Application Portability Protection of existing investmentData Server Innovation to Deliver True Business Value
    53. 53. IBM Client Confidential – Do Not Distribute IBM Information Management Move Your Applications to DB2 … as-is Proven Results  Use the Oracle skills you have with  Simple drag and drop of schemas to DB2 DB2 – Achieve high productivity  Applications moved to DB2 run with  Integrated, cross-platform tools full native execution – Deliver high performance  IBM can rapidly assess your – 98%+ application code runs as-is application Data Server Innovation to Deliver True Business Value3. Ease of Use
    54. 54. IBM Client Confidential – Do Not Distribute IBM Information ManagementPL/SQL in DB2  Built in PL/SQL compiler  Source level debugging and profiling Editor Data Studio PL/SQL SQL PL Compiler Compiler Debugger SQL Unified Runtime Engine Profiler DB2 Server Data base Data Server Innovation to Deliver True Business Value
    55. 55. IBM Client Confidential – Do Not Distribute IBM Information ManagementDebugging PL/SQL in DB2 Data Server Innovation to Deliver True Business Value
    56. 56. IBM Client Confidential – Do Not Distribute IBM Information Management Oracle Compatibility Layer in DB2 – Aggressive delivery into DB2 9.7 ship vehicles – Significant reduction in cost and risk of application migration from Oracle to DB2 • Many applications simply work as-is, with equal or better performance • Now over 98% compatibility with DB2 10 – Prioritized Focus on …. – Extending Reach > Eliminating show stoppers > More client APIs to increase number of applications which can be enabled to DB2 in an enterprise. – Increase “out of the box” experience > Popular features which reduce effort significantly > 98% → 99% compatibility => Reduction of effort by 50% – Utilize with partners and community – Support for Oracle Forms via multiple partners – Support for more Oracle package via Developerworks56 Data Server Innovation to Deliver True Business Value
    57. 57. IBM Client Confidential – Do Not Distribute IBM Information ManagementAverage PL/SQL Compatibility Moves Above 98%Easily Move from the More Expensive Oracle Database; Leverage Oracle Skills with DB2 9.7.1 SUB STRB Increase compatibility 9.7.1 UDF Parameters: INOUT Increase compatibility Reliance Life Insurance 9.7.1 FORALL/BULK COLLECT Increase compatibility “The total cost of ownership with 9.7.1 Improve BOOLEAN Increase compatibility DB2 running on IBM systems is almost 9.7.1 Conditional Compilation Enhancement half the cost of Oracle Database on 9.7.1 Basic DPF Support Broaden coverage Sun systems.” 9.7.1 OCI Support Broaden coverage 9.7.2 UDF Parameters: DEFAULT Increase compatibility 9.7.2 Obfuscation Enhancement Banco de Crédito del Peru 9.7.2 NCHAR, NVARCHAR, NCLOB Increase compatibility “We switched from Oracle Database 9.7.3 NUMBER Performance Performance to IBM DB2 and cut our costs in half, 9.7.3 Runtime “purity level” Enforcement Increase compatibility while improving performance and 9.7.3 RATIO_TO_REPORT Function Increase compatibility reliability of business applications.” 9.7.3 RAISE_APPLICATION_ERROR Increase compatibility 9.7.3 Small LOB Compare Increase compatibility JSC Rietumu Banka 9.7.4 Multi-action Trigger & Update Before Trigger Increase compatibility• Moved from Oracle Database to IBM DB2 9.7.4 Autonomous Tx Improvements Increase compatibility • Used “compatibility features” 9.7.4 LIKE Improvements, LISTAGG Increase compatibility • 3-30x faster query performance 9.7.4 ROW & ARRAY of ROW JDBC Support Increase compatibility 9.7.5 Pro*C Support Increase compatibility • 200% improvement in data availability 9.7.5 Nested Complex Objects Increase compatibility 10 Local Procedure Definitions Increase compatibility 10 Local Type Definitions Increase compatibility57 10 PL/SQL Performance Performance * Based on internal tests and reported client experience from 28 Sep 2011 to 07 Mar 2012. Data Server Innovation to Deliver True Business Value
    58. 58. IBM Client Confidential – Do Not DistributeIBM Information Management Workload Optimized Systems One Size Does NOT fit all …..Data Server Innovation to Deliver True Business Value
    59. 59. IBM Client Confidential – Do Not Distribute IBM Information ManagementIBM - Proven Track Record of Optimizing Systems 1950s…1960s 1990s TPF: Airline Reservation System IMS & s/360: Transaction & Database IMS, CICS, and DB2 Parallel Sysplex: System High-scale Application and Data Serving 2000s IBM Smart Analytics System 1970s...1980s High-scale Business Intelligence and System 38 and AS/400: Relational / XML Data Warehousing Integrated Application and Data Serving DB2 pureScale on PowerHA: DB2 & S/370: Online Transaction High-Scale Database Management Processing WebSphere Edge Server: High-scale Web Application Serving Datapower: XML & Web services appliance Data Server Innovation to Deliver True Business Value
    60. 60. IBM Client Confidential – Do Not Distribute IBM Information ManagementWorkload Optimized Right Out of the Box • 3x to Database throughput improvement on brokerage application. • System configuration POWER7 (Model 9117-MMB) DB2 9.7 FP1 4.6 TB SSD + 38.4 TB 15K HDD Data Server Innovation to Deliver True Business Value
    61. 61. IBM Client Confidential – Do Not Distribute IBM Information ManagementIBM Smart Analytic System – For Operational AnalyticsLeveraging DB2 10 - Coming soonFaster Time to Value, Faster Business Results 5600 Based on System x …Designed for business analytics workloads Meeting clients where …Optional Solid State Disk – reducing data latency their information is…5710 7700Based on System x Based on POWER7 Servers…Cost-effective solutions for analytics …Scaling to hundreds of terabytes of data and BI, reporting …Extract insights from untapped information…Compact, integrated single analytics solutions…Available for mid-market7710Based on Power7 Server 9600…A single server warehousing and analytics solution Based on Based on System z…Built on POWER7 based servers and designed for …Advanced query /workload management production data warehouses - sizes under 10 TB, …Database designed and optimized for system…For development and non-production use. …Disk controller, optimized to reduce data latency Data Server Innovation to Deliver True Business Value
    62. 62. IBM Client Confidential – Do Not Distribute IBM Information ManagementForm-factor Coverage Expert Optimised Flexible Deployment Configurable System Warehouse Appliance •Deployed On Platform of Choice •Highly Optimised & adaptable Platform •Fully Integrated High Performance Analytical Appliance •Sophisticated Architectures and workloads •Sophisticated Architectures and workloads •Low Cost of Management •Mixed Updates and Queries •Mixed updates and Queries •Rapid Deployment •Configurable/Tunable platform •Configurable/Tunable platform •Industry Vertical Applications •No Copy Analytics •No Copy Analytics InfoSphere Warehouse IBM Smart Analytics System IBM Netezza Choice of Platform Workload Optimized Linux, Unix, Windows Simplify - Accelerate Value - Reduce Cost DB2 or Z or Power or Linux Customers who want Customers with Operational Analytic requirements a Warehouse Appliance for deep analytics Software Only Solutions or Expert Optimized Systems Configurable/Customisable Low Touch Data Server Innovation to Deliver True Business Value
    63. 63. IBM Client Confidential – Do Not DistributeIBM Information Management Security, Data Governance & Regulatory Compliance A complete set of capabilitiesData Server Innovation to Deliver True Business Value
    64. 64. IBM Client Confidential – Do Not Distribute IBM Information ManagementSecurity, Regulatory Compliance and Data GovernanceComponents of a Data Compliance OfferingLeveraging DB2, Optim and Guardium Audit updates made to data within Encryption for data at rest, backup the database data or data within the database • • Minimal impact to performance required Applicable US Regs.: • Sarbanes, HIPAA, PCI… • Encryption on disk, backups and on the wireArchive Strategy Creation of test databases Corporate• Keep data for a specified period of time • “Changing” sensitive data• Includes audit record repository as well while maintaining “production-like” Data Servers as sensitive data data and referential integrity • For development and test Proactively detect areas Restrict and Manage access in DB2 • Individual users of vulnerability • Map database to business security models • Important to ensuring • SECADM compliance of all other pillars • Label Based Access Control • Roles • Finely Grained Access Control and Masking • Identity Assertion and Trusted Context Data Server Innovation to Deliver True Business Value
    65. 65. IBM Client Confidential – Do Not DistributeIBM Information Management Conclusion Continued Investment Consistency of focusData Server Innovation to Deliver True Business Value
    66. 66. IBM Client Confidential – Do Not Distribute IBM Information ManagementAccelerating an Information-Led Transformation…IBM has invested $12B in R&D and Acquisitions Cognos SPSS ILOG InfoSphere DB2 Informix NetezzaFileNet Optim66 Data Server Innovation to Deliver True Business Value66
    67. 67. IBM Client Confidential – Do Not Distribute IBM Information ManagementIBM DB2 Technology RoadmapInvestment, Innovation and Industry Leading Capabilities 2011 2012 2013 2014 and Beyond DB2 DB2 DB2 Overall •Oracle compatibility •Temporal Query •Simple integrated Disaster • Unbreakable DB2 enhancements • Fine Grain Access Control Recovery (logical HADR) • Seamless application • 2x Compression • SSD and PCM memory recovery PureScale DB2 • Oracle compatibility hierarchy • Continuous application •Geographic Cluster enhancements • Reorg free database availability • HADR Multiple standby • Oracle compatibility • Transparent archival • XML • Java object cache • Hibernate optimizations • Cross-database de- • Multiple CF HCA integration duplication Support pureScale DB2 • Virtual database and pureScale DB2 • Online rolling upgrades tenant support • Multiple Database •Range Partitioned Tables • Built-in encryption • Incremental backup Consolidation Platform • Tablespace recovery • Explicit Hierarchical Locking • no-sql, triple store, RDF etc • Performance Improvements support •Database as a Service • RDMA over 10Gb-E for P for Tier 2 and below Consolidation Platform applications • Common platform with SMAS Consolidation Platform • Virtualized •DbaaS for Tier 1 • Single model for Tier 1, Tier applications 2, Tier 3 and • Web workload deployment • Bare metal DbaaS deployment of pureScale • Deployment into DB2 pureScale DB2 • Multitenancy support and more… Data Server Innovation to Deliver True Business Value67 67
    68. 68. IBM Information Management DB2 – Differentiating Business ValueLes KingDirector, Product Managementlking@ca.ibm.comMay, 2012 © 2006 IBM Corporation

    ×