IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle

1,676 views

Published on

IBM DB2 Analytics Accelerator has drawn lots of attention from DB2 for z/OS users. In many respects it presents itself as just another DB2 access path (but what a powerful one!) and its deep integration into DB2 as well as application transparency makes it one of the most exciting DB2 enhancements in years. The IBM DB2 Analytics Accelerator complements DB2 by adding industry leading data intensive complex query performance thanks to being powered by the Netezza engine and enhances DB2 to the ultimate database management system that delivers the best of both worlds: transactional as well as analytical workloads. This presentation brings the latest news from the IDAA development and shows the trends and directions in which this technology develops.

Published in: Data & Analytics, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,676
On SlideShare
0
From Embeds
0
Number of Embeds
22
Actions
Shares
0
Downloads
48
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle

  1. 1. 1 © 2011 IBM Corporation© 2014 IBM Corporation #IDUG IBM DB2 Analytics Accelerator Trends and Directions Namik Hrle IBM April 16, 2013 | Platform: DB2 for z/OS
  2. 2. 2 © 2014 IBM Corporation Agenda ■ What is IBM DB2 Analytics Accelerator ■ Fast evolution of DB2 Analytics Accelerator ■ Static SQL support ■ Workload balancing across multiple accelerators ■ Incremental update enhancements ■ High Performance Storage Server improvements ■ Easy workload acceleration eligibility assessment ■ More acceleration, improved functionality, simpler management
  3. 3. 3 © 2014 IBM Corporation IBM DB2IBM DB2 AnalyticsAnalytics AcceleratorAccelerator Applications DBA Tools, z/OS Console, ... . . .. . . Operation Interfaces (e.g. DB2 Commands) Application Interfaces (standard SQL dialects) DB2 LogLog ManagerManager IRLMIRLM BufferBuffer ManagerManager DataData ManagerManager System zSystem z Superior availabilitySuperior availability reliability, security,reliability, security, workload management,workload management, OLTP performance ...OLTP performance ... Powered byPowered by PDAPDA True appliance,True appliance, Industry leadingIndustry leading ease of performanceease of performance Uniform DB2 service, maintenance, database administration, ... Uniform and transparent access for transactional and analytical applications What Is IBM DB2 Analytics Accelerator?
  4. 4. 4 © 2014 IBM Corporation IBM zEnterprise and DB2 Analytics Accelerator Transaction Processing The hybrid computing platform on zEnterprise Analytics Workload DB2 Analytics Accelerator and DB2 for z/OS A self-managing, hybrid workload-optimized database management system that runs every query workload in the most efficient way, so that each query is executed in its optimal environment for greatest performance and cost efficiency Ø Supports transaction processing and analytics workloads concurrently, efficiently and cost- effectively Ø Delivers industry leading performance for mixed workloads Driving Revolutionary Change
  5. 5. 5 © 2014 IBM Corporation Agenda ■ What is IBM DB2 Analytics Accelerator ■ Fast evolution of DB2 Analytics Accelerator ■ Static SQL support ■ Workload balancing across multiple accelerators ■ Incremental update enhancements ■ High Performance Storage Server improvements ■ Easy workload acceleration eligibility assessment ■ More acceleration, improved functionality, simpler management
  6. 6. 6 © 2011 IBM CorporationIBM Confidential © 2014 IBM Corporation Fast Evolution of IBM DB2 Analytics Accelerator • Version 1 – IBM Smart Analytics Optimizer – In-memory, column-store, multi-core and SIMD algorithms – Discontinued and replaced by IBM DB2 Analytics Accelerator • Version 2 – New name: IBM DB2 Analytics Accelerator – Incorporates Netezza query engine – Preserves key V1 value propositions and adds many more • Version 3 – Better performance, more capacity – Incremental update – High Performance Storage Server • Version 4 – Much broader acceleration opportunities – More enterprise features Nov 2010Nov 2010 Nov 2011Nov 2011 Nov 2012Nov 2012 Nov 2013Nov 2013
  7. 7. 7 © 2014 IBM Corporation IDAA V3 Highlights Generally available since November 2012 ■ Propagating DB2 changes to the accelerator as they happen: Incremental Update ■ Reducing disk storage cost by archiving data in the accelerator and maintaining the excellent performance for analytical queries: High Performance Storage Saver ■ Workload Manager integration ■ Automatic detection of needs to refresh data in the accelerator ■ More query routing control for applications (all, eligible) ■ More query offload (e.g. DB2 OLAP functions) ■ Speeding-up data refresh and reducing associated CPU cost on System z (1) ■ Accelerating in-database transformation (1) ■ Enhancing high availability and scaling out (1) ■ Improving performance of queries that generate very large result sets (1) ■ Supporting multi-byte EBCDIC data encoding (phase 1) (1) ■ Increasing capacity to more than 1 petabyte (1) ■ Support for SAP workloads (1) (1) – features retrofitted to V2
  8. 8. 8 © 2014 IBM Corporation IDAA V3 Highlights Additions since GA ■ Additional query engine: PureData System for Analytics N2001 ■ Support for Netezza operating system 7 ■ Further reduction of CPU time associated with IDAA load process – Up to 30% – Enhancements in DFSMS BSAM routines managing data on the USS pipes – z/OS PTFs: ● z/OS V1.12 UA68971 ● z/OS V1.13 UA68972 ● z/OS V2.1 UA68973 ■ Multiple time zones in the same accelerator ■ Limited support for LOCAL DATE setting ■ Support for BITAND and TIMESTAMPDIFF functions ■ Support for DECFLOAT when used as implicit cast – e.g. when comparing different data types ■ Enhancements to incremental update
  9. 9. 9 © 2011 IBM CorporationIBM Confidential © 2014 IBM Corporation N1001 N2001/N2002 Blade type HS22 HX-5 CPU sockets & cores per blade 2 x 4 Core Intel CPUs 2 x 8 Core Intel CPUs # Disks 96 x 3.5” / 1 TB SAS (92 Active) 288 x 2.5” / 600GB SAS2 (240 Active) Raw Capacity 96 TB 172.8 TB Total Disk Bandwidth ~11 GB/s ~32 GB/s S-Blades per Rack (cores) 14 (112) 7 (112) S-Blade Memory 24 GB 128 GB Rack Configurations ¼, ½, 1, 1 ½, 2, 3, … 10 ½, 1, 2, 4 FPGA Cores / Blade 8 (2 x 4 Engine Xilinx FPGA) 16 (2 x 8 Engine Xilinx Virtex 6 FPGA) User Data / Rack (assuming 4x compression) 128 TB 192 TB IBM PureData System for Analytics Models Comparison
  10. 10. © 2011 IBM CorporationIBM Confidential © 2014 IBM Corporation Speed Through Taking Most of Streaming Capabilities FPGA CoreCPU Core DecompressProjectRestrict Visibility Complex ∑ Joins, Aggs, etc. S-Blade Table Cache DB2 for z/OS 130 MB/s 1300 MB/s 1000 MB/s1000 MB/s 4x compression assumed 130 MB/s 65 MB/s 2.5 drives per core 325 MB/s FPGA CoreCPU Core DecompressProjectRestrict Visibility Complex ∑ Joins, Aggs, etc. S-Blade Table Cache DB2 for z/OS 120 MB/s 480 MB/s 500 MB/s800 MB/s 4x compression assumed N200xN200x N1001N1001
  11. 11. 11 © 2011 IBM CorporationIBM Confidential © 2014 IBM Corporation IBM DB2 Analytics Accelerator Supports All Models Capacity = User data space Effective Capacity = User data space with compression (4x compression assumed) N2001 Models 005 010 020 040 Cabinets 1/2 1 2 4 S-Blades 4 7 14 28 Processing Units 64 112 224 448 Capacity (TB) 24 48 96 192 Effective Capacity (TB)* 96 192 384 768 N1001 Models 002 005 010 015 020 030 040 060 080 100 Cabinets ¼ ½ 1 1 ½ 2 3 4 6 8 10 S-Blades 4 7 14 18 28 42 56 84 112 140 Processing Units 32 56 112 144 224 336 448 672 896 1120 Capacity (TB) 8 16 32 48 64 96 128 192 256 320 Effective Capacity (TB)* 32 64 128 192 256 384 512 768 1024 1280 N2002 Model 002 005 010 020 040 Cabinets ¼ 1/2 1 2 4 S-Blades 2 4 7 14 28 Processing Units 32 64 112 224 448 Capacity (TB) 8 24 48 96 192 Effective Capacity (TB)* 32 96 192 384 768
  12. 12. 12 © 2011 IBM CorporationIBM Confidential © 2014 IBM Corporation Growth On Demand Example One rack for approximately same price as a half of the rack  Model name: “(Minimum capacity) N2001-010” defined as 24TB (raw) and 50% performance  Model name: “(Extra capacity) N2001-010“ defined as 6TB storage (raw) and GRA resource increment of 12.5% performance  There is a small premium for buying as you grow Growth on Demand vs. Standalone Purchase
  13. 13. 13 © 2014 IBM Corporation BITAND and TIMESTAMPDIFF Support ■ Queries using the following functions with INTEGER, SMALLINT and BIGINT data types are eligible for routing to the accelerator: ● BITAND ● BITANDNOT ● BITOR ● BITXOR ● BITNOT ■ Queries using these functions with DECIMAL, DOUBLE, REAL and DECFLOAT data types are not eligible for routing to the accelerator ■ DB2 execution of TIMESTAMPDIFF is an estimate ● 1 month = 30 days ● 1 year = 365 days ■ However, if the function is executed by the accelerator, the calculation will account for leap years and months with 31 days ■ Therefore, different results are expected between the same query execution on DB2 vs. accelerator BITAND TIMESTAMPDIFF
  14. 14. 14 © 2014 IBM Corporation Limited DECFLOAT Support ■ IDAA still does not support explicitly defined DECFLOAT columns and queries that explicitly or implicitly return DECFLOAT column, e.g. SELECT C2+'2147483648' FROM ... and C2 is integer. ■ However, if the DECFLOAT is used implicitly by DB2, for example when comparing different data types, that is no longer obstacle for routing queries to the accelerator ■ DB2 will cast to DOUBLE instead of DECFLOAT before routing to the accelerator ■ Examples: – SELECT … FROM … WHERE C2+'2147483648' > 12 – The OLAP functions CORR, COVAR, and COVAR_SAMP will now offload as long as none of the arguments are DECFLOAT. The result datatype for these OLAP functions is actually DOUBLE, not DECFLOAT. DB2 uses DECFLOAT for processing the OLAP function, but the return datatype is DOUBLE. – There may be other scalar functions that were previously blocked from offload because it returned a DECFLOAT result that will now offload if the function is not in the top SELECT list. ■ Note that a loss of precision can occur
  15. 15. 15 © 2014 IBM Corporation Version 4 at a Glance More Query Acceleration Enhanced Capabilities Improved Transparency Static SQL Improved scalability of Incremental Update Automatic workload balancing with multiple accelerators DB2 11 (2) Better performance of Incremental Update New RTS 'last-changed-at' timestamp (2) Multi-row fetch from local applications Improved performance for large result sets (2) Automated NZKit installation EBCDIC and Unicode in the same DB2 system and accelerator Better access control for HPSS archived partitions Built-in Restore for HPSS NOT IN and ALL predicates (3) HPSS archiving to multiple accelerators Protection for image copies created by HPSS archiving process FOR BIT DATA support (3) Extending WLM support to local applications Profile controlled special registers (2) 24:00:00 time value (3) Rich system scope monitoring Improved continuous operations for Incremental Update MEDIAN support (3) Reporting prospective CPU cost and elapsed time savings Refreshing IDAA table without table lock even if incremental updated active (3) Separation of duties for accelerator system administration operations Static SQL and workload balancing enablement migration tool (3) Support for N2002 hardware (3) Incremental Update continues replicating even for tables in AREO state (3) Loading from flat file or image copy (1) Loading in parallel to DB2 and accelerator (1) Loading data as of any past point in time (1) Loading data to accelerator only (1) E n a b l i n g n e w u s e c a s e s (1) – delivered by a separate tool (2) – DB2 11 only (3) – IDAA V4.1.2 (PTF2 – March 2014)
  16. 16. 16 © 2014 IBM Corporation Agenda ■ What is IBM DB2 Analytics Accelerator ■ Fast evolution of DB2 Analytics Accelerator ■ Static SQL support ■ Workload balancing across multiple accelerators ■ Incremental update enhancements ■ High Performance Storage Server improvements ■ Easy workload acceleration eligibility assessment ■ More acceleration, improved functionality, simpler management
  17. 17. 17 © 2014 IBM Corporation Static SQL Support ● The most requested feature since the accelerator's first release ➔ Presumably many customers implemented reporting workloads on System z using static SQL ● Well, the request is addressed in V4 ➔ Statically bound queries on active or archived data can be routed to the accelerator ● New BIND options ➔ QUERYACCELERATION ➔ GETACCELARCHIVE ➔ The possible values match the existing special register and zparm semantics ● Acceleration for static queries is determined and fixed at package bind time ➔ Tables must be defined to an accelerator and enabled for acceleration prior to binding the package ■ Accelerator must be active and started when the static query runs
  18. 18. 18 © 2014 IBM Corporation Workload Assessment Techniques for Static SQL select collid, name, statement from sysibm.syspackstmt where explainable = 'Y' and collid = '…' and name = '…' Application extract SQL from EXEC SQL SQL monitoring tools such as OMPE, QM, ... Get top 10 - 50 statements and identify acceleration candidates oror Get referenced tables and views DDL (no data) Convert and run statements as dynamic SQL Send to IBM EXPLAIN on IBM system IBM produces list of statements and their acceleration eligibility Create virtual accelerator and add relevant tables to it Run EXPLAIN using virtual accelerator Produce list of statements and their acceleration eligibility Follow existing workload assessment procedure – engage IBM IBM produces standard PDF Option A Option B Option C
  19. 19. 19 © 2014 IBM Corporation Agenda ■ What is IBM DB2 Analytics Accelerator ■ Fast evolution of DB2 Analytics Accelerator ■ Static SQL support ■ Workload balancing across multiple accelerators ■ Incremental update enhancements ■ High Performance Storage Server improvements ■ Easy workload acceleration eligibility assessment ■ More acceleration, improved functionality, simpler management
  20. 20. 20 © 2014 IBM Corporation Efficient Workload Balancing across Multiple Accelerators • No workload balancing across multiple accelerators ➔ DB2 selects an eligible accelerator in a non-deterministic fashion ➔ All eligible queries are routed to that accelerator • Workaround involves manually distributing tables across accelerators ➔ All tables can be defined and loaded in every accelerator but the sets of enabled tables differ across the accelerators ➔ Requires careful planning and very good understanding of the query workload ➔ Inflexible ➔ Suboptimal use of the combined accelerator resources ➔ High availability procedures must include steps to enable tables that are normally disabled in the failover accelerator V3V3
  21. 21. 21 © 2014 IBM Corporation Workload Balancing across Multiple Accelerators in V3V3 DB2 F1 F2 D1 D2 D3 D4 Accelerator Y F1 F2 D1 D2 D3 D4 Accelerator X F1 F2 D1 D2 D3 D4 disabled disabled Select … from F1, Dx … Select … from F2, Dx ... … but only assuming uniform distribution of queries across F1 and F2
  22. 22. 22 © 2014 IBM Corporation DB2 F1 F2 D1 D2 D3 D4 Accelerator Y F1 F2 D1 D2 D3 D4 Accelerator X F1 F2 D1 D2 D3 D4 disabled disabled Select … from F1, Dx … Select … from Dx ... Select … from Dx ... Select … from Dx ... Select … from F2, Dx ... Workload Balancing across Multiple Accelerators in V3V3
  23. 23. 23 © 2014 IBM Corporation Efficient Workload Balancing across Multiple Accelerators • No workload balancing across multiple accelerators ➔ DB2 selects an eligible accelerator in a non-deterministic fashion ➔ All eligible queries are routed to that accelerator • Workaround involves manually distributing tables across accelerators ➔ All tables can be defined and loaded in every accelerator but the sets of enabled tables differ across the accelerators ➔ Requires careful planning and very good understanding of the query workload ➔ Inflexible ➔ Suboptimal use of the combined accelerator resources ➔ High availability procedures must include steps to enable tables that are normally disabled in the failover accelerator V3V3 • Automated workload balancing across multiple accelerators • Accelerators notify DB2 about their resource utilization ➔ Utilization determined based on the accelerator's capacity and request queue length ➔ Regularly sent to all the attached DB2 systems via the heartbeat signal ➔ DB2 checks the utilization for every eligible accelerator and routes the query to the most optimal one • Migration was not smooth at GA, but it is addressed in PTF2 ➔ In order to benefit from workload balancing the associated tables needed to be redefined to the accelerators ➔ Workload balancing functions across mixed, V3 and V4, accelerators, but at least two of them need to be at V4 V4V4
  24. 24. 24 © 2014 IBM Corporation DB2 F1 F2 D1 D2 D3 D4 Accelerator Y F1 F2 D1 D2 D3 D4 Accelerator X F1 F2 D1 D2 D3 D4 Select … from F1, Dx … Select … from F2, Dx, ... Select … from Dx ... Select … from Dx ... Select … from F1, Dx … Select … from F2, Dx, ... Select … from Dx ... Select … from Dx ... utilization capacity utilization capacity Workload Balancing across Multiple Accelerators in V4V4
  25. 25. 25 © 2014 IBM Corporation Agenda ■ What is IBM DB2 Analytics Accelerator ■ Fast evolution of DB2 Analytics Accelerator ■ Static SQL support ■ Workload balancing across multiple accelerators ■ Incremental update enhancements ■ High Performance Storage Server improvements ■ Easy workload acceleration eligibility assessment ■ More acceleration, improved functionality, simpler management
  26. 26. 26 © 2014 IBM Corporation Incremental Update Enhancements • Each DB2 system using incremental update requires a dedicated replication apply agent on the accelerator ➔ Replication apply agent needs at least 4GB of memory ➔ This limits the number of DB2 systems connected to the same N1001 accelerator to theoretically 4, but practically 2. • If a table enabled for replication needs to be reloaded, the replication is stopped for all tables ➔ Disruptive for continuous operations • Log reader returns all the log records to the capture agent which needs to select only those belonging to the tables that are enabled for replication V3V3 • A single replication apply agent can service up to 10 connected DB2 systems ➔ Still, exercise common sense in preventing overloading the accelerator • Reloading a table enabled for replication does not affect any other table. They continue to be replicated ➔ Particularly useful when replicated tables got changed via non-logged operations • Log reader filters the relevant log records during the retrieval ➔ Enabled by IFI enhancements in DB2 11 (and ported back to DB2 10 and IDAA V3) ➔ Better performance and lower overall CPU utilization on System z V4V4
  27. 27. 27 © 2014 IBM Corporation Agenda ■ What is IBM DB2 Analytics Accelerator ■ Fast evolution of DB2 Analytics Accelerator ■ Static SQL support ■ Workload balancing across multiple accelerators ■ Incremental update enhancements ■ High Performance Storage Server improvements ■ Easy workload acceleration eligibility assessment ■ More acceleration, improved functionality, simpler management
  28. 28. 28 © 2014 IBM Corporation High Performance Storage Saver Major saving of host disk space for historical data Year Year -7Year -2 Year -3 Year -4 Year -5Year -1 Historical Data Current Data One Quarter = 3.57% of 7 years of data One Month = 1.12% of 7 years of data 4Q 4Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q
  29. 29. 29 © 2014 IBM Corporation Storing historical data in accelerator only Accelerator Part #1Query from Application Or No longer present on DB2 Storage Part #1 Part #2 Part #3 Part #4 Part #5 Part #6 Part #7 DB2 Active Historical  Time-partitioned tables where: – only the recent partitions are used in a transactional context (frequent data changes, short running queries) – the entire table is used for analytics (data intensive, complex queries).  High Performance Storage Saver’s “Archive” Process: – Data is loaded into Accelerator if not already loaded – Automatically takes Image Copy of each partition to be archived – Automatically remove data from DB2 archived tablespace partitions – DBA starts archived partitions as read-only High Performance Storage Saver
  30. 30. 30 © 2014 IBM Corporation High Performance Storage Server Enhancements at a Glance • Data integrity exposure ➔ Inserts and updates to archived partitions are not systemically prevented ➔ The changes are not supposed to happen based on the usage scenarious, but there is no guarantee they would happen • Image copies generated during archiving process have special importance ➔ Need to be handled with special care • Restoring archived partitions is a complex procedure that must be performed manually • Table cannot be archived to multiple accelerators ➔ No appropriate support for high availability V3V3 • Archived partitions are placed into a new PRO state that prevents data modifications • Several image copy enhancements ➔ No new image copies can be created for partitions in the PRO status ➔ Up to 4 image copies per partition can be created ➔ Naming schema based on templates • Restore of archived partitions encapsulated in an administrative stored procedure • Table can be archived in multiple accelerators ➔ Image copy used as the source for subsequent accelerators V4V4
  31. 31. 31 © 2014 IBM Corporation backup part n backup part n-1 backup part 1 backup part 2 backup part n ... backup part n-1 Initial Situation Before Archiving Application DB2 Accelerator backup part 1 backup part 2 backup part n ... SELECT FROM X routing? backup part n-1 part n part n-1 part n-2 part 2 part 1 . . . tableX part n part n-1 part n-2 part 2 part 1 . . . tableX no backup part 1 backup part 2 backup part 1 backup part 2 backup part n backup part n-1 DB2 recovery site yes ... ...
  32. 32. 32 © 2014 IBM Corporation backup part n backup part n-1 backup part 1 backup part 2 backup part n ... backup part n-1 Supplied Stored Procedure Encapsulates Archiving Procedure Application DB2 Accelerator backup part 1 backup part 2 backup part n ... backup part n-1 part n part n-1 part n-2 part 2 part 1 . . . tableX part n part n-1 part n-2 part 2 part 1 . . . tableX backup part 1 backup part 2 backup part 1 backup part 2 backup part n backup part n-1 DB2 recovery site ... ... CALL stored procedure ACCEL_ARCHIVE_ TABLES partitions specification 'partitions specification' is given in terms of which tables and which partitions should be moved to the accelerator. Let's say that in this particular example only the last two partitions “n” and “n-1” of table X should stay in DB2 As of V4As of V4, the names and the number of image copies, up to 2 local and 2 remote, to be created are specified in the template member ACTENV In this particular example the table is already defined and loaded to the accelerator. You can also first archive partitions and subsequently load into accelerator the active ones. The effect is the same.
  33. 33. 33 © 2014 IBM Corporation backup part n backup part n-1 backup part 1 backup part 2 backup part n ... backup part n-1 Partitions to be Archived Are Firstly Backed Up Application DB2 Accelerator part n part n-1 part n-2 part 2 part 1 . . . tableX part n part n-1 part n-2 part 2 part 1 . . . tableX backup part 1 backup part 2DB2 recovery site ... CALL stored procedure ACCEL_ARCHIVE_ TABLES partitions specification As of V4As of V4, the archiving process can generate multiple image copies according to specification in the template backup part 1 backup part 2 backup part n ... backup part n-1 backup part 1 backup part 2 backup part n backup part n-1 ...
  34. 34. 34 © 2014 IBM Corporation backup part n backup part n-1 backup part 1 backup part 2 backup part n ... backup part n-1 Old Partitions are Deleted from DB2 Application DB2 Accelerator As of V4As of V4, the PRO status for archived partitions implicitly protects image copies, i.e. no further image copies can be created backup part 1 backup part 2 backup part n ... backup part n-1 part n part n-1 part n-2 part 2 part 1 . . . tableX backup part 1 backup part 2 backup part 1 backup part 2 backup part n backup part n-1 ... ... CALL stored procedure ACCEL_ARCHIVE_ TABLES partitions specification As of V4As of V4, these partitions are set to the PRO status (PERSISTENT READ ONLY) which prevents data modifying operations. Offending applications receive -904, 00C90635 part n part n-1 Old partitions are still present in the table, but they are empty and the disk space use is limited to the primary allocation quantity which can be made very small tableX DB2 recovery site
  35. 35. 35 © 2014 IBM Corporation Applications Have Transparent Access Application DB2 Accelerator part n part n-1 tableX SELECT FROM X routing ? SELECT FROM X part n part n-1 part n-2 part 2 part 1 . . . tableX Set zparm (1) or Set special register (2) (1) Set once on global scope, without application changes (2) Set within the application and allows changing the scope on a per-statement level no yes No SQL Statement Changes Needed no changes in V4no changes in V4
  36. 36. 36 © 2014 IBM Corporation Restoring Archived Partitions Application DB2 Accelerator backup part 1 backup part 2 backup part n ... backup part n-1 part n part n-1 part n-2 part 2 part 1 . . . tableX part n part n-1 tableX In V3V3 restoring archived partitions was a manual procedure. In V4V4, it is automated. new in V4new in V4
  37. 37. 37 © 2014 IBM Corporation Restoring Archived Partitions Application DB2 Accelerator backup part 1 backup part 2 backup part n ... backup part n-1 part n part n-1 part n-2 part 2 part 1 . . . tableX part n part n-1 tableX CALL stored procedure ACCEL_RESTORE_ARCHIVE_ TABLES partitions specification 'partitions specification' is given in terms of which tables and which partitions should be restored. Let's say that in this particular example only partition 2 is to be restored. new in V4new in V4 part 2 Automated procedure for restoring any number (including all) of archived partitions and making them active in DB2 again. The acceleration status does not change,
  38. 38. 38 © 2014 IBM Corporation Exploiting TEMPLATE Capabilities  New AQTENV environment variables to allow up to 4 image copies: AQT_ARCHIVE_COPY1 AQT_ARCHIVE_COPY2 AQT_ARCHIVE_RECOVERYCOPY1 AQT_ARCHIVE_RECOVERYCOPY2  Each value is a template specification as in the DB2 TEMPLATE utility, e.g. AQT_ARCHIVE_COPY1=&USERID..&DB..&TS..P&PART..&UNIQ.  Following requirements apply to the values: ● variables can be used as documented for the DB2 COPY Utility ● variables &SEQ, &LIST, &DSNUM cannot be used for IBM DB2 Analytics Accelerator ● values must evaluate to qualifiers that are mapped by DFSMS to a suitable data class (as in V3 with static HLQ prefix) ● values must ensure uniqueness of names among all archived partitions. recommendation: use variables &PART and &UNIQ for that purpose ● values must evaluate to valid z/OS dataset names  V3 environment variable AQT_ARCHIVECOPY_HLQ is no loger supported ■ ■
  39. 39. 39 © 2014 IBM Corporation Archiving Table on Multiple Accelerators • A given table can be archived to one accelerator only ➔ No appropriate support for high availability including disaster recovery V3V3 • A table can be archived on multiple accelerators • Archiving performed once per accelerator ➔ not a single step for multiple accelerators • For that accelerator on which a table is archived column SYSACCELERATEDTABLES(ARCHIVE) is set to 'A' ➔ For other accelerators the column is set to 'C' • If ACCEL_ARCHIVE_TABLE is called for an already archived table then it will archive the table also on the new accelerator by using available image copy data ➔ To archive a table from image copy to an accelerator at least one accelerator having this table already archived must be active V4V4
  40. 40. 40 © 2014 IBM Corporation Agenda ■ What is IBM DB2 Analytics Accelerator ■ Fast evolution of DB2 Analytics Accelerator ■ Static SQL support ■ Workload balancing across multiple accelerators ■ Incremental update enhancements ■ High Performance Storage Server improvements ■ Easy workload acceleration eligibility assessment ■ More acceleration, improved functionality, simpler management
  41. 41. 41 © 2014 IBM Corporation Workload Acceleration Eligibility Assessment • • IBM CoE assessment procedure ➔ Most detailed assessment ➔ Dynamic SQL only ➔ Low intensity engagement by customers • Virtual accelerators ➔ Reports acceleration eligibility only ➔ Dynamic SQL only ➔ Run by customers, moderate effort needed • Optim Query Workload Tuner ➔ What-if analysis V3V3 • • IBM CoE assessment procedure ➔ Most detailed assessment ➔ Low intensity engagement by customers for dynamic SQL ➔ Moderate intensity engagement by customers for static SQL • Virtual accelerators ➔ Reports acceleration eligibility only ➔ Dynamic and static SQL ➔ Run by customers, moderate effort needed • Optim Query Workload Tuner ➔ What-if analysis • Accelerator modelling ➔ Provided by DB2 instrumentation ➔ Dynamic and static SQL ➔ Very low effort for customers ➔ CPU cost and elapsed time prospective savings ➔ It might require follow up with IBM CoE assessment procedure V4V4
  42. 42. 42 © 2014 IBM Corporation Accelerator Modelling ■ Provides indicators for possible CPU and elapsed time savings if IBM DB2 Analytics Accelerator was available – It does not require presence of the accelerator ■ DB2 11 or DB2 10 ■ Controlled by new zparm ACCELMODEL which can be set to YES or NO – Changeable online – If set to YES, DB2 accounting records (IFCIDs 3 and 148) include projected CPU ad elapsed time savings – Both the zparm and special register CURRENT QUERY ACCELERATION must be set to NONE ● However, EXPLAIN will still indicate if the query is eligible for acceleration and, if not, the reason why in DSN_STATEMNT_TABLE.REASON – Like with any DB2 instrumentation, the new timers need to be formatted and reported by a monitor ■ For more granular and detail analysis, such as projected cost saving per statement, request the existing IBM CoE assessment procedure or use Optim Query Workload Tuner ■ Functionality delivered via two DB2 10 APARs – PM90886: Covers the existing V3 acceleration capability – PM95035: Adds the new V4 acceleration capabilities, such as static SQL ● REBIND needed to enable acceleration modelling ■
  43. 43. 43 © 2014 IBM Corporation Accelerator Modelling as Reported by OMPE MEASURED/ELIG TIMES APPL (CL1) DB2 (CL2) ------------------- ---------- ---------- ELAPSED TIME 4.830139 4.740227 ELIGIBLE FOR ACCEL N/A 4.442327 CP CPU TIME 6.337894 6.336111 ELIGIBLE FOR SECP 4.990042 N/A ELIGIBLE FOR ACCEL N/A 6.329119 SE CPU TIME 0.000000 0.000000 ELIGIBLE FOR ACCEL N/A 0.000000 1 2 3 1 Elapsed time that can be significantly reduced because the qualifying statements in the reported plan execution could be routed to the accelerator. If the statements are executed in parallel, the reduced elapsed time relates to the parent task only. 2 The part of CPU time spent on general purpose processors that can be saved to a large extent because the qualifying statements in the reported plan execution could be routed to the accelerator. If the statements are executed in parallel, the CPU saving includes the parent and all the subordinated parallel tasks. 3 The part of CPU time spent on specialty engine processors that can be saved to a large extent because the qualifying statements in the reported plan execution could be routed to the accelerator. If the statements are executed in parallel, the CPU saving includes the parent and all the subordinated parallel tasks.
  44. 44. 44 © 2014 IBM Corporation Agenda ■ What is IBM DB2 Analytics Accelerator ■ Fast evolution of DB2 Analytics Accelerator ■ Static SQL support ■ Workload balancing across multiple accelerators ■ Incremental update enhancements ■ High Performance Storage Server improvements ■ Easy workload acceleration eligibility assessment ■ More acceleration, improved functionality, simpler management
  45. 45. 45 © 2014 IBM Corporation Setting Acceleration Options without Application Changes What to accelerate? NONE | ENABLE | ENABLE WITH FAILBACK | ELIGIBLE | ALL Read archived data? NO | YES • • System scope: zparms ➔ QUERY_ACCELERATION ➔ GET_ACCEL_ARCHIVE • Statement scope: special registers ➔ CURRENT QUERY ACCELERATION ➔ CURRENT GET_ACCEL_ARCHIVE • Application scope ➔ JDBC and ODBC applications ➔ BIND PACKAGE ➔ Application can be qualified by any identifier supported by DB2 profile tables V4V4 • • System scope: zparms ➔ QUERY_ACCELERATION ➔ GET_ACCEL_ARCHIVE • Statement scope: special registers ➔ CURRENT QUERY ACCELERATION ➔ CURRENT GET_ACCEL_ARCHIVE • Application scope ➔ JDBC applications: specify special registers in connection URL ➔ ODBC applications: specify special registers in db2dsdriver.cfg V3V3
  46. 46. 46 © 2014 IBM Corporation Workload Manager Integration Enhancements ■ Workload Manager integration introduced in V3 – DB2 detects WLM service class and importance level and sends it to the accelerator with each query submitted from a remote applicationremote application. – The local applications such as SPUFI, TEP3, CICS, IMS are not supported – The accelerator maps the importance level to a Netezza priority and alters the session prior to query execution, using the corresponding priority. Also threads scheduled will have their priorities adjusted. ● The changes in prioritization after query start are not reflected ● Netezza supports only 4 different priority levels, therefore multiple WLM importance levels have to be mapped against the same Netezza priority. ■ V4 extends the support to the local applicationslocal applications as well ■ Mapping changes – apply to both remote and local applications WLM Importance Level Netezza Priority System Critical Critical Importance 1 Critical Critical Importance 2 High Critical Importance 3 Normal High Importance 4 Normal Normal Importance 5 Normal Low Discretionary Low Low V3V3 V4V4
  47. 47. 47 © 2014 IBM Corporation Multi-row Fetch Support for Local Applications • If a cursor within a application that is locally connected to DB2 uses multi-row fetch, the query does not qualify for acceleration ➔ This disables acceleration for local applications that use good programming practices to improve performance and reduce CPU cost ➔ The remotely connected applications are not exposed to this deficiency V3V3 • The restriction is removed • Queries need to specify: ➔ WITH ROWSET POSITIONING on PREPARE or DECLARE CURSOR ➔ FETCH NEXT ROWSET with FOR N ROWS clause when fetching ➔ Rowset size must be the same for each FETCH NEXT ROWSET ➔ Target host variables must be specified ➔ The local query must use WITHOUT RETURN (the default clause) on PREPARE or DECLARE CURSOR ➔ Query is not a part of an SQL PL routine V4V4
  48. 48. 48 © 2014 IBM Corporation System Scope Monitoring • Basic statistics about DB2 resources involved in communicating with the accelerator • Basic statistics about accelerator activity ➔ The data collected by accelerator and sent back to DB2 via the heartbeat mechanism, i.e. every 20 seconds. • Both types of statistics externalized by DB2 Statistics Trace in SMF 100, i.e. IFCID 2 • Formatted and reported by traditional DB2 performance monitors • Displayed by DISPLAY ACCEL command V3V3 • Existing set of statistics is greatly enhanced by new indicators to help: ➔ Perfomance monitoring ➔ Charge-back ➔ Capacity planning ➔ Problem determination • Many indicators are provided in pairs: ➔ Per accelerator ➔ Per DB2 connected to the accelerator • Includes comprehensive statistics about incremental update • The mechanism for collecting and externalizing the statistics stays the same V4V4
  49. 49. 49 © 2014 IBM Corporation System Scope Monitoring as Reported by OMPE Q100 FOR SUBSYSTEM ONLY QUANTITY Q100 TOTAL ACCELERATOR QUANTITY ----------------------------------- -------------------- ------------------------------------ -------------------- QUERIES SUCCESSFULLY EXECUTED 1.00 QUERIES SUCCESSFULLY EXECUTED 1.00 QUERIES FAILED TO EXECUTE 1.00 QUERIES FAILED TO EXECUTE 1.00 CURRENTLY EXECUTING QUERIES 0.00 CURRENTLY EXECUTING QUERIES 0.23 MAXIMUM EXECUTING QUERIES 1.00 MAXIMUM EXECUTING QUERIES 1.00 CPU TIME EXECUTING QUERIES 1.290000 CPU TIME EXECUTING QUERIES 1.290000 CPU TIME LOAD/ARCHIVE/RESTORE 15:42.600000 CPU TIME LOAD/ARCHIVE/RESTORE 15:42.600000 CONNECTS TO ACCELERATOR 4.00 ACCELERATOR SERVER START 09/05/13 13:36:48.19 REQUESTS SENT TO ACCELERATOR 6.00 ACCELERATOR STATUS CHANGE 09/09/13 11:47:05.22 TIMED OUT 0.00 FAILED 0.00 DISK STORAGE AVAILABLE (MB) 48000959.97 BYTES SENT TO ACCELERATOR 7618.00 IN USE FOR ACCEL DB - ALL DB2 (MB) 1932487.60 BYTES RECEIVED FROM ACCELERATOR 2707.00 IN USE FOR ACCEL DB - THIS DB2(MB) 64322.60 MESSAGES SENT TO ACCELERATOR 33.00 MESSAGES RECEIVED FROM ACCEL 33.00 MAXIMUM QUEUE LENGTH 0.00 BLOCKS SENT TO ACCELERATOR 0.00 CURRENT QUEUE LENGTH 0.00 BLOCKS RECEIVED FROM ACCELERATOR 2.00 AVG QUEUE WAIT ELAPSED TIME 0.021328 ROWS SENT TO ACCELERATOR 0.00 MAX QUEUE WAIT ELAPSED TIME 0.945941 ROWS RECEIVED FROM ACCELERATOR 53.00 WORKER NODES 7.00 TCP/IP SERVICES ELAPSED TIME 28:18.061328 WORKER NODES DISK UTILIZATION (%) 2.40 ELAPSED TIME IN ACCELERATOR 7.791182 WORKER NODES AVG CPU UTILIZATION (%) 23.14 WAIT TIME IN ACCELERATOR 0.099476 COORDINATOR CPU UTILIZATION (%) 8.71 PROCESSORS 224.00 CPU TIME FOR REPLICATION N/P DATA SLICES 240.00 LOG RECORDS READ N/P LOG RECORDS FOR ACCEL TABLES N/P CPU TIME FOR REPLICATION N/P LOG RECORD BYTES PROCESSED N/P LOG RECORDS READ N/P INSERT ROWS FOR ACCEL TABLES N/P LOG RECORDS FOR ACCEL TABLES N/P UPDATE ROWS FOR ACCEL TABLES N/P LOG RECORD BYTES PROCESSED N/P DELETE ROWS FOR ACCEL TABLES N/P INSERT ROWS FOR ACCEL TABLES N/P REPLICATION LATENCY IN SECONDS N/P UPDATE ROWS FOR ACCEL TABLES N/P REPLICATION STATUS CHANGE N/P DELETE ROWS FOR ACCEL TABLES
  50. 50. 50 © 2014 IBM Corporation NOT IN and ALL predicate Support ■ Queries using the following predicates are now eligible for routing to the accelerator: ➔ NOT IN <subquery> ➔ <op> ALL where <op> can be any of =, <>, >, >, >=, <=. • It requires NPS 7.0.4 • For example, the following two queries are supported in V4 SELECT A.C1, A.C1, A.C2 FROM TABLEA A WHERE A.C1 NOT IN (SELECT B.C1 FROM TABLEB B WHERE B.C2 = 3); SELECT A.C1, A.C1, A.C2 FROM TABLEA A WHERE A.C1 < ALL (SELECT B.C1 FROM TABLEB B WHERE B.C2 = 3);
  51. 51. 51 © 2014 IBM Corporation Version 4 at a Glance More Query Acceleration Enhanced Capabilities Improved Transparency Static SQL Improved scalability of Incremental Update Automatic workload balancing with multiple accelerators DB2 11 (2) Better performance of Incremental Update New RTS 'last-changed-at' timestamp (2) Multi-row fetch from local applications Improved performance for large result sets (2) Automated NZKit installation EBCDIC and Unicode in the same DB2 system and accelerator Better access control for HPSS archived partitions Built-in Restore for HPSS NOT IN and ALL predicates (3) HPSS archiving to multiple accelerators Protection for image copies created by HPSS archiving process FOR BIT DATA support (3) Extending WLM support to local applications Profile controlled special registers (2) 24:00:00 time value (3) Rich system scope monitoring Improved continuous operations for Incremental Update MEDIAN support (3) Reporting prospective CPU cost and elapsed time savings Refreshing IDAA table without table lock even if incremental updated active (3) Separation of duties for accelerator system administration operations Static SQL and workload balancing enablement migration tool (3) Support for N2002 hardware (3) Incremental Update continues replicating even for tables in AREO state (3) Loading from flat file or image copy (1) Loading in parallel to DB2 and accelerator (1) Loading data as of any past point in time (1) Loading data to accelerator only (1) E n a b l i n g n e w u s e c a s e s (1) – delivered by a separate tool (2) – DB2 11 only (3) – IDAA V4.1.2 (PTF2 – March 2014)
  52. 52. 52 © 2011 IBM Corporation© 2014 IBM Corporation #IDUG Namik Hrle IBM hrle@de.ibm.com Title: IBM DB2 Analytics Accelerator Trends and Directions Please fill out your session evaluation before leaving!

×