Aetna's Production Experience Using
IBM DB2 Analytics Accelerator
Session Number IDW-1146A
Jeff Kohan, Aetna
Daniel Martin, IBM
© 2013 IBM Corporation
 3rd largest Health Insurer in US (Based on revenue)
 2012 Revenue: $35.54 Billion
 Employees worldwide : 34,000+
 Business locations: International (China, Dubai, London)
 Membership:
 22 million medical members
 14.3 million dental members
 13.8 million pharmacy member
 Health Care Networks : 1 million health care professionals,
5300 hospitals, 597,000 doctors and specialists
About Aetna
IBM DB2 Analytics Accelerator Powered by Netezza
Pure Data for Analytics (PDA)
DB2 Accelerator or “the accelerator”
Agenda
 Aetna Environment
 Results Obtained
 Business Value
 Technical “Deep Dive”
 Quiz
 Summary
 Production Environment: DB8G – 6 member DB2 data
sharing reporting environment
 DATA - 400+ tables of various sizes, total about 9TB
 1 over a Billion Rows
 40% over Half a Billion Rows
 40% over 100 Million Rows
 10% between 10 and 100 Million Rows
 Major reporting applications targeted
 Member – MF Warehouse of all enrollment data
 Over Payment Tracker (OPT)
 Plan Sponsor Reporting
 Claim Reporting System
Aetna Environment
 Ideal Use Case – Long running DB2 reports
 Saved $ over application redesign
 This was an Infrastructure funded project
 There is no chargeback to applications
Aetna Environment – Why “the Accelerator”
Aetna DB2 Analytics Accelerator Environment
OLTP Data
- DB2 zOS
ETL
DB8G
Reporting
Warehouse
DB2 zOS
(v10 NFM)
Query Acceleration
DB2 accelerator (v3)
Reporting Tools
Cognos
Business Objects
Webi
Crystal
MS Access
SAS
Tableau
Source
DB3G Data Stage
Load
Incremental
Update
DB2 Queries
 No application code changes
 No tuning*
 No indexes
 48TB Total Storage
 16 TB Dedicated to User Data
 4 to 1 Compression Rate
Netezza TwinFin 1000-6* (Aetna slide)
What is IBM DB2 AnaIytics Accelerator?
 DB2 Software offering that comes with a hardware appliance
 Start accelerator with DB2 command
 DB2 Detects “heartbeat” of attached appliance
 DB2 OPTIMIZER Recognizes Query can be offloaded
 zPARM Current Query Acceleration = Enable with Failback
 Data needs to be loaded into appliance
 Data Studio with a plug-in is user-interface to the accelerator
 DB2 Explain modified to determine why query will not off-load
Lets look at the results
Results: LOAD Run Times
 LOAD data
 5.1GB loaded in 85 seconds
 589,324,806 rows in 9.5 minutes (62M
rows/Minute)
 1.5 Billion rows in 15 minutes (100 Million
rows/minute)
 15 minutes for 18.3 gigabytes of data (33
tables, the largest had > 550,000,000 rows).
Threads Off-loaded
DATE NAME
TOTAL
Accelerator
THREADS
TOTAL
DB2
THREADS
%
offloaded
10/28/13 WIREPORTSERV 176 213 83%
10/27/13 WIREPORTSERV 189 211 90%
10/26/13 WIREPORTSERV 333 343 97%
10/25/13 WIREPORTSERV 826 971 85%
10/24/13 WIREPORTSERV 911 1,101 83%
10/23/13 WIREPORTSERV 672 815 82%
10/22/13 WIREPORTSERV 955 1,176 81%
10/21/13 WIREPORTSERV 1,002 1,247 80%
10/20/13 WIREPORTSERV 298 383 78%
10/19/13 WIREPORTSERV 273 364 75%
10/18/13 WIREPORTSERV 866 1,026 84%
10/17/13 WIREPORTSERV 1,076 1,276 84%
10/16/13 WIREPORTSERV 735 978 75%
10/15/13 WIREPORTSERV 662 948 70%
10/14/13 WIREPORTSERV 1,309 1,559 84%
10/13/13 WIREPORTSERV 408 503 81%
10/12/13 WIREPORTSERV 357 441 81%
10/11/13 WIREPORTSERV 603 766 79%
10/10/13 WIREPORTSERV 840 1,033 81%
10/09/13 WIREPORTSERV 936 1,109 84%
10/08/13 WIREPORTSERV 3,891 4,163 93%
10/07/13 WIREPORTSERV 6,628 9,270 71%
Accelerator Modeler - APAR PM90535
 This APAR provides new function to allow a DB2 subsystem to
model the existence of an accelerator to evaluate the CPU and
elapsed time spent in DB2 for static SQL queries that would be
eligible for acceleration if an accelerator were active. No
accelerator is required or needs to be active for this modeling to
occur.
 PM90886 also required
What the customers are saying
No we can’t run these queries now, only overnight.
(But when they did)  WOW!, The answer came
back so fast, I thought it must have failed
This was a CPU time out after the query ran for 82
wall-clock minutes yesterday.
It ran successfully this morning in 27 sec.
Whatever you paid for this, it was well worth it!
Thank you!!! Hey…what a difference.
I just re-ran a query that fails on DB2 workspace. It just ran successfully in 12 min
I just created a query to go after a report that I am asked for regularly.. I usually
have to build in stages because of the two name fields. The last time I ran reporting
had to parse into 6 queries and schedule the reporting. On average it takes 45
minutes to 1 hour to pull the reporting back - Just ran the same reporting in 17
seconds!
More accolades
-We’re noticing much faster response times today. I re-ran a few
reports, without any modifications to compare times. Here are
the results.
06-Feb 12-Feb
301 sec 23 sec
240 sec 13 sec
262 sec 13 sec
322 sec 13 sec
And Finally
- Our business partners are very pleased. The
throughput and the ability to meet some previously
unfulfilled needs are being well received.
And just a few side benefits…
- No sort space failures in reporting environment
- MIPS Savings with cost avoidance of $$$
- Process Changes – Member can process on weekend
Business Value - OPT
 Plan Sponsor/Provider performance guarantees
 E.G. time spent w/patient, mean time to medication, readmission rate
 Payments to be recovered if performance is not met
 Fines involved per regulatory requirements
 500+ reports/per quarter for recoveries
 Monthly aggregate rollup on overpayment metrics – very slow turnaround
time
 Trending Reports – where are overpayments occurring and why
 Currently in our reporting environment, inadequate table design and
structure
 No full view of providers
 Enterprise view of overpayments returned, lag time, duration for collection
 Identify Providers not responding
 Trend Analysis over time
 Root Cause Resolution = $ savings plus better Healthcare
Outcomes
Business Value - OPT
 Manages work better
 Root Cause Team -
 Ability to run yearly trending reports of
overpayments
 Ability to scale yearly reports down and target
anomalies
 Gives ability to size issues quickly
 Business people now do requests themselves and do
not need to rely on technical person for assistance
(doesn’t mean IT staff!)
 Able to do whatever we want to make informed and
right decisions where we were so handcuffed before
There were some bumps along the way
 Software offering that comes with hardware – Who Supports?
 Require Short Range OSA cards in Data Center
 Configure switches for jumbo frame support
 Ensure WLM environments are defined correctly
 DSNX881I - Critical error messages, you must alert for this
message
 SQLSTATE 57011 - Increase NZ_SPRINGFIELD_SIZE to 4096
– concurrency could be reduced (Netezza tuning knob)
 Corrupted date field returned from query – Update 5
(PM75749), or DB2 Connect v9.7 fp7 (V2)
 SQLCODE-516 currently open - >32K result set from Access
and Crystal possible fix LUW APAR IC86946 – patch supplied
9.7 fp3a runtime client (V2)
 Business Objects - two part query predicate runs slow –
UK92607 (V2)
There were some bumps along the way
(Below are V3 Netezza)
 Reason Code 00D35011 on accelerated query PM90148 10/26
 Includes 35 ptf’s
 Accelerator stopping for no apparent reason New GUI V3 PTF3
UK96194
 Query performance degrading when replication is enabled.
APARFIX PTF 3 prereq
 Query statistics being reset PTF4
 Accelerated query failing [57011] frequency statistics on the
Netezza-resident objects needed manual update
DSNX881I Messages –
Capture and Alert through automation
Email to Data Center and team members
 DSNX881I *DB8C 2 E 101 (07-MAY-13, 13:39:41 EDT) NPS SYSTEM NZ82011-H1
- SERVI CE REQUESTED FOR SPU 1188 AT 07-MAY-13, 13:39:41 EDT SYSTEM.
LOCATION:LOGICAL NA ME:'SPA1.SPU9' PHYSICAL LOCATION:'1ST RACK, 1ST
SPA, SPU IN 9TH SLOT' ERROR STRI NG:SPU PHYSICAL INTERFACE ERROR
 DSNX881I *DB8H 10 E 50 (28-JAN-13, 11:49:47 EST) NPS SYSTEM NZ82011-H1 -
DISK ERROR ON DISK 1146. SPUHWID:1153 DISK LOCATION:LOGICAL
NAME:'SPA1.DISKENCL4.DISK 9' PHYSICAL LOCATION:'1ST RACK, 4TH
DISKENCLOSURE, DISK IN ROW 3/COLUMN 1' ERRTY
PE:3 ERRCODE:116 OPER:0 DATAPAR
What’s Next with the Accelerator at Aetna?
 Current V4 beta participant
 Current DB2 Loader v1.1 beta participant
 Installation of Version 4 – Static SQL support
 Consider eliminating DB8G Indexes
 Consider eliminating DB8G subsystem members
 Evaluate ETL needs
 Workload Manager feature
 High Performance Storage Saver exploitation
 High Availability and DBAR
 Performance Monitoring and Reporting
 New zBLC Workgroup created for accelerator
Deep Dive
(Architecture, Data maintenance)
Hardware Acceleration
23
FPGA Core CPU Core
Decompress Project
Restrict
Visibility
SQL &
Advanced Analytics
From Select Where Group by
15 MB/sec
130 MB/sec
130 MB/sec
~275 MB/sec
(2.14 drives / core)
PureData System for Analytics N2001
1100
MB/sec
1-rack system: 112 cores, 240 disks --> 124 GB/s raw scan rate
(after 4x compression)
© 2013 IBM Corporation
N2001 Hardware Overview
 User Data Capacity: 192 TB (240 x 600GB, 1/3 user data)*
 Data Scan Speed: 478 TB/hr (240 x 130MB/s)*
 Load Speed: 1.5 TB/hr
* 4X compression assumed
Scales from
½ Rack to 4 Racks
12 Disk Enclosures
 288 600 GB SAS2 Drives
240 User Data, 14 S-Blade
34 Spare
 RAID 1 Mirroring
2 Hosts (Active-Passive)
 2 6-Core Intel 3.46 GHz CPUs
 7x300 GB SAS Drives
 Red Hat Linux 6 64-bit
7 PureData for Analytics S-Blades™
 2 Intel 8 Core 2+ GHz CPUs
 2 8-Engine Xilinx Virtex-6 FPGAs
 128 GB RAM + 8 GB slice buffer
 Linux 64-bit Kernel
24
* 4X compression assumed
Hybrid Database System
25
DB2 for z/OS IBM DB2 Analytics Accelerator
(powered by PureData System for Analytics)
High volume, high concurrency,
transactional workload and batch processing
Low volume, low concurrency, complex
queries
• Data shared across all members
• Lock-based concurrency control
• Write-ahead log (WAL)
• Index
• …
• Data partitioned across worker nodes
• Multi-version concurrency control
• Immutable rows (no in-place updates)
• Automatic Zone-Map, auto-reorg
• …
Data
Maintenance
© 2013 IBM Corporation26
Optimizer
AcceleratorDRDARequestor
Application
Application
Interface
Queries executed with Accelerator
Queries executed without Accelerator
Heartbeat (availability and performance indicators)
Query execution run-time for
queries that cannot be or should
not be off-loaded to Accelerator
SPU
Memory
SPU
Memory
SPU
Memory
SPU
Memory
SMPHost
Heartbeat
DB2 for z/OS
CPU FPGA
CPU FPGA
CPU FPGA
CPU FPGA
CPU FPGA
CPU FPGA
CPU FPGA
CPU FPGA
Accelerator
Query Execution Process Flow
DB2 for z/OS Optimizer Decisions
27
System
checks
• zPARM value
• PROFILE values
• QUERY ACCELERATION special register value
Status
checks
• Accelerator available (heartbeat)?
• Accelerator ready to accept queries?
Table
checks
• Referenced tables loaded to accelerator?
• Referenced tables enabled for acceleration?
SQL
checks
• Check for offload limitations (UDF?, XML?)
Heuristic
• Matching index?
• No grouping or aggregation (just i/o?)
• Big result set?
• Size of tables very small?
• Cost threshold
Data Maintenance
28
Operation Properties
Full table re-load / Partition re-load • Snapshot of a table
• Can use the RTS change detection feature
• Very efficient: ~0.5 CPU seconds per 100 MB net
changes
• High throughput: up to 1.5 TB/h total, 220 GB/h per
stream, but actual throughput varies
• Redundant reloads if not all rows of a table or partition
have changed. Overhead for the duplicate work
Incremental Update • Applies updates continuously, no snapshots
• Not as efficient as UNLOAD: 31 - 65 CPU seconds per
100 MB net changes
• Throughput: up to 18 GB/h, but actual throughput varies
• Granularity: changes at row level, independent of a
table's partitioning scheme
Latency Detection
29
 Tables managed by UNLOAD-based refresh
 SYSACCEL.SYSACCELERATEDTABLES.REFRESH_TIME
 Tables managed by Incremental Update
 Stored Procedure SYSPROC.ACCEL_CONTROL_ACCELERATOR
with <getAcceleratorInfo/> command
 <replicationInfo state="STARTED"
lastStateChangeSince="2012-11-
11T10:33:42.487678" latencyInSeconds=“38">
 All functions are available via Accelerator Studio as well
Latency Management
30
 Automate snapshot-based table refresh
 Table (or partition) is available for queries during refresh
 Old version of table is used until refresh is done
 Consider using “Change detection” feature (based on DB2 RTS)
 May stop offloading if latency is too high
 Session scope (QUERY ACCELERATION special register)
 Table scope (SET_TABLES_ACCELERATION SP)
 Accelerator scope (-STO ACCEL command)
 Consider using the <waitForReplication /> command of
SYSPROC.ACCEL_CONTROL_ACCELERATOR
 The Stored Procedure returns when all commits that happened
before CALLing the SP have been applied to the Accelerator
Quiz Time
Our session today covered the IBM DB2
Analytics Accelerator.
What is another name for the accelerator?
A. International Digital Arts Awards
B. International Diabetic Athletic Association
D. Indiana Dental Assistants Association
E. Interior Designers Association of Australia (est. 1948)
F. Infectious Disease Association of America
G. It Does Amazing Acceleration!
H. None of the above
Thank You
Jeff Kohan - kohanjm@aetna.com
Daniel Martin - danmartin@de.ibm.com
Thank You
Your feedback is important!
• Access the Conference Agenda Builder to
complete your session surveys
o Any web or mobile browser at
http://iod13surveys.com/surveys.html
o Any Agenda Builder kiosk onsite

IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Accelerator

  • 1.
    Aetna's Production ExperienceUsing IBM DB2 Analytics Accelerator Session Number IDW-1146A Jeff Kohan, Aetna Daniel Martin, IBM © 2013 IBM Corporation
  • 2.
     3rd largestHealth Insurer in US (Based on revenue)  2012 Revenue: $35.54 Billion  Employees worldwide : 34,000+  Business locations: International (China, Dubai, London)  Membership:  22 million medical members  14.3 million dental members  13.8 million pharmacy member  Health Care Networks : 1 million health care professionals, 5300 hospitals, 597,000 doctors and specialists About Aetna
  • 3.
    IBM DB2 AnalyticsAccelerator Powered by Netezza Pure Data for Analytics (PDA) DB2 Accelerator or “the accelerator” Agenda  Aetna Environment  Results Obtained  Business Value  Technical “Deep Dive”  Quiz  Summary
  • 4.
     Production Environment:DB8G – 6 member DB2 data sharing reporting environment  DATA - 400+ tables of various sizes, total about 9TB  1 over a Billion Rows  40% over Half a Billion Rows  40% over 100 Million Rows  10% between 10 and 100 Million Rows  Major reporting applications targeted  Member – MF Warehouse of all enrollment data  Over Payment Tracker (OPT)  Plan Sponsor Reporting  Claim Reporting System Aetna Environment
  • 5.
     Ideal UseCase – Long running DB2 reports  Saved $ over application redesign  This was an Infrastructure funded project  There is no chargeback to applications Aetna Environment – Why “the Accelerator”
  • 6.
    Aetna DB2 AnalyticsAccelerator Environment OLTP Data - DB2 zOS ETL DB8G Reporting Warehouse DB2 zOS (v10 NFM) Query Acceleration DB2 accelerator (v3) Reporting Tools Cognos Business Objects Webi Crystal MS Access SAS Tableau Source DB3G Data Stage Load Incremental Update DB2 Queries
  • 7.
     No applicationcode changes  No tuning*  No indexes  48TB Total Storage  16 TB Dedicated to User Data  4 to 1 Compression Rate Netezza TwinFin 1000-6* (Aetna slide)
  • 8.
    What is IBMDB2 AnaIytics Accelerator?  DB2 Software offering that comes with a hardware appliance  Start accelerator with DB2 command  DB2 Detects “heartbeat” of attached appliance  DB2 OPTIMIZER Recognizes Query can be offloaded  zPARM Current Query Acceleration = Enable with Failback  Data needs to be loaded into appliance  Data Studio with a plug-in is user-interface to the accelerator  DB2 Explain modified to determine why query will not off-load
  • 9.
    Lets look atthe results
  • 10.
    Results: LOAD RunTimes  LOAD data  5.1GB loaded in 85 seconds  589,324,806 rows in 9.5 minutes (62M rows/Minute)  1.5 Billion rows in 15 minutes (100 Million rows/minute)  15 minutes for 18.3 gigabytes of data (33 tables, the largest had > 550,000,000 rows).
  • 11.
    Threads Off-loaded DATE NAME TOTAL Accelerator THREADS TOTAL DB2 THREADS % offloaded 10/28/13WIREPORTSERV 176 213 83% 10/27/13 WIREPORTSERV 189 211 90% 10/26/13 WIREPORTSERV 333 343 97% 10/25/13 WIREPORTSERV 826 971 85% 10/24/13 WIREPORTSERV 911 1,101 83% 10/23/13 WIREPORTSERV 672 815 82% 10/22/13 WIREPORTSERV 955 1,176 81% 10/21/13 WIREPORTSERV 1,002 1,247 80% 10/20/13 WIREPORTSERV 298 383 78% 10/19/13 WIREPORTSERV 273 364 75% 10/18/13 WIREPORTSERV 866 1,026 84% 10/17/13 WIREPORTSERV 1,076 1,276 84% 10/16/13 WIREPORTSERV 735 978 75% 10/15/13 WIREPORTSERV 662 948 70% 10/14/13 WIREPORTSERV 1,309 1,559 84% 10/13/13 WIREPORTSERV 408 503 81% 10/12/13 WIREPORTSERV 357 441 81% 10/11/13 WIREPORTSERV 603 766 79% 10/10/13 WIREPORTSERV 840 1,033 81% 10/09/13 WIREPORTSERV 936 1,109 84% 10/08/13 WIREPORTSERV 3,891 4,163 93% 10/07/13 WIREPORTSERV 6,628 9,270 71%
  • 12.
    Accelerator Modeler -APAR PM90535  This APAR provides new function to allow a DB2 subsystem to model the existence of an accelerator to evaluate the CPU and elapsed time spent in DB2 for static SQL queries that would be eligible for acceleration if an accelerator were active. No accelerator is required or needs to be active for this modeling to occur.  PM90886 also required
  • 13.
    What the customersare saying No we can’t run these queries now, only overnight. (But when they did)  WOW!, The answer came back so fast, I thought it must have failed This was a CPU time out after the query ran for 82 wall-clock minutes yesterday. It ran successfully this morning in 27 sec. Whatever you paid for this, it was well worth it! Thank you!!! Hey…what a difference. I just re-ran a query that fails on DB2 workspace. It just ran successfully in 12 min I just created a query to go after a report that I am asked for regularly.. I usually have to build in stages because of the two name fields. The last time I ran reporting had to parse into 6 queries and schedule the reporting. On average it takes 45 minutes to 1 hour to pull the reporting back - Just ran the same reporting in 17 seconds!
  • 14.
    More accolades -We’re noticingmuch faster response times today. I re-ran a few reports, without any modifications to compare times. Here are the results. 06-Feb 12-Feb 301 sec 23 sec 240 sec 13 sec 262 sec 13 sec 322 sec 13 sec
  • 15.
    And Finally - Ourbusiness partners are very pleased. The throughput and the ability to meet some previously unfulfilled needs are being well received. And just a few side benefits… - No sort space failures in reporting environment - MIPS Savings with cost avoidance of $$$ - Process Changes – Member can process on weekend
  • 16.
    Business Value -OPT  Plan Sponsor/Provider performance guarantees  E.G. time spent w/patient, mean time to medication, readmission rate  Payments to be recovered if performance is not met  Fines involved per regulatory requirements  500+ reports/per quarter for recoveries  Monthly aggregate rollup on overpayment metrics – very slow turnaround time  Trending Reports – where are overpayments occurring and why  Currently in our reporting environment, inadequate table design and structure  No full view of providers  Enterprise view of overpayments returned, lag time, duration for collection  Identify Providers not responding  Trend Analysis over time  Root Cause Resolution = $ savings plus better Healthcare Outcomes
  • 17.
    Business Value -OPT  Manages work better  Root Cause Team -  Ability to run yearly trending reports of overpayments  Ability to scale yearly reports down and target anomalies  Gives ability to size issues quickly  Business people now do requests themselves and do not need to rely on technical person for assistance (doesn’t mean IT staff!)  Able to do whatever we want to make informed and right decisions where we were so handcuffed before
  • 18.
    There were somebumps along the way  Software offering that comes with hardware – Who Supports?  Require Short Range OSA cards in Data Center  Configure switches for jumbo frame support  Ensure WLM environments are defined correctly  DSNX881I - Critical error messages, you must alert for this message  SQLSTATE 57011 - Increase NZ_SPRINGFIELD_SIZE to 4096 – concurrency could be reduced (Netezza tuning knob)  Corrupted date field returned from query – Update 5 (PM75749), or DB2 Connect v9.7 fp7 (V2)  SQLCODE-516 currently open - >32K result set from Access and Crystal possible fix LUW APAR IC86946 – patch supplied 9.7 fp3a runtime client (V2)  Business Objects - two part query predicate runs slow – UK92607 (V2)
  • 19.
    There were somebumps along the way (Below are V3 Netezza)  Reason Code 00D35011 on accelerated query PM90148 10/26  Includes 35 ptf’s  Accelerator stopping for no apparent reason New GUI V3 PTF3 UK96194  Query performance degrading when replication is enabled. APARFIX PTF 3 prereq  Query statistics being reset PTF4  Accelerated query failing [57011] frequency statistics on the Netezza-resident objects needed manual update
  • 20.
    DSNX881I Messages – Captureand Alert through automation Email to Data Center and team members  DSNX881I *DB8C 2 E 101 (07-MAY-13, 13:39:41 EDT) NPS SYSTEM NZ82011-H1 - SERVI CE REQUESTED FOR SPU 1188 AT 07-MAY-13, 13:39:41 EDT SYSTEM. LOCATION:LOGICAL NA ME:'SPA1.SPU9' PHYSICAL LOCATION:'1ST RACK, 1ST SPA, SPU IN 9TH SLOT' ERROR STRI NG:SPU PHYSICAL INTERFACE ERROR  DSNX881I *DB8H 10 E 50 (28-JAN-13, 11:49:47 EST) NPS SYSTEM NZ82011-H1 - DISK ERROR ON DISK 1146. SPUHWID:1153 DISK LOCATION:LOGICAL NAME:'SPA1.DISKENCL4.DISK 9' PHYSICAL LOCATION:'1ST RACK, 4TH DISKENCLOSURE, DISK IN ROW 3/COLUMN 1' ERRTY PE:3 ERRCODE:116 OPER:0 DATAPAR
  • 21.
    What’s Next withthe Accelerator at Aetna?  Current V4 beta participant  Current DB2 Loader v1.1 beta participant  Installation of Version 4 – Static SQL support  Consider eliminating DB8G Indexes  Consider eliminating DB8G subsystem members  Evaluate ETL needs  Workload Manager feature  High Performance Storage Saver exploitation  High Availability and DBAR  Performance Monitoring and Reporting  New zBLC Workgroup created for accelerator
  • 22.
  • 23.
    Hardware Acceleration 23 FPGA CoreCPU Core Decompress Project Restrict Visibility SQL & Advanced Analytics From Select Where Group by 15 MB/sec 130 MB/sec 130 MB/sec ~275 MB/sec (2.14 drives / core) PureData System for Analytics N2001 1100 MB/sec 1-rack system: 112 cores, 240 disks --> 124 GB/s raw scan rate (after 4x compression)
  • 24.
    © 2013 IBMCorporation N2001 Hardware Overview  User Data Capacity: 192 TB (240 x 600GB, 1/3 user data)*  Data Scan Speed: 478 TB/hr (240 x 130MB/s)*  Load Speed: 1.5 TB/hr * 4X compression assumed Scales from ½ Rack to 4 Racks 12 Disk Enclosures  288 600 GB SAS2 Drives 240 User Data, 14 S-Blade 34 Spare  RAID 1 Mirroring 2 Hosts (Active-Passive)  2 6-Core Intel 3.46 GHz CPUs  7x300 GB SAS Drives  Red Hat Linux 6 64-bit 7 PureData for Analytics S-Blades™  2 Intel 8 Core 2+ GHz CPUs  2 8-Engine Xilinx Virtex-6 FPGAs  128 GB RAM + 8 GB slice buffer  Linux 64-bit Kernel 24 * 4X compression assumed
  • 25.
    Hybrid Database System 25 DB2for z/OS IBM DB2 Analytics Accelerator (powered by PureData System for Analytics) High volume, high concurrency, transactional workload and batch processing Low volume, low concurrency, complex queries • Data shared across all members • Lock-based concurrency control • Write-ahead log (WAL) • Index • … • Data partitioned across worker nodes • Multi-version concurrency control • Immutable rows (no in-place updates) • Automatic Zone-Map, auto-reorg • … Data Maintenance
  • 26.
    © 2013 IBMCorporation26 Optimizer AcceleratorDRDARequestor Application Application Interface Queries executed with Accelerator Queries executed without Accelerator Heartbeat (availability and performance indicators) Query execution run-time for queries that cannot be or should not be off-loaded to Accelerator SPU Memory SPU Memory SPU Memory SPU Memory SMPHost Heartbeat DB2 for z/OS CPU FPGA CPU FPGA CPU FPGA CPU FPGA CPU FPGA CPU FPGA CPU FPGA CPU FPGA Accelerator Query Execution Process Flow
  • 27.
    DB2 for z/OSOptimizer Decisions 27 System checks • zPARM value • PROFILE values • QUERY ACCELERATION special register value Status checks • Accelerator available (heartbeat)? • Accelerator ready to accept queries? Table checks • Referenced tables loaded to accelerator? • Referenced tables enabled for acceleration? SQL checks • Check for offload limitations (UDF?, XML?) Heuristic • Matching index? • No grouping or aggregation (just i/o?) • Big result set? • Size of tables very small? • Cost threshold
  • 28.
    Data Maintenance 28 Operation Properties Fulltable re-load / Partition re-load • Snapshot of a table • Can use the RTS change detection feature • Very efficient: ~0.5 CPU seconds per 100 MB net changes • High throughput: up to 1.5 TB/h total, 220 GB/h per stream, but actual throughput varies • Redundant reloads if not all rows of a table or partition have changed. Overhead for the duplicate work Incremental Update • Applies updates continuously, no snapshots • Not as efficient as UNLOAD: 31 - 65 CPU seconds per 100 MB net changes • Throughput: up to 18 GB/h, but actual throughput varies • Granularity: changes at row level, independent of a table's partitioning scheme
  • 29.
    Latency Detection 29  Tablesmanaged by UNLOAD-based refresh  SYSACCEL.SYSACCELERATEDTABLES.REFRESH_TIME  Tables managed by Incremental Update  Stored Procedure SYSPROC.ACCEL_CONTROL_ACCELERATOR with <getAcceleratorInfo/> command  <replicationInfo state="STARTED" lastStateChangeSince="2012-11- 11T10:33:42.487678" latencyInSeconds=“38">  All functions are available via Accelerator Studio as well
  • 30.
    Latency Management 30  Automatesnapshot-based table refresh  Table (or partition) is available for queries during refresh  Old version of table is used until refresh is done  Consider using “Change detection” feature (based on DB2 RTS)  May stop offloading if latency is too high  Session scope (QUERY ACCELERATION special register)  Table scope (SET_TABLES_ACCELERATION SP)  Accelerator scope (-STO ACCEL command)  Consider using the <waitForReplication /> command of SYSPROC.ACCEL_CONTROL_ACCELERATOR  The Stored Procedure returns when all commits that happened before CALLing the SP have been applied to the Accelerator
  • 31.
    Quiz Time Our sessiontoday covered the IBM DB2 Analytics Accelerator. What is another name for the accelerator? A. International Digital Arts Awards B. International Diabetic Athletic Association D. Indiana Dental Assistants Association E. Interior Designers Association of Australia (est. 1948) F. Infectious Disease Association of America G. It Does Amazing Acceleration! H. None of the above
  • 32.
    Thank You Jeff Kohan- kohanjm@aetna.com Daniel Martin - danmartin@de.ibm.com
  • 33.
    Thank You Your feedbackis important! • Access the Conference Agenda Builder to complete your session surveys o Any web or mobile browser at http://iod13surveys.com/surveys.html o Any Agenda Builder kiosk onsite