Pure Genius: How To Get
Mainframe-Like Scalability &
Availability For Midrange DB2
The Information Management Specialists
Availability For Midrange DB2
James Gill & Julian Stuhler
GSE November 2010
Agenda
• pureScale Overview
Why should you care?
The Information Management Specialists
Architectural overview
• Experiences
Triton’s pureScale environment
Installation
Performance
Resilience
Why pureScale?
• Availability
Any system outage has direct impact on profitability and
customer retention
The Information Management Specialists
customer retention
Serving multiple geographies makes planned downtime more
difficult
• Agility / Scalability
Almost every business has major workload spikes, with
significant unused capacity at other times
Need to be able to rapidly scale up/down in a cost effective way,
with little or no change needed to the application
What is pureScale?
• Optional feature for DB2 for Linux, UNIX and
Windows
Current support is for AIX on System p and
SUSE Linux on IBM System x servers only
The Information Management Specialists
SUSE Linux on IBM System x servers only
• Implements a shared-disk clustering solution
to support high scalability and availability
Up to 128 members in initial release
• Based on proven “data sharing” technology
used in DB2 for z/OS for past 15 years
• Capacity-based charging model allows
cluster to be easily expanded/contracted
• Little or no application change required
pureScale Architecture
Primary CF Secondary
CF
GBP
GLM SCA
The Information Management Specialists
Member A
Shared
Database
Member B Member C
IB Interconnect
Architecture - Members
Agents and threads
dbheap,
The Information Management Specialists
Shared
Database
Log buffers
Bufferpools
Logs
dbheap,
sort heap,
etc
Architecture - CFs
SCA
The Information Management Specialists
SCA
GBP
Directory
Data Pgs
Index Pgs
GLM
Hash table
Lock
entries
Architecture - Infiniband
• Low latency (1 – 1.3 microseconds)
• High speed (300Gb/s – EDR 12x)
The Information Management Specialists
• High speed (300Gb/s – EDR 12x)
• RDMA – remote direct memory access
NIC managed
► No processor interrupt
► 5 – 30 microsecond access time
Architecture – Infiniband 2
The Information Management Specialists
www.infinibandta.org/content/pages.php?pg=technology_overview
pureScale Scalability
10.4
10
12
14Throughputv1Member
The Information Management Specialists
1
1.98
3.9
7.6
0
2
4
6
8
0 2 4 6 8 10 12 14
Throughputv1Member
Number of pureScale Members
Source: Internal IBM Lab Tests
Practical Experiences
• Triton’s Commodity Cluster
• IBM’s nano-cluster
The Information Management Specialists
• IBM’s nano-cluster
• Architecture
• Installation
• Performance
• Resilience
Triton’s Commodity Cluster
• Objectives
Undertake basic validation of IBM’s performance &
scalability claims
The Information Management Specialists
scalability claims
Build technical experience in a pureScale environment
and establish platform for ongoing R&D
Assist IBM with early beta testing
• Constraints
Budget < £1K
Easily portable for customer demos etc.
Commodity Cluster - Architecture
CF
The Information Management Specialists
Node 1 Node 2
1GB Ethernet
iSCSI
NAS
TE App
Server
Triton’s Commodity Cluster
• 2 member nodes and one CF
• Each node:
Intel D510M0 (Dual core 1GHz Atom)
The Information Management Specialists
Intel D510M0 (Dual core 1GHz Atom)
4GB RAM
40GB SSD
• Shared disk
iSCSI 1TB (QNAP TS110)
• DB2 9.8 pureScale FP2 development image
• Technology Explorer used for workload and monitoring
www.sourceforge.net/projects/db2mc
IBM nanoCluster - Architecture
CF
node101 node102
App
Servers
CF
node103
The Information Management Specialists
CF
1GB Ethernet
GPFS
Disk
member
CF
member
IBM’s pureScale nanoCluster
• 2 combined member and CF nodes
• One shared disk and app server tier node
• Each node:
The Information Management Specialists
• Each node:
Intel D510M0 (dual core 1GHz Atom)
4GB RAM
Disk
► 40GB SSD (pureScale nodes)
► 100GB 7200rpm SATA (shared disk/app server node)
Triton pureScale Experiences
• Installation experiences
SLES 10 requires
The Information Management Specialists
SLES 10 requires
► compat-libstdc++-5.0.7-22.2.x86_64.rpm
db2cluster resolves iSCSI mount issues
Ensure FQDN names in /etc/hosts
Very slick considering the component count
Triton pureScale Experiences
• Commodity Cluster Performance
Technology Explorer
The Information Management Specialists
Technology Explorer
► WMD Java workload driver (WLB enabled)
► 2.5M row table
► Vanilla installation
► 32 threads, 25ms think time
Delivered 1000tps @ 95%CPU load
• 14 simulated client connections
• WLB ACR enabled
• 1ms think time
• 250,000 row table
Performance - nanoCluster
The Information Management Specialists
• Delivering c. 5500tps @ 50% CPU load
Workload Balancing (WLB)
• Members track and share available capacity
db2pd –serverlist
The Information Management Specialists
db2pd –serverlist
• Slipstreamed periodically to clients
DB2 9.7 FP1 or higher
• Transaction workload balance
On UR boundaries
WLB
• db2pd –serverlist
Database Member 0 -- Active -- Up 0 days 00:20:43
The Information Management Specialists
Database Member 0 -- Active -- Up 0 days 00:20:43
Server List:
Time: Tue Nov 2 07:26:54
Database Name: DTW
Count: 2
Hostname Non-SSL Port SSL Port Priority
node102.purescale.demo 50001 0 52
node103.purescale.demo 50001 0 47
Resilience – CF Failure
• Simulate failure of the primary CF
db2lco@node102:~> db2instance -list
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER
The Information Management Specialists
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER
-- ---- ----- --------- ------------ ----- ----------------
0 MEMBER STARTED node102 node102 NO 0
1 MEMBER STARTED node103 node103 NO 0
128 CF PRIMARY node102 node102 NO -
129 CF PEER node103 node103 NO -
HOSTNAME STATE INSTANCE_STOPPED ALERT
-------- ----- ---------------- -----
node103 ACTIVE NO NO
node102 ACTIVE NO NO
db2lco@node102:~> ps -ef | grep ca-server
db2lco 4158 4153 0 05:30 ? 00:00:31 ca-mgmnt-lwd -i128 -p56000 -k8521f27a -d/home/db2lco/sqllib/db2dump -
e/home/db2lco/sqllib/cf/ca-server -f
db2lco 4164 4158 18 05:30 ? 00:12:41 /home/db2lco/sqllib/cf/ca-server -i 128 -p 56000 -k 1474562 -s 0 -f
db2lco 32120 21933 0 06:37 pts/0 00:00:00 grep ca-server
db2lco@node102:~> kill -9 4164
Resilience – CF Failure - Impact
The Information Management Specialists
pureScale Summary
• Robust clustering technology based on a proven
architecture
Scalability
The Information Management Specialists
Scalability
Resilience
• No code change to scale out
• Excellent price/performance characteristics
• Initial customer implementations are under way
• What are you waiting for?
Feedback / Questions
James Gill – james.gill@triton.co.uk
The Information Management Specialists
James Gill – james.gill@triton.co.uk
Julian Stuhler – julian.stuhler@triton.co.uk
www.triton.co.uk
pureScale webcast series each Tuesday

UKGSE DB2 pureScale

  • 1.
    Pure Genius: HowTo Get Mainframe-Like Scalability & Availability For Midrange DB2 The Information Management Specialists Availability For Midrange DB2 James Gill & Julian Stuhler GSE November 2010
  • 2.
    Agenda • pureScale Overview Whyshould you care? The Information Management Specialists Architectural overview • Experiences Triton’s pureScale environment Installation Performance Resilience
  • 3.
    Why pureScale? • Availability Anysystem outage has direct impact on profitability and customer retention The Information Management Specialists customer retention Serving multiple geographies makes planned downtime more difficult • Agility / Scalability Almost every business has major workload spikes, with significant unused capacity at other times Need to be able to rapidly scale up/down in a cost effective way, with little or no change needed to the application
  • 4.
    What is pureScale? •Optional feature for DB2 for Linux, UNIX and Windows Current support is for AIX on System p and SUSE Linux on IBM System x servers only The Information Management Specialists SUSE Linux on IBM System x servers only • Implements a shared-disk clustering solution to support high scalability and availability Up to 128 members in initial release • Based on proven “data sharing” technology used in DB2 for z/OS for past 15 years • Capacity-based charging model allows cluster to be easily expanded/contracted • Little or no application change required
  • 5.
    pureScale Architecture Primary CFSecondary CF GBP GLM SCA The Information Management Specialists Member A Shared Database Member B Member C IB Interconnect
  • 6.
    Architecture - Members Agentsand threads dbheap, The Information Management Specialists Shared Database Log buffers Bufferpools Logs dbheap, sort heap, etc
  • 7.
    Architecture - CFs SCA TheInformation Management Specialists SCA GBP Directory Data Pgs Index Pgs GLM Hash table Lock entries
  • 8.
    Architecture - Infiniband •Low latency (1 – 1.3 microseconds) • High speed (300Gb/s – EDR 12x) The Information Management Specialists • High speed (300Gb/s – EDR 12x) • RDMA – remote direct memory access NIC managed ► No processor interrupt ► 5 – 30 microsecond access time
  • 9.
    Architecture – Infiniband2 The Information Management Specialists www.infinibandta.org/content/pages.php?pg=technology_overview
  • 10.
    pureScale Scalability 10.4 10 12 14Throughputv1Member The InformationManagement Specialists 1 1.98 3.9 7.6 0 2 4 6 8 0 2 4 6 8 10 12 14 Throughputv1Member Number of pureScale Members Source: Internal IBM Lab Tests
  • 11.
    Practical Experiences • Triton’sCommodity Cluster • IBM’s nano-cluster The Information Management Specialists • IBM’s nano-cluster • Architecture • Installation • Performance • Resilience
  • 12.
    Triton’s Commodity Cluster •Objectives Undertake basic validation of IBM’s performance & scalability claims The Information Management Specialists scalability claims Build technical experience in a pureScale environment and establish platform for ongoing R&D Assist IBM with early beta testing • Constraints Budget < £1K Easily portable for customer demos etc.
  • 13.
    Commodity Cluster -Architecture CF The Information Management Specialists Node 1 Node 2 1GB Ethernet iSCSI NAS TE App Server
  • 14.
    Triton’s Commodity Cluster •2 member nodes and one CF • Each node: Intel D510M0 (Dual core 1GHz Atom) The Information Management Specialists Intel D510M0 (Dual core 1GHz Atom) 4GB RAM 40GB SSD • Shared disk iSCSI 1TB (QNAP TS110) • DB2 9.8 pureScale FP2 development image • Technology Explorer used for workload and monitoring www.sourceforge.net/projects/db2mc
  • 15.
    IBM nanoCluster -Architecture CF node101 node102 App Servers CF node103 The Information Management Specialists CF 1GB Ethernet GPFS Disk member CF member
  • 16.
    IBM’s pureScale nanoCluster •2 combined member and CF nodes • One shared disk and app server tier node • Each node: The Information Management Specialists • Each node: Intel D510M0 (dual core 1GHz Atom) 4GB RAM Disk ► 40GB SSD (pureScale nodes) ► 100GB 7200rpm SATA (shared disk/app server node)
  • 17.
    Triton pureScale Experiences •Installation experiences SLES 10 requires The Information Management Specialists SLES 10 requires ► compat-libstdc++-5.0.7-22.2.x86_64.rpm db2cluster resolves iSCSI mount issues Ensure FQDN names in /etc/hosts Very slick considering the component count
  • 18.
    Triton pureScale Experiences •Commodity Cluster Performance Technology Explorer The Information Management Specialists Technology Explorer ► WMD Java workload driver (WLB enabled) ► 2.5M row table ► Vanilla installation ► 32 threads, 25ms think time Delivered 1000tps @ 95%CPU load
  • 19.
    • 14 simulatedclient connections • WLB ACR enabled • 1ms think time • 250,000 row table Performance - nanoCluster The Information Management Specialists • Delivering c. 5500tps @ 50% CPU load
  • 20.
    Workload Balancing (WLB) •Members track and share available capacity db2pd –serverlist The Information Management Specialists db2pd –serverlist • Slipstreamed periodically to clients DB2 9.7 FP1 or higher • Transaction workload balance On UR boundaries
  • 21.
    WLB • db2pd –serverlist DatabaseMember 0 -- Active -- Up 0 days 00:20:43 The Information Management Specialists Database Member 0 -- Active -- Up 0 days 00:20:43 Server List: Time: Tue Nov 2 07:26:54 Database Name: DTW Count: 2 Hostname Non-SSL Port SSL Port Priority node102.purescale.demo 50001 0 52 node103.purescale.demo 50001 0 47
  • 22.
    Resilience – CFFailure • Simulate failure of the primary CF db2lco@node102:~> db2instance -list ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER The Information Management Specialists ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER -- ---- ----- --------- ------------ ----- ---------------- 0 MEMBER STARTED node102 node102 NO 0 1 MEMBER STARTED node103 node103 NO 0 128 CF PRIMARY node102 node102 NO - 129 CF PEER node103 node103 NO - HOSTNAME STATE INSTANCE_STOPPED ALERT -------- ----- ---------------- ----- node103 ACTIVE NO NO node102 ACTIVE NO NO db2lco@node102:~> ps -ef | grep ca-server db2lco 4158 4153 0 05:30 ? 00:00:31 ca-mgmnt-lwd -i128 -p56000 -k8521f27a -d/home/db2lco/sqllib/db2dump - e/home/db2lco/sqllib/cf/ca-server -f db2lco 4164 4158 18 05:30 ? 00:12:41 /home/db2lco/sqllib/cf/ca-server -i 128 -p 56000 -k 1474562 -s 0 -f db2lco 32120 21933 0 06:37 pts/0 00:00:00 grep ca-server db2lco@node102:~> kill -9 4164
  • 23.
    Resilience – CFFailure - Impact The Information Management Specialists
  • 24.
    pureScale Summary • Robustclustering technology based on a proven architecture Scalability The Information Management Specialists Scalability Resilience • No code change to scale out • Excellent price/performance characteristics • Initial customer implementations are under way • What are you waiting for?
  • 25.
    Feedback / Questions JamesGill – james.gill@triton.co.uk The Information Management Specialists James Gill – james.gill@triton.co.uk Julian Stuhler – julian.stuhler@triton.co.uk www.triton.co.uk pureScale webcast series each Tuesday