Showdown: IBM DB2 versus Oracle Database for OLTP

Showdown: DB2 vs. Oracle Database for OLTP Conor O’Mahony Email: [email_address] Twitter: conor_omahony Blog: db2news.wordpress.com Conor O’Mahony Email: [email_address] Twitter: conor_omahony Blog: database-diary.com

Agenda What Improves OLTP Performance? Comparing OLTP Performance Comparing Cluster Scalability Comparing Cluster Availability What Users are Saying

Technology for OLTP Performance Efficient I/O High performing data access Logger most critical choke point Large Memory Exploitation Multiple buffer pools Ability to handle lots of users Low memory footprint per user Connection concentrator

Efficient I/O Logging is the most critical factor in high volume OLTP TPC-C proves that DB2 has a more efficient logger DB2 logs less than ½ of Oracle Database and SQL Server DB2 9 = 2.4KB of log per transaction Oracle 10gR2 = 4.9KB of log per transaction SQL Server 2005 = 6.0KB of log per transaction Reduced logging = increased performance

Large Memory and Efficient Memory Usage TPC-C benchmark with DB2 on p5 595 used 2TB of real memory 1.9TB of buffer pool space allocated by DB2 Using multiple buffer pools of different page sizes DB2 is now threaded One flat memory address space for all of DB2 No more private memory (all shared) Reduces per connection footprint by 1MB More efficient memory access = better performance

User Scalability DB2 has a proven ability to support lots of users Supports two connection methods: Each client has connection process on server Server processes are shared amongst incoming requests DB2 has a lower memory footprint per agent (connection) than other vendors Enables more client connections for a given amount of memory Better user scalability = more throughput

Transactional Performance Two large-scale standardized performance benchmarks for transactional workloads are often used for comparisons TPC-C – industry standard OLTP benchmark SAP SD – represents an SAP R/3 Sales and Distribution application

Longevity in TPC-C Performance Results as of April 21, 2008

Apples-to-Apples Comparison Both on exactly the same server (8way 1.9GHz p5 570) DB2 v8 vs. Oracle 10g DB2 leads in performance by 16% over Oracle 16% Faster Results current as of Feb 24, 2008 Check http://www.tpc.org for latest results

Longevity in SAP 3-Tier SD Performance Results as of Jan 8, 2008

SAP SD 3-tier This benchmark represents a 3-tier SAP R/3 environment Database on separate server DB2 outperforms Oracle by 68% with half the number of cores DB2 running on 32-way p5 595 Oracle running on 64-way HP

2-tier SAP SD Benchmarks DB2 on IBM Power leads Oracle on HP by 18% on ½ the number of cores DB2 on IBM Power delivers significantly more SAPS/Watt saving you money Results as of April 8, 2008

Oracle RAC - Single Instance Wants to Read a Page Process on Instance 1 wants to read page 501 mastered by instance 2 System process checks local buffer pool: page not found System process sends an IPC to the Global Cache Service process to get page 501 Context Switch to schedule GCS on a CPU GCS copies request to kernel memory to make TCP/IP stack call GCS sends request over to Instance 2 IP receive call requires interrupt processing on remote node Remote node responds back via IP interrupt to GCS on Instance 1 GCS sends IPC to System process (another context switch to process request) System process performs I/O to get the page Buffer Cache Instance 2 Buffer Cache system 1 2 3 4 GCS GCS 5 6 501 501 Instance 1

What Happens in DB2 pureScale to Read a Page Agent on Member 1 wants to read page 501 db2agent checks local buffer pool: page not found db2agent performs Read And Register (RaR) RDMA call directly into CF memory No context switching, no kernel calls. Synchronous request to CF CF replies that it does not have the page (again via RDMA) db2agent reads the page from disk Group Buffer Pool CF Buffer Pool Member 1 db2agent 1 2 3 4 501 501 PowerHA pureScale

The Advantage of DB2 Read and Register with RDMA DB2 agent on Member 1 writes directly into CF memory with: Page number it wants to read Buffer pool slot that it wants the page to go into CF either responds by writing directly into memory on Member 1: That it does not have the page or With the requested page of data Total end to end time for RAR is measured in microseconds Calls are very fast, the agent may even stay on the CPU for the response Much more scalable, does not require locality of data Direct remote memory write with request I don’t have it, get it from disk I want page 501. Put into slot 42 of my buffer pool. Direct remote memory write of response Member 1 CF 1, Eaton, 10210, SW 2, Smith, 10111, NE 3, Jones, 11251, NW db2agent CF thread

Transparent Application Scalability Scalability without application or database partitioning Centralized locking and real global buffer pool with RDMA access results in real scaling without making application cluster aware Sharing of data pages is via RDMA from a true shared cache not synchronized access via process interrupts between servers) No need to partition application or data for scalability Resulting in lower administration and application development costs Distributed locking in RAC results in higher overhead and lower scalability Oracle RAC best practices recommends Fewer rows per page (to avoid hot pages) Partition database to avoid hot pages Partition application to get some level of scalability All of these result in higher management and development costs

2, 4 and 8 Members Over 95% Scalability Scalability for OLTP Applications 64 Members 95% Scalability 16 Members Over 95% Scalability 32 Members Over 95% Scalability 88 Members 90% Scalability 112 Members 89% Scalability 128 Members 84% Scalability

Online Recovery DB2 pureScale design point is to maximize availability during failure recovery processing When a database member fails, only in-flight data remains locked until member recovery completes In-flight = data being updated on the failed member at the time it failed Target time to row availability <20 seconds Shared Data CF CF Log Log Log Log DB2 DB2 DB2 DB2 % of Data Available Time (~seconds) Only data in-flight updates locked during recovery Database member failure 100 50

Steps Involved in DB2 pureScale Member Failure Failure Detection Recovery process pulls directly from CF: Pages that need to be fixed Location of Log File to start recovery from Restart Light Instance performs redo and undo recovery

Failure Detection for Failed Member DB2 has a watchdog process to monitor itself for software failure The watchdog is signaled any time the DB2 member is dying This watchdog will interrupt the cluster manager to tell it to start recovery Software failure detection times are a fraction of a second The DB2 cluster manager performs very low level, sub second heart beating (with negligible impact on resource utilization) DB2 cluster manager performs other checks to determine congestion or failure Result is hardware failure detection in under 3 seconds without false failovers

Member Failure Summary DB2 Single Database View CF Shared Data Member Failure DB2 Cluster Services automatically detects member’s death Inform other members, and CFs Initiates automated member restart on same or remote host Member restart is like crash recovery in a single system, but is much faster Redo limited to in-flight transactions Benefits from page cache in CF Client transparently re-routed to healthy members Other members fully available at all times – “Online Failover” CF holds update locks held by failed member Other members can continue to read and update data not locked by failed member Member restart completes Locks released and all data fully available DB2 DB2 DB2 CF Updated Pages Global Locks Primary CF Secondary CF Updated Pages Global Locks Log CS CS Clients CS CS CS CS Log Log Log Log Records Pages kill -9

Steps involved in a RAC node failure Node failure detection Data block remastering Locking of pages that need recovery Redo and undo recovery Unlike DB2 pureScale, Oracle RAC does not centralize lock or data cache

With RAC – Access to GRD and Disks are Frozen Global Resource Directory (GRD) Redistribution No more I/O until pages that need recovery are locked No Lock Updates GRD GRD Instance 1 fails Instance 1 Instance 2 Instance 3 I/O Requests are Frozen GRD

With RAC – Pages that Need Recovery are Locked GRD GRD Instance 1 fails I/O Requests are Frozen Instance 1 Instance 2 Instance 3 Recovery Instance reads log of failed node Recovery instance locks pages that need recovery x x x x x x redo log redo log redo log GRD Must read log and lock pages before freeze is lifted.

DB2 pureScale – No Freeze at All Member 1 fails Member 1 Member 2 Member 3 No I/O Freeze CF knows what rows on these pages had in-flight updates at time of failure x x x x x x x x x x x x CF always knows what changes are in flight CF Central Lock Manager

Agenda What Improves OLTP Performance? Comparing Cluster Scalability Comparing Cluster Availability Comparing OLTP Performance What Users are Saying

Sample of Feedback… “ We compared DB2 with Oracle and found that DB2 was more reliable and has a better price-performance ratio.” ― Dr. Yuki Kitaoka, Kyoto National Hospital “ We chose DB2 over Oracle and Sybase for its proven performance, availability and cost-effectiveness.” ― Atakan Karaman, Anadolu Group “ DB2 Universal Database is the bank’s database of choice for its high availability, scalability and performance.” ― Patrick Stearn, Bayerische Landesbank “ In comparison tests with both Oracle and Microsoft, DB2 continually demonstrated a better price-to-performance ratio -- the quality of DB2 is quite astonishing.” ― Benjamin Simmen, Zurich Financial Services “ When making our decision, we discovered that a $250,000 DB2/Linux solution performed better than a $1.5 million Oracle/Solaris solution” ― Carl Ansley, CTO, Clarity Payment Solutions, Inc.

Thank You! For more information, see www.ibm.com/db2 Or contact me at: Conor O’Mahony Email: [email_address] Twitter: conor_omahony Blog: database-diary.com

Showdown: IBM DB2 versus Oracle Database for OLTP

More Related Content

What's hot

Viewers also liked

Similar to Showdown: IBM DB2 versus Oracle Database for OLTP

Recently uploaded

Showdown: IBM DB2 versus Oracle Database for OLTP

Editor's Notes