The IBM Ex5 Memory Advantage How Additional Memory Capacity On Ex5 Can Benefit SQL Server Data Warehouse Workloads
Up to 2TB of memory and enterprise-class availability features make eX5 ideal for DWOctober 2010 The IBM eX5 Memory Advantage: How Additional Memory Capacity on eX5 Can Benefit SQL Server Data Warehouse Workloads By Roland Mueller Vinay Kulkarni IBM Systems and Technology Group
eX5 Memory Advantage: How Add’l Capacity Can Benefit SQL Server DW WorkloadsPage 2Executive Overview A Data Warehouse (DW) is a type of database designed to archive data for the purposes of reporting and analysis. Data Warehouses are typically used as the back-end data storage for Decision Support Systems (DSS). A DW environment is different from an OLTP (online transaction processing) database in that it is often subjected to ad hoc queries rather than the predefined queries typical of the OLTP database. Queries run against DW databases often scan millions of rows, whereas typical OLTP queries generally scan a handful of records. The DSS I/O pattern is mostly large sequential reads, compared to small random reads with an OLTP workload. This technical report documents a series of tests that demonstrate the benefits of using large amounts of server memory for a large-scale Decision Support System workload. ® ® For this series of tests, a Microsoft SQL Server 2008 300GB data warehouse was used running on an IBM® System x3850 X5 32-core server connected via Fibre Channel to an IBM System Storage™ DS3400 storage controller. IBM System x3850 X5 The x3850 X5 is built on eX5, the fifth generation of IBM X-Architecture® chipset technology. This new system incorporates the latest in hardware acceleration and scalability providing resource configurations that push the boundaries of x86/x64-based systems while providing industry- leading flexibility. With the release of new multi-core processors from Intel®, the 4-socket IBM System x3850 X5 now offers up to 32 cores or 64 logical processors with Hyper-Threading Technology and up to 1TB of memory per chassis. The memory is highly available, incorporating such IBM OnForever™ features as Memory ProteXion™, Chipkill™ error correction, memory rank sparing, memory scrubbing, and memory mirroring. Other high-availability features, such as hot- swap redundant fans and power supplies (with 220V AC input), and hot-swap HDDs and solid- state drives (SSDs) deliver high levels of reliability and availability to maximize up-time. These kinds of features provide the reliability and high availability needed for the most demanding virtualization requirements. Compared to the previous-generation server, the x3850 X5 doubles the memory capacity to 64 DIMM slots/1TB of PC3-10600 double data rate 3 (DDR3) memory. (Using the optional 1U MAX5 memory expansion unit, the x3850 X5 has access to an additional 32 DIMM slots, for a total of 1.5TB of memory in only 5U.) This paper will demonstrate how leveraging the memory capabilities of the x3850 X5 can take the performance, efficiency and reliability of your Data Warehouse environment to the next level. http://ibm.com/systems/x/hardware/enterprise/x3850x5/index.html IBM System Storage DS3400 DSS workloads require large bulk reads, so it’s important to maximize throughput and minimize, or eliminate, potential bottlenecks that can occur in the storage system or the storage interconnects. The DS3400 was chosen as the appropriate storage platform because it met the requirements for large capacity, high throughput, and the capability to sustain a high IOPS (I/O operations per second) rate. The DS3400 features a 4Gbps Fibre Channel (FC) interface for either direct-attach storage (DAS) or SAN-ready solutions. The combination of 12 SAS or SATA 3.5" drives per enclosure provides the option to scale to 5.4TB of storage capacity with 450GB hot-swappable SAS disks or 12.0TB with 1.0TB hot- swappable SATA disks in the first enclosure. The DS3400 is expandable by attaching up to three additional EXP3000 storage enclosures for a total of 21.6TB of storage capacity with 450GB SAS or up to 48.0TB with 1.0TB SATA.
eX5 Memory Advantage: How Add’l Capacity Can Benefit SQL Server DW WorkloadsPage 3 For more information about the DS3400 System Storage controller, visit the following URL: http://ibm.com/systems/storage/disk/ds3000/ds3400/index.html Microsoft SQL Server 2008 SQL Server 2008 running on Microsoft Windows® Server 2008 enables organizations to build comprehensive, enterprise-scale analytic solutions that deliver intelligence where customers want it. SQL Server 2008 makes a major advance in Data Warehouse scalability. The database engine contains numerous enhancements designed to improve both absolute performance and multi- processor scaling on decision support workloads.Test Environment Decision support workloads are frequently very I/O intensive. The high I/O requirements of these tests were met using DS3400 storage systems connected to the IBM x3850 X5 server using multiple Fibre Channel inter-connects. Figure 1 illustrates the topology of the Data Warehouse workload test environment. The x3850 X5 server was connected to seven DS3400 storage controllers using fourteen 8Gbps Fibre Channel interconnects (two FC links per controller). Note: Although the host bus adapters (HBAs) are designed to operate at 8Gbps, they auto- negotiate to 4Gbps due to the 4Gbps FC interface of the DS3400 storage controllers. Figure 1. Test Configuration Server Configuration The x3850 X5 incorporated four Intel Xeon® X7560 8-core processors (32 cores). All seven PCIe slots were populated with Emulex LPE 12002-M8 8Gb HBAs. The tests were performed with 64GB, 128GB, 256GB, and 512GB of memory.
eX5 Memory Advantage: How Add’l Capacity Can Benefit SQL Server DW WorkloadsPage 4 Storage Configuration Figure 2 illustrates the storage configuration. Four RAID-1 arrays were created on each storage controller. Two logical drives were created on each of the RAID-1 arrays. The 300GB test database as well as SQL Server’s TempDB were spread across the 56 logical drives. The SQL log file was on a separate array. Figure 2. Storage Configuration showing 28 RAID-1 arrays, each with two logical drivesAnalysis Looking at Graph 3, there is only a slight reduction in query response times for most queries when moving from 64GB to 128GB of server memory. Only queries 3 and 6 show a fair amount of improvement. However, when moving from 128GB of memory to 256GB or 512GB, we start seeing the benefits of the additional memory in terms of much reduced response time on all the queries. Queries Used The following queries were executed against the 300GB database: Query 1 (Q1) - Shipping Priority Query This query retrieves the shipping priority and potential revenue of the top 10 orders having the largest revenue among those that had not been shipped as of a given date. Query 2 (Q2) - Local Supplier Volume Query This query lists the revenue volume that resulted from line-item transactions in which the customer ordering parts and the supplier filling them were both within that nation for a particular year.
eX5 Memory Advantage: How Add’l Capacity Can Benefit SQL Server DW WorkloadsPage 5 Query 3 (Q3) - National Market Share Query This query determines how the market share of a given nation within a given region has changed over two years for a given part type. Query 4 (Q4) - Returned Item Reporting Query This query identifies the top 20 customers who might be having problems with the parts that are shipped to them for a given quarter. Query 5 (Q5) - Small-Quantity-Order Revenue Query This query determines how much average yearly revenue would be lost if orders were no longer filled for small quantities of certain parts. This may reduce overhead expenses by concentrating sales on larger shipments. Query 6 (Q6) - Discounted Revenue Query This query reports the gross discounted revenue attributed to the sale of selected parts that were shipped by air or delivered in person.Test Results The following graph shows the comparison of query response times for each of the six queries with 64GB, 128GB, 256GB, and 512GB of server memory. Figure 3. Test results Note: The following results use 64GB as the baseline performance.
eX5 Memory Advantage: How Add’l Capacity Can Benefit SQL Server DW WorkloadsPage 6 Query 1 Results When doubling the memory from 64GB to 128GB we see a slight improvement in query response time. However, when executing the query with 256GB or 512GB, we get a 57% reduction in query response time for both runs. It appears that the dataset for this query was greater than 128GB and less than 256GB, which is why we see similar performance for the 64GB and 128GB runs and then again for the 256GB and 512GB runs. When executing the query with 256GB and 512GB, the entire dataset could fit into memory. In this case, no further reduction in query response time can be realized by increasing memory from 256GB to 512GB. Query 2 Results As in query 1, we see a slight improvement in query response time when doubling memory from 64GB to 128GB, which tells us the dataset was too large to fit into memory. When executing the query with 256GB, we get a 38% reduction in query response time. It appears much of the dataset was able to fit into memory when executing with 256GB. However, not everything fit into memory, because when executing with 512GB, we see a 71% reduction in query response time. Query 3 Results In query 3, we see a 26% reduction in query response time when moving from 64GB to 128GB. When we execute the query with 256GB and 512GB, we get 52% and 59% reductions, respectively. Query 4 Results Query 4 is almost identical to query 1. Query response time remains the same when executing with both 64GB and 128GB, and then drops 65% when memory is increased from 128GB to 256GB (no change with 512GB). Query 5 Results Query 5 displays very linear improvement when more memory is added. When executing the query with 128GB of memory, we see a 19% reduction in query response time. The 256GB run reduces the query response time by 56%. Finally when executed with 512GB of server memory, we see a 93% reduction in query response time. It appears that the dataset was larger than 256GB but smaller than 512GB. The query was able to execute very quickly on the 512GB test run because the entire dataset resided in memory. Query 6 Results Query 6 tells a similar story to query 3. We see a 39% reduction in query response time when executing with 128GB of memory. When the query is executed with 256GB and 512GB, we see a 92% and a 95% reduction, respectively.Conclusion I/O is the slowest component in a Data Warehouse configuration. When you run a query against a database, the server needs to process data. Data is stored on the storage, and reads are issued to gather the dataset. On a system with a small amount of memory, current data in memory needs to be evicted to make room for the dataset. If the evicted data is needed later, it must be retrieved again and again.
eX5 Memory Advantage: How Add’l Capacity Can Benefit SQL Server DW WorkloadsPage 7 On a system with a large amount of memory, more of the dataset can remain in memory, and future references to the dataset will be satisfied at very low latencies, reducing the query response times significantly. If achieving faster query response times is your goal, then one way to achieve that goal is to use enough memory to keep most or all of the dataset in memory. This will let you reduce the amount of storage needed for the same I/O bandwidth, which can offset the cost of adding memory. IBM offers multiple effective solutions with the 4-to-8-socket x3850 X5 that supports up to 1TB of memory natively, up to 1.5TB, using an x3850 X5 and a 1U MAX5 memory expansion unit, and up to 2TB using two x3950 X5 chassis; or the 2-socket x3690 X5 server, which supports up to 512GB natively or up to 1TB using a server and MAX5 unit; or the 2- to-4-socket HX5 blade server, which supports up to 128GB natively, up to 256GB using two HX5 servers; or up to 320GB using one HX5 and one MAX5. The results presented in this paper demonstrate the benefits of running your workload in these large memory configurations.