SAP HANA Performance with Intel Processors


Published on

Intel® Xeon® Processor E7-8800/4800/2800 Product Families provide the reference platform for SAP® HANA SAP HANA benefits from generational platform improvements and new features.

Published in: Technology, Business

Comments are closed

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

SAP HANA Performance with Intel Processors

  1. 1. 1SAP HANA® PerformanceSAPPHIRE NOW 2013 OrlandoDietrich O. Banschbach
  2. 2. 2Legal Disclaimer - NoticeINFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANYINTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTELS TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTELASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDINGLIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHERINTELLECTUAL PROPERTY RIGHT.UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTELPRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features orinstructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising fromfuture changes to them. The information here is subject to change without notice. Do not finalize a design with this information.The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Currentcharacterized errata are available on request.Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Goto:® AES-NI requires a computer system with an AES-NI enabled processor, as well as non-Intel software to execute the instructions in the correct sequence. AES-NI isavailable on select Intel® processors. For availability, consult your reseller or system manufacturer. For more information, see computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer with Intel® VirtualizationTechnology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible measured launched environment (MLE). Intel TXT alsorequires the system to contain a TPM v1.s. For more information, visit® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, and virtual machine monitor (VMM). Functionality, performance or otherbenefits will vary depending on hardware and software configurations. Software applications may not be compatible with all operating systems. Consult your PC manufacturer.For more information, visit a system with Intel® Turbo Boost Technology. Intel Turbo Boost Technology and Intel Turbo Boost Technology 2.0 are only available on select Intel® processors.Consult your PC manufacturer. Performance varies depending on hardware, software, and system configuration. For more information, visit product is manufactured on a lead-free process. Lead is below 1000 PPM per EU RoHS directive (2002/95/EC, Annex A). No exemptions requiredHalogen-free: Applies only to halogenated flame retardants and PVC in components. Halogens are below 900ppm bromine and 900ppm chlorine.Copyright © 2013 Intel Corporation. All rights reserved. Intel, Intel Xeon, the Intel Xeon logo and the Intel logo are trademarks of Intel Corporation in the U.S. and/or othercountries. .*Other names and brands may be claimed as the property of others.NOTE: Some Configuration details are listed in the notes sections. Please use Notes Page under View to print or PDF.
  3. 3. 3Legal Disclaimers - PerformanceSoftware and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software,operations and functions. Any change to any of those factors may cause the results to vary. You should consult otherinformation and performance tests to assist you in fully evaluating your contemplated purchases, including the performance ofthat product when combined with other products.Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in thisdocument. Intel encourages all of its customers to visit the referenced Web sites or others where similar performancebenchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systemsavailable for purchase.Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actualbenchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, andassigning them a relative performance number that correlates with the performance improvements reported.SPEC, SPECint, SPECfp, and SPECrate are trademarks of the Standard Performance Evaluation Corporation. See for more information.SAP and SAP NetWeaver are the registered trademarks of SAP AG in Germany and in several other countries. See for more information.
  4. 4. 4Legal Disclaimers - Optimization NoticeOptimization NoticeIntel® compilers may or may not optimize to the same degree for non-Intel microprocessors foroptimizations that are not unique to Intel microprocessors. These optimizations include Intel®Streaming SIMD Extensions 2 (Intel® SSE2), Intel® Streaming SIMD Extensions 3 (Intel® SSE3),and Supplemental Streaming SIMD Extensions 3 (Intel® SSSE3) instruction sets and otheroptimizations. Intel does not guarantee the availability, functionality, or effectiveness of anyoptimization on microprocessors not manufactured by Intel. Microprocessor-dependentoptimizations in this product are intended for use with Intel microprocessors. Certainoptimizations not specific to Intel microarchitecture are reserved for Intel microprocessors.Please refer to the applicable product User and Reference Guides for more information regardingthe specific instruction sets covered by this notice.Notice revision #20110804
  5. 5. 5SAP HANA performance – platform benefitsIntel® Xeon® Processor E7-8800/4800/2800 Product Familiesprovide the reference platform for SAP® HANASAP HANA benefits from generational platform improvementsand new features.Examples:•  Intel® Turbo Boost Technology: Increases performance by increasingprocessor frequency and enabling faster speeds when conditions allow•  Intel® Hyper-threading Technology: Increases performance for threadedapplications delivering greater throughput and responsiveness•  Up to 10 cores and 20 threads; 30 MB of on-die cache•  Up to 2 TB of DDR3 memory on a 4 socket system using 32 GB DIMMs
  6. 6. 62-socket2+2 (4S)2+2+2+2 (8S) 4S (64DIMMs)4S (32DIMMs)4+4 (8S)Modular Platform Drives InnovationWide Range of Xeon® E7-8800/4800/2800 based Platforms Brought to MarketXeon® CPU SocketMemoryI/O HubIntel QuickPath Interconnect3rd partry Node Controller(non-Intel)OEM interconnectAdd’l configs viaOEM-specificscaling tech2+2+…(up to 256s)...* Other names and brands may be claimed as the property of others.Huge variety of systems available for optimized choice
  7. 7. 7  Intel engineers optimized HANA for Xeon E7  Adaptation of state-of-art microprocessor ISAextension like SSE4.x  Decompression and search benefit 60-100%speedup  7x faster hash function  3.5x faster implementation of bit-vectoroperations  Intel Decimal Floating-Point Library  Pre-enabling of Haswell new instructions  Integrated Intel VTuneTM APIs for deep code analysis  Scalability improvements for various usage scenarios  Great scalability on 8-socket glue-less referencedesignIntel Engineering EngagementCo-Engineering since 2005 (TREXàBWAàHANA) FunctionBit-Vector Optimizations0123456780 1 2 3 4 5 6 7 8Speed-up# of socketsSAP HANA - Scalabilityperfectscaling
  8. 8. 8Scanning at the Speed of Light•  Optimized SSE routines can scan 2B symbols/s on 1core•  A 4-way Intel® Xeon® processor E7-4870 system canscan 50B symbols/s:•  A fast typist can type 400 chars/min (world record: 542)•  You need 7.4B typists to type as fast as scanning the data.•  World population is 7.0B.•  Assume a receipt with 5mm per line•  1 system can scan 247,939km/s of receipts.•  Speed of light is 299,792km/s.•  Source: Intel internal measurements
  9. 9. 9Beyond PerformanceSAP HANA implements recovery routinesthat allow the database to survive uncorrectable memory errors in many casesNormal StatusWith ErrorPreventionFirst Machine Check Architecture Recovery in Xeon®-based Systems*Errors detected using Patrol Scrub or Explicit Write-back from cacheMCA needs to be supported by the OS and the ApplicationPreviously seen only in RISC, mainframe, and Itanium-based systemsREGDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR3DDR3DDR3DDR3DDR3DDR3DDR3DDR3REGDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR3DDR3DDR3DDR3DDR3DDR3DDR3DDR3REGDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR3DDR3DDR3DDR3DDR3DDR3DDR3DDR3REGDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR3DDR3DDR3DDR3DDR3DDR3DDR3DDR3SMBREGDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR3DDR3DDR3DDR3DDR3DDR3DDR3DDR3REGDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR3DDR3DDR3DDR3DDR3DDR3DDR3DDR3REGDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR3DDR3DDR3DDR3DDR3DDR3DDR3DDR3REGDRAMDRAMDRAMDRAMDRAMDRAMDRAMDRAMDDR3DDR3DDR3DDR3DDR3DDR3DDR3DDR3SMBSMISMIErrorCorrectedHW CorrectableErrorsErrorDetected*Patrol Scrubberscans memoryfor errorsUn-correctableErrorErrorContainedBad memory locationflagged so data will not beused by OS or applicationsError informationpassed to OSSystem works inconjunction with OS torecover or restartprocesses and continuenormal operation9SystemRecoverywith OSSAP HANA enabled toanalyze OS error signalsand executes own recoveryroutinesSAP HANARecoveryError informationpassed to IMDB
  10. 10. 10What‘s next?Brickland–EX Platform & Intel® Xeon® processor E7-8800/4800/2800 v2 product families (codenamed ‘Ivy Bridge-EX’)  22nm process technology  Die shrink of Sandy Bridge microarchitecture  Future microarchitecture after Boxboro-EX/Westmere-EX  Support of PCIe* 3.0  IVB-EX will include new advanced reliability features  MCA Recovery Execution Path, MCA I/O, Enhanced MCA Gen 1, PCIe* Live ErrorRecovery  IVB-EX will triple the memory capacity  Up to 6TB in 4S system; up to 12TB in 8S system  Support for up to 24 DIMMs per socket; DDR3 memory (RDIMM, LRDIMM) up to64GB (LRDIMMs) density  IVB-EX on-track for Production in Q4’ 2013
  11. 11. 11Intel® Xeon® ProcessorE7-8800/4800/2800 Product FamiliesKey Performance Claims BackupPerformance claims as of 15 February 2011:1.  Generational: Up to 40% generational compute-intensive throughput claim based on SPECint*_rate_base2006 benchmark comparing nextgeneration Intel® Xeon® processor E7-4870 (30M cache, 2.40GHz, 6.40GT/s Intel® QPI, formerly codenamed Westmere-EX) scoring 1,010(includes Intel Compiler XE2011 improvements accounting for about 11% of the performance boost) to X7560 (24M cache, 2.26GHz, 6.40GT/sIntel QPI, formerly codenamed Nehalem-EX) scoring 723 (Intel Compiler 11.1). Source: Intel SSG TR#1131.2.  Scalability: Up to 2.8x scaling transaction improvement claim based on internal OLTP benchmark comparing next generation Intel® Xeon®processor E7-4870 (30M cache, 2.40GHz, 6.40GT/s Intel® QPI, formerly codenamed Westmere-EX) scoring 2.73M transactions (leading databasevendor) to X5680 (12M cache, 3.33GHz, 6.40GT/s Intel QPI, formerly codenamed Westmere-EP) scoring 970K transactions. Source: Intel SSGTR#1120.3.  Consolidation:Up to 29:1 server consolidation performance with return on investment in less than one year" claim estimated based on comparison between 4SMP Intel® Xeon® processor 3.33GHz (single-core with Intel® HyperThreading Technology, 8M LLC cache, 3.33GHz, 800MHz FSB, formerly codenamed Potomac) and 4S Intel® Xeon® processor E7-4870 (30M cache, 2.40GHz, 6.4GT/s Intel® QPI, formerly code named Westmere-EX)based servers.Up to 18:1 server consolidation performance with return on investment in about 14 months" claim estimated based on comparison between 4SMP Intel® Xeon® processor 7041 (dual-core with Intel® HyperThreading Technology, 4M cache, 3.00GHz, 800MHz FSB, formerly code namedPaxville) and 4S Intel® Xeon® processor E7-4870 (30M cache, 2.40GHz, 6.4GT/s Intel® QPI, formerly code named Westmere-EX) basedservers.Calculation includes analysis based on performance, power, cooling, electricity rates, operating system annual license costs and estimated servercosts. This assumes 42U racks, $0.10 per kWh, cooling costs are 2x the server power consumption costs, operating system license cost of $900/year per server, per server cost of $41,523 based on averaged estimated list prices, and estimated server utilization rates. All dollar figures areapproximate. Estimated SPECint*_rate_base2006 performance and power results are measured for Intel® Xeon® processor E7-4870 andestimated for Intel Xeon processor 3.33GHz single-core / 7041 dual-core based servers. Platform power was measured during the steady statewindow of the benchmark run and at idle. Performance gain compared to baseline was 29x for single-core and 18x for dual-core (truncated).* Baseline single-core platform (measured score of 34.1; idle = 480W; active = 780W): Intel server with four MP Intel® Xeon® processor3.33GHz processors, 16GB memory (8x 2GB DDR2-400), 1 hard drive, 1 power supply, Microsoft Windows Server* 2008 Enterprise x64 EditionR2 operating system, Intel Compiler 11 built SPECcpu* 2006 November 2009 binaries. Estimated result.* Baseline dual-core platform (estimated score of 54.6; idle = 546W; active = 812W): Intel server with four Intel® Xeon® processor 7041processors, 32GB memory (16x 2GB DDR2-400), 1 hard drive, 1 power supply, Microsoft Windows Server* 2008 Enterprise x64 Edition R2operating system, Intel Compiler 11 built SPECcpu* 2006 November 2009 binaries. Estimated result.* New platform (measured score of 1,000; idle = 552W; active = 1053W): Intel internal reference server with four Intel® Xeon® processorE7-4870 (30M cache, 2.40GHz, 6.40GT/s Intel® QPI), 256GB memory (64 x Samsung 4GB 2Rx8 PC3L-10600R), 1 hard drive, 3 power supplies,using SUSE* Linux Enterprise Server 11 operating system, Intel C++ and Fortran Composer XE2011 built SPECcpu* 2006 January 2011 binaries.Source: Intel SSG TR#1131.4.  Flexible Virtualization: Up to 25% better virtual machine performance claim based on SPECvirt_sc2010 benchmark comparing next generationIntel® Xeon® processor E7-4870 (30M cache, 2.40GHz, 6.40GT/s Intel® QPI, formerly codenamed Westmere-EX) scoring 2,540 @ 162VMs toX7560 (24M cache, 2.26GHz, 6.40GT/s Intel QPI, formerly codenamed Nehalem-EX) scoring 2,024 @ 126VMs. Source: Intel SSG TR#1118.
  12. 12. 12Orders of magnitude