Solving io bottleneck


Published on

Solving IO Bottleneck
IMEX Research

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Solving io bottleneck

  1. 1. IMEX RESEARCH.COM Solving the IO Bottleneck in NextGen DataCenters & Cloud Computing Are SSDs Ready for Enterprise Storage Systems Anil Vasudeva, President & Chief Analyst, IMEX Research © 2007-11 IMEX Research All Rights Reserved Copying Prohibited Please write to IMEX for authorization Anil Vasudeva President & Chief Analyst IMEX RESEARCH.COM 408-268-0800© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  2. 2. Abstract IMEX RESEARCH.COM Solving the I/O Bottleneck in NextGen Data Centers & Cloud Computing Virtualization brings tremendous advantages to data centers of all sizes – large and small. But the rapid proliferation of virtual machines (VMs) per physical server, in its wake, creates a highly randomized I/O problem, raising performance bottlenecks. To address this I/O issue, the IT industry and storage administrators in particular, have begun the adoption of a slew of newer technology solutions - ranging from faster and larger memories in cache, SSDs for higher IOPS cost effectively, high bandwidth (10GbE) networks using NPIVs for virtualized networks between VMs and shared storage systems along with embedded intelligence software to optimize various VM workloads – all in order to meet various SLA metrics of performance, availability, cost etc.. Are you aware of the side-effects that get created from Server Virtualization and prepared to ask pertinent questions of your suppliers of IT Infrastructure Equipment, Storage Virtualization and Data Storage Management software as well as in their implementation to achieve targeted results in performance, availability, scalability, interoperability and data management in their virtualized data centers This presentation provides an illustrative view of the impact of Server Virtualization on existing storage I/O solutions and best practices. It delineates the roles, capabilities and cost effectiveness of emerging technologies in mitigating the I/O bottlenecks so the IT infrastructure implementers can achieve their targeted performance under various workloads, from their storage systems in virtualized data centers. 2© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  3. 3. Agenda IMEX RESEARCH.COM • Data Centers & Cloud Infrastructure • Cloud Computing Architecture • Performance Metrics by Workload • Anatomy of Data Access • Data Center Performance Bottlenecks • Improving Query Response Time in OLTP • Role of SSD in Improving I/O Perf. Gap • SCM: A New Storage Class Memory SSDs • Price Erosion & IOPS/GB • Choosing SSD vs. Memory to Improve TPS • New Storage Usage Hierarchy in NGDC & Clouds • IO Bottleneck Mitigation in Virtualized Servers • I/O Forensics for Auto Storage-Tiering • Apps Benefitting from Improved I/O • Key Takeaways • Acknowledgements 3© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  4. 4. Data Centers & Cloud Infrastructure IMEX RESEARCH.COM Public CloudCenter © Enterprise VZ Data Center On-Premise Cloud Vertical Clouds Servers VPN Switches: Layer 4-7, IaaS, PaaS Layer 2, 10GbE, FC Stg SaaS Supplier/Partners ISP ISP Internet ISP FC/ IPSANs ISP Core Optical ISP ISP Edge Caching, Proxy, Database Servers, Remote/Branch Office ISP FW, SSL, IDS, DNS, Middleware, Data LB, Web Servers Mgmt Tier-1 Application Servers Tier-3 Edge Apps HA, File/Print, ERP, Data Base Web 2.0 SCM, CRM Servers Servers Social Ntwks. Cellular Facebook, Tier-2 Apps Twitter, YouTube… Cable/DSL… Directory Security Policy Management Wireless Home Networks Middleware PlatformSource:: IMEX Research - Cloud Infrastructure Report © 2009-11 4© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  5. 5. IT Industry’s Journey - Roadmap IMEX RESEARCH.COM Cloudization SIVAC ®IMEX On-Premises > Private Clouds > Public Clouds DC to Cloud-Aware Infrast. & Apps. Cascade migration to SPs/Public Clouds. Automation Automatically Maintains Application SLAs (Self-Configuration, Self-Healing©IMEX, Self-Acctg. Charges etc) Virtualization Pools Resources. Provisions, Optimizes, Monitors Shuffles Resources to optimize Delivery of various Business Services Integration/Consolidation Integrate Physical Infrast./Blades to meet CAPSIMS ®IMEX Cost, Availability, Performance, Scalability, Inter-operability, Manageability & Security Standardization Standard IT Infrastructure- Volume Economics HW/Syst SW (Servers, Storage, Networking Devices, System Software (OS, MW & Data Mgmt SW)Source:: IMEX Research - Cloud Infrastructure Report © 2009-11 5© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  6. 6. Cloud Computing Architecture IMEX RESEARCH.COM Cloud Computing Private Cloud Hybrid Public Cloud Service Enterprise Cloud Providers SaaS Applications SaaS SLA SLA SLA SLA SLA App App App App SLA SLA App SLA SLA SLA App App App App App Platform Tools & Services Management Python Ruby .Net EJB PHP ….. ….. PaaS Operating Systems Virtualization IaaS Resources (Servers, Storage, Networks) Application’s SLA dictates the Resources Required to meet specific requirements of Availability, Performance, Cost, Security, Manageability etc.© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  7. 7. Performance Metrics by Workload IMEX RESEARCH.COM 1000 K OLTP Transaction Processing eCommerce 100 K Business IOPS* (*Latency-1) (RAID - 1, 5, 6) Data Intelligence (RAID - 0, 3) Warehousing 10K OLAP 1K Scientific Computing HPC Imaging TP 100 Audio Web 2.0 HPC Video 10 1 5 10 50 100 500 *IOPS for a required response time ( ms) MB/sec *=(#Channels*Latency-1)Source:: IMEX Research - Cloud Infrastructure Report © 2009-11© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  8. 8. Anatomy of Data Access IMEX RESEARCH.COM Performance For the time it takes to do each Disk Operation: r r I/O r ve sso Gap - Millions of CPU Operations can be done Se oce - Hundreds of Thousands of Memory Operations Pr can be accomplished /O Disk I 1980 1990 2000 2010 Anatomy of Data Access Time taken by CPU, Memory, Network, Disk for a typical I/O Operation during a Data Access A 7.2K/15k rpm HDD can do 100/140 IOPS© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  9. 9. Data Center Performance Bottlenecks IMEX RESEARCH.COM Clients Windows, User Bottlenecks Applications Linux, Unix Connectivity Timeouts, Excessive Locking Workload Surges Data Contention I/O Delays/Errors LAN Access Networks Network I/O Network Congestion Servers Server Bottlenecks Dropped packets Web Servers Lack of Srvr Power Application Servers Data Retransmissions Database Servers IO Wait & Queuing CPU Timeouts Overhead I/O Timeouts Component Failures Storage I/O Access Storage I/O Connect Lack of Bandwidth Storage Overloaded PCIe Connect Web, Application, Device Bottlenecks Storage Device Contention Database Device I/O Hotspots Cache Flush Lack of Storage Capacity© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  10. 10. Improving Query Response Time in OLTP IMEX RESEARCH.COM 8 Query Response Time (ms) HDDs HDDs Hybrid 6 14 Drives 112 Drives HDDs $$ w short 36 Drives stroking + SSDs SSDs 4 $$$$$$$$ $$$$ 12 Drives $$$ 2 Conceptual Only -Not to Scale 0 0 10,000 20,000 30,000 40,000 IOPS (or Number of Concurrent Users) • Improving Query Response Time • Cost effective way to improve Query response time for a given number of users or servicing an increased number of users at a given response time is best served with use of SSDs or Hybrid (SSD + HDDs) approach, particularly for Database and Online Transaction ApplicationsSource: IMEX Research SSD Industry Report 2011 © 10© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  11. 11. Role of SSD in Improving I/O Perf. Gap IMEX RESEARCH.COM Price CPU SDRAM $/GB DRAM getting NOR DRAM Faster (to feed faster CPUs) & Larger (to feed Multi-cores & Multi-VMs from Virtualization) NAND SCM PCIe SSD Servers HDD SSD segmenting into SATA PCIe SSD Cache Tape SSD - as backend to DRAM & SATA SSD Hybr. Storage - as front end to HDD HDD becoming Cheaper, not faster Source: IMEX Research SSD Industry Report 2011 © Performance I/O Access Latency 11© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  12. 12. SCM: A New Storage Class Memory IMEX RESEARCH.COM • SCM (Storage Class Memory) Solid State Memory filling the gap between DRAMs & HDDs Marketplace segmenting SCMs into SATA and PCIe based SSDs • Key Metrics Required of Storage Class Memories Device - Capacity (GB), Cost ($/GB), • Performance - Latency (Random/Block RW Access-ms); Bandwidth BW(R/W- GB/sec) • Data Integrity - BER (Better than 1 in 10^17) • Reliability - Write Endurance (30K PE Cycles No. of writes before death); • - Data Retention (5 Years); MTBF (2 millions of Hrs), • Environment - Power Consumption (Watts); • Volumetric Density (TB/; Power On/Off Time (sec), • Resistance - Shock/Vibration (g-force); Temp./Voltage Extremes 4-Corner (oC,V); Radiation (Rad) 12© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  13. 13. SSDs - Price Erosion & IOPS/GB IMEX RESEARCH.COM 600 8 HDD B SS Ds IO PS/G M) it s( D Un Units (Millions) SS IOPS/GB 300 4 Enter prise HDD Units (M) SSD IOPS/GB HDDs 0 0 2009 2010 2011 2012 2013 2014 Note: 2U storage rack, • 2.5” HDD max cap = 400GB / 24 HDDs, de-stroked to 20%, • 2.5” SSD max cap = 800GB / 36 SSDs Source: IMEX Research SSD Industry Report ©2011© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  14. 14. IMEXChoosing SSD vs. Memory to Improve TPS RESEARCH.COM 2500 2000 1500 TPS 4.0 x 2.5 x 1000 Ie SS D PC 500 A TA 0 4.0 x S SD S A ID 1 D R 2.5 x HD 0 0 2 4 6 8 10 12 14 16 18 20 Buffer Pool (Memory) GB© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  15. 15. Data Storage Usage Patterns –Data Access vs. Age of Data IMEX RESEARCH.COM 80% of IOPs 80% of TB Performance Scale Data Protection Cost Data Reduction Data Access Storage Growth SSDs 1 Day 1 Week 1 Month 2 Mo. 3 Mo. 6 Mo. 1 Year 2 Yrs Age of DataSource:: IMEX Research - Cloud Infrastructure Report © 2009-11© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  16. 16. New Storage Hierarchy in NGDC & Clouds IMEX RESEARCH.COM I/O Access Frequency vs. Percent of Corporate Data 95% 75% Cloud 65% FCoE/ Storage SAS SATA Arrays % of I/O Accesses • Back Up Data SSD • Tables • Archived Data • Logs • Indices • Offsite DataVault • Journals • Hot Data • Temp Tables • Hot Tables 1% 2% 10% 50% 100% % of Corporate DataSource:: IMEX Research - Cloud Infrastructure Report © 2009-11 16© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  17. 17. IO Bottleneck Mitigation in Virtualized Servers IMEX RESEARCH.COM vSphere ESX VM Client 1 VM Client 2 VM Client n XCL Mgr. I/O XCL I/O XCL I/O XCL Reg. Driver Reg. Driver Reg. Driver ESX Kernel XCL VLUN Disk Controller. SSD Driver Offloading IOPS from Primary Storage Primary SSD Drive Both Applications & Storage Storage Run Faster w/ESX Driver© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  18. 18. I/O Forensics for Auto Storage-Tiering IMEX RESEARCH.COM Storage-Tiered Virtualization Storage-Tiering at LBA/Sub-LUN Level Physical Storage Logical Volume SSDs Hot Data Arrays Cold Data HDDs Arrays LBA Monitoring and Tiered Placement • Every workload has unique I/O access signature • Historical performance data for a LUN can identify performance skews & hot data regions by LBAs Source: IBM & IMEX Research SSD Industry Report 2011 ©IMEX 2010-11 18© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  19. 19. Apps Benefitting from Improved I/O IMEX RESEARCH.COM Smart Mobile Commercial Bioinformatics Decision Support Entertainment- Devices Visualization & Diagnostics Bus. Intelligence VoD / U-Tube Data: IMEX Research & Panasas 19 Instant On Boot Ups Rendering (Texture & Polygons) Data Warehousing Most Accessed Videos Rugged, Low Power Very Read Intensive, Small Block I/O Random IO, High OLTPM Very Read Intensive 1GB/s, __ms 10 GB/s, __ms 1GB/s, __ms 4 GB/s, __ms© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
  20. 20. Key Takeaways IMEX RESEARCH.COM • Solving I/O Problems • I/O Bottlenecks occur at multiple places in the Compute Stack, the largest being at Storage I/O • SSD comes out cheaper/IOP for IO Intensive Apps • To get of Reads – Improve Indexing, archive out old data • Minimize the impact of writes – Get rid of temp tables/filesorts on slow disks. • Compress big varchar/text/blobs • Data Forensics and Tiered Placement • Every workload has unique I/O access signature • Historical performance data for a LUN can identify performance skews & hot data regions by LBAs • Use Smart Tiering to identify hot LBA regions and non-disruptively migrate hot data from HDD to SSDs. • Typically 4-8% of data becomes a candidate and when migrated to SSDs can provide response time reduction of ~65% at peak loads© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.