IMEX Research - Is Solid State Storage Ready for Enterprise & Cloud Computing Systems
Upcoming SlideShare
Loading in...5
×
 

IMEX Research - Is Solid State Storage Ready for Enterprise & Cloud Computing Systems

on

  • 1,844 views

 

Statistics

Views

Total Views
1,844
Views on SlideShare
1,844
Embed Views
0

Actions

Likes
0
Downloads
44
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    IMEX Research - Is Solid State Storage Ready for Enterprise & Cloud Computing Systems IMEX Research - Is Solid State Storage Ready for Enterprise & Cloud Computing Systems Presentation Transcript

    • IMEX RESEARCH.COM Is Solid State Storage Ready for Enterprise & CloudReady for Enterprise Storage Systems Are SSDs Computing Systems Anil Vasudeva, President & Chief Analyst, IMEX Research © 2007-11 IMEX Research All Rights Reserved Copying Prohibited Please write to IMEX for authorization Anil Vasudeva President & Chief Analyst imex@imexresearch.com IMEX RESEARCH.COM 408-268-0800© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Abstract IMEX RESEARCH.COM • Are SSDs Ready for Enterprise Storage Systems Computer architects dream of storage devices for their applications/workloads which can provide very high IOPs at minimal cost (IOPS/$/GB) and fast access (low latency). “Enterprise-Ready SSDs” have started to fulfill that promise as they ©IMEX segment into SATA and PCIe based Storage products. A major factor for their quick adoption has been the advent of new controllers and firmware which have allowed them to transparently mitigate early issues related to reliability, endurance, data retention, performance, ease of management and interoperability with exiting storage interfaces. But their real success in enterprise adoption comes from Automated Storage Tiering activated by monitoring workload I/O access signatures and behavior over time and then non-disruptive migration of hot data to SSDs, resulting in over 200% improvement in IOPS and 80% improvement in response time at peak loads. The presentation provides an overview of SSD technology, storage characteristics and applications that benefit the most from its usage. It also provides techniques for workloads optimization using automated smart-tiering and system implementation in enterprise storage systems together with economics of SSDs usage in real life. It illustrates how optimally selected hybrid storage of SSDs and HDDs can achieve 65% lower TCO and 165% lower footprint while achieving a whopping 800% in $/IOPs in SANs and other storage systems under different scenarios. 2© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Agenda IMEX RESEARCH.COM • IT DataCenter & Cloud Infrastructure Roadmap • Storage Usage Patterns – Issues & Requirements • NextGen SSDs for Enterprise Storage Systems • Enterprise SSD Market/Product Segments by Interfaces • SSD vs. HDDs vs. Hybrids - Price/Perf/Availability • SLC vs. MLC SSDs – Technologies, Drivers & Challenges • New Intelligent Controllers – Key for SSD Adoption • AutoSmart Storage-Tiering Software – Usage & Impact • Applications best suited for SSDs • Key Takeaways 3© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • IT DataCenters & Cloud Infrastructure IMEX RESEARCH.COM Public CloudCenter © Enterprise VZ Data Center On-Premise Cloud Vertical Clouds Servers VPN Switches: Layer 4-7, IaaS, PaaS Layer 2, 10GbE, FC Stg SaaS Supplier/Partners ISP ISP Internet ISP FC/ IPSANs ISP Core Optical ISP ISP Edge Caching, Proxy, Database Servers, Remote/Branch Office ISP FW, SSL, IDS, DNS, Middleware, Data LB, Web Servers Mgmt Tier-1 Application Servers Tier-3 Edge Apps HA, File/Print, ERP, Data Base Web 2.0 SCM, CRM Servers Servers Social Ntwks. Cellular Facebook, Tier-2 Apps Twitter, YouTube… Cable/DSL… Directory Security Policy Management Wireless Home Networks Middleware Platform Data Source: IMEX Research Cloud Infrastructure Report ©2009-11 4© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • IT Industry’s Journey - Roadmap IMEX RESEARCH.COM Cloudization SIVAC ®IMEX On-Premises > Private Clouds > Public Clouds DC to Cloud-Aware Infrast. & Apps. Cascade migration to SPs/Public Clouds. Automation Automatically Maintains Application SLAs (Self-Configuration, Self-Healing©IMEX, Self-Acctg. Charges etc) Virtualization Pools Resources. Provisions, Optimizes, Monitors Shuffles Resources to optimize Delivery of various Business Services Integration/Consolidation Integrate Physical Infrast./Blades to meet CAPSIMS ®IMEX Cost, Availability, Performance, Scalability, Inter-operability, Manageability & Security Standardization Standard IT Infrastructure- Volume Economics HW/Syst SW (Servers, Storage, Networking Devices, System Software (OS, MW & Data Mgmt SW) Data Source: IMEX Research Cloud Infrastructure Report ©2009-11 5© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Market Segments by Applications IMEX RESEARCH.COM 1000 K OLTP Transaction Processing eCommerce 100 K Business (RAID - 1, 5, 6) Data Intelligence (RAID - 0, 3) Warehousing IOPS* 10K OLAP 1K Scientific Computing HPC Imaging TP 100 Audio Web 2.0 HPC Video 10 1 5 10 50 100 500 Data Source: IMEX Research Cloud Infrastructure Report *IOPS for a required response time ( ms) 2009-11 © MB/sec *=(#Channels*Latency-1)© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Corporate DataCenter Storage Usage IMEX RESEARCH.COM I/O Access Frequency vs. Percent of Corporate Data 100% 95% 75% Disk Tape 65% Arrays Libraries • Back Up Data • Tables • Archived Data % of I/O Accesses • Indices Cache • Hot Data • Offsite DataVault • Logs • Journals • Temp Tables • Hot Tables SSD 1% 2% 10% 50% 100% Data Source: IMEX Research Cloud Infrastructure Report 2009-11 © % of Corporate Data 7© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Cloud MegaDataCenter© Storage Usage IMEX RESEARCH.COM I/O Access Frequency vs. Percent of Corporate Data 95% 75% Cloud 65% FCoE/ Storage SAS SATA Arrays % of I/O Accesses • Back Up Data SSD • Tables • Archived Data • Logs • Indices • Offsite DataVault • Journals • Hot Data • Temp Tables • Hot Tables 1% 2% 10% 50% 100% Data Source: IMEX Research Cloud Infrastructure Report 2009-11 © % of Corporate Data 8© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Data Storage Usage: Access & Longevity IMEX RESEARCH.COM 80% of IOPs 80% of TB Performance Scale Data Protection Cost Data Reduction Data Access Storage Growth SSDs 1 Day 1 Week 1 Month 2 Mo. 3 Mo. 6 Mo. 1 Year 2 Yrs Age of Data Data Source: IMEX Research Cloud Infrastructure Report 2009-11 ©© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Fueling New Mkts.- Consumer & Enterprise IMEX RESEARCH.COM SSD 64-300 GB Component Density, GbNAND - $/GB 2002 2004 2006 2008 2010 2012 Source: Samsung & IMEX Research© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Enterprise SSDs Trends - Cost IMEX RESEARCH.COM 100K 10K Price Erosion Trends • Driven by an explosion in the use of cost-sensitive handheld mobile 1K devices, MLC NAND has seen an explosive growth • On enterprise side Clustered low 100 $/GB cost servers used in multiple environments from DB to BI to 10 HPC applications besides being driven by Cloud Service Providers are providing an overall growth of 1 107% cagr in Computing SSDs GB • SSD units are forecasted to grow at 86% cagr during the 2010-14 Source: IBM Journal R&D 0.1 time frame. 0.01 1990 1995 2000 2005 2010 2015 11© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SSD Filling Price/Perf Gaps in Storage IMEX RESEARCH.COM Price $/GB CPU SDRAM DRAM getting DRAM Faster (to feed faster CPUs) & NOR Larger (to feed Multi-cores & Multi-VMs from Virtualization) NAND PCIe SCM SSD HDD SSD segmenting into SATA PCIe SSD Cache Tape SSD - as backend to DRAM & SATA SSD - as front end to HDD HDD becoming Cheaper, not faster Source: IMEX Research SSD Industry Report 2011 © Performance I/O Access Latency 12© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SCM – A new Storage Class Memory IMEX RESEARCH.COM • SCM (Storage Class Memory) Solid State Memory filling the gap between DRAMs & HDDs Marketplace segmenting SCMs into SATA and PCIe based SSDs • Key Metrics Required of SCMs Device - Capacity (GB), Cost ($/GB), • Performance - Latency (Random/Block RW Access-ms); Bandwidth W(R/W- GB/sec) • • Data Integrity - BER (Better than 1 in 10^17) • Reliability - Write Endurance (No. of writes before death); Data Retention (Years); MTBF (millions of Hrs), • Environment - Power Consumption (Watts); Volumetric Density (TB/cu.in.); Power On/Off Time (sec), • Resistance - Shock/Vibration (g-force); Temp./Voltage Extremes 4-Corner (oC,V); Radiation (Rad) 13© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Advantage SSD vs. HDD in Enterprise IMEX RESEARCH.COM Manufacturer’s Required Specs Endurance Function Failures UBER Enterprise <= 3% <=10e-16 SSD HDD Diff. % Million MTBF Hr 2.1 1.0 -110% Failure Rate Per Year <<3% 4% -33% Reliability 110% Environmental Shock/Vibration Higher Resistance to Performance -Shock 8x Better, Anti-Vibration 16x Better 65% 475% RW Speed 5x Faster Operating Temp IOPS Data Access Time <1% Spec’d at 2x Wider Operating Temp Range Concurrent Access 900% Better Noise 30dB Lower IOPS 475% Better Space Savings in Space $ at $/sq.ft Office Space For same IOPS, Fewer Frames, Switch Ports, Controllers, Cables, Power Supplies etc Weight 50% Less Weight 163% 92% Power 92% Less Power, 38% Less Temp Operating Maintenance 50 % SSD HDD Svgs % Maintenance & Operating Time Reduced in Power Watts 0.5 6.8 93% - Booting Up, - Virus Scan, - Defrag, - RAID Build, - Idling Patching, - Data Restoration TCO Temp Surf C 85 136 38% 50% of HDD Power Watts 0.9 10.1 91% Load Temp Surf C 94 154 39% Source: IMEX Research SSD Industry Report 2011 ©IMEX 2009-11© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Advantage SSDs vs. HDDs in Enterprise IMEX RESEARCH.COM SSD Parameter Improvement SSD vs. HDD 1.0 GB/in3 Storage Density 16 GB/in3 1600 % 4.2 IOPS/in3 Performance Density 1,250 IOPS/in3 30,000 % 11.4 GB/W Power Efficiency 570 GB/W 5,000 % 43.1 IOPS/W Performance/Power 42,850 IOPS/W 100,000 % Enterprise SSD vs HDD Comparison 8 600 7 Shipments Units Millions SDs /GB S 500 IOPS 6 M) s( IOPS/GB it 400 5 Un S SD 4 300 3 HD DU nit 200 2 s (M ) 100 1 IOPS/GB HDDs 0 0 2009 2010 2011 2012 2013 2014 Note: 2U storage rack, • 2.5” HDD max cap = 400GB / 24 HDDs, de-stroked to 20%, • 2.5” SSD max cap = 800GB / 36 SSDs Source: IMEX Research SSD Industry Report ©2011© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Drivers & Challenges – MLC vs. SLC SSDs IMEX RESEARCH.COM Drivers Challenges Raw Media - No moving parts • - Higher density of MLC increases bit error rate - Predictable wear out - High bit error rate increases with wear Reliability • - Post infant mortality catastrophic • - Program and Read Disturb Prevention, Partial Page Programming device failures rare - Data retention is poor at high temperature and wear Media - Performance is excellent vs. HDDs • - NAND not really a random access device • - High Performance/Watt - Block oriented; Slow effective write, erase/transfer/program) latency, Performan • (IOPS/Watt) - Imbalanced R/W access speed ce - Low pin count: shared command / • - NAND Performance changes with wear, Some controllers do • data bus, good balance read/erase/modify/write, Others use inefficient garbage collection Controller - Transparently converts NAND - • - Interconnect • Flash memory into storage device - Number of NAND Flash Chips (Die); # of Buses (Real / Pipelined) • - Manages high bit error rate • - Data Protection (Int./Ext.RAID; DIF; ECC);Write Mitigation techniques • - Improves endurance to sustain a 5- • - Effective Block (LBA; Sector) Size: Write Amplification • year life cycle - Garbage Collection (GC) Efficiency • - Buffer Capacity & Management: Meta-data processing • Source: IMEX Research SSD Industry Report ©2011 16© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • MLC vs. SLC SSDs - Price Erosion IMEX RESEARCH.COM Relative Price Erosion SLC vs MLC 0 -20 % Price Erosion ($/GB) -40 -60 SLC -80 -100 MLC -120 -140 2004 2005 2006 2007 2008 2009 2010 2011e 2012e 2013e 17© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SSD Challenges & Solutions: Endurance/Wear-out IMEX RESEARCH.COM • Reason for Endurance Limitation in SSDs Anatomy of a PE Cycle in SSDs (Roundtrip through Tunnel Oxide) Memory Cell X-Section Memory Cell X-Section Memory Cell X-Section Programmed Erase Floating Tunnel Tunnel Gate Oxide _ _ _ _ _ _ _ _ Oxide N+ ________ N+ N+ N + N+ ________ N+ Floating Gate can Permanently Store Charge Erase takes them off Programming puts Floating Gate electrons on Floating Gate • Fundamentally – NAND Flash Memory Cell is an MOS Transistor with a Floating Gate that can permanently store charge • Programming puts electrons in Floating Gate, Erase takes them off • 1 Program/Erase (P/E) Cycle is a round trip by the electrons • Electrons pass through Cell’s Tunnel Oxide. Back & Forth round trips gradually damage the Tunnel Oxide over hundred thousands of trips (Program/Erase or PE cycles) resulting in Limited Endurance (or Wear-Out by PE cycles) in SSDs© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SSD Challenges & Solutions: Endurance (Wear-Out) IMEX RESEARCH.COM Challenge: Bad Block Mgmt Solution: Over Provisioning 50% • Over Provisioning by Increasing Spare % of Blocks Failing 40% Range of blocks 30% Best/Worst – Decreases user capacity but in Industry – Allows SSD to more efficiently complete 20% random Writes 10% – Improves Random Write Endurance and 0% Performance 1K 10K 100K 1000K • Methods to Implement include: P/E Cycles – Setting max LBA to limit visible drive capacity or – Create Smaller RAID Logical Drives or • The ability to erase slows down after a – Create Smaller Partitions number of P/E Cycles. Promise: • If NAND Memory block fails to erase, Controller is notified and another block from spares is used instead • Depending on workload, endurance can vary and needs to be managed properly • But there’s no loss of data, so a failed NAND block does not pose a problem. • Endurance should match usage needs of the system to minimize costs. • Eventually devices will run out of – SSD used as cache for 10 HDDs. 2 PB spares writes of useful life will support this.(1.1 TB • The point where the % failing exceed writes/day for 5 years.) number of spares is the most Basic Endurance Limit© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SSD Challenges & Solutions: Endurance (UBER) IMEX RESEARCH.COM Challenge: Uncorrectable BER Mgmt Solution: ECC 10-5 Raw Bit Error Rate 10-6 Flash Media Starts with - 1 in 108 (1 error/100 million bits) Read Range of Best/Worst in Flash Media’s Raw Bit Errors (RBER) 10-7 Industry Corrected by ECC UBER 10-8 Left Uncorrected – 1 in 1016 (1 error/10,000 Trillion bits Read) 10-9 • Using modern ECC techniques based controllers, vendors are providing spec at 1 10 100 1K 10K P/E Cycles 1 in 10^-17 UBER • A small of written bits gets flipped (similar to HDDs) • This is Flash Media’s Raw Bit Error Rate (RBER) • ECC is used to correct/reduce this RBER • RBER gradually increases with P/E cycles. Any bit error rate over ECC Correction capability is the Uncorrected Bit Error Rate (UBER). Reaching a UBER domain user data can become corrupted. • UBER is kept low. JEDEC Spec is 1 in 1016 errors • The point where UBER reaches this spec, is Another Endurance Limit© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SSD Challenges & Solutions: Data Retention IMEX RESEARCH.COM Challenge: Data Retention Solution: Data Retention Firmware 10-4 • Powered-On Firmware – To allow Higher Retention Raw Bit Error Rate 10-5 Raw Bit Error Rate • Balance out SSD Data Retention vs. Endurance 10-6 – Lower Data Retention allows for higher endurance 10-7 10-8 10-9 0 5K 10K Retention Time (Hours) • After PE cycles, RBER increases with time. ECC corrects bit flips but only to a certain extent. • So the industry lives with a required UBER and required Retention Time. This, in turn, determines the Safe PE cycles that device should be exercised to, prior to reaching the UBER and Retention time. This is also another endurance limit set by retention.© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SSD Challenges & Solutions: Functional Failure Defects IMEX RESEARCH.COM Challenge: Electronic Component - Defects Solution: Burn-Ins, Error Avoidance Algorithm Failure Rate Role of Defects in SSD Reliability Wafer Process Defects 61% 1 3 Design Related & Test 10% 2 EOS/ESD 10% Handling 9% Process Errors 5% Time Assembly & Test 5% • Vigorous SSD Burn-In & Testing 1 – Remove Infant Mortality • Compute NAND – Tread to improve Read Disturbs 2 – TPROG to reduce Program Disturbs • SSD Error Avoidance algorithms • All ICs have defects that cause failures. In Flash – ECC ASICS early life failures are caused by such defects. • Wear Leveling to avoid Hot Spots 3 • Defects can cause functional failures not just data loss. Most of NAND defect failures are caused by • Efficient Write Amplification Factor (WAF) PE cycles, coming in from high PE voltages – WAF=Data written to NAND /Data Written by causing defects to short. Host to SSD – WAF dependent on (a) SSD FW algorithm • The point where % failing from defects would built into SSD (b) Over Provisioning Amount reach unacceptable limits is another boundary for (c) App Workload endurance.© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SSD Challenges & Solutions: Industry Standard Testing IMEX RESEARCH.COM JEDEC Solution: Manufacturer Requirements Class Active Usage Retention Failures UBER Power On Power Off FFR Client 8 Hrs/day (400C) 1 yr. (400C) <=3% <10^-15 Enterprise 24 Hrs/day (400C) 3 mo.(400C) <=3% <10^-16 JEDEC Solution: Specify Endurance, Verify Spec via EVT • Endurance spec is max TB written to SSD over which device meets spec • Rigorous verification of Spec using EVT (Endurance Verification Test) • JEDEC supplies the workload. Data continuously read and verified. • SSD must meet<3% fail, UBER <1 in 10^-16 • EVT requires high/low temp stressing • EVT represents lifetime worth of Stress Test, so can be trusted • Accelerated Test (High Temp bake) and Unaccelerated Room Temp Retention Test required • Manufacturer provides ‘gauge’ informing user of % of endurance life used up© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • SSD Challenges & Solutions: Goals & Best Practices IMEX RESEARCH.COM Goals & Best Practices Afraid of SSD Adoption in your Enterprise? Be aware of Tools & Best Practices… And you’ll be OK !! % of Drives Failing (AFR %) 20% % of Drives Failing (AFR %) ve ise 20% Cur Prom 15% 15% Range of Extensive a 10% noi HDD Tests Done JEDEC rld 10% SSD Std. l Wo a 5% <=3% Par 5% Rea 0% 0% 20% 40% 60% 80% 100% 0% Lifetime (TBW) 0 1 2 3 4 5 JEDEC SSD Years of Use Std. <=3% Goals Best Practices • All NAND will have finite Endurance Limits due to – By leveraging Error Avoidance Algorithms, limitations imposed by: Verification Testing and Best Practices, so – Uncorrectable Bit Error Rates, - Functional Failures that total functional failure rate <=3% with defects and wear-outs issues combined – Data Retention Time – In practice, endurance ratings are likely to be • Goal is to embody technologies to Improve Life significantly higher than typical use, so data (Years of Use) errors and failures will be even less. – Push Endurance Limit to the right beyond product – Capacity reduction can provide large life as required by SSD products increases in random performance and – Push the defect rate down through Burn-Ins, Error endurance. Avoidance Algorithms and Practices. so the total – Select SSD based on confirmed EVT Ratings <=3% defects and wear-outs issues combined – Use MLC within requirements of Endurance – Target data errors to be < 1 in 10^16 for Enterprise Limits SSDs for both TBW and Retentions specs.© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • WW Enterprise SSD Mkt. Opportunity IMEX RESEARCH.COM WW Enterprise SSD 5-Yr Mkt Opportunity Cum $B (2010-14) 120% Market Size 2010-14 CAGR % $8.6B PCIe (5-Yr cum ) SAS 60% SATA 0% FC -60% $- $2 $4 5-Yr Cum Market Size $B by Interface Source: IMEX Research SSD Industry Report 2011 ©© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • PCIe based SSD Storage IMEX RESEARCH.COM • PCIe based SSD Storage – Target Market – Servers Storage – SSD as backend storage to DRAM as the front end – 36 PCIe Lanes Availability, – 3/6 GB/s Performance (PCIe Gen2/3 x8), – Low Latency in micro sec, – Low Cost (via eliminating HBA cost) DRAM DRAM DRAM DRAM PCIe SSD SSD PCIe PCIe PCIe SSD PCIe SSD CPU CPU Switch PCIe SSD SSD Core Core PCIe PCIe SSD Northbridge PCIe SSD PCIe SSD SSD PCIe PCIe PCIe SSD CPU CPU Switch PCIe SSD Core Core PCIe SSD SSD PCIe PCIe SSD PCIe SSD 26© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • PCIe based SSD Storage IMEX RESEARCH.COM Usage for Device Storage Metric Target Central Storage Server Based Caching $/IOP, Latency *1 LBA Cache LBA Cache Performance $/IOP/GB *2 Hot App Data Hot App Data Capacity $/GB, Watts/GB *3 Cold/Lukewarm App Data Lukewarm App Data *1 – PCIe SSD performance enables new storage caching “IOPS Tier” as Application Managed Caching *2 - PCIe SSDs of many flavors replace HDDs for High Performance Storage in some apps (e.g. Financial, DB etc) *3 – HDDs best for Data at Rest as $/GB storage leader PCIe SSD attributes of high IOPS, high Bandwidth, LowSource: 2011 Latency and lower cost are a good match for Caching 27 © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Hybrid SSD Storage IMEX RESEARCH.COM • Hybrid Storage – SAS or SATA SSD+HDD – Target market – External Storage Systems – Combines best features of SSDs - outstanding Read Performance (Latency, IOPs) and Throughput (MB/s) with extremely low cost of HDDs giving rise to a new class of storage - Hybrid Storage Devices – SSD as Front End to HDD – Controller emulates SSD as HDD – Use of Adaptive Memory sends High IOPS requirements to SSD while capacity requiring Apps sent to HDD – Simple Add on to SATA HDD Storage – SAS 6Gb/sec announced by multi-vendors 28© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Hybrid SSD Storage - Perf & TCO IMEX RESEARCH.COM SAN TCO using HDD vs. Hybrid Storage SAN Performance 250 Improvements using SSD 300 10 IOPS 9 200 $/IOPS 250 Improvement Improvement 8 475% 145 800% 7 200 150 HDD-FC 6 Cost $K 0 IOPS $/IOP HDD- 36 150 5 100 SATA 4 0 100 3 SSD 64 50 75 2 50 RackSpace 1 28 14.2 Pwr/Cool 5.2 0 0 0 HDD Only HDD/SSD FC-HDD Only SSD/SATA-HDD Power & Cooling RackSpace SSDs HDD SATA HDD FC Performance (IOPS) $/IOP Source: IMEX Research SSD Industry Report ©2011 29© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • New Intelligent Controllers: SSD Storage System Architecture IMEX RESEARCH.COM DRAM Cache Encryption DRAM Controller Flash Array PCIe SSDArray Flash SSD PCIe SSDArray Flash SSD PCIeFlash PCIe Array I/F Connector Interface Flash Flash Array PCIe SSDArray Flash SSD PCIe SSDArray Controller Controller Flash SSD PCIeFlash PCIe Array Host Flash Array PCIe SSDArray Flash SSD PCIe SSDArray Flash SSD PCIeFlash PCIe Array Flash Array PCIe SSDArray Flash SSD PCIe SSDArray Flash SSD PCIeFlash PCIe Array RAID Controller Power Mgmt Big Voltage Capacitor Regulator PCB/Chassis Signaling Mgmt, Interpret WR/RD/Status Commands, Native Command Queuing, Move Data  1 Interface Controller <‐> Host Signaling Mgmt, Format, Interpret WR/RD/Status Commands for Flash Arrays, Move Data.                 2 Flash Controller Defect Mapping/Bad Block Mgmt, Wear Leveling, Physical<>Logical Translations, ECC… 3 RAID Controller RAID Type & RD/WR/Parity Manipulation 4 Channels Multiple Channel to Increase Speed between NAND Flash Arrays & Flash Controller 5 DRAM Increase Performance using fast DRAM Cache Buffer 6 Power Failure Power Failure Protection using Big Capacitor 7 Power Mgmt Power/Performance Balancing, Sleep Mode Mgmt 8 Encryption Security Schemes Implementation & Manipulation 30© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • New Intelligent Controllers: Managing NAND Media in NextGen SSDs IMEX RESEARCH.COM Leveraging Long History of managing HDD’s imperfect media & high error rates Characterizing the quality & capabilities of media Allocating data based on quality of media HDD Media 10-4 Adaptive Signal Processing for Media Rd/Wr/Erase 10-16 Advanced Bit Detection & Error Correction Codes Defect Management Flash Media 10-4 Adaptive Signal Conditioning for Flash Media 10-17 Auto Bit Detection & Error Correction Codes Defect Management Leveraging Long History of managing HDD’s imperfect media & high error rates Endurance for Long Life Cycle Reliability through RAID of Flash Elements Adaptive Digital Signal Processing Technology Dynamically adjust Read/Write characteristics of each chip Tune adjustments over life of media ECCs - PRML Deploying Enhanced Error Correction Codes Source: IMEX Research SSD Industry Report © 2011© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • New Intelligent Controllers: Managing Enterprise Requirements IMEX RESEARCH.COM • Meet Enterprise Requirements – Always-On 24x7 Reliability and performance supersede cost – Fast I/O Performance required by business-critical applications – 5-Yr. Life Cycle Endurance required by mission-critical applications – Use State-of-the-Art new sophisticated controllers and firmware technologies to run mission critical applications in the enterprise, using Robust ECC, Internal RAID, Wear Leveling (To reduce hot spots), Spare Capacity, Write Amplification, Avoidance, Garbage Collection Efficiency, Wear Out Prediction Management etc. SATA3 I/F New Intelligent Controller (2nd Generation) RS232,GPIO,I2 C, JTAG I/FSource: SandForce 6Gb/s,32 NCQ Optimized Block Mgmt/ Garbage Read/Disturb RAID w/o Std. Write Wear Leveling Collection Management Parity OH NAND Flash I/F • Toggle , ONFI-2 Command Transport • SLC/ MLC/ eMLC PHY Link AES 256/128 TCG 55b/512B • 8ch/16 Byte Lanes • 3x,2x nm Supp Encryption Compliance BCH ECC • 512 GB Capable 32 © 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • New Intelligent Controllers : Managing Endurance in NextGen SSDs IMEX RESEARCH.COM • Managing Endurance To overcome NAND’s earlier endurance shortfalls due to limitation in write/erase cycles/block, intelligent controllers manage NAND SSDs using – ECC Techniques – Correct and guard against bit failures, same as in HDDs – Wear Leveling Algorithms – Writing data to evenly distributes it over all available cells to avoids a block of cells being overused and cause failures. – Over-provisioning Capacity – Extra spare raw blocks are designed-in as headroom and included to replace those blocks that get overused or go bad. Additionally provide enough room for wear-leveling algorithms to enhance reliability of the device over its life-cycle. – Typical SSD device’s specified GB device will actually contain 20-25% extra raw capacity to meet these criterions. 33© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • New Intelligent Controllers Managing Reliability in NextGen SSDs IMEX RESEARCH.COM • Managing Reliability Multiple techniques are being used to improve the reliability, such as: In-Flight Corruption upstream disk controllers, Corruption in SSD controller itself Flush at power loss using large cap elements At-Rest ECC Scanning & scrubbing Redundancy Meta-data Error correcting memory Data integrity field 34© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • New Intelligent Controllers Managing Performance in NextGen SSDs IMEX RESEARCH.COM • Managing Performance / Key Metrics Impact – Factors Impact Performance – Hardware - CPU, Interface, Chipset ... – System SW - OS, App, Drivers, Caches, SSD specific TRIM, Purge, … – Device - Flash Generation, Parallelism, Caching Strategy, Wear-Leveling, Garbage Collection, Warranty Strategy… – Write History - TBW, spares…) – Workload - Random, Sequential, R/W Mix, Queues, Threads… – Pre-Conditioning - Random, Sequential, Amount … – Performance - Short “Burst” First On Board (FOB) - Steady State post xPE Cycles Using interleaved memory banks, caching and other techniques being designed in modern controllers, the performance of MLC SSDs today started to match and even outshines performance offered by some SLC SSDs 35© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • AutoSmart Storage-Tiering Software: Storage Mapping IMEX RESEARCH.COM • Automated Storage Tiering Principles – Continuously monitor and analyze data access on the tiers – Automatically elevate hot data to “Hot Tiers” and demote cool data/volumes to “Lower Tiers. Allocate and relocate volumes on each tier based on use – Reduces OPEX vs. managing SANs manually. All major Computer System manufacturers adopted it such as FAST, Easy Tier, Data Progression, Adaptive Optimization, Dynamic Tiering, Smart Pools... • Traditional Disk Mapping • Smart Storage Mapping – Volumes have different – All volumes appear to be “logically” characteristics. Applications homogenous to apps. But data is need to place them on placed at the right tier of storage correct tiers of storage based based on its usage through smart on usage data placement and migration Source: IBM & IMEX Research SSD Industry Report ©2011 36© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • AutoSmart Storage-Tiering Software: Workload I/O Monitoring/Smart Migrations IMEX RESEARCH.COM Storage-Tiered Virtualization Storage-Tiering at LBA/Sub-LUN Level Physical Storage Logical Volume SSDs Hot Arrays Data Cold HDDs Data Arrays LBA Monitoring and Tiered Placement • Every workload has unique I/O access signature • Historical performance data for a LUN can identify performance skews & hot data regions by LBAs • Using Smart Tiering identify hot LBA regions and non-disruptively migrate hot data from HDD to SSDs. • Typically 4-8% of data becomes a candidate and when migrated to SSDs can provide response time reduction of ~65% at peak loads. Source: IBM & IMEX Research SSD Industry Report 2011 ©IMEX 2010-11 37© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • AutoSmart Storage-Tiering Software:Enhancing Database Throughput IMEX RESEARCH.COM• DB Throughput Optimization • Productivity (Response Time) Improvement – Every workload has unique I/O access – Using automated reallocation of hot spot data signature and historical behavior (typically 5-10% of total data) to SSDs, performance – identify hot “database objects” and smartly improvements is achieved placed in the right tier. – Response time reduction of around 70+% or – Scalable Throughput Improvement - 300% – Through put (IOPS) increase of 200% for any I/O – Substantial IO Bound Transaction intensive loads experienced by Time-Perishable Online Transactions like: Airlines Reservations, Response time Improvement - 45%-75% Wall Street Investment Banking Stock Transactions Financial Institutions Hedge Funds etc. as well as Low Latency seeking HPC Clustered Systems etc. Source: IBM & IMEX Research SSD Industry Report 2011 ©IMEX 2010-11 38© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Applications Best Suited for SSDs IMEX RESEARCH.COM Apps and impact from SSD Usage • Databases Applications most benefitting – Databases have key elements of commit files from SSDs Use – logs, redo, undo, tempDB • Structured data Database/OLTP 43% – Structured data access is an excellent fit for E-Mail/Collabor. 32% SSD HPC 31% – Exception–large, growing table spaces BI/DW 30% • Unstructured data ERP/SCM/CRM 25% – Unstructured data access is a poor fit for SSD Web 2.0 23% – Exception – small, non-growing, tagged files Office Apps 20% • OS images – boot-from-flash, page-to-DRAM Typical Cases - Impact on Applications • Financial Credit Card/ATM Transactions – Improvements: Batch Window 22%, App Response Time 50%, App I/O Rate 50% • Messaging Applications Source: IMEX Research SSD Industry Report © 2011 – Cost Savings: 200+ FC HDDS into only 16 SSDs 39© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Apps Best Suited for SSDs: OLTP to Improve Query Response Time IMEX RESEARCH.COM 8 Query Response Time (ms) HDDs HDDs Hybrid 6 14 Drives 112 Drives HDDs $$ w short 36 Drives stroking + SSDs SSDs 4 $$$$$$$$ $$$$ 12 Drives $$$ 2 Conceptual Only -Not to Scale 0 0 10,000 20,000 30,000 40,000 IOPS (or Number of Concurrent Users) • Improving Query Response Time – Cost effective way to improve Query response time for a given number of users or servicing an increased number of users at a given response time is best served with use of SSDs or Hybrid (SSD + HDDs) approach, particularly for Database and Online Transaction Applications Source: IMEX Research SSD Industry Report ©2011 40© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Apps Best Suited for SSDs: DB in Memory for Data Warehouse/BI IMEX RESEARCH.COM Large DB Size Growth by Market Segment Scale In Scale Up 1200 1100 1000 OLTP DB Size (TB) 900 DW/BI 800 700 600 500 400 300 200 100 0 Scale Out 2009 2010 2011 2012 2013 2014 Storage Usage vs DB Capacity 1-2 TB DB Size (TB) Storage Size (TB) VTL & DB Size 2-5 TB 5-10 TB >10 TB 0 20 40 60 80 TB Data Source: IMEX Research Cloud Infrastructure Report ©2009-11© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Apps Best Suited for SSDs: IMEX Collaboration using Virtual Desktops RESEARCH.COM • Mitigating Boot Storms – Boot Storm created by simultaneous Logins by users at start of office day – Over provisioning SAN capacity just for short morning burst expensive, while sitting almost idle rest of the day – Three potential solutions with pros and cons include: – (1) DAS Storage, (2) Virtual Desktop Images on SSD (3) SS Cache to accelerate SAN, Virtual Solution Pros Con Desktops New DAS    ‐ Popular with Desktop SW Vendors           ‐ Additional Cost for Dedicated Storage  Storage                Lowered Cost                                               ‐ Wasted existing SAN Storage                     ‐ . ‐ ‐ Data Protection  & Mgmt Challenges ‐ SSD Ideal for read intensive app             ‐ Using most expensive storage                 Host SSD ‐ Instant‐On/Immediate Boot Response  ‐ High Speed Needed just for an hour            – Images Stored with High Efficiency ‐ Not simple shoe‐in w existing storage SAN w SSD  ‐ Possibly best way to solve problem         ‐ Not feasible without existing SAN          Accelerator          ‐ Small SSD optimized for image store                     ‐ SSD in SAN Integration still a challenge      . ‐ No change to existing Data Protection ‐ Providing a perfect balance of access and storage is achieved through Integrating SATA HDDs with SSDs and using Automatic Tiering Solutions Source: IMEX Research SSD Industry Report ©2011© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Apps Best Suited for SSDs:High Performance Computing/Web 2.0 IMEX RESEARCH.COM Smart Mobile Commercial Bioinformatics Decision Support Entertainment- Devices Visualization & Diagnostics Bus. Intelligence VoD / U-Tube Data: IMEX Research & Panasas 43 Instant On Boot Ups Rendering (Texture & Polygons) Data Warehousing Most Accessed Videos Rugged, Low Power Very Read Intensive, Small Block I/O Random IO, High OLTPM Very Read Intensive 1GB/s, __ms 10 GB/s, __ms 1GB/s, __ms 4 GB/s, __ms© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Key Takeaways IMEX RESEARCH.COM • Optimize Infrastructure to meet needs of Applications/SLA • Solid State Storage creating a paradigm shift in Storage Industry – Leverage the opportunity to optimize your computing infrastructure with SSD adoption after making a due diligence in selection of vendors/products, industry testing and interoperability • Enterprise SSD Market Segments: PCIe vs. SAS/SATA – 5-Year cum Market $8.6B Segments by Revenues: 36% PCIe, 33% SAS, 24% SATA, 6% FC based SSDs • Understand Drivers and Challenges of SSDs for Enterprise Use • Intelligent Controllers key to adoption & success of SSDs – Mitigate Endurance, Wear-Out, Life issues • Optimize Transactions for Query Response Time vs. # of Users – Improving Query Response time for a given number of users (IOPs) or Serving more users (IOPS) for a given query response time • Select Automated Storage Tiering Software 44© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.
    • Acknowledgements IMEX RESEARCH.COM Many thanks to the following individuals for their industry vision and leadership in Solid State Storage industry (in preparation of some slides in this presentation) • N. Mielke, Fellow & Director of Reliability Methods, Intel • Jim Elliott, VP Marketing, Samsung • John Scaramuzzo, SVP & GM, Smart Modular Technologies • Michael Raam, President & CEO, SandForce • Manouch Moshayedi, CEO, STEC • Dean Klein, VP Memory Systems Development • John White, Chairman SSD Initiatives, SNIA Author: Anil Vasudeva, President & Chief Analyst, IMEX Research 45© 2010‐11  IMEX Research, Copying prohibited. All rights reserved.