Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HPC DAY 2017 | HPE Storage and Data Management for Big Data

230 views

Published on

HPC DAY 2017 - http://www.hpcday.eu/

HPE Storage and Data Management for Big Data

Volodymyr Saviak | CEE HPC & POD Sales Manager at HPE

Published in: Technology
  • Be the first to comment

  • Be the first to like this

HPC DAY 2017 | HPE Storage and Data Management for Big Data

  1. 1. HPE Storage and Data Management for Big Data 1 Volodymyr Saviak, HPE HPC Sales Manager
  2. 2. Agenda 1. Different HPC Storage Requirements and solutions. Intro – 5 minutes 2. HPC Lustre based storage – 10 minutes 3. NVME low latency storage – 5 minutes 4. DMF as data management – 10 minutes 2
  3. 3. Mission Critical Storage and HPC Storage Differences 3 Mission Critical Storage HPC Storage Critical buying factors • Robustness • Features • Price, price performance • Throughput • Capacity Willing to compromise on • Price, price performance • Throughput • Capacity • Robustness • Features Summary: Need different products for different markets
  4. 4. Different kind of solutions we deliver Features Parallel Performance Low Latency Storage Active Archive Capacity ++ (100PB) +(250TB) +++(500PB) Multiple nodes parallel file/object access +++(many thousands) ++(128/512) + High Bandwidth +++ ++ ++ Bandwidth per sNode +++ +++ + High IOps ++ +++ + Low Latency + ++ - Disaster Tolerance - - +++ Heterogeneous access -/+ -/+ +++ Multiprotocol access - - ++ 4
  5. 5. Recent requests coming from the fields Typical technical requirements (extract) – HPC customer wants to build 1PByte Storage for Genome Sequencing with 30GByte/s write performance – Bank wants to build 100TByte storage with near memory performance for Oracle Data Warehouse – Telco wants to build 3PByte storage with 80GByte/s write throughput – Government organization wants to build 10PByte archive with 99.9999% data availability Confidential – For Training Purposes Only 5 Scale-out storage helps to grow storage capacity without pain
  6. 6. NVMe Scale-OUT The fastest storage available today 6
  7. 7. Accelerate Your Workload With HPE NVMe SSDs Reliability for continuous performance with less downtime Efficiency for lower TCO Performance for faster business results
  8. 8. Difference between SATA and NVMe –NVMe is actually not an interface but a language. –NVMe is a different language that is optimized to reduce overhead when making requests to the SSD. 8
  9. 9. NVMe Deployment Challenges 9 App App App App App App All Flash Arrays Server-side Flash • Array software that does not take full advantage of flash characteristics • Network and fabric latencies • I/O stack bottlenecks • Capacity challenges: provisioning optimization is difficult • Creates data locality issue • No centralized management • Low utilization rates Applications can have access to large pools of flash but with limitations Applications benefit from maximum flash performance but without shared data
  10. 10. Local NVMe Devices 10 Benefits – Very Low Latency – High IOPs – High Throughput – Commodity Pricing The Reality – DAS – No Logical Volumes – No Data Protection – No High Availability – No Application movement – Excess (wasted) IOPs
  11. 11. How NVMe Scale-out Storage Works 11 Centralized Management (GUI, RESTful HTTP) Control Path Effective Data Path NVMe Client(s) NVMe Target (unmodified) Applications Intelligent Client Block Driver High Speed Network R-NIC CPU NVMe Drive(s) NVMe Target Module R-NIC
  12. 12. Converged, Disaggregated or mixed 12 Local Storage in Application Server Storage is Centralized • Storage is unified into one pool • NVMesh Target Module & Intelligent Client Block Driver run on all nodes • NVMesh bypasses server CPU • Linearly scalable I/O I/O NVMesh Target Module Intelligent Client Block Driver NVMesh Target Module Intelligent Client Block Driver • Storage is unified into one pool • NVMesh Target Module runs on storage nodes • Intelligent Client Block Driver runs on server nodes • Applications get performance of local storage
  13. 13. Performance 13 Remote IOPS = Local IOPS Remote BW = Local BW Remote Lat. = Local Lat. + ~5us 2RU server with 24 NVMe drives: > 4.9M 4KB IOPS > 24GB/s Scalability 20 servers, shared data, > 99% efficiency 128 servers @ NASA > 130 GB/s writing through a shared file system Converged-ready Using RDDA, 0% target CPU usage Ubiquitous Access Highly Optimized
  14. 14. Customer Successes Pooling NVMe enables new Science use cases at NASA Use Case • Large-scale modeling, simulation, analysis and visualization • Visualizes supercomputer simulation data on 128 monitors from a 128 node cluster Problem • Interactive work is generally small IO's • Introduction of high-performance local NVMe SSD’s create the problem of data locality Solution NVMesh enables NASA to create a petabyte-scale unified pool of high-performance flash distributed retaining the speeds and latencies of directly-attached media
  15. 15. Lustre Storage The world of fast and large HPC storage 15
  16. 16. 16 Data Management | Lustre Market Share in Key HPC Use Cases
  17. 17. + XX GB/Sec + XX GB/Sec + XX,000 Metadata Ops + XX,000 Metadata Ops Data Management | Lustre Designed to Scale Out Management Network Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) Storage servers grouped into failover pairs Data Network (LNET) (InfiniBand/Ethernet/Omnipath) Lustre Clients Management Server (MGS) Metadata Server (MDS) Object Storage Servers (OSS) XX,000 Metadata Ops Metadata Target (MDT) Metadata Target (MDT) + XX,000 Metadata Ops XX GB/Sec Object Storage Targets (OSTs) + XX GB/Sec Customer Goal 60GB/sec 17GB/sec + 17GB/sec + 17GB/sec + 17GB/sec
  18. 18. Data Management | Lustre Update Shift to Community Lustre –Intel initiated a process to consolidate its Lustre efforts around a single version of Lustre that will be available from the community as open source –All proprietary elements of Intel Enterprise Edition for Lustre were contributed by Intel to the community –HPE will deliver an updated Apollo 4520 Lustre solution based on Community Lustre in late 2H2017 ORIGINAL Community Lustre Intel Enterprise Edition Lustre Intel Foundation Edition Lustre Intel Cloud Edition Lustre NEW Community Lustre ORIGINAL NEW Integrated on HPE Hardware? Yes Yes Lustre Version Intel Enterprise Edition for Lustre 3.1 Community Lustre 2.10 L1 Support HPE HPE L2 Support HPE HPE L3 Support Intel Intel
  19. 19. Data Management | Lustre Roadmap and Relevance Key Features • Multi-Rail LNET for data pipeline scalability • Progressive File Layouts for performance and more efficient/balanced file storage • Data on MDT for direct small file storage on MDT (flash)
  20. 20. + XX GB/Sec + XX GB/Sec + XX,000 Metadata Ops + XX,000 Metadata Ops Data Management | Lustre Multi-Rail LNET Management Network Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) Storage servers grouped into failover pairs Data Network (LNET) (InfiniBand/Ethernet/Omnipath) Lustre Clients Management Server (MGS) Metadata Server (MDS) Object Storage Servers (OSS) XX,000 Metadata Ops Metadata Target (MDT) Metadata Target (MDT) + XX,000 Metadata Ops XX GB/Sec Object Storage Targets (OSTs) + XX GB/Sec Multiple Fabric Adapters/Connections
  21. 21. Data Management | Lustre Roadmap and Approach Management Network Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) Data Network (LNET) (InfiniBand/Ethernet/Omnipath) Lustre Clients Storage Monitoring Management Server (MGS) Metadata Server (MDS) Object Storage Servers (OSS) Storage servers grouped into failover pairs Current Small File I/O Model New Small File I/O Model Client Data I/O Small writes go to MDT Large writes go to OST
  22. 22. HPE Apollo 4520 Scalable Storage with Lustre 22 Designed for Petabyte-Scale Data Sets Density Optimized Design For Scale • Dense Storage Design Translates to Lower $/GB • Linear performance and capacity scaling ZFS for File Protection and Performance ZFS file system provides advanced data protection • ZFS RAID provides Snapshot, Compression & Error Correction High Performance Storage Solution Meets Demanding I/O requirements • Up to 51GB/sec per rack using balanced architecture based on 4520 Lustre Server with D6020 JBODs Services and support Installation and support services • Factory tested and validated, deployment services for installation • 24/7 Support services HPE & Partner Confidential Apollo 4520 controller
  23. 23. HPE Apollo 4520 Scalable Storage For Lustre Easily optimized for a variety of needs Capacity Optimized – Up to 5.5 PB per rack raw storage using 12TB drives – Maximize $/GB – Up to 6 JBODs per Apollo 4520 – Minimal software license and support costs – Efficient JBOD chaining – HA configured to handle cable failures – Software support licensing not based on capacity Bandwidth Optimized – Up to 73 GB/s per rack Apollo 4520 using an all- Apollo 4520 configuration – Even faster performance can be obtained with an all SSD configuration – Multi-rail LNET in Lustre 2.10 eliminates any potential fabric bottlenecks in all-SSD us cases 23HPE & Partner Confidential UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 HP StorageWorks D6000 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 HP StorageWorks D6000 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 HP StorageWorks D6000 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 HP StorageWorks D6000 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 HP StorageWorks D6000 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 UID 1 8 15 22 29 2 9 16 23 30 3 10 17 24 31 4 11 18 25 32 5 12 19 26 33 6 13 20 27 34 7 14 21 28 35 HP StorageWorks D6000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 UID D3700 Disk Enclosure Storage Controller Storage JBOD #1 Storage JBOD #2 Storage JBOD #3 Storage JBOD #4 Storage JBOD #5 Storage JBOD #6 Storage Controller Storage Controller Storage Controller Storage Controller Storage Controller Storage Controller Storage Controller Storage Controller Storage Controller Storage Controller
  24. 24. HPE Scalable Storage with Intel Enterprise Edition For Lustre* – Performance with single Apollo 4520 and two D6020 JBODs (all HDDs) Peak Writes: 15 GB/s Peak Reads: 17 GB/s WWAS 2017 | HPE & Partner Confidential
  25. 25. HPE DMF Tiered Data Management and Protection 25
  26. 26. Data Management | Lustre HSM Data Management Guidelines 26 Data always lives longer than the hardware on which it is stored. Forward migration to new technology should never adversely impact the users.
  27. 27. Data Management | HPC Storage Landscape New Model Key Takeaways • Disaggregate and scale High- Performance Storage Tier Independently from Capacity Tier • Co-locate Performance Tier with Compute and Fabric • Implement tiered Data Management for Capacity Scaling and Data Protection High-Performance Storage Capacity Storage, Protection & Data Management Compute Tiered Data Movement and Management are a Key Requirement – and HPE Data Management Framework (DMF) meets that need
  28. 28. Data Management | DMF Core Concepts
  29. 29. Data Management | DMF Advanced Tape Storage Integration 29 • DMF is certified with libraries from Spectra Logic, Oracle (StorageTek), IBM and HPE portfolio of tape libraries • Support for latest LTO and Enterprise- class drive technology • Advanced feature support for accelerated retrieval and automated library management • Certification guide for libraries and drives is available – and updated regularly
  30. 30. Data Management | DMF Object Storage Support 30 High-Performance File System DMF Policy and Migration Engine HPDAHPC DMF Data Management Layer Cloud & Object Storage Offsite Data Replication DMF Zero Watt Storage Onsite Tape Storage Secure Offsite Tape NFS RAID or Flash-based Storage CIFS XFS CXFS Lustre Object Storage System in an DMF Architecture: • Standards-based Integration: – Use of S3 interface enables compatibility with Scality, HGST Active Archive, Amazon S3, CEPH, NetApp StorageGrid, DDN WOS and open source alternatives • Accessibility: – High resilience and data integrity for a variety of use cases • Scalability & Throughput: – Scalable DMF connections to object storage environment – DMF Parallel Data Mover architecture with high-availability and failover • Flexibility: – Ability to blend object storage with alternative storage options including Zero Watt Storage (performance) or tape (off-site disaster recovery)
  31. 31. Data Management | DMF Zero Watt Storage 31 High Performance & Density: • 70 x 3.5” SAS drives in a 5U Enclosure • Supports >600TB of usable storage per enclosure with 10TB drives • 4+ PB of usable capacity per rack • High Performance: >10GB/sec per enclosure streaming retrieval that is an excellent DMF cache compliment to tape, object or cloud storage Zero Watt Storage Advanced Software Features: • Open standard access – no user application changes required • Flexible deployment – no interruption to DMF production environment during ZWS deployment • Tuneable data movement policies – to maximize use of ZWS & other storage hardware • Granular drive management including automated spin-down of inactive individual disks • Maximum power savings Increasing disk lifespan • Automated data recoverability – silent data corruption prevention and ‘in place’ data recovery
  32. 32. Migrate Recall e.g. by time, type, etc Primary Storage (POSIX) • Online, high-performance disk Nearline Fast-Mount Cache High capacity, low cost, power-managed disk Deep Storage Object Store Public Cloud Tape  Entire namespace is in Filesystem  Migrate file data transparently (with invisible IO), leave inodes  Recall on access or by schedule  Filesystem IS the metadata database  Transparency makes it easy – data catalog and access in same place Data Management | Lustre HSM with DMF Core Concepts
  33. 33. Data Management | DMF Data Protection Strategy 33 3 copies Advantage of 3 copies of all data: • Optimized use of storage HW • High availability • Elimination of backup 1 2 3 Performance copy Secure copy Disaster Recovery copy 2 media types Advantage of keeping data on two different media types: • Fast data access • Data retention • Archive resilience RAID, Flash, Disc, Tape, Object & ZWS Tape or Cloud Object 1 2 1 copy offsite Advantage of keeping one copy offsite: • Lower power consumption • Base for compliance • Disaster recovery Primary Data Center Offsite or Cloud Storage 1
  34. 34. 34 Proven in production use for over 20 years • Data Management, Archive, Integrated Backup, Validation and RepairAll-in-One • All data appears online all the timeTransparent • Policies leverage file attributes, define multiple copies on different media.Policy-driven DMF Scalable Data Management Fabric • Policy-based Data Migration & HSM • Parallel Architecture for High Throughput • Active Data Validation and Repair • Minimizes Storage Administrator Workload Lowest cost per GB of data with extremely high levels of data durability High-performance access with very low storage & operating costs Highly scalable and resilient for availability and disaster recovery Public/Private CloudZero Watt StorageTMTape Library Storage Data Management | DMF Core Concepts
  35. 35. Data Management | DMF Customer Example Space Agency 35
  36. 36. –High-Performance data migrations –DMF Direct Archiving –MAID storage target –DMF Zero Watt Storage –Elegant Archive Storage Migration Over Time –Multi-Petabyte data migration with no user impact –Trusted data protection –Over 25 years preserving data –Active user community –DMF User Group (Feb 2017)http://hpc.csiro.au/users/dmfug/ Key Differentiators 36 Some names and brands may be claimed as the property of others
  37. 37. 37 Conclusions
  38. 38. Data Management | Summary 38 HPC presents unique storage challenges HPE has a robust and flexible set of HPC file systems DMF data management ensures long term availability HPC Business Unit can assist with sizing and design
  39. 39. Thank you

×