Software Defined storage, Big Data and Ceph - What is all the fuss about?

721 views
626 views

Published on

Dell and Red Hat joined forces at OpenStack Summit 2014 to talk about the very latest enhancements and capabilities delivered in Inktank Ceph Enterprise, including a new erasure coded storage back-end, support for tiering, and the introduction of user quotas. They also shared best practices, lessons learned and architecture considerations founded in real customer deployments of Dell and Inktank Ceph solutions. Learn more: http://del.ly/DellRedHat

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
721
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
33
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • R720XD configurations use 4 TB drives2 X 300 GB OS drives2 X 10 GB NICiDRAC 7 EnterpriseLSI 9207-[8i, 8e] HBAs2 X E5-2650 2 GHz processors(*) - The larger capacity is that were erasure encoding is in use. To get the same redundancy as 2 X in erasure encoding uses a factor of 1.2. Erasure encoding is a feature of the Ceph Firefly release, which is in its final phase of development. Additional performance could be gained by adding either Intel’s CAS or Dell FluidFS DAS caching software packages. Doing so would impose additional memory and processing overhead, and more work in the deployment/installation bucket (because we would have to install and configure it).
  • https://dev.uabgrid.uab.edu/wiki/OpenStackPlusCeph The research computing system (RCS) is built on a collection of distinct hardware systems designed to provide specific services to applications. The RCS hardware includes dedicated compute fabrics that support high performance computing (HPC) applications where hundreds of compute cores can work together on a single application. These clusters of commodity compute hardware make it possible to do data analysis and modelling work in hours, work that would have taken months using a single computer. The clusters are connected with dedicated high bandwidth, low latency networks for applications to efficiently coordinate their actions across many computers and access a shared high speed storage system for working efficiently with terabytes of data. Our newest hardware fabric, acquired 2012Q4, is designed to support emerging data intensive scientific computing and virtualization paradigms. This hardware is very similar to the commodity computers used by our traditional HPC fabrics, however, in addition to having many compute cores and lots of RAM, each individual computer contains 36TB of built in disk storage. Taken together, this newest hardware fabric adds 192 cores, 1TB RAM, and 420TB of storage to the RCS. The built-in disk storage is designed to support applications running local to each computer. The data intensive computing paradigm exchanges the external storage networks of traditional HPC clusters with the native, very high speed system buses that provide access to local hard disks in each computer. Large datasets are distributed across these computers and then applications are assigned to run on the specific computer that stores the portion of the dataset it has been assigned to analyze. The hardware requirements for data intensive computing closely resemble the requirements for virtualization and can benefit tremendously from the configuration flexibility that a virtualization fabric offers. In order to enhance flexibility and further improve support for scaling research applications, we are engineering our latest hardware cluster to act as a virtualized storage and compute fabric. This enables support for a wide variety of storage and compute use cases, most prominently, ample storage capacity for reliably housing large research data collections and flexible application development and deployment capabilities that allow direct user control over all aspects of the application environment. In short, we are tooling this hardware to build a cloud computing environment. We are building this cloud using OpenStack for compute virtualization and Ceph for storage virtualization. Crowbar will provision the raw hardware fabric. This approach is very similar to the mode we have been following with our traditional ROCKS-based HPC cluster environment. The new approach enhances our ability to automatically provision hardware and further improve the economics large scale computing. We are implementing this environment with Dell and Inktank. These vendors and the upstream open source projects on which this platform is built, embrace the DevOps model for systems development. This will support further engineering collaboration with our vendors, enabling the UAB research community to continually enhance our fabric as needed and feed those enhancements upstream for inclusion in future support releases. This solution rounds out the feature set of the RCS core and will provide a general framework to scale future growth.
  • https://dev.uabgrid.uab.edu/wiki/OpenStackPlusCeph The research computing system (RCS) is built on a collection of distinct hardware systems designed to provide specific services to applications. The RCS hardware includes dedicated compute fabrics that support high performance computing (HPC) applications where hundreds of compute cores can work together on a single application. These clusters of commodity compute hardware make it possible to do data analysis and modelling work in hours, work that would have taken months using a single computer. The clusters are connected with dedicated high bandwidth, low latency networks for applications to efficiently coordinate their actions across many computers and access a shared high speed storage system for working efficiently with terabytes of data. Our newest hardware fabric, acquired 2012Q4, is designed to support emerging data intensive scientific computing and virtualization paradigms. This hardware is very similar to the commodity computers used by our traditional HPC fabrics, however, in addition to having many compute cores and lots of RAM, each individual computer contains 36TB of built in disk storage. Taken together, this newest hardware fabric adds 192 cores, 1TB RAM, and 420TB of storage to the RCS. The built-in disk storage is designed to support applications running local to each computer. The data intensive computing paradigm exchanges the external storage networks of traditional HPC clusters with the native, very high speed system buses that provide access to local hard disks in each computer. Large datasets are distributed across these computers and then applications are assigned to run on the specific computer that stores the portion of the dataset it has been assigned to analyze. The hardware requirements for data intensive computing closely resemble the requirements for virtualization and can benefit tremendously from the configuration flexibility that a virtualization fabric offers. In order to enhance flexibility and further improve support for scaling research applications, we are engineering our latest hardware cluster to act as a virtualized storage and compute fabric. This enables support for a wide variety of storage and compute use cases, most prominently, ample storage capacity for reliably housing large research data collections and flexible application development and deployment capabilities that allow direct user control over all aspects of the application environment. In short, we are tooling this hardware to build a cloud computing environment. We are building this cloud using OpenStack for compute virtualization and Ceph for storage virtualization. Crowbar will provision the raw hardware fabric. This approach is very similar to the mode we have been following with our traditional ROCKS-based HPC cluster environment. The new approach enhances our ability to automatically provision hardware and further improve the economics large scale computing. We are implementing this environment with Dell and Inktank. These vendors and the upstream open source projects on which this platform is built, embrace the DevOps model for systems development. This will support further engineering collaboration with our vendors, enabling the UAB research community to continually enhance our fabric as needed and feed those enhancements upstream for inclusion in future support releases. This solution rounds out the feature set of the RCS core and will provide a general framework to scale future growth.
  • https://dev.uabgrid.uab.edu/wiki/OpenStackPlusCeph The research computing system (RCS) is built on a collection of distinct hardware systems designed to provide specific services to applications. The RCS hardware includes dedicated compute fabrics that support high performance computing (HPC) applications where hundreds of compute cores can work together on a single application. These clusters of commodity compute hardware make it possible to do data analysis and modelling work in hours, work that would have taken months using a single computer. The clusters are connected with dedicated high bandwidth, low latency networks for applications to efficiently coordinate their actions across many computers and access a shared high speed storage system for working efficiently with terabytes of data. Our newest hardware fabric, acquired 2012Q4, is designed to support emerging data intensive scientific computing and virtualization paradigms. This hardware is very similar to the commodity computers used by our traditional HPC fabrics, however, in addition to having many compute cores and lots of RAM, each individual computer contains 36TB of built in disk storage. Taken together, this newest hardware fabric adds 192 cores, 1TB RAM, and 420TB of storage to the RCS. The built-in disk storage is designed to support applications running local to each computer. The data intensive computing paradigm exchanges the external storage networks of traditional HPC clusters with the native, very high speed system buses that provide access to local hard disks in each computer. Large datasets are distributed across these computers and then applications are assigned to run on the specific computer that stores the portion of the dataset it has been assigned to analyze. The hardware requirements for data intensive computing closely resemble the requirements for virtualization and can benefit tremendously from the configuration flexibility that a virtualization fabric offers. In order to enhance flexibility and further improve support for scaling research applications, we are engineering our latest hardware cluster to act as a virtualized storage and compute fabric. This enables support for a wide variety of storage and compute use cases, most prominently, ample storage capacity for reliably housing large research data collections and flexible application development and deployment capabilities that allow direct user control over all aspects of the application environment. In short, we are tooling this hardware to build a cloud computing environment. We are building this cloud using OpenStack for compute virtualization and Ceph for storage virtualization. Crowbar will provision the raw hardware fabric. This approach is very similar to the mode we have been following with our traditional ROCKS-based HPC cluster environment. The new approach enhances our ability to automatically provision hardware and further improve the economics large scale computing. We are implementing this environment with Dell and Inktank. These vendors and the upstream open source projects on which this platform is built, embrace the DevOps model for systems development. This will support further engineering collaboration with our vendors, enabling the UAB research community to continually enhance our fabric as needed and feed those enhancements upstream for inclusion in future support releases. This solution rounds out the feature set of the RCS core and will provide a general framework to scale future growth.
  • User base: 900+ researchers across Campus. KVM-based2 Nova nodes4 primary storage nodes4 replication nodes2 control nodes12 x R720XD systems
  • Software Defined storage, Big Data and Ceph - What is all the fuss about?

    1. 1. Software Defined storage, Big Data and Ceph. What is all the fuss about? Kamesh Pemmaraju, Sr. Product Mgr, Dell Neil Levine, Dir. of Product Mgmt, Red Hat OpenStack Summit Atlanta, May 2014
    2. 2. CEPH
    3. 3. CEPH UNIFIED STORAGE FILE SYSTEM BLOCK STORAGE OBJECT STORAGE Keystone Geo-Replication Native API 3 Multi-tenant S3 & Swift OpenStack Linux Kernel iSCSI Clones Snapshots CIFS/NFS HDFS Distributed Metadata Linux Kernel POSIX Copyright © 2013 by Inktank | Private and Confidential
    4. 4. ARCHITECTURE 4 Copyright © 2013 by Inktank | Private and Confidential APP HOST/VM CLIENT
    5. 5. COMPONENTS 5 S3/SWIFT HOST/HYPERVISOR iSCSI CIFS/NFS SDK INTERFACESSTORAGECLUSTERS MONITORS OBJECT STORAGE DAEMONS (OSD) BLOCK STORAGE FILE SYSTEMOBJECT STORAGE Copyright © 2014 by Inktank | Private and Confidential
    6. 6. THE PRODUCT
    7. 7. 7 INKTANK CEPH ENTERPRISE WHAT’S INSIDE? Ceph Object and Ceph Block Calamari Enterprise Plugins (2014) Support Services Copyright © 2013 by Inktank | Private and Confidential
    8. 8. Copyright © 2013 by Inktank | Private and Confidential USE CASE: OPENSTACK 9
    9. 9. Copyright © 2013 by Inktank | Private and Confidential USE CASE: OPENSTACK 10 Volumes Ephemeral Copy-on-Write Snapshots
    10. 10. Copyright © 2013 by Inktank | Private and Confidential USE CASE: OPENSTACK 11
    11. 11. Copyright © 2013 by Inktank | Private and Confidential USE CASE: CLOUD STORAGE 12 S3/Swift S3/Swift S3/Swift S3/Swift
    12. 12. Copyright © 2013 by Inktank | Private and Confidential USE CASE: WEBSCALE APPLICATIONS 13 Native Protocol Native Protocol Native Protocol Native Protocol
    13. 13. ROADMAP INKTANK CEPH ENTERPRISE 14 Copyright © 2013 by Inktank | Private and Confidential May 2014 Q4 2014 2015
    14. 14. Copyright © 2013 by Inktank | Private and Confidential USE CASE: PERFORMANCE BLOCK 15 CEPH STORAGE CLUSTER
    15. 15. Copyright © 2013 by Inktank | Private and Confidential USE CASE: PERFORMANCE BLOCK 16 CEPH STORAGE CLUSTER Read/Write Read/Write
    16. 16. Copyright © 2013 by Inktank | Private and Confidential USE CASE: PERFORMANCE BLOCK 17 CEPH STORAGE CLUSTER Write Write Read Read
    17. 17. Copyright © 2013 by Inktank | Private and Confidential USE CASE: ARCHIVE / COLD STORAGE 18 CEPH STORAGE CLUSTER
    18. 18. ROADMAP INKTANK CEPH ENTERPRISE 19 Copyright © 2013 by Inktank | Private and Confidential April 2014 September 2014 2015
    19. 19. Copyright © 2013 by Inktank | Private and Confidential USE CASE: DATABASES 20 Native Protocol Native Protocol Native Protocol Native Protocol
    20. 20. Copyright © 2013 by Inktank | Private and Confidential USE CASE: HADOOP 21 Native Protocol Native Protocol Native Protocol Native Protocol
    21. 21. 22 Training for Proof of Concept or Production Users Online Training for Cloud Builders and Storage Administrators Instructor led with virtual lab environment INKTANK UNIVERSITY Copyright © 2014 by Inktank | Private and Confidential VIRTUAL PUBLIC May 21 – 22 European Time-zone June 4 - 5 US Time-zone
    22. 22. Ceph Reference Architectures and case study
    23. 23. Outline • Planning your Ceph implementation • Choosing targets for Ceph deployments • Reference Architecture Considerations • Dell Reference Configurations • Customer Case Study
    24. 24. • Business Requirements – Budget considerations, organizational commitment – Replacing Enterprise SAN/NAS for cost saving – XaaS use cases for massive-scale, cost-effective storage – Avoid lock-in – use open source and industry standards – Steady-state vs. Spike data usage • Sizing requirements – What is the initial storage capacity? – What is the expected growth rate? • Workload requirements – Does the workload need high performance or it is more capacity focused? – What are IOPS/Throughput requirements? – What applications will be running on Ceph cluster? – What type of data will be stored? Planning your Ceph Implementation
    25. 25. How to Choose Targets Use Cases for Ceph Virtualization and Private Cloud (traditional SAN/NAS) High Performance (traditional SAN) PerformanceCapacity NAS & Object Content Store (traditional NAS) Cloud Applications Traditional IT XaaS Compute Cloud Open Source Block XaaS Content Store Open Source NAS/Object Ceph Target Ceph Target
    26. 26. • Tradeoff between Cost vs. Reliability (use-case dependent) • How many node failures can be tolerated? – SSD mirroring – Distributing nodes across failures zones • In a multi-rack scenario, should a whole rack failure be tolerated? • Is there a need for multi-site data replication? • Erasure coding (more capacity with the same raw disk. More CPU load) • Plan for redundancy of the monitor nodes – distribute across fault zones? Architectural considerations – Redundancy and replication considerations
    27. 27. Server Considerations • Storage Node: – one OSD per HDD, 1 – 2 GB ram, and 1 Gz/core/OSD, – Dedicated SSD only nodes for journaling and for using the tiering feature in Firefly – Erasure coding will increase useable capacity at the expense of additional compute load – SAS JBOD expanders for extra capacity (beware of extra latency and oversubscribed SAS lanes) • Monitor nodes (MON): odd number for quorum, services can be hosted on the storage node for smaller deployments, but will need dedicated nodes larger installations • Dedicated RADOS Gateway nodes for large object store deployments
    28. 28. Networking Considerations • Dedicated or Shared network – Be sure to involve the networking and security teams early when design your networking options – Network redundancy considerations – Dedicated client and OSD networks – VLAN’s vs. Dedicated switches • 1 GBs vs 10 GBs fabric – Depends on cost, performance, and use case (data access patterns and IOPS requirements) • Networking design – Spine and Leaf – Multi-rack – Core fabric connectivity
    29. 29. Ceph additions coming to the Dell Red Hat OpenStack solution Pilot configuration Components • Dell PowerEdge R620/R720/R720XD Servers • Dell Networking S4810/S55 Switches, 10GB • Red Hat Enterprise Linux OpenStack Platform • Dell ProSupport • Dell Professional Services • Avail. w/wo High Availability Specs at a glance • Node 1: Red Hat Openstack Manager • Node 2: OpenStack Controller (2 additional controllers for HA) • Nodes 3-8: OpenStack Nova Compute • Nodes: 9-11: Ceph 12x3 TB raw storage • Network Switches: Dell Networking S4810/S55 • Supports ~ 170-228 virtual machines Benefits • Rapid on-ramp to OpenStack cloud • Scale up, modular compute and storage blocks • Single point of contact for solution support • Enterprise-grade OpenStack software package Storage bundles
    30. 30. Example Ceph Dell Server Configurations Type Size Components Performance 20 TB • R720XD • 24 GB DRAM • 10 X 4 TB HDD (data drives) • 2 X 300 GB SSD (journal) Capacity 44TB / 105 TB* • R720XD • 64 GB DRAM • 10 X 4 TB HDD (data drives) • 2 X 300 GB SSH (journal) • MD1200 • 12 X 4 TB HHD (data drives) Extra Capacity 144 TB / 240 TB* • R720XD • 128 GB DRAM • 12 X 4 TB HDD (data drives) • MD3060e (JBOD) • 60 X 4 TB HHD (data drives)
    31. 31. • Dell & Red Hat & Inktank have partnered to bring a complete Enterprise-grade storage solution for RHEL-OSP + Ceph • The joint solution provides: – Co-engineered and validated Reference Architecture – Pre-configured storage bundles optimized for performance or storage – Storage enhancements to existing OpenStack Bundles – Certification against RHEL-OSP – Professional Services, Support, and Training › Collaborative Support for Dell hardware customers › Deployment services & tools What Are We Doing To Enable?
    32. 32. UAB Case Study
    33. 33. Overcoming a data deluge Inconsistent data management across research teams hampers productivity • Growing data sets challenged available resources • Research data distributed across laptops, USB drives, local servers, HPC clusters • Transferring datasets to HPC clusters took too much time and clogged shared networks • Distributed data management reduced researcher productivity and put data at risk
    34. 34. Solution: a storage cloud Centralized storage cloud based on OpenStack and Ceph • Flexible, fully open-source infrastructure based on Dell reference design − OpenStack, Crowbar and Ceph − Standard PowerEdge servers and storage − 400+ TBs at less than 41¢ per gigabyte • Distributed scale-out storage provisions capacity from a massive common pool − Scalable to 5 petabytes • Data migration to and from HPC clusters via dedicated 10Gb Ethernet fabric • Easily extendable framework for developing and hosting additional services − Simplified backup service now enabled “We’ve made it possible for users to satisfy their own storage needs with the Dell private cloud, so that their research is not hampered by IT.” David L. Shealy, PhD Faculty Director, Research Computing Chairman, Dept. of Physics
    35. 35. Building a research cloud Project goals extend well beyond data management • Designed to support emerging data-intensive scientific computing paradigm – 12 x 16-core compute nodes – 1 TB RAM, 420 TBs storage – 36 TBs storage attached to each compute node • Virtual servers and virtual storage meet HPC − Direct user control over all aspects of the application environment − Ample capacity for large research data sets • Individually customized test/development/ production environments − Rapid setup and teardown • Growing set of cloud-based tools & services − Easily integrate shareware, open source, and commercial software “We envision the OpenStack-based cloud to act as the gateway to our HPC resources, not only as the purveyor of services we provide, but also enabling users to build their own cloud-based services.” John-Paul Robinson, System Architect
    36. 36. Research Computing System (Next Gen) A cloud-based computing environment with high speed access to dedicated and dynamic compute resources Open Stack node Open Stack node Open Stack node Open Stack node Open Stack node Open Stack node Open Stack node Open Stack node Open Stack node Open Stack node Open Stack node Open Stack node HPC Cluster HPC Cluster HPC Storage DDR Infiniband QDR Infiniband 10Gb Ethernet Cloud services layer Virtualized server and storage computing cloud based on OpenStack, Crowbar and Ceph UAB Research Network
    37. 37. THANK YOU!
    38. 38. Contact Information Reach Kamesh and Neil for additional information: Dell.com/OpenStack Dell.com/Crowbar Inktank.com/Dell Kamesh_Pemmaraju@Dell.com @kpemmaraju Neil.Levine@Inktank.com @neilwlevine Visit the Dell and Inktank booths in the OpenStack Summit Expo Hall

    ×