The Importance of Fast, Scalable Storage for Today’s HPC
Upcoming SlideShare
Loading in...5
×
 

The Importance of Fast, Scalable Storage for Today’s HPC

on

  • 290 views

Today, data drives discovery. And discoveries create are key to creating sustained advantages. The better your critical workflows are able to create and access data – the better you’ll be able to ...

Today, data drives discovery. And discoveries create are key to creating sustained advantages. The better your critical workflows are able to create and access data – the better you’ll be able to discover new, innovative solutions to important problems, or to create entirely new products. More than ever before, data intensive applications need the sustained performance and virtually unlimited scalability that only parallel storage software delivers.

Designed for maximum performance and scale, storage solutions powered by Lustre software deliver the performance at scale to meet today’s storage requirements. As the most widely used parallel storage system for HPC, Lustre-powered storage is the ideal storage foundation.

But scalable performance storage by itself only solves half the problem. Today’s users expect storage solutions that deliver sustained performance, scale upward to near limitless capacities, and are simple to install and manage. Intel(r) Enterprise Edition for Lustre* software combines the straight line speed and scale of Lustre with the bottom line need for lowered management complexity and cost.

As the recognized leaders in the development and support of the Lustre file system, Intel has the expertise to make storage solutions for data intensive applications faster, smarter and easier.

Statistics

Views

Total Views
290
Views on SlideShare
290
Embed Views
0

Actions

Likes
0
Downloads
7
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Just like we saw in HPC 10 years ago.We’ve been dealing with it – and have solutions such as Lustre.Object storage comment: it changes the applications.Points to be made here:Data is growing at exponential ratesScale-out, parallel file systems are current ahead of the gameLustre has several sites at > 1GB/s and 55 PB formattedLustre is the market leader for parallel FS in HPC with 27.5% market shareAs I said – it’s a mature and dependable technology for HPCBig Data is often (incorrectly equated to Hadoop)Hadoop relies on local diskHadoop throws away 75% of its resourcesLustre, optimized for Hadoop (IDH+IEEL) is the obvious answer for HPC sitesIt may even be the answer for Enterprise Hadoop – the debate is just startingCloud is building momentum for HPCLustre on AWSCloud-bursting HPC jobs is possibleBio-genomics sites reportedly buying 10 GigE uplinks into AWSNext stop: OpenStack
  • More than ever before, today’s most important applications and workflows are used to create sustained competitive advantages for our customers – where they’ve been long time users of HPC, or new, often commercial enterprises seeking to exploit high performance storage to power critical commercial applications.Enterprises and institutions of all sizes and from around the world are investing in storage solutions that are faster, far more scalable, and highly flexible to meet an array of configurations. The file system software used with these new solutions plays a critical role – just as Intel led the shift toward parallel computing several years ago – future storage solutions must be parallel too. Purpose-built to meet the most demanding storage requirements – today and into the future – the Lustre file system powers the fastest computers in the world.Most widely used parallel file system for high performance computing1Powers +60% of the fastest 100 supercomputers worldwide2And 7 of the top 10 supercomputers too… Virtually unlimited performance and scalability+2 TB/second throughput, many production sites between 500 and 750 GB/secondShared, parallel and distributed storage – vast global namespaceConforms to key industry standards (POSIX)Flexible, open, highly efficient configurations preserve vendor choice
  • Compared with mainstream enterprise applications, compute-intensive, high-performance computing (HPC) places very different demands on storage systemsHPC applications tend to be used for compute and data intensive applications, including: computational analysis, data-intensive research, rich media, seismic processing, and large-scale simulation. Driven by CPU-intensive processing, such applications handle large volumes of data over short periods of time while also in some cases permitting simultaneous access from multiple servers.
  • ----- Meeting Notes (1/8/14 13:57) -----Course content split into 3 sections1. Overview - Introduction to Lustre2. Products - Intel Enterprise Edition3. Resources to help you sell
  • ----- Meeting Notes (1/8/14 13:57) -----So, what is Lustre?----- Meeting Notes (1/8/14 16:24) -----SUPPORTS POSIX SEMANTICSApplications deigned to work with POSIX compliant file systems will work with Lustre Very important to MANY end users
  • ----- Meeting Notes (1/8/14 16:33) -----MENTION GEOSCIENCES AND ENERGY
  • ----- Meeting Notes (1/8/14 16:19) -----So, let's take a closer look at a typical Lustre solution….
  • ----- Meeting Notes (1/8/14 16:19) -----All Lustre solutions, regardless of size and performance, are built using the same componentsThere are 3 types of dedicated servers found in Lustre solutionsServers devoted to MANAGEMENT TASKS are the first type, as seen hereBascally, METADATA SERVERS are responsible for managing files and directory information. There is 1 METADATA SERVER for EACH FILE SYSTEMMANAGEMENT SERVERS hold cluster configuration information, including servers and clients. There is 1 MGMT SERVER for each INSTALLATION
  • ----- Meeting Notes (1/8/14 16:19) -----OBJECT STORAGE SERVERS are the next type found in a Lustre configuration.OSS provide I/O services to CLIENTS. Lustre solutions often have MANY OBJECT STORAGE SERVERS…for virtually unlimited scale-out storageMultiple servers are aggregated together to create a single, global namespace
  • ----- Meeting Notes (1/8/14 16:19) -----Lustre clients are the third type of server. Think of these as DISKLESS COMPUTE nodes that run applications and use the storage services -- capacity and I/O -- provided by OSSALL SERVERS RUN LINUX - EITHER Red Hat OR CentOS
  • ----- Meeting Notes (1/8/14 16:19) -----Finally, Lustre solutions have 2 types of networks1. Management network2. Data network, most often InfiniBand
  • ----- Meeting Notes (1/8/14 16:33) -----The result?Unmatched speed and scale+2 TBUpward of 80K IO operations per second
  • As storage solutions continue to grow in complexity, powerful, yet easy-to-use software tools to install, configure, monitor, manage, and optimize Lustre-based solutions are essential.Intel® Manager for Lustre* (IML) has been purpose-built to simplify the installation, configuration andmanagement of fast, scalable storage solutions based on Intel Enterprise Edition for Lustre software. IML is a core component of Intel Enterprise Edition for Lustre software.Key FeaturesFollowing are key features afforded by Intel® Manger for Lustre* software (as a component of Intel® Enterprise Edition for Lustre* software)GUI-based creation and management of Lustre* file systemsThe Intel® Manager for Lustre* software provides a powerful, yet easy-to-use GUI that enables rapid creation of Lustre file systems. The GUI supports easy configuration for high availability and expansion, and enables performance monitoring and management of multiple Lustre file systems.Graphical charts display real-time performance metricsFully-configurable color charts display a variety of real-time performance metrics for single or multiple file systems, down to individual servers and targets, and reveal metrics such as OST balance, file system capacity, metadata operations, bandwidth, read/write operations, and various resource usage parameters, among others.Auto-configured high-availability clustering for server pairsPacemaker and Corosync are configured automatically when the system design follows configuration guidance. This removes the need for manually installing HA configuration files on storage servers and simplifies high-availability configuration.Power DistributionUnit configuration and server outlet assignments support automatic failoverThe PDU tab lets you configure and manager power distribution units. At this tab you can add a detected PDU and assign specific PDU outlets to specific servers. When you associate PDU failover outlets with servers using this tool, STONITH is automatically configured. IPMI and BMC ConfigurationAn alternative to PDU configuration, support for Intelligent Platform Management Interface and baseboard management controllers support server monitoring, high-availability configuration, and failover. Simplied ISO-less installation and automated deployment mechanism streamlines overall installationThe installation strategy removes the need to manually install the software on each server. Intel® Manager for Lustre* software is quickly installed on the manager server, and from there, required packages are automatically deployed to all storage servers. Storage servers and the manager server can run the same standard operating system as the rest of your estate. Additional software built for CentOS or Red Hat will also work on servers managed by Intel® Manager for Lustre* software.End-userbenefits:Reduced management complexity and costs, enabling storage administrators to exploit the performance and scalability of Lustre storageAccelerated deployment of fast storage to support critical applications and work flows(near?)Real-time storage-monitoring lets you track Lustre file system usage, performance metrics, events, and errors at the file system layer.(optional, provided by 3rd party) Storage plug-ins enable monitoring of hardware-level performance data, disk errors and faults, and other hardware-related information.Pre-existing Lustre file systems that were not configured and created using IML can be monitored using the intuitive dashboard GUIThe IML server (inside orange box) hosts the Intel® Manager for Lustre* software, and is the server from which all management tasks, from installing software onto storage servers to configuring, monitoring, and managing Lustre file systems. The IML server runs Linux and communicates with other servers (metadata and storage) via the management network.
  • Defining the storage cluster and file system attributes is performed within the Configuration tab along the top of the menu bar. Working from left to right, it’s simple to define and create new Lustre file system using Intel® Manager for Lustre:ServersThis tab lets you add servers to the storage system, provides server status information, and lets you start, stop, and remove servers. SSH and HTTPS are used for secure communications with servers. At least 2 servers are required for highly available configurations.VolumesThis tab provides features to configure primary and failover servers in file systems with servers configured for high availability. Each storage target corresponds to a single volume. If servers in the volume arephysically connected and then configured for high availability (using this Volumes tab and the PDUs tab, next), then primary and failover servers can be specified for the storage target. Volumes may be accessible on one or more servers via different device nodes, and it may be accessible via multiple device nodes on the same host.Power Control Configure PDUs and then assign specific PDU outlets to specific servers.MGTConfigure and create metadata and storage server targetsFile SystemsSelect management and storage targets, adjust tuning parameters and create Lustre file systems. The creation of the file system is started from this tabStorageCharting and reporting of storage hardware using vendor specific plug-ins.UsersCreating and managing various administrative accounts. You can easily monitor one or more file systems at the Dashboard, Alerts, History, and Logs pages. The Dashboard page displays a set of charts that provide usage and performance data at several levels in the file systems being monitored, while the Alerts, History, and Log pages keep you informed of file system activity relevant to current and past file system health and performance.Status indicatorThe Status indicator provides a quick glance at the status of all managed file systems.A green light means all is normalA yellow light indicates that one or more warning messages have been received (events or alerts) and should be checked. The file system may be operating in a degraded mode, for example a target has failed over, so performance may be degraded.A red light indicates that one or more errors have occurred. This file system may be down or performance may be severely degraded.
  • Big Data problems that typically use Hadoop are getting larger putting more demands on resources: they need more compute power faster IO access (perhaps than what HDFS can provide) more capacity (they processing more data)Moreover, there is an increasing need to integrate MR/Hadoop applications with applications that may not be MR appropriate. Typically these applications using POSIX for IO which means they cannot use HDFSOn the other hand, applications that are considered “HPC” are being written to use MR and to use Hadoop. Examples include: Genomics: Crossbow (genomic pipeline), Lots of financial applications (example: portfolio analysis) Big Pharma: drug discovery, molecular docking, Chemistry: molecular dynamics Astrophysics: image processing and analysis, Geospatial data (satellite data)Since Hadoop applications require lots of local storage (not common in HPC) dedicated hardware must be purchased and admins must be trained on the care and feeding of Hadoop.

The Importance of Fast, Scalable Storage for Today’s HPC The Importance of Fast, Scalable Storage for Today’s HPC Presentation Transcript

  • The Importance of Fast, Scalable Storage for Today‟s HPC Bill Webster High Performance Data Division, Intel Corporation For more follow @IntelITCenter on Twitter
  • Some Data About Data…. 2.5 >80% ~90% Quadrillion bytes of data created daily1 Of data today‟s data is unstructured Of the world‟s data has been created within the last 2 years… 1 Source: IBM
  • The Case for Fast, Scalable Storage Solving important problems drives technology investments Fast storage is critical for maximum application performance Lustre software was created for performance at large scale Storage fueled by Lustre* is stable, flexible and highly efficient Lustre is the most widely used parallel storage for HPC1 Over 60% of the fastest 100 HPC sites worldwide rely on Lustre2 1 Source: IDC research 2 Intel analysis of www.top500.org rankings, December 2013 * Some names and brands may be claimed as the property of others. View slide
  • • Workloads are diverse and dynamic, and applications are compute or data-intensive – often both • The value of HPC storage is measured by speed, scale & IOPS • To meet these requirements, HPC storage needs to: • Scale-out for increased I/O and capacity • Perform I/O in parallel for maximum throughput • Support virtually unlimited number of clients • Commercial “HPC” needs the same level of performance Lustre was architected for speed, scale and IOPS 4 HPC Places Unique Demands on Storage View slide
  • HPC Storage Software Introducing the Lustre file system 5
  • What is Lustre*? 6 Open source, distributed, parallel, clustered file system Designed for maximum performance at massive scale POSIX compliant – key for supporting applications Global, shared name space – all clients can access all data Very resource efficient and cost effective * Some names and brands may be claimed as the property of others.
  • What Makes Lustre* So Important? 7 Purpose-built for speed and scale:  Speed: Unmatched performance  Openness: choice of storage platforms  Efficiency: Achieves +90% utilization of storage resources  Affordable: Low CAPEX and OPEX  Scale-out: Independently scale storage capacity and bandwidth  Stable and reliable: Backed by Intel, the worldwide leader in Lustre support * Some names and brands may be claimed as the property of others.
  • Good Fit Applications for Lustre*… 8 Financial analysis – Modeling risk exposure & portfolio valuation Geosciences - weather forecasting and climate modeling Bioinformatics – genomics, proteomics, drug discovery Energy - exploration, reservoir modeling, wind energy Engineering - CAE, CFD and FEA for aerospace, automotive SCIENCEANALYTICS ENGINEERING * Some names and brands may be claimed as the property of others.
  • What Does a Lustre* Solution Look Like? Management Network High Performance Data Network (InfiniBand, 10GbE) Metadata Servers Object Storage Servers Intel Manager for Lustre* (requires Enterprise Edition) Object Storage Servers Object Storage Targets (OSTs) Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) Lustre Clients – diskless compute servers * Some names and brands may be claimed as the property of others.
  • Management Servers Management Network High Performance Data Network (InfiniBand, 10GbE) Metadata Servers Object Storage Servers Intel Manager for Lustre* (requires Enterprise Edition) Object Storage Servers Object Storage Targets (OSTs) Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) Lustre Clients – diskless compute servers 1 * Some names and brands may be claimed as the property of others.
  • Storage Servers Management Network High Performance Data Network (InfiniBand, 10GbE) Metadata Servers Object Storage Servers Intel Manager for Lustre* (requires Enterprise Edition) Object Storage Servers Object Storage Targets (OSTs) Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) Lustre Clients – diskless compute servers 2 * Some names and brands may be claimed as the property of others.
  • Compute clients Management Network High Performance Data Network (InfiniBand, 10GbE) Metadata Servers Object Storage Servers Intel Manager for Lustre* (requires Enterprise Edition) Object Storage Servers Object Storage Targets (OSTs) Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) Lustre Clients – diskless compute servers 3 * Some names and brands may be claimed as the property of others.
  • Interconnect fabric Management Network High Performance Data Network (InfiniBand, 10GbE) Metadata Servers Object Storage Servers Intel Manager for Lustre* (requires Enterprise Edition) Object Storage Servers Object Storage Targets (OSTs) Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) Lustre Clients – diskless compute servers 4 * Some names and brands may be claimed as the property of others.
  • The Results? Fast, scalable storage & I/O Management Network High Performance Data Network (InfiniBand, 10GbE) Object Storage Servers Object Storage Servers Lustre Clients – diskless compute servers Object Storage Targets (OSTs) Object Storage Targets (OSTs) Metadata Target (MDT) Management Target (MGT) • Over +2 TB/s achieved • 500-750 GB/s production • +80,000 IO/s * Some names and brands may be claimed as the property of others.
  • Intel® Lustre Solutions Enterprise Edition for Lustre* software * Some names and brands may be claimed as the property of others.
  • Intel® Enterprise Edition for Lustre* 16 Intel® Manager for Lustre is the heart of all Intel EE for Lustre based solutions. * Some names and brands may be claimed as the property of others.
  • Intel® Manager for Lustre* 17 The ‘dashboard’ canvas displays a variety of charts that illustrates performance levels and resource utilization. Visual system status indictor Configure, create and optimize Lustre file systems Intelligent, intuitive logging – understand how your storage is performing quickly and easily * Some names and brands may be claimed as the property of others.
  • A word about Big Data.
  • The Convergence of HPC and Big Data • Big Data problems are getting larger • More compute power. More files. More capacity and data throughput • MapReduce workloads are being added to HPC environments • 1 in 3 HPC sites have deployed Hadoop1 • But MapReduce workloads run differently than typical HPC applications • Compute nodes are diskless – no local storage • By default, Hadoop expects local storage within each node • Lustre storage accelerates the value of Hadoop • Improves application performance • Boosts storage efficiency and lowers management complexity 19 * Some names and brands may be claimed as the property of others. 1 Source: IDC research
  • Intel® Enterprise Edition for Lustre* software Includes the Hadoop „adapter‟ for Lustre • Replacement for HDFS • Shared, parallel storage optimizes performance • Lowers management complexity • Maximizes utilization of storage resources * Some names and brands may be claimed as the property of others.
  • 21 Case Study: Sanger Wellcome Trust Challenge: Improved processes and lab equipment led to exponential increases in the volume of data being generated – but storage budgets were growing slowly. Large data sets are difficult to proactively manage, and can easily overwhelm storage resources. Un-optimized storage had a direct, negative impact on application performance – slowing the time for breakthrough results. Solution: Exploit the power and scale of HPC-class storage, powered by Lustre* software and supported by Intel. Benefits provided:  Openness – Broad array of storage vendors and products  Global namespace – all clients can access all data  Performance – Upwards of 1 TB/s  Capacity - Virtually unlimited file system and per file sizes  Confidence – Backed by Intel expertise with Lustre • 10-15 TB of processed data weekly • Processed data is small fraction of overall storage capacity • Stored in iRODS data warehouse • BAM or FASTA format files • Use pattern matching algorithms like BWA and BLAST • Lustre offers immense scalable capacity • Now have 8 production Lustre file systems – and are planning to add more • Performance was main goal – but scale, flexibility, efficiency were critical * Some names and brands may be claimed as the property of others.
  • Thank You. 22