SlideShare a Scribd company logo
12th ANNUAL WORKSHOP 2016
HPC STORAGE AND IO TRENDS
AND WORKFLOWS
Gary Grider, Division Leader, HPC Division
April 4, 2014
Los Alamos National Laboratory
LA-UR-16-20184
OpenFabrics Alliance Workshop 2016
EIGHT DECADES OF PRODUCTION WEAPONS
COMPUTING TO KEEP THE NATION SAFE
2	
CM-2	IBM	Stretch	 CDC	 Cray	1	 Cray	X/Y	Maniac	
CM-5	 SGI	Blue	Mountain	 DEC/HP	Q	 IBM	Cell	Roadrunner	 Cray	XE	Cielo	
Cray	Intel	KNL	Trinity	 Ziggy	DWave	
Cross	Roads
OpenFabrics Alliance Workshop 2016
ECONOMICS HAVE SHAPED OUR WORLD
The beginning of storage layer proliferation circa 2009
§  Economic modeling for
large burst of data from
memory shows
bandwidth / capacity
better matched for solid
state storage near the
compute nodes
$0	
$5,000,000	
$10,000,000	
$15,000,000	
$20,000,000	
$25,000,000	
2012	
2013	
2014	
2015	
2016	
2017	
2018	
2019	
2020	
2021	
2022	
2023	
2024	
2025	
Hdwr/media	cost	3	mem/mo	10%	FS	
new	servers	
new	disk	
new	cartridges	
new	drives	
new	robots	
n  Economic	modeling	for	archive	
shows	bandwidth	/	capacity	be?er	
matched	for	disk
OpenFabrics Alliance Workshop 2016
THE HOOPLA PARADE CIRCA 2014
Data	Warp
OpenFabrics Alliance Workshop 2016
WHAT ARE ALL THESE STORAGE LAYERS?
WHY DO WE NEED ALL THESE STORAGE LAYERS?
§  Why
• BB: Economics (disk bw/iops too expensive)
• PFS: Maturity and BB capacity too small
• Campaign: Economics (tape bw too expensive)
• Archive: Maturity and we really do need a “forever”
Memory	
Burst	Buffer	
Parallel	File	System	
Campaign	Storage	
Archive	
Memory	
Parallel	File	System	
Archive	
HPC	Before	Trinity	
HPC	A]er	Trinity	
1-2	PB/sec																			
Residence	–	hours											
Overwri`en	–	conanuous	
4-6	TB/sec																											
Residence	–	hours					
Overwri`en	–	hours	
1-2	TB/sec																				
Residence	–	days/weeks	
Flushed	–	weeks	
100-300	GB/sec																
Residence	–	months-year	
Flushed	–	months-year	
10s	GB/sec	(parallel	tape	
Residence	–	forever	
HPSS	Parallel	
Tape	
Lustre	
Parallel	File	
System	
DRAM
OpenFabrics Alliance Workshop 2016
BURST BUFFERS ARE AMONG US
§  Many instances in the world now, some at multi-PB Multi-TB/s
scale
§  Uses
•  Checkpoint/Restart, Analysis, Out of core
§  Access
•  Largely POSIX like, with JCL for stage in, stage out based on job exit health, etc.
§  A little about Out of Core
•  Before HBM we were thinking DDR ~50-100 GB/sec and NVM 2-10 GB/sec (10X
penalty and durability penalty)
•  Only 10X penalty from working set speed to out of core speed
•  After HBM we have HBM 500-1000 GB/sec, DDR 50-100 GB/sec, and NVS 2-10
GB/sec (100X penalty from HBM to NVS)
•  Before HBM, out of core seemed like it might help some read mostly apps
•  After HBM, using DDR for working set seems limiting, but useful for some
•  Using NVS for read mostly out of core seems limiting too, but useful for some
OpenFabrics Alliance Workshop 2016
CAMPAIGN STORAGE SYSTEMS ARE AMONG US TOO – MARFS
MASSIVELY SCALABLE POSIX LIKE FILE SYSTEM NAME PACE
OVER CLOUD STYLE ERASURE PROTECTED OBJECTS
§  Background
•  Object Systems provide massive scaling and efficient erasure
•  Friendly to applications, not to people. People need a name space.
•  Huge Economic appeal (erasure enables use of inexpensive storage)
•  POSIX name space is powerful but has issues scaling
§  The challenges
•  Mismatch of POSIX an Object metadata, security, read/write size/semantics
•  No update in place with Objects
•  Scale to Trillions of files/directories and Billions of files in a directory
•  100’s of GB/sec but with years data longevity
§  Looked at
•  GPFS, Lustre, Panasas, OrangeFS, Cleversafe/Scality/EMC ViPR/Ceph/Swift, Glusterfs,
Nirvana/Storage Resource Broker/IRODS, Maginatics, Camlistore, Bridgestore, Avere, HDFS
§  Experiences
•  Pilot scaled to 3PB and 3 GB/sec,
•  First real deployment scaling to 30PB and 30 GB/sec
§  Next a demo of scale to trillions and billions
Current	Deployment	
Uses	N	GPFS’s	for	MDS	
and	Scality	So]ware	Only	
Erasure	Object	Store	
Be	nice	to	the	Object	system:	
pack	many	small	files	into	one	
object,	break	up	huge	files	
into	mulaple	objects
OpenFabrics Alliance Workshop 2016
MARFS
METADATA SCALING NXM
DATA SCALING BY X
Slide	8
OpenFabrics Alliance Workshop 2016
ISN’T THAT TOO MANY LAYERS JUST FOR
STORAGE?
§  If the Burst Buffer does its job very well (and indications are
capacity of in system NV will grow radically) and campaign
storage works out well (leveraging cloud), do we need a
parallel file system anymore, or an archive? Maybe just a
bw/iops tier and a capacity tier.
§  Too soon to say, seems feasible longer term
Memory	
Burst	Buffer	
Parallel	File	System	(PFS)	
Campaign	Storage	
Archive	
Memory	
IOPS/BW	Tier	
Parallel	File	System	(PFS)	
Capacity	Tier	
Archive	
Diagram	
courtesy	of	
John	Bent	
EMC	
Factoids	
(ames	are	
changing!)	
LANL	HPSS	=	53	PB	
and	543	M	files	
Trinity	2	PB	
memory,	4	PB	flash	
(11%	of	HPSS)	and	
80	PB	PFS	or	150%	
HPSS)	
Crossroads	may	
have	5-10	PB	
memory,	40	PB	solid	
state	or	100%	of	
HPSS	with	data	
residency	measured	
in	days	or	weeks		
We	would	have	never	
contemplated	more	in	
system	storage	than	our	
archive	a	few	years	ago
OpenFabrics Alliance Workshop 2016
BURST BUFFER -> PERF/IOPS TIER
WHAT WOULD NEED TO CHANGE?
§ Burst Buffers are designed for data durations in
hours-days. If in system solid state in system
storage is to be used for months duration many
things are missing.
• Protecting Burst Buffers with RAID/Erasure is NOT economical for
checkpoint and short term use because you can always go back to a lower
tier copy, but longer duration data requires protection. Likely you would
need a software RAID/Erasure that is distributed on the Supercomputer
over its fabric
• Long term protection of the name space is also needed
• QoS issues are also more acute
• With much more longer term data, locality based scheduling is also
perhaps more important
OpenFabrics Alliance Workshop 2016
CAMPAIGN -> CAPACITY TIER
WHAT WOULD NEED TO CHANGE CAMPAIGN?
§ Campaign data duration is targeted at a few years but
for a true capacity tier more flexible perhaps much
longer time periods may be required.
• Probably need power managed storage devices that match the
parallel BW needed to move around PB sized data sets
§ Scaling out to Exabytes, Trillions of files, etc.
• Much of this work is underway
§ Maturing of the solution space with multiple at least
partially vendor supported solutions is needed as
well.
OpenFabrics Alliance Workshop 2016
OTHER CONSIDERATIONS
§  New interfaces to storage that preserve/leverage structure/
semantics of data (much of this is being/was explored in the
DOE Storage FFwd)
•  DAOS like concepts with name spaces friendly to science application needs
•  Async/transactional/Versioning to match better with future async programming
models
§  The concept of a loadable storage stack (being worked in MarFS
and EMC)
•  It would be nice if the Perf/IOPS tier could be directed to “check out” a “problem” to
work on for days/weeks. Think PBs of data and billions of metadata entries of
various shapes. (EMC calls this dynamically loadable name spaces)
•  MarFS metadata demonstration of Trillions of files in a file system and Billions of files
in a directory will be an example of a “loadable” name space
•  CMU’s IndexFS->BatchFS->DeltaFS - stores POSIX metadata into thousands of
distributed KVS’s makes moving/restart with billions of metadata entries simply and
easily
•  Check out a billion metadata entries, make modifications, check back in as a new
version or a merge back into the Capacity Tier master name space
OpenFabrics Alliance Workshop 2016
BUT, THAT IS JUST ECONOMIC ARM WAVING.
How will the economics
combine with the apps/machine/
environmental
needs?
Enter Workflows
OpenFabrics Alliance Workshop 2016
WORKFLOWS TO THE RESCUE?
§  What did I learn from the workflow-fest circa 04/2015?
• There are 57 ways to interpret in situ J
• There are more workflow tools than requirements documents
• There is no common taxonomy that can be used to reason about
data flow/workflow for architects or programmers L
§  What did I learn from FY15 Apex Vendor meetings
• Where do you want your flash, how big, how fast, how durable
• Where do you want your SCM, how big, how fast
• Do you want it near the node or near the disk or in the network
• --- YOU REALLY DON’T WANT ME TO TELL YOU WHERE TO
PUT YOUR NAND/SCM ---
§ Can workflows help us beyond some automation tools?
OpenFabrics Alliance Workshop 2016
INSITU / POST / ACTIVE STORAGE / ACTIVE ARCHIVE
ANALYSIS WORK FLOWS IN WORK FLOWS
OpenFabrics Alliance Workshop 2016
WORKFLOWS: POTENTIAL TAXONOMY
Derived from Dave Montoya (Circa 05/15)
V2	
taxonomy	
a`empt
OpenFabrics Alliance Workshop 2016
WORKFLOWS CAN HELP US BEYOND SOME AUTOMATION TOOLS:
WORKFLOWS ENTER THE REALM OF RFP/PROCUREMENT
§  Trinity/Cori
• We didn’t specify flops, we specified running bigger app faster
• We wanted it to go forward 90% of the time
• We didn’t specify how much burst buffer, or speeds/feeds
• Vendors didn’t like this at first but realized it was degrees of freedom we
were giving them
§  Apex Crossroads/NERSC+1
• Still no flops J
• Still want it to go forward a large % of the time
• Vendors ask: where and how much flash/nvram/pixy dust do we put on
node, in network, in ionode, near storage, blah blah
• We don’t care we want to get the most of these work flows through the
system in 6 months
V3	Taxonomy	A`empt	
Next	slides	represents		work	
done	by	the	APEX	Workflow	
Team
OpenFabrics Alliance Workshop 2016
V3 TAXONOMY FROM APEX PROCUREMENT DOCS
A SIMULATION PIPELINE
OpenFabrics Alliance Workshop 2016
V3 TAXONOMY FROM APEX PROCUREMENT DOCS
A HIGH THROUGHPUT/UQ PIPELINE
OpenFabrics Alliance Workshop 2016
WORKFLOW DATA THAT GOES WITH THE
WORKFLOW DIAGRAMS
OpenFabrics Alliance Workshop 2016
SUMMARY
§  Economic modeling/analysis is a powerful tool for guiding our
next steps
§  Given the growing footprint of Data Mgmt/Movement in the cost
and pain in HPC, workflows may grow in importance and may be
more useful in planning for new machine architectures/
procurement/integration than ever.
§  Combining Economics and Workflows helps paint a picture of
the future for all of us.
12th ANNUAL WORKSHOP 2016
THANK YOU
AND
RIP PFS

More Related Content

What's hot

Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Disaster Recovery in the Hadoop Ecosystem: Preparing for the ImprobableDisaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Stefan Kupstaitis-Dunkler
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Mike Pittaro
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
Shashwat Shriparv
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
mcsrivas
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
hpcexperiment
 
HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
Apache Apex
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
Adam Kawa
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
Shashwat Shriparv
 
Big Data
Big DataBig Data
Hadoop architecture meetup
Hadoop architecture meetupHadoop architecture meetup
Hadoop architecture meetup
vmoorthy
 
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
The Apache Software Foundation
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Sankar H
 
Introduction to Hadoop part 2
Introduction to Hadoop part 2Introduction to Hadoop part 2
Introduction to Hadoop part 2
Giovanna Roda
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Yahoo!デベロッパーネットワーク
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Impetus Technologies
 
Hug syncsort etl hadoop big data
Hug syncsort etl hadoop big dataHug syncsort etl hadoop big data
Hug syncsort etl hadoop big data
Stéphane Heckel
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
DataWorks Summit
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
tcurdt
 

What's hot (20)

Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Disaster Recovery in the Hadoop Ecosystem: Preparing for the ImprobableDisaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
Disaster Recovery in the Hadoop Ecosystem: Preparing for the Improbable
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
 
HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
 
Big Data
Big DataBig Data
Big Data
 
Hadoop architecture meetup
Hadoop architecture meetupHadoop architecture meetup
Hadoop architecture meetup
 
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, SmallerORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
 
Introduction to Hadoop part 2
Introduction to Hadoop part 2Introduction to Hadoop part 2
Introduction to Hadoop part 2
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Hug syncsort etl hadoop big data
Hug syncsort etl hadoop big dataHug syncsort etl hadoop big data
Hug syncsort etl hadoop big data
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 

Viewers also liked

Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
inside-BigData.com
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
Sriram Krishnan
 
Introduction to EasyBuild: Tutorial Part 1
Introduction to EasyBuild: Tutorial Part 1Introduction to EasyBuild: Tutorial Part 1
Introduction to EasyBuild: Tutorial Part 1
inside-BigData.com
 
Importing data in Oasis Montaj
Importing data in Oasis MontajImporting data in Oasis Montaj
Importing data in Oasis Montaj
Amin khalil
 
Dell Lustre Storage Architecture Presentation - MBUG 2016
Dell Lustre Storage Architecture Presentation - MBUG 2016Dell Lustre Storage Architecture Presentation - MBUG 2016
Dell Lustre Storage Architecture Presentation - MBUG 2016
Andrew Underwood
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Hortonworks
 
High–Performance Computing
High–Performance ComputingHigh–Performance Computing
High–Performance Computing
BRAC University Computer Club
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS Cloud
Amazon Web Services
 
NSCC Training - Introductory Class
NSCC Training - Introductory ClassNSCC Training - Introductory Class
NSCC Training - Introductory Class
National Supercomputing Centre Singapore
 
Working With Big Data
Working With Big DataWorking With Big Data
Working With Big Data
Seth Familian
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
Geoffrey Fox
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
Seth Familian
 

Viewers also liked (12)

Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
 
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC ResourcesmyHadoop - Hadoop-on-Demand on Traditional HPC Resources
myHadoop - Hadoop-on-Demand on Traditional HPC Resources
 
Introduction to EasyBuild: Tutorial Part 1
Introduction to EasyBuild: Tutorial Part 1Introduction to EasyBuild: Tutorial Part 1
Introduction to EasyBuild: Tutorial Part 1
 
Importing data in Oasis Montaj
Importing data in Oasis MontajImporting data in Oasis Montaj
Importing data in Oasis Montaj
 
Dell Lustre Storage Architecture Presentation - MBUG 2016
Dell Lustre Storage Architecture Presentation - MBUG 2016Dell Lustre Storage Architecture Presentation - MBUG 2016
Dell Lustre Storage Architecture Presentation - MBUG 2016
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
High–Performance Computing
High–Performance ComputingHigh–Performance Computing
High–Performance Computing
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS Cloud
 
NSCC Training - Introductory Class
NSCC Training - Introductory ClassNSCC Training - Introductory Class
NSCC Training - Introductory Class
 
Working With Big Data
Working With Big DataWorking With Big Data
Working With Big Data
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
 

Similar to HPC Storage and IO Trends and Workflows

Lecture-20.pptx
Lecture-20.pptxLecture-20.pptx
Lecture-20.pptx
mohaaalsa
 
HPE Storage KubeCon US 2018 Workshop
HPE Storage KubeCon US 2018 WorkshopHPE Storage KubeCon US 2018 Workshop
HPE Storage KubeCon US 2018 Workshop
Michael Mattsson
 
The Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDisk
The Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDiskThe Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDisk
The Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDisk
OpenStack
 
NVMe over Fibre Channel Introduction
NVMe over Fibre Channel IntroductionNVMe over Fibre Channel Introduction
NVMe over Fibre Channel Introduction
Calvin Zito
 
Lamp Stack Optimization
Lamp Stack OptimizationLamp Stack Optimization
Lamp Stack Optimization
Dave Ross
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond Hadoop
Impetus Technologies
 
Filesystems
FilesystemsFilesystems
Filesystems
royans
 
Beyond the File System - Designing Large Scale File Storage and Serving
Beyond the File System - Designing Large Scale File Storage and ServingBeyond the File System - Designing Large Scale File Storage and Serving
Beyond the File System - Designing Large Scale File Storage and Serving
mclee
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Cal Henderson
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
Bhupesh Bansal
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
Chris Almond
 
Policy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud World
Policy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud WorldPolicy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud World
Policy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud World
Apcera
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) Overview
Marcel Hergaarden
 
Ceph - High Performance Without High Costs
Ceph - High Performance Without High CostsCeph - High Performance Without High Costs
Ceph - High Performance Without High Costs
Jonathan Long
 
scale_perf_best_practices
scale_perf_best_practicesscale_perf_best_practices
scale_perf_best_practices
webuploader
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
Lviv Startup Club
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
Lviv Startup Club
 
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated SystemsPetapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
dairsie
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
SUSE Italy
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
Chris Nauroth
 

Similar to HPC Storage and IO Trends and Workflows (20)

Lecture-20.pptx
Lecture-20.pptxLecture-20.pptx
Lecture-20.pptx
 
HPE Storage KubeCon US 2018 Workshop
HPE Storage KubeCon US 2018 WorkshopHPE Storage KubeCon US 2018 Workshop
HPE Storage KubeCon US 2018 Workshop
 
The Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDisk
The Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDiskThe Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDisk
The Consequences of Infinite Storage Bandwidth: Allen Samuels, SanDisk
 
NVMe over Fibre Channel Introduction
NVMe over Fibre Channel IntroductionNVMe over Fibre Channel Introduction
NVMe over Fibre Channel Introduction
 
Lamp Stack Optimization
Lamp Stack OptimizationLamp Stack Optimization
Lamp Stack Optimization
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond Hadoop
 
Filesystems
FilesystemsFilesystems
Filesystems
 
Beyond the File System - Designing Large Scale File Storage and Serving
Beyond the File System - Designing Large Scale File Storage and ServingBeyond the File System - Designing Large Scale File Storage and Serving
Beyond the File System - Designing Large Scale File Storage and Serving
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
 
Policy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud World
Policy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud WorldPolicy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud World
Policy-based Cloud Storage: Persisting Data in a Multi-Site, Multi-Cloud World
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) Overview
 
Ceph - High Performance Without High Costs
Ceph - High Performance Without High CostsCeph - High Performance Without High Costs
Ceph - High Performance Without High Costs
 
scale_perf_best_practices
scale_perf_best_practicesscale_perf_best_practices
scale_perf_best_practices
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated SystemsPetapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 

More from inside-BigData.com

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
inside-BigData.com
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
inside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
inside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
inside-BigData.com
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
inside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
inside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
inside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Recently uploaded

Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 

Recently uploaded (20)

Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 

HPC Storage and IO Trends and Workflows

  • 1. 12th ANNUAL WORKSHOP 2016 HPC STORAGE AND IO TRENDS AND WORKFLOWS Gary Grider, Division Leader, HPC Division April 4, 2014 Los Alamos National Laboratory LA-UR-16-20184
  • 2. OpenFabrics Alliance Workshop 2016 EIGHT DECADES OF PRODUCTION WEAPONS COMPUTING TO KEEP THE NATION SAFE 2 CM-2 IBM Stretch CDC Cray 1 Cray X/Y Maniac CM-5 SGI Blue Mountain DEC/HP Q IBM Cell Roadrunner Cray XE Cielo Cray Intel KNL Trinity Ziggy DWave Cross Roads
  • 3. OpenFabrics Alliance Workshop 2016 ECONOMICS HAVE SHAPED OUR WORLD The beginning of storage layer proliferation circa 2009 §  Economic modeling for large burst of data from memory shows bandwidth / capacity better matched for solid state storage near the compute nodes $0 $5,000,000 $10,000,000 $15,000,000 $20,000,000 $25,000,000 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 Hdwr/media cost 3 mem/mo 10% FS new servers new disk new cartridges new drives new robots n  Economic modeling for archive shows bandwidth / capacity be?er matched for disk
  • 4. OpenFabrics Alliance Workshop 2016 THE HOOPLA PARADE CIRCA 2014 Data Warp
  • 5. OpenFabrics Alliance Workshop 2016 WHAT ARE ALL THESE STORAGE LAYERS? WHY DO WE NEED ALL THESE STORAGE LAYERS? §  Why • BB: Economics (disk bw/iops too expensive) • PFS: Maturity and BB capacity too small • Campaign: Economics (tape bw too expensive) • Archive: Maturity and we really do need a “forever” Memory Burst Buffer Parallel File System Campaign Storage Archive Memory Parallel File System Archive HPC Before Trinity HPC A]er Trinity 1-2 PB/sec Residence – hours Overwri`en – conanuous 4-6 TB/sec Residence – hours Overwri`en – hours 1-2 TB/sec Residence – days/weeks Flushed – weeks 100-300 GB/sec Residence – months-year Flushed – months-year 10s GB/sec (parallel tape Residence – forever HPSS Parallel Tape Lustre Parallel File System DRAM
  • 6. OpenFabrics Alliance Workshop 2016 BURST BUFFERS ARE AMONG US §  Many instances in the world now, some at multi-PB Multi-TB/s scale §  Uses •  Checkpoint/Restart, Analysis, Out of core §  Access •  Largely POSIX like, with JCL for stage in, stage out based on job exit health, etc. §  A little about Out of Core •  Before HBM we were thinking DDR ~50-100 GB/sec and NVM 2-10 GB/sec (10X penalty and durability penalty) •  Only 10X penalty from working set speed to out of core speed •  After HBM we have HBM 500-1000 GB/sec, DDR 50-100 GB/sec, and NVS 2-10 GB/sec (100X penalty from HBM to NVS) •  Before HBM, out of core seemed like it might help some read mostly apps •  After HBM, using DDR for working set seems limiting, but useful for some •  Using NVS for read mostly out of core seems limiting too, but useful for some
  • 7. OpenFabrics Alliance Workshop 2016 CAMPAIGN STORAGE SYSTEMS ARE AMONG US TOO – MARFS MASSIVELY SCALABLE POSIX LIKE FILE SYSTEM NAME PACE OVER CLOUD STYLE ERASURE PROTECTED OBJECTS §  Background •  Object Systems provide massive scaling and efficient erasure •  Friendly to applications, not to people. People need a name space. •  Huge Economic appeal (erasure enables use of inexpensive storage) •  POSIX name space is powerful but has issues scaling §  The challenges •  Mismatch of POSIX an Object metadata, security, read/write size/semantics •  No update in place with Objects •  Scale to Trillions of files/directories and Billions of files in a directory •  100’s of GB/sec but with years data longevity §  Looked at •  GPFS, Lustre, Panasas, OrangeFS, Cleversafe/Scality/EMC ViPR/Ceph/Swift, Glusterfs, Nirvana/Storage Resource Broker/IRODS, Maginatics, Camlistore, Bridgestore, Avere, HDFS §  Experiences •  Pilot scaled to 3PB and 3 GB/sec, •  First real deployment scaling to 30PB and 30 GB/sec §  Next a demo of scale to trillions and billions Current Deployment Uses N GPFS’s for MDS and Scality So]ware Only Erasure Object Store Be nice to the Object system: pack many small files into one object, break up huge files into mulaple objects
  • 8. OpenFabrics Alliance Workshop 2016 MARFS METADATA SCALING NXM DATA SCALING BY X Slide 8
  • 9. OpenFabrics Alliance Workshop 2016 ISN’T THAT TOO MANY LAYERS JUST FOR STORAGE? §  If the Burst Buffer does its job very well (and indications are capacity of in system NV will grow radically) and campaign storage works out well (leveraging cloud), do we need a parallel file system anymore, or an archive? Maybe just a bw/iops tier and a capacity tier. §  Too soon to say, seems feasible longer term Memory Burst Buffer Parallel File System (PFS) Campaign Storage Archive Memory IOPS/BW Tier Parallel File System (PFS) Capacity Tier Archive Diagram courtesy of John Bent EMC Factoids (ames are changing!) LANL HPSS = 53 PB and 543 M files Trinity 2 PB memory, 4 PB flash (11% of HPSS) and 80 PB PFS or 150% HPSS) Crossroads may have 5-10 PB memory, 40 PB solid state or 100% of HPSS with data residency measured in days or weeks We would have never contemplated more in system storage than our archive a few years ago
  • 10. OpenFabrics Alliance Workshop 2016 BURST BUFFER -> PERF/IOPS TIER WHAT WOULD NEED TO CHANGE? § Burst Buffers are designed for data durations in hours-days. If in system solid state in system storage is to be used for months duration many things are missing. • Protecting Burst Buffers with RAID/Erasure is NOT economical for checkpoint and short term use because you can always go back to a lower tier copy, but longer duration data requires protection. Likely you would need a software RAID/Erasure that is distributed on the Supercomputer over its fabric • Long term protection of the name space is also needed • QoS issues are also more acute • With much more longer term data, locality based scheduling is also perhaps more important
  • 11. OpenFabrics Alliance Workshop 2016 CAMPAIGN -> CAPACITY TIER WHAT WOULD NEED TO CHANGE CAMPAIGN? § Campaign data duration is targeted at a few years but for a true capacity tier more flexible perhaps much longer time periods may be required. • Probably need power managed storage devices that match the parallel BW needed to move around PB sized data sets § Scaling out to Exabytes, Trillions of files, etc. • Much of this work is underway § Maturing of the solution space with multiple at least partially vendor supported solutions is needed as well.
  • 12. OpenFabrics Alliance Workshop 2016 OTHER CONSIDERATIONS §  New interfaces to storage that preserve/leverage structure/ semantics of data (much of this is being/was explored in the DOE Storage FFwd) •  DAOS like concepts with name spaces friendly to science application needs •  Async/transactional/Versioning to match better with future async programming models §  The concept of a loadable storage stack (being worked in MarFS and EMC) •  It would be nice if the Perf/IOPS tier could be directed to “check out” a “problem” to work on for days/weeks. Think PBs of data and billions of metadata entries of various shapes. (EMC calls this dynamically loadable name spaces) •  MarFS metadata demonstration of Trillions of files in a file system and Billions of files in a directory will be an example of a “loadable” name space •  CMU’s IndexFS->BatchFS->DeltaFS - stores POSIX metadata into thousands of distributed KVS’s makes moving/restart with billions of metadata entries simply and easily •  Check out a billion metadata entries, make modifications, check back in as a new version or a merge back into the Capacity Tier master name space
  • 13. OpenFabrics Alliance Workshop 2016 BUT, THAT IS JUST ECONOMIC ARM WAVING. How will the economics combine with the apps/machine/ environmental needs? Enter Workflows
  • 14. OpenFabrics Alliance Workshop 2016 WORKFLOWS TO THE RESCUE? §  What did I learn from the workflow-fest circa 04/2015? • There are 57 ways to interpret in situ J • There are more workflow tools than requirements documents • There is no common taxonomy that can be used to reason about data flow/workflow for architects or programmers L §  What did I learn from FY15 Apex Vendor meetings • Where do you want your flash, how big, how fast, how durable • Where do you want your SCM, how big, how fast • Do you want it near the node or near the disk or in the network • --- YOU REALLY DON’T WANT ME TO TELL YOU WHERE TO PUT YOUR NAND/SCM --- § Can workflows help us beyond some automation tools?
  • 15. OpenFabrics Alliance Workshop 2016 INSITU / POST / ACTIVE STORAGE / ACTIVE ARCHIVE ANALYSIS WORK FLOWS IN WORK FLOWS
  • 16. OpenFabrics Alliance Workshop 2016 WORKFLOWS: POTENTIAL TAXONOMY Derived from Dave Montoya (Circa 05/15) V2 taxonomy a`empt
  • 17. OpenFabrics Alliance Workshop 2016 WORKFLOWS CAN HELP US BEYOND SOME AUTOMATION TOOLS: WORKFLOWS ENTER THE REALM OF RFP/PROCUREMENT §  Trinity/Cori • We didn’t specify flops, we specified running bigger app faster • We wanted it to go forward 90% of the time • We didn’t specify how much burst buffer, or speeds/feeds • Vendors didn’t like this at first but realized it was degrees of freedom we were giving them §  Apex Crossroads/NERSC+1 • Still no flops J • Still want it to go forward a large % of the time • Vendors ask: where and how much flash/nvram/pixy dust do we put on node, in network, in ionode, near storage, blah blah • We don’t care we want to get the most of these work flows through the system in 6 months V3 Taxonomy A`empt Next slides represents work done by the APEX Workflow Team
  • 18. OpenFabrics Alliance Workshop 2016 V3 TAXONOMY FROM APEX PROCUREMENT DOCS A SIMULATION PIPELINE
  • 19. OpenFabrics Alliance Workshop 2016 V3 TAXONOMY FROM APEX PROCUREMENT DOCS A HIGH THROUGHPUT/UQ PIPELINE
  • 20. OpenFabrics Alliance Workshop 2016 WORKFLOW DATA THAT GOES WITH THE WORKFLOW DIAGRAMS
  • 21. OpenFabrics Alliance Workshop 2016 SUMMARY §  Economic modeling/analysis is a powerful tool for guiding our next steps §  Given the growing footprint of Data Mgmt/Movement in the cost and pain in HPC, workflows may grow in importance and may be more useful in planning for new machine architectures/ procurement/integration than ever. §  Combining Economics and Workflows helps paint a picture of the future for all of us.
  • 22. 12th ANNUAL WORKSHOP 2016 THANK YOU AND RIP PFS