SlideShare a Scribd company logo
1 of 64
Download to read offline
Scien&fic	
  Compu&ng	
  on	
  AWS:
NASA/JPL,	
  ESA	
  and	
  CERN
Jamie Kinney
Principal Solutions Architect
World Wide Public Sector
jkinney@amazon.com
@jamiekinney
1
?
How do researchers use AWS today?
Can you run HPC on AWS?
Should everything run on the cloud?
How does AWS facilitate scientific collaboration?
2
AmazonWebServices
AWS Global Infrastructure
Application Services
Networking
Deployment & Administration
DatabaseStorageCompute
3
Amazon EC2
4
ec2-run-instances
5
6
Programmable
7
8
9
Elastic
10
Self Hosting
Waste
Customer
Dissatisfaction
Actual demand
Predicted
Demand
Rigid
Actual
demand
Elastic
11
Gofromoneinstance...
12
ToThousands
13
Instance Types
14
Standard (m1)
High Memory (m2,m3)
High CPU (c1)
15
Intel Nehalem (cc1.4xlarge)
Nvidia GPUs (cg1.4xlarge)
2TB of SSD 120,000 IOPS (hi1.4xlarge)
Intel Sandy Bridge E5-2670 (cc2.8xlarge)
Sandy Bridge, NUMA, 240GB RAM (cr1.4xlarge)
48 TB of ephemeral storage (hs1.8xlarge)
Cluster Compute
16
17
Placement Groups
18
10 gig E
Placement
Group
Full
Bisection
EC2
EC2
EC2
EC2 EC2 EC2
EC2
EC2EC2
19
What is Scientific
Computing?
20
UseCases
•Science-as-a-Service
•Large-scale HTC (100,000+ core clusters)
•Large-scale MapReduce (Hadoop/Spark/Shark) using EMR or EC2
•Small to medium-scale MPI clusters (hundreds of nodes)
•Many small MPI clusters working in parallel to explore parameter space
•GPGPU workloads
•Dev/test of MPI workloads prior to submitting to supercomputing centers
•Collaborative research environments
•On-demand academic training/lab environments
21
Large Input Data Sets
22
ESAGaiaMissionOverview
ESA’s Gaia is an ambitious mission to chart a three-dimensional
map of the Milky Way Galaxy in order to reveal the composition,
formation and evolution of our Galaxy.
Gaia will repeatedly analyze and record the positions and
magnitude of approximately one billion stars over the course of
several years.
1 billion stars x 80 observations x 10 readouts = ~1 x 10^12
samples.
1ms processing time/sample = more than 30 years of processing
23
GaiaSolutionOverview
• Purchase at the beginning of the mission for the anticipated high-water mark
• Pay as you go: Launch what you need, as you need it. Turn instances off when you’re done
• Purchase additional systems for redundancy
• If an instance fails, turn it off and launch a replacement at no additional charge
• Large-scale data reprocessing is constrained to available infrastructure. No way to accelerate jobs
without additional CapEx
• Need to reprocess the data within a few hours, simply launch more instances. 100 machines running
for 1 hour at the same cost as 1 machine running for 100 hours
• Performance constrained to processor/disk/memory available at time of procurement...for a multi-
year mission
• AWS frequently launches new instance types running the latest hardware. Simply restart your
instances on a newer instance type and stop paying for less-capable infrastructure.
• Data transfer and security policies make it difficult to collaborate with researchers located elsewhere
• Easily and securely collaborate with researchers around the world
24
Many Iterations With Varying Parameters
25
Linear Algebra Calculations
26
27
JPL
Pasadena, CA
CDSCC
Canberra Deep Space
Communication Complex
MDSCC
Madrid Deep Space
Communication Complex
GDSCC
Goldstone Deep Space
Communication Complex
ARC
CheMin
Moffett Field, CA
MSSS
MARDI, MAHLI,
MastCam
San Diego, CA
KSC
IKI
DAN
Moscow, Russia
INTA
REMS
Madrid,
Spain
LANL
ChemCam
Los Alamos, NM
UofGuelph
APXS
Guelph, OntarioSwRI
RAD
Boulder, CO
GSFC
SAM
Greenbelt, MD
Plus hundreds of other
sites around the world for
Co-Is and Colleagues
MSL Distributed Operations
28
Data Locality Challenges
Scientist 1 retrieves data from L.A.
Scientist 1 returns data to L.A.
Scientist 2 retrieves data from L.A.
Scientist 2 returns data to L.A.
29
AWSGlobalInfrastructure
9 regions
25 availability zones
38 edge locations
30
AWS Public Data Sets
AWS.amazon.com/datasets 31
Data Locality Challenges
Researcher in L.A. uploads
data to the cloud
Scientist 1 uses cloud
resources to process data
Scientist 2 retrieves data
products from edge network
Scientist 2 uses cloud resources
to process data
Global collaboration
32
33
On-DemandPricing
34
ReservedInstances
35
SpotInstances
• Bid $X per hour
• If current price <= bid, instance starts
• If current price > bid, instance terminates
• Customers pay market rate, not bid
36
U. Wisc.: CMS Particle Detector
http://www.hep.wisc.edu/~dan/talks/EC2SpotForCMS.pdf
37
Integrated
Architectures
38
Amazon
VPC
AWS Direct
Connect
EC2 EC2
EC2EC2
Los Angeles
Singapore
Japan
London
Sao Paolo
New York
Sydney
39
40
Secured Uplink Planning
41
JPL Data Center
Decider
File
Transfer
Workers
Data
Processing
Workers
Polyphony
Amazon SWF
Decider
Data Processing
Tasks
File Transfer
Tasks
Decision Tasks
Create EC2
Instances
Upload and
Download
File Chunks
Data Processing Workers
EC2 EC2 EC2 EC2
S3
42
SWF
EC2
S3
SimpleDB
CloudWatch
IAMs
ELB
5 Giga-pixels in 5 minutes!
43
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
Ames
Large, tightly-
coupled MPI
Large EP, smaller scale tightly-coupled
MPI, dev/test, burst capacity
Small scale
MPI and EP
NASA
Researcher
44
45
46
ZerotoInternet-ScaleinOneWeek!
47
ELBs on Steroids
48
Route53
49
CloudFormation
50
CloudFront
51
Regions and AZs
52
MarsScienceLaboratory-LiveVideoStreaming
Architecture
Availability Zone: us-east-1a
Adobe
Flash
Media
Server
Availability Zone: us-west-1b
Telestream
Wirecast
CloudFront
streaming for
museum partners
Adobe
Flash
Media
Server
Elastic Load
Balancer
Tier 2 Nginx
Cache
Tier 1
Nginx
Cache
Cloud Formation Stack
Elastic Load
Balancer
Tier 2 Nginx
Cache
Tier 1
Nginx
Cache
Cloud Formation Stack
53
BattleTestingJPL’sDeployment
Benchmarking
54
DynamicTrafficScaling
US-EastCacheNodePerformance
11.4 Gbps
55
DynamicTrafficScaling
US-EastCacheNodePerformance
25.3 Gbps
56
DynamicTrafficScaling
US-EastCacheNodePerformance
10.1 Gbps
57
DynamicTrafficScaling
US-EastCacheNodePerformance
40.3 Gbps
58
DynamicTrafficScaling
US-EastCacheNodePerformance
26.6 Gbps
59
Only ~42Mbps
DynamicTrafficScaling
ImpactonUS-EastFMSOriginServers
60
Only ~42Mbps
DynamicTrafficScaling
ImpactonUS-EastFMSOriginServers
61
CloudFrontBehaviors
UsingELBsforDynamicContent
62
AWS Academic Grants
AWS.amazon.com/grants 63
Thank
You
64

More Related Content

What's hot

What's hot (20)

SoCal Data Science Conference: Machine Learning & Data Science in the Age of ...
SoCal Data Science Conference: Machine Learning & Data Science in the Age of ...SoCal Data Science Conference: Machine Learning & Data Science in the Age of ...
SoCal Data Science Conference: Machine Learning & Data Science in the Age of ...
 
Machine Learning & Data Science in the Age of the GPU: Smarter, Faster, Better
Machine Learning & Data Science in the Age of the GPU: Smarter, Faster, BetterMachine Learning & Data Science in the Age of the GPU: Smarter, Faster, Better
Machine Learning & Data Science in the Age of the GPU: Smarter, Faster, Better
 
GTC Tel Aviv: Accelerate Analytics with a GPU Data Frame
GTC Tel Aviv: Accelerate Analytics with a GPU Data FrameGTC Tel Aviv: Accelerate Analytics with a GPU Data Frame
GTC Tel Aviv: Accelerate Analytics with a GPU Data Frame
 
Prácticas recomendadas en materia de arquitectura y errores que debes evitar
Prácticas recomendadas en materia de arquitectura y errores que debes evitarPrácticas recomendadas en materia de arquitectura y errores que debes evitar
Prácticas recomendadas en materia de arquitectura y errores que debes evitar
 
BioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing dataBioPig for scalable analysis of big sequencing data
BioPig for scalable analysis of big sequencing data
 
The Do’s and Don’ts of Benchmarking Databases
The Do’s and Don’ts of Benchmarking DatabasesThe Do’s and Don’ts of Benchmarking Databases
The Do’s and Don’ts of Benchmarking Databases
 
SQream DB - Bigger Data On GPUs: Approaches, Challenges, Successes
SQream DB - Bigger Data On GPUs: Approaches, Challenges, SuccessesSQream DB - Bigger Data On GPUs: Approaches, Challenges, Successes
SQream DB - Bigger Data On GPUs: Approaches, Challenges, Successes
 
The Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery DataThe Matsu Project - Open Source Software for Processing Satellite Imagery Data
The Matsu Project - Open Source Software for Processing Satellite Imagery Data
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
 
CPAC Connectome Analysis in the Cloud
CPAC Connectome Analysis in the CloudCPAC Connectome Analysis in the Cloud
CPAC Connectome Analysis in the Cloud
 
VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWS
 
DSD-INT 2018 Earth Science Through Datacubes - Merticariu
DSD-INT 2018 Earth Science Through Datacubes - MerticariuDSD-INT 2018 Earth Science Through Datacubes - Merticariu
DSD-INT 2018 Earth Science Through Datacubes - Merticariu
 
3 Sessione - Come superare il problema delle risorse nell’utilizzo di softwa...
3  Sessione - Come superare il problema delle risorse nell’utilizzo di softwa...3  Sessione - Come superare il problema delle risorse nell’utilizzo di softwa...
3 Sessione - Come superare il problema delle risorse nell’utilizzo di softwa...
 
Using SparkML to Power a DSaaS (Data Science as a Service) with Kiran Muglurm...
Using SparkML to Power a DSaaS (Data Science as a Service) with Kiran Muglurm...Using SparkML to Power a DSaaS (Data Science as a Service) with Kiran Muglurm...
Using SparkML to Power a DSaaS (Data Science as a Service) with Kiran Muglurm...
 
The next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engineThe next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engine
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologies
 
Apache Lens at Hadoop meetup
Apache Lens at Hadoop meetupApache Lens at Hadoop meetup
Apache Lens at Hadoop meetup
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performance
 
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web ServicesCreating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
 
EMR AWS Demo
EMR AWS DemoEMR AWS Demo
EMR AWS Demo
 

Similar to Scientific Computing With Amazon Web Services

(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
Amazon Web Services
 

Similar to Scientific Computing With Amazon Web Services (20)

High Performance Computing with AWS
High Performance Computing with AWSHigh Performance Computing with AWS
High Performance Computing with AWS
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the Cloud
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
 
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
Finding New Sub-Atomic Particles on the AWS Cloud (BDT402) | AWS re:Invent 2013
 
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
Re invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampionRe invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampion
 
TeraGrid Communication and Computation
TeraGrid Communication and ComputationTeraGrid Communication and Computation
TeraGrid Communication and Computation
 
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
 
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
 
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
Cray HPC Environments for Leading Edge Simulations
Cray HPC Environments for Leading Edge SimulationsCray HPC Environments for Leading Edge Simulations
Cray HPC Environments for Leading Edge Simulations
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 

Recently uploaded

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

Scientific Computing With Amazon Web Services