SlideShare a Scribd company logo
WIFI: awsDevDay | PASS: CodeHappy
U P N E X T :
Introduction to Batch
Processing on AWS
T H A N K S T O O U R F R I E N D S A T :
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Pierre Steckmeyer, Solutions Architect
August 1, 2017
Introduction to Batch
Processing on AWS
• Batch processing – overview and challenges
• Why run batch workloads in the cloud
• Overview of AWS batch solutions
• Deep dive look at AWS Batch and Amazon ECS
• Best practices review
Agenda
Challenges of Running Batch Workloads
• Typically resource intensive
• Time constraint for completion
• Potential impact to concurrent batch jobs
• Scaling infrastructure resources
• Ensuring effective resource utilization and cost savings
• Fragile and unreliable
What Batch Workloads Need
Reliability Easy Development Easy Deployment
High Efficiency Low Ops Load Cost Effective
Why the cloud makes sense for batch workloads
Reliable Scalable Pay as you goInfrastructure as
code
Why containers make sense for batch workloads
• Simple to model
• Polyglot
• Image is the version
• Do one thing well
• You build it, you run it
• Black box
Options for batch workloads on AWS
AWS Batch Amazon ECS
Introducing AWS Batch
• Fully managed batch primitives
• Focus on your applications (shell scripts,
Linux executables, Docker images) and
their resource requirements
• We take care of the rest!
IAM role for the
AWS Batch job
Input files
Queue of
runnable jobs
S3 events trigger a
Lambda function that submits
an AWS Batch job
AWS Batch
compute environments
AWS Batch
job output
Typical AWS Batch Job Architecture
Job definition
Job resource requirements
and other parameters
AWS Batch execution
Application
image
AWS Batch
Scheduler
Amazon EC2 Container Service (ECS) is a highly
scalable, high performance container management
service that supports Docker containers and allows you
to easily run applications on a managed cluster of
Amazon EC2 instances.
Introducing Amazon ECS
Cluster Management Made Easy
Nothing to run
Complete state
Control and monitoring
Scale
Performance at Scale
Flexible Container Placement
Applications
Batch jobs
Multiple schedulers
Designed for Use with Other AWS Services
Elastic Load Balancing
Amazon Elastic Block Store
Amazon Virtual Private Cloud
AWS Identity and Access Management
AWS CloudTrail
Spot Fleet
Security
Your own EC2 instances in a VPC
with all its security features to
provide a high level of isolation.
Amazon ECS
EC2 INSTANCES
LOAD
BALANCER ECS
AGENT
TASK
Container
TASK
Container
ECS
AGENT
TASK
Container
TASK
Container AGENT COMMUNICATION
SERVICE
Amazon
ECS
API
CLUSTER MANAGEMENT
ENGINE
KEY/VALUE STORE
ECS
AGENT
TASK
Container
TASK
Container
LOAD
BALANCER
Internet
Running batch workloads on
ECS
File put into
S3 bucket
Amazon
Simple Queue
Service
Output to S3
bucket
Amazon ECS provisions compute
clusters and schedules tasks based
on demand
Batch worker
task polls
SQS for new
jobs
Queue load is
communicated to
ECS
Containerized
batch worker
processes file
Basic batch workflow with ECS
Trigger Batch Processing with Lambda
Amazon ECS
Availability Zone Availability Zone
Container Instance
AutoScaling Group
Task A
AWS Lambda
Amazon
S3 Bucket
(Source)
ecs:RunTask
Amazon
S3 Bucket
(Target)
Amazon
S3 Bucket
Object
Amazon
CloudWatch
AWS CloudTrail
Container Instance
Fleet of workers with ECS with SQS
Amazon ECS
Availability Zone Availability Zone
SQS queue
Container Instance Container Instance
AutoScaling Group
Task A
AWS Lambda
Amazon
S3
DynamoDB
Amazon
Kinesis
ecs:RunTask
Amazon
CloudWatch
AWS CloudTrail
Long-running Batch Jobs
• Utilize Spot
Instances
• EC2 Spot Blocks for
Defined-Duration
Workloads
• ECS event stream
for CloudWatch
Events
• Service Scaling and
Monitoring
Amazon ECS
Availability Zone Availability Zone
Container Instance Container Instance
AutoScaling Group
Task A Task B
Task C
Amazon
CloudWatch
AWS CloudTrail
Get the Best Value for EC2 Capacity – Spot
Instances
• Since Spot instances typically cost 50-90% less than
On-Demand, you can increase your compute capacity
by 2-10x within the same budget
• Or you could save 50-90% on your existing workload
• Either way, you should try it!
Best Practices
• Store state and inputs, outputs in S3 or another
datastore
• Minimize dependencies between task definitions (should
be independent of each other)
• Use Spot Instances and Spot fleets for long-running
batch jobs
• Monitor cluster state with ECS APIs
• Share pools of resources
• Auto Scaling, VPC, IAM, scheduled Reserved Instances
Serving
Maps at
Scale on
AWS
Powers over 5,000 apps in categories ranging from social to mobility
Reaches more than 200 million users each month and growing
C4 R3 M4R3 R3
R3 R3 R3
M4 M4
M4 M4 M4
C4 C4
C4 C4 C4
Map Service Search Service Directions Service
C4
ECS Cluster
R3 M4R3 R3
R3 R3 R3
M4 M4
M4 M4 M4
C4 C4
C4 C4 C4
Map Service Search ServiceDirections Service
Deploying
with the
Old Way
R3 R3 R3
R3 R3 R3
Git SHA 56fb514
R3 R3 R3
R3 R3 R3
Git SHA 168f73e
Deploying
on ECS
Deploys take minutes instead of hours
Can iterate and ship new features faster
Rollbacks are faster
Reduce
Waste with
Better
Instance
Packing
Map
service
CPU 55%
Mem 5%
Search
service
CPU 25%
Mem 75%
Combined
services
with ECS
CPU 80%
Mem 80%
Stability On
Spot with
Instance
Diversity
C4
ECS Cluster
R3 R3 R3
R3 R3 R3
M4
M4
M4 M4
M4 M4
C4 C4
C4 C4 C4
Map Service Search ServiceDirections Service
Spot Fleet
C4
C4
R3
R3
25%
Fewer instances
80-90%
Savings per month on EC2
21
Services
2000
Tasks
1.3 billion
Requests per day
Time and Event-Based Task Scheduling
• Schedule on fixed time intervals (e.g.: number of minutes, hours, or days)
• or use cron expressions.
• Set Amazon ECS as a CloudWatch Events target
Time and Event-Based Task Scheduling
• Schedule on fixed time intervals (e.g.: number of minutes, hours, or days)
• or use cron expressions.
• Set Amazon ECS as a CloudWatch Events target
Summary
• Cloud and containers are a great way to run batch
workloads
• Two options on AWS: Batch and ECS
• Why AWS Batch:
• Managed Batch Processing environment
• Why ECS:
• DIY Batch Processing
• Very flexible Time and Event based Task Scheduling
Thank You!
Don’t Forget Evaluations!

More Related Content

What's hot

AWS Cloud Kata | Bangkok - Getting to Scale on AWS
AWS Cloud Kata | Bangkok - Getting to Scale on AWSAWS Cloud Kata | Bangkok - Getting to Scale on AWS
AWS Cloud Kata | Bangkok - Getting to Scale on AWS
Amazon Web Services
 

What's hot (20)

從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
 
CI&CD with AWS - AWS Prague User Group - May 2015
CI&CD with AWS - AWS Prague User Group - May 2015CI&CD with AWS - AWS Prague User Group - May 2015
CI&CD with AWS - AWS Prague User Group - May 2015
 
AWS 101 - Amazon Web Services
AWS 101 - Amazon Web ServicesAWS 101 - Amazon Web Services
AWS 101 - Amazon Web Services
 
Moving Viadeo to AWS (2015)
Moving Viadeo to AWS (2015)Moving Viadeo to AWS (2015)
Moving Viadeo to AWS (2015)
 
AWS CloudFormation (February 2016)
AWS CloudFormation (February 2016)AWS CloudFormation (February 2016)
AWS CloudFormation (February 2016)
 
Real World AWS Deployment With Boto, Fabric, and Cloud Formation
Real World AWS Deployment With Boto, Fabric, and Cloud FormationReal World AWS Deployment With Boto, Fabric, and Cloud Formation
Real World AWS Deployment With Boto, Fabric, and Cloud Formation
 
Building a data warehouse with Amazon Redshift … and a quick look at Amazon ...
Building a data warehouse  with Amazon Redshift … and a quick look at Amazon ...Building a data warehouse  with Amazon Redshift … and a quick look at Amazon ...
Building a data warehouse with Amazon Redshift … and a quick look at Amazon ...
 
初探AWS 平台上的 NoSQL 雲端資料庫服務
初探AWS 平台上的 NoSQL 雲端資料庫服務初探AWS 平台上的 NoSQL 雲端資料庫服務
初探AWS 平台上的 NoSQL 雲端資料庫服務
 
Cloud Economics, from Genesis to Scale
Cloud Economics, from Genesis to ScaleCloud Economics, from Genesis to Scale
Cloud Economics, from Genesis to Scale
 
Building a WorkFlow using AWS Step Functions with Skycatch
Building a WorkFlow using AWS Step Functions with SkycatchBuilding a WorkFlow using AWS Step Functions with Skycatch
Building a WorkFlow using AWS Step Functions with Skycatch
 
AWS re:Invent 2016: 20k in 20 Days - Agile Genomic Analysis (ENT320)
AWS re:Invent 2016: 20k in 20 Days - Agile Genomic Analysis (ENT320)AWS re:Invent 2016: 20k in 20 Days - Agile Genomic Analysis (ENT320)
AWS re:Invent 2016: 20k in 20 Days - Agile Genomic Analysis (ENT320)
 
Running Docker clusters on AWS (June 2016)
Running Docker clusters on AWS (June 2016)Running Docker clusters on AWS (June 2016)
Running Docker clusters on AWS (June 2016)
 
Introduction to Amazon EC2 Spot
Introduction to Amazon EC2 Spot Introduction to Amazon EC2 Spot
Introduction to Amazon EC2 Spot
 
Containers Meetup (AWS+CNCF) Milano Jan 15th 2020
Containers Meetup (AWS+CNCF) Milano Jan 15th 2020Containers Meetup (AWS+CNCF) Milano Jan 15th 2020
Containers Meetup (AWS+CNCF) Milano Jan 15th 2020
 
Introduction to Amazon EC2 Spot Instances
Introduction to Amazon EC2 Spot InstancesIntroduction to Amazon EC2 Spot Instances
Introduction to Amazon EC2 Spot Instances
 
AWS re:Invent 2016: Datapipe Open Source: Image Development Pipeline (ARC319)
AWS re:Invent 2016: Datapipe Open Source:  Image Development Pipeline (ARC319)AWS re:Invent 2016: Datapipe Open Source:  Image Development Pipeline (ARC319)
AWS re:Invent 2016: Datapipe Open Source: Image Development Pipeline (ARC319)
 
AWS Cloud Kata | Bangkok - Getting to Scale on AWS
AWS Cloud Kata | Bangkok - Getting to Scale on AWSAWS Cloud Kata | Bangkok - Getting to Scale on AWS
AWS Cloud Kata | Bangkok - Getting to Scale on AWS
 
Deep Dive with Amazon EC2 Container Service Hands-on Workshop
Deep Dive with Amazon EC2 Container Service Hands-on WorkshopDeep Dive with Amazon EC2 Container Service Hands-on Workshop
Deep Dive with Amazon EC2 Container Service Hands-on Workshop
 
Optimizing costs with spot instances
Optimizing costs with spot instancesOptimizing costs with spot instances
Optimizing costs with spot instances
 
Using Amazon CloudWatch Events, AWS Lambda and Spark Streaming to Process EC...
 Using Amazon CloudWatch Events, AWS Lambda and Spark Streaming to Process EC... Using Amazon CloudWatch Events, AWS Lambda and Spark Streaming to Process EC...
Using Amazon CloudWatch Events, AWS Lambda and Spark Streaming to Process EC...
 

Similar to Intro to batch processing on AWS

Intro to Batch Processing on AWS - DevDay Austin 2017
Intro to Batch Processing on AWS - DevDay Austin 2017Intro to Batch Processing on AWS - DevDay Austin 2017
Intro to Batch Processing on AWS - DevDay Austin 2017
Amazon Web Services
 
gkkAwscloudpractitioneressentialstraining
gkkAwscloudpractitioneressentialstraininggkkAwscloudpractitioneressentialstraining
gkkAwscloudpractitioneressentialstraining
Anne Starr
 

Similar to Intro to batch processing on AWS (20)

Intro to Batch Processing on AWS - DevDay Los Angeles 2017
Intro to Batch Processing on AWS - DevDay Los Angeles 2017Intro to Batch Processing on AWS - DevDay Los Angeles 2017
Intro to Batch Processing on AWS - DevDay Los Angeles 2017
 
Intro to Batch Processing on AWS - DevDay Austin 2017
Intro to Batch Processing on AWS - DevDay Austin 2017Intro to Batch Processing on AWS - DevDay Austin 2017
Intro to Batch Processing on AWS - DevDay Austin 2017
 
Container Stories from the Trenches
Container Stories from the TrenchesContainer Stories from the Trenches
Container Stories from the Trenches
 
Deep Dive on Microservices and Docker
Deep Dive on Microservices and DockerDeep Dive on Microservices and Docker
Deep Dive on Microservices and Docker
 
What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?
 
gkkAwscloudpractitioneressentialstraining
gkkAwscloudpractitioneressentialstraininggkkAwscloudpractitioneressentialstraining
gkkAwscloudpractitioneressentialstraining
 
Cloud & Native Cloud for Managers
Cloud & Native Cloud for ManagersCloud & Native Cloud for Managers
Cloud & Native Cloud for Managers
 
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)
 
Making sense of containers, docker and Kubernetes on Azure.
Making sense of containers, docker and Kubernetes on Azure.Making sense of containers, docker and Kubernetes on Azure.
Making sense of containers, docker and Kubernetes on Azure.
 
Best of re:Invent
Best of re:InventBest of re:Invent
Best of re:Invent
 
Day 5 - Reviewing the Best of 2014
Day 5 - Reviewing the Best of 2014Day 5 - Reviewing the Best of 2014
Day 5 - Reviewing the Best of 2014
 
Getting Started on AWS
Getting Started on AWSGetting Started on AWS
Getting Started on AWS
 
Day 1 - Introduction to Cloud Computing with Amazon Web Services
Day 1 - Introduction to Cloud Computing with Amazon Web ServicesDay 1 - Introduction to Cloud Computing with Amazon Web Services
Day 1 - Introduction to Cloud Computing with Amazon Web Services
 
Amazon Webservices Introduction And Core Modules
Amazon Webservices Introduction And Core Modules Amazon Webservices Introduction And Core Modules
Amazon Webservices Introduction And Core Modules
 
AWS Education and Research 101
AWS Education and Research 101AWS Education and Research 101
AWS Education and Research 101
 
SRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerSRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and Docker
 
Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 
The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017
 
AWS Lambda and Serverless Cloud
AWS Lambda and Serverless CloudAWS Lambda and Serverless Cloud
AWS Lambda and Serverless Cloud
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...КАТЕРИНА АБЗЯТОВА  «Ефективне планування тестування  ключові аспекти та практ...
КАТЕРИНА АБЗЯТОВА «Ефективне планування тестування ключові аспекти та практ...
 

Intro to batch processing on AWS

  • 1.
  • 2. WIFI: awsDevDay | PASS: CodeHappy U P N E X T : Introduction to Batch Processing on AWS
  • 3. T H A N K S T O O U R F R I E N D S A T :
  • 4. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Pierre Steckmeyer, Solutions Architect August 1, 2017 Introduction to Batch Processing on AWS
  • 5. • Batch processing – overview and challenges • Why run batch workloads in the cloud • Overview of AWS batch solutions • Deep dive look at AWS Batch and Amazon ECS • Best practices review Agenda
  • 6. Challenges of Running Batch Workloads • Typically resource intensive • Time constraint for completion • Potential impact to concurrent batch jobs • Scaling infrastructure resources • Ensuring effective resource utilization and cost savings • Fragile and unreliable
  • 7. What Batch Workloads Need Reliability Easy Development Easy Deployment High Efficiency Low Ops Load Cost Effective
  • 8. Why the cloud makes sense for batch workloads Reliable Scalable Pay as you goInfrastructure as code
  • 9. Why containers make sense for batch workloads • Simple to model • Polyglot • Image is the version • Do one thing well • You build it, you run it • Black box
  • 10. Options for batch workloads on AWS AWS Batch Amazon ECS
  • 11. Introducing AWS Batch • Fully managed batch primitives • Focus on your applications (shell scripts, Linux executables, Docker images) and their resource requirements • We take care of the rest!
  • 12. IAM role for the AWS Batch job Input files Queue of runnable jobs S3 events trigger a Lambda function that submits an AWS Batch job AWS Batch compute environments AWS Batch job output Typical AWS Batch Job Architecture Job definition Job resource requirements and other parameters AWS Batch execution Application image AWS Batch Scheduler
  • 13. Amazon EC2 Container Service (ECS) is a highly scalable, high performance container management service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances. Introducing Amazon ECS
  • 14. Cluster Management Made Easy Nothing to run Complete state Control and monitoring Scale
  • 17. Designed for Use with Other AWS Services Elastic Load Balancing Amazon Elastic Block Store Amazon Virtual Private Cloud AWS Identity and Access Management AWS CloudTrail Spot Fleet
  • 18. Security Your own EC2 instances in a VPC with all its security features to provide a high level of isolation.
  • 19. Amazon ECS EC2 INSTANCES LOAD BALANCER ECS AGENT TASK Container TASK Container ECS AGENT TASK Container TASK Container AGENT COMMUNICATION SERVICE Amazon ECS API CLUSTER MANAGEMENT ENGINE KEY/VALUE STORE ECS AGENT TASK Container TASK Container LOAD BALANCER Internet
  • 21. File put into S3 bucket Amazon Simple Queue Service Output to S3 bucket Amazon ECS provisions compute clusters and schedules tasks based on demand Batch worker task polls SQS for new jobs Queue load is communicated to ECS Containerized batch worker processes file Basic batch workflow with ECS
  • 22. Trigger Batch Processing with Lambda Amazon ECS Availability Zone Availability Zone Container Instance AutoScaling Group Task A AWS Lambda Amazon S3 Bucket (Source) ecs:RunTask Amazon S3 Bucket (Target) Amazon S3 Bucket Object Amazon CloudWatch AWS CloudTrail Container Instance
  • 23. Fleet of workers with ECS with SQS Amazon ECS Availability Zone Availability Zone SQS queue Container Instance Container Instance AutoScaling Group Task A AWS Lambda Amazon S3 DynamoDB Amazon Kinesis ecs:RunTask Amazon CloudWatch AWS CloudTrail
  • 24. Long-running Batch Jobs • Utilize Spot Instances • EC2 Spot Blocks for Defined-Duration Workloads • ECS event stream for CloudWatch Events • Service Scaling and Monitoring Amazon ECS Availability Zone Availability Zone Container Instance Container Instance AutoScaling Group Task A Task B Task C Amazon CloudWatch AWS CloudTrail
  • 25. Get the Best Value for EC2 Capacity – Spot Instances • Since Spot instances typically cost 50-90% less than On-Demand, you can increase your compute capacity by 2-10x within the same budget • Or you could save 50-90% on your existing workload • Either way, you should try it!
  • 26. Best Practices • Store state and inputs, outputs in S3 or another datastore • Minimize dependencies between task definitions (should be independent of each other) • Use Spot Instances and Spot fleets for long-running batch jobs • Monitor cluster state with ECS APIs • Share pools of resources • Auto Scaling, VPC, IAM, scheduled Reserved Instances
  • 27.
  • 28. Serving Maps at Scale on AWS Powers over 5,000 apps in categories ranging from social to mobility Reaches more than 200 million users each month and growing
  • 29. C4 R3 M4R3 R3 R3 R3 R3 M4 M4 M4 M4 M4 C4 C4 C4 C4 C4 Map Service Search Service Directions Service
  • 30. C4 ECS Cluster R3 M4R3 R3 R3 R3 R3 M4 M4 M4 M4 M4 C4 C4 C4 C4 C4 Map Service Search ServiceDirections Service
  • 31. Deploying with the Old Way R3 R3 R3 R3 R3 R3 Git SHA 56fb514 R3 R3 R3 R3 R3 R3 Git SHA 168f73e
  • 32. Deploying on ECS Deploys take minutes instead of hours Can iterate and ship new features faster Rollbacks are faster
  • 33. Reduce Waste with Better Instance Packing Map service CPU 55% Mem 5% Search service CPU 25% Mem 75% Combined services with ECS CPU 80% Mem 80%
  • 34. Stability On Spot with Instance Diversity C4 ECS Cluster R3 R3 R3 R3 R3 R3 M4 M4 M4 M4 M4 M4 C4 C4 C4 C4 C4 Map Service Search ServiceDirections Service Spot Fleet C4 C4 R3 R3
  • 37. Time and Event-Based Task Scheduling • Schedule on fixed time intervals (e.g.: number of minutes, hours, or days) • or use cron expressions. • Set Amazon ECS as a CloudWatch Events target
  • 38. Time and Event-Based Task Scheduling • Schedule on fixed time intervals (e.g.: number of minutes, hours, or days) • or use cron expressions. • Set Amazon ECS as a CloudWatch Events target
  • 39. Summary • Cloud and containers are a great way to run batch workloads • Two options on AWS: Batch and ECS • Why AWS Batch: • Managed Batch Processing environment • Why ECS: • DIY Batch Processing • Very flexible Time and Event based Task Scheduling