SlideShare a Scribd company logo
1 of 69
Download to read offline
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Batch: Easy & Efficient Batch
Computing on Amazon Web Services
Michael Raposa
Head of Platform Engineering
AQR Capital
C M P 3 7 2
Rey Wang
Senior Product Manager
AWS Batch and HPC
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
Intro to AWS Batch
Summary of recent AWS Batch launches
Glimpse into our roadmap
Real world use case: how AQR Capital
leverages AWS Batch to identify new
investment signals
Q&A
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Batch
Fully managed Integrated with AWS Cost-optimized
resource provisioning
No software to install
or servers to manage
Natively integrated with
the AWS platform
Automatically provisions
compute resources tailored to
the needs of your jobs using
Amazon On-Demand and Spot
pricing.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pricing
There is no additional charge for AWS Batch
You only pay for the AWS resources (for example, Amazon Elastic
Compute Cloud [Amazon EC2] instances) you create to store and run
your batch jobs
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Batch regional expansion
AWS Batch is available in 15 regions:
us-east-1 (N. Virginia)
us-east-2 (Ohio)
us-west-1 (N. California)
us-west-2 (Oregon)
eu-west-1 (Ireland)
eu-west-2 (London)
eu-west-3 (Paris)
eu-central-1 (Frankfurt)
ap-south-1 (Mumbai)
ap-northeast-1 (Tokyo)
ap-northeast-2 (Seoul)
ap-southeast-1 (Singapore)
ap-southeast-2 (Sydney)
ca-central-1 (Canada Central)
sa-east-1 (São Paulo)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Manageability & performance improvements
• AWS Batch as an Amazon CloudWatch Events target: automate your
workflow; submit a job to AWS Batch in response to an event pattern
or on a schedule
• Job execution timeout: control cost by automatically terminating your
job once the job has been running for the specified duration
• AWS CloudTrail audit calls to AWS Batch APIs: ensure compliance with
internal policies and regulatory standards
• Scheduling & throughput enhancements
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Improved managed compute environments
• Use Amazon EC2 launch templates in your compute environment
• Example use cases include:
• Increase size/encrypt container volume
• Support custom user-data: mount Amazon Elastic File System (Amazon EFS) on instance
launch, without needing to create a custom AMI
• Override Amazon Elastic Container Service (Amazon ECS) image cleanup
• New instance types
• Z1d
• R5
• R5d
• M5d
• C5d
• X1e
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Container
Container 1 Container 2
Container 3 Container 4
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Support for multi-node parallel jobs
Container 1 Container 2
Container 3 Container 4
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Support for multi-node parallel jobs
• Run single jobs which require
multiple EC2 instances.
• Designed for distributed
computing needs.
• Tightly coupled High Performance
Computing (HPC) applications
• Distributed deep learning trainings
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What can you expect in the next 12 months?
• Support for Elastic Fabric Adapter
• Significant improvements to the AWS Batch console
• Batch to emit CloudWatch metrics for monitoring
• Better GPU support
• More scheduling and performance improvements
• New instance types
• Further regional expansion
How AQR Capital leverages AWS to
research new investment signals
Michael Raposa
November 28, 2018
Head of Platform Engineering, AQR
Not intended for the sale or marketing of financial products or services.
Disclosures
16
The information set forth herein has been obtained or derived from sources believed by AQR Capital Management, LLC (“AQR”) to be reliable. However, AQR does not make any representation or warranty, express or implied, as to the information’s
accuracy or completeness. This presentation does not represent a formal or official view of AQR. Nor services or applications referenced are specifically endorsed by AQR..
The information contained herein is only as current as of the date indicated, and may be superseded by subsequent market events or for other reasons. Charts and graphs provided herein are for illustrative purposes only. The information in this
presentation has been developed internally and/or obtained from sources believed to be reliable; however, neither AQR nor the speaker guarantees the accuracy, adequacy or completeness of such information.
Neither AQR nor the speaker assumes any duty to, nor undertakes to update forward looking statements. No representation or warranty, express or implied, is made or given by or on behalf of AQR, the speaker or any other person as to the accuracy
and completeness or fairness of the information contained in this presentation, and no responsibility or liability is accepted for any such information. By accepting this presentation in its entirety, the recipient acknowledges its understanding and
acceptance of the foregoing statement.
Agenda
• About AQR
• Business problem
• Solution summary
• Lessons learned
• Takeaways
17
Our Firm
AQR is a global investment management firm built at the intersection of financial theory and practical application. We strive to deliver superior, long-term results for our
clients by looking past market noise to identify and isolate what matters most, and by developing ideas that stand up to rigorous testing. Our focus on practical
insights and analysis has made us leaders in alternative and traditional strategies since 1998.
At a glance
• AQR takes a systematic, research-driven approach to managing alternative and traditional strategies
• We apply quantitative tools to process fundamental information and manage risk
• Our clients include institutional investors, such as pension funds, defined contribution plans, insurance companies, endowments, foundations, family offices, and
sovereign wealth funds, as well as RIAs, private banks, and financial advisors
• The firm has 36 principals and 1,025 employees; over half of employees hold advanced degrees
• AQR is based in Greenwich, Connecticut, with offices in Boston, Chicago, Hong Kong, London, Los Angeles, and Sydney
• Approximately $226 billion in assets under management as of September 30, 2018*
*Approximate as of 9/30/2018, includes assets managed by AQR and its advisory affiliates. 18
Problem statement
Background
• Quantitative asset manager
• Investment decisions based on numerical models and
systematic trading
• Researchers develops models and “back test” ideas
over many years
• Never-ending appetite for more data and non-obvious
data sets
19
Source: AQR. For illustrative purposes only.
Researcher workflow
Idea
Gather
data
Build
model
Back test
Analyze
result
20
Source: AQR. For illustrative purposes only.
Problem statement
Background
• Quantitative asset manager
• Investment decisions based on numerical models and
systematic trading
• Researchers develop models and “back test” ideas over
many years
• Never-ending appetite for more data and non-obvious
data sets
Problem
• On-premise compute grid can’t keep up with demand
• CAPEX locked into grid
• Researchers wait for grid resources
• Researchers need job results as quickly as possible
• New experimental use cases, such as GPU, require
significant time and money upfront investment
21
Source: AQR. For illustrative purposes only.
Design considerations
1.Scalable both in compute and memory
2.Fast without long queue times
3.Don’t want to manage a job scheduler, for example, Condor
or Sun Grid Engine
4.Easy to use
5.Secure
22
Solution summary
Burstability
Leverage building block
application services, for
example, Amazon Simple
storage Service (Amazon S3),
EC2, Amazon DynamoDB
Reduce costs—Spot
Leverage cloud
23
Solution summary
Based on Sun Grid Engine
Submit jobs via AWS Command
Line Intervace (AWS CLI) or
API—unlimited compute at
fingertips
Short feedback loop—immediate
results to researcher
Backend grid matches researcher
workstation—no “It works in DEV
but not on the GRID”
Seamless interface
24
Solution summary
Automate everything
“Unlimited” compute
Short start times—no
“queues”
Fast
25
Solution summary
Encryption everywhere
Leverage AWS security
tools
Security is a cloud
engineering problem
Secure
26
Solution summary
Step 1
27
Solution summary
Step 2
28
Solution summary
Step 3
29
Solution summary
Step 4
30
Solution summary—Job submission (AWS CLI)
31
Solution summary—Job submission (AWS CLI)
32
Solution summary—Job submission (AWS CLI)
33
Solution summary—Job submission (AWS CLI)
34
Solution summary—Job submission (AWS CLI)
35
Solution summary—Job submission (AWS CLI)
Step 1
36
Solution summary—Job submission (AWS CLI)
Step 2
37
Solution summary—Job submission (AWS CLI)
Step 3
38
Solution summary—Job submission (AWS CLI)
Step 4
39
Solution summary—DAGs
40
Client
Jobs
Child jobs
Backtest: 20,000 Equities 1998-2018
AAPL: 1998-2018 MSFT: 1998-2018
MSFT: 1998 MSFT: 1999 MSFT: 2018AAPL: 1998 AAPL: 1999 AAPL: 2018
Lessons learned—General
Use Spot
Use as many instance
types as you can
Use as many AZs as
you can
Drive the lowest cost - $15/1000 vCPU/hour
41
Spot Instance
Availability Zones
Instance Families
Lessons learned—General
ECS logs
Job output
Host logs
… more
Log everything!
42
Lessons learned—General
Job runtime
Job start-up times
Job cost by user
vCPU consumption by user
High priority queue
consumption by user
Monitor everything!
43
CloudWatch dashboard
44
CloudWatch dashboard
45
Lessons learned
As we added more users …
1.TooManyRequestsException—DescribeJobs API call failing
2.Providing governance “guard rails”
46
Lessons learned—Event-based pipeline
47
Lessons learned—Event-based pipeline
48
Lessons learned—Event-based pipeline
49
Lessons learned—Event-based pipeline
50
Lessons learned
As we added more compute …
1.TooManyRequestsException … API calls in the container
2.Job state storage woes
3.Job start times
4.Job costs
51
Lessons learned—API in containers
52
Parameter
Store
ECS
container
Get
Secret
Lessons learned—API in containers
At scale
53
Exception:
too many
requests
Parameter
Store
ECS
container
ECS
container
ECS
container
ECS
container
ECS
container
Lessons learned—API in containers
At scale
54
ECS
container
ECS
container
ECS
container
ECS
container
ECS
container
Amazon
S3
What is shared job state?
Shared memory
across containers
Job assignment &
orchestration
Job input and output
55
Lessons learned—Job state backend
56
Amazon
EFS
Lessons learned—Job state backend
57
Amazon
EFS
Redis
Lessons learned—Job state backend
58
Amazon
EFS
Amazon
S3
Redis
Lessons learned—Job start times
59
Lessons learned—Set limits
60
Takeaways
Spot
Multi-AZ
Log everything
Monitor everything
Follow best practices
61
Spot
Instance Availability Zones
Takeaways
Eliminate API calls in your
containers. Only use services
that scale, such as
DynamoDB.
Switch to event/message-
based status from poll-based.
Choose a job state backend
that fits your use case.
Watch for scale issues
62
Takeaways
Write a light-weight
wrapper around AWS
Batch for non-technical
users
Have a “governator”
function to ensure
compliance
Make it easy and secure
63
Takeaways
Packer and bake AMIs
Pre-warm the cluster
during active times of the
day
Give yourself an SLA—
75% of jobs start in 10
mins and 90% in 15 mins
Reduce your start times
64
Takeaways
Alarm for large long-
running jobs -> $$$
Set limits on your
compute environment
Control runaway costs
65
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Michael Raposa
Head of Platform Engineering
AQR Capital
Rey Wang
Senior Product Manager
AWS Batch and HPC
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Amazon Web Services
 
Accelerating Life Sciences with HPC on AWS - AWS Online Tech Talks
Accelerating Life Sciences with HPC on AWS - AWS Online Tech TalksAccelerating Life Sciences with HPC on AWS - AWS Online Tech Talks
Accelerating Life Sciences with HPC on AWS - AWS Online Tech TalksAmazon Web Services
 
Work Anywhere with Amazon Workspaces (Level: 200)
Work Anywhere with Amazon Workspaces (Level: 200)Work Anywhere with Amazon Workspaces (Level: 200)
Work Anywhere with Amazon Workspaces (Level: 200)Amazon Web Services
 
Building Serverless Applications with Amazon DynamoDB & AWS Lambda - Workshop...
Building Serverless Applications with Amazon DynamoDB & AWS Lambda - Workshop...Building Serverless Applications with Amazon DynamoDB & AWS Lambda - Workshop...
Building Serverless Applications with Amazon DynamoDB & AWS Lambda - Workshop...Amazon Web Services
 
High Performance Computing on AWS: Driving Innovation without Infrastructure ...
High Performance Computing on AWS: Driving Innovation without Infrastructure ...High Performance Computing on AWS: Driving Innovation without Infrastructure ...
High Performance Computing on AWS: Driving Innovation without Infrastructure ...Amazon Web Services
 
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...Amazon Web Services
 
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018Amazon Web Services
 
Reserve Amazon EC2 On-Demand Capacity for Any Duration with On-Demand Capacit...
Reserve Amazon EC2 On-Demand Capacity for Any Duration with On-Demand Capacit...Reserve Amazon EC2 On-Demand Capacity for Any Duration with On-Demand Capacit...
Reserve Amazon EC2 On-Demand Capacity for Any Duration with On-Demand Capacit...Amazon Web Services
 
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...Amazon Web Services
 
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...Amazon Web Services
 
Hands-On: Deploy Remote Graphics Desktops for Content Production (CMP422) - A...
Hands-On: Deploy Remote Graphics Desktops for Content Production (CMP422) - A...Hands-On: Deploy Remote Graphics Desktops for Content Production (CMP422) - A...
Hands-On: Deploy Remote Graphics Desktops for Content Production (CMP422) - A...Amazon Web Services
 
MySQL High Availability & Disaster Recovery (DAT361) - AWS re:Invent 2018
MySQL High Availability & Disaster Recovery (DAT361) - AWS re:Invent 2018MySQL High Availability & Disaster Recovery (DAT361) - AWS re:Invent 2018
MySQL High Availability & Disaster Recovery (DAT361) - AWS re:Invent 2018Amazon Web Services
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Web Services
 
[NEW LAUNCH!] Introducing Amazon EC2 A1 Instances Based on the Arm Architectu...
[NEW LAUNCH!] Introducing Amazon EC2 A1 Instances Based on the Arm Architectu...[NEW LAUNCH!] Introducing Amazon EC2 A1 Instances Based on the Arm Architectu...
[NEW LAUNCH!] Introducing Amazon EC2 A1 Instances Based on the Arm Architectu...Amazon Web Services
 
Beyond the Basics: Advanced Infrastructure as Code Programming on AWS (DEV327...
Beyond the Basics: Advanced Infrastructure as Code Programming on AWS (DEV327...Beyond the Basics: Advanced Infrastructure as Code Programming on AWS (DEV327...
Beyond the Basics: Advanced Infrastructure as Code Programming on AWS (DEV327...Amazon Web Services
 
Serverless Data Prep with AWS Glue (ANT313) - AWS re:Invent 2018
Serverless Data Prep with AWS Glue (ANT313) - AWS re:Invent 2018Serverless Data Prep with AWS Glue (ANT313) - AWS re:Invent 2018
Serverless Data Prep with AWS Glue (ANT313) - AWS re:Invent 2018Amazon Web Services
 
Bridge OLTP and Stream Processing with Amazon Kinesis, AWS Lambda, & MongoDB ...
Bridge OLTP and Stream Processing with Amazon Kinesis, AWS Lambda, & MongoDB ...Bridge OLTP and Stream Processing with Amazon Kinesis, AWS Lambda, & MongoDB ...
Bridge OLTP and Stream Processing with Amazon Kinesis, AWS Lambda, & MongoDB ...Amazon Web Services
 
Migrate from Netezza to Amazon Redshift: Best Practices with Financial Engine...
Migrate from Netezza to Amazon Redshift: Best Practices with Financial Engine...Migrate from Netezza to Amazon Redshift: Best Practices with Financial Engine...
Migrate from Netezza to Amazon Redshift: Best Practices with Financial Engine...Amazon Web Services
 

What's hot (20)

Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
 
Accelerating Life Sciences with HPC on AWS - AWS Online Tech Talks
Accelerating Life Sciences with HPC on AWS - AWS Online Tech TalksAccelerating Life Sciences with HPC on AWS - AWS Online Tech Talks
Accelerating Life Sciences with HPC on AWS - AWS Online Tech Talks
 
Work Anywhere with Amazon Workspaces (Level: 200)
Work Anywhere with Amazon Workspaces (Level: 200)Work Anywhere with Amazon Workspaces (Level: 200)
Work Anywhere with Amazon Workspaces (Level: 200)
 
Building Serverless Applications with Amazon DynamoDB & AWS Lambda - Workshop...
Building Serverless Applications with Amazon DynamoDB & AWS Lambda - Workshop...Building Serverless Applications with Amazon DynamoDB & AWS Lambda - Workshop...
Building Serverless Applications with Amazon DynamoDB & AWS Lambda - Workshop...
 
High Performance Computing on AWS: Driving Innovation without Infrastructure ...
High Performance Computing on AWS: Driving Innovation without Infrastructure ...High Performance Computing on AWS: Driving Innovation without Infrastructure ...
High Performance Computing on AWS: Driving Innovation without Infrastructure ...
 
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
 
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
 
Reserve Amazon EC2 On-Demand Capacity for Any Duration with On-Demand Capacit...
Reserve Amazon EC2 On-Demand Capacity for Any Duration with On-Demand Capacit...Reserve Amazon EC2 On-Demand Capacity for Any Duration with On-Demand Capacit...
Reserve Amazon EC2 On-Demand Capacity for Any Duration with On-Demand Capacit...
 
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
 
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...
 
Hands-On: Deploy Remote Graphics Desktops for Content Production (CMP422) - A...
Hands-On: Deploy Remote Graphics Desktops for Content Production (CMP422) - A...Hands-On: Deploy Remote Graphics Desktops for Content Production (CMP422) - A...
Hands-On: Deploy Remote Graphics Desktops for Content Production (CMP422) - A...
 
MySQL High Availability & Disaster Recovery (DAT361) - AWS re:Invent 2018
MySQL High Availability & Disaster Recovery (DAT361) - AWS re:Invent 2018MySQL High Availability & Disaster Recovery (DAT361) - AWS re:Invent 2018
MySQL High Availability & Disaster Recovery (DAT361) - AWS re:Invent 2018
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
 
[NEW LAUNCH!] Introducing Amazon EC2 A1 Instances Based on the Arm Architectu...
[NEW LAUNCH!] Introducing Amazon EC2 A1 Instances Based on the Arm Architectu...[NEW LAUNCH!] Introducing Amazon EC2 A1 Instances Based on the Arm Architectu...
[NEW LAUNCH!] Introducing Amazon EC2 A1 Instances Based on the Arm Architectu...
 
Beyond the Basics: Advanced Infrastructure as Code Programming on AWS (DEV327...
Beyond the Basics: Advanced Infrastructure as Code Programming on AWS (DEV327...Beyond the Basics: Advanced Infrastructure as Code Programming on AWS (DEV327...
Beyond the Basics: Advanced Infrastructure as Code Programming on AWS (DEV327...
 
Serverless Data Prep with AWS Glue (ANT313) - AWS re:Invent 2018
Serverless Data Prep with AWS Glue (ANT313) - AWS re:Invent 2018Serverless Data Prep with AWS Glue (ANT313) - AWS re:Invent 2018
Serverless Data Prep with AWS Glue (ANT313) - AWS re:Invent 2018
 
Amazon Aurora 深度探討
Amazon Aurora 深度探討Amazon Aurora 深度探討
Amazon Aurora 深度探討
 
SRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 FoundationsSRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 Foundations
 
Bridge OLTP and Stream Processing with Amazon Kinesis, AWS Lambda, & MongoDB ...
Bridge OLTP and Stream Processing with Amazon Kinesis, AWS Lambda, & MongoDB ...Bridge OLTP and Stream Processing with Amazon Kinesis, AWS Lambda, & MongoDB ...
Bridge OLTP and Stream Processing with Amazon Kinesis, AWS Lambda, & MongoDB ...
 
Migrate from Netezza to Amazon Redshift: Best Practices with Financial Engine...
Migrate from Netezza to Amazon Redshift: Best Practices with Financial Engine...Migrate from Netezza to Amazon Redshift: Best Practices with Financial Engine...
Migrate from Netezza to Amazon Redshift: Best Practices with Financial Engine...
 

Similar to Intro to AWS Batch & How AQR Capital leverages AWS to Identify New Investment Signals (CMP372) - AWS re:Invent 2018

How AQR Capital Uses AWS to Research New Investment Signals
How AQR Capital Uses AWS to Research New Investment Signals How AQR Capital Uses AWS to Research New Investment Signals
How AQR Capital Uses AWS to Research New Investment Signals Amazon Web Services
 
AWS Governance at Scale_AWSPSSummit_Singapore
AWS Governance at Scale_AWSPSSummit_SingaporeAWS Governance at Scale_AWSPSSummit_Singapore
AWS Governance at Scale_AWSPSSummit_SingaporeAmazon Web Services
 
Living the AWS Well Architected Framework
Living the AWS Well Architected FrameworkLiving the AWS Well Architected Framework
Living the AWS Well Architected FrameworkAdam Dillman
 
Hybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWSHybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWSTom Laszewski
 
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...Amazon Web Services
 
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...Amazon Web Services
 
AWSome Day Intro Stockholm 201509
AWSome Day Intro Stockholm 201509AWSome Day Intro Stockholm 201509
AWSome Day Intro Stockholm 201509Amazon Web Services
 
Mythbusting the Federal Cloud Journey
Mythbusting the Federal Cloud JourneyMythbusting the Federal Cloud Journey
Mythbusting the Federal Cloud JourneyAmazon Web Services
 
Migrate, Modernize, and Manage: Best Practices for a Cloud Migration
Migrate, Modernize, and Manage: Best Practices for a Cloud MigrationMigrate, Modernize, and Manage: Best Practices for a Cloud Migration
Migrate, Modernize, and Manage: Best Practices for a Cloud MigrationAmazon Web Services
 
“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...
“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...
“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...Amazon Web Services
 
Cloud DevSecOps and compliance considerations leveraging AWS Marketplace sellers
Cloud DevSecOps and compliance considerations leveraging AWS Marketplace sellersCloud DevSecOps and compliance considerations leveraging AWS Marketplace sellers
Cloud DevSecOps and compliance considerations leveraging AWS Marketplace sellersAmazon Web Services
 
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...Amazon Web Services
 
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017Amazon Web Services
 
Building Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptxBuilding Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptxNelson Kimathi
 
Cloud DevSecOps Considerations Leveraging AWS Marketplace Software
Cloud DevSecOps Considerations Leveraging AWS Marketplace SoftwareCloud DevSecOps Considerations Leveraging AWS Marketplace Software
Cloud DevSecOps Considerations Leveraging AWS Marketplace SoftwareAmazon Web Services
 
How Cardknox Migrated 1M+ Sensitive Records to AWS
 How Cardknox Migrated 1M+ Sensitive Records to AWS How Cardknox Migrated 1M+ Sensitive Records to AWS
How Cardknox Migrated 1M+ Sensitive Records to AWSAmazon Web Services
 
Security & Compliance in the Cloud
Security & Compliance in the CloudSecurity & Compliance in the Cloud
Security & Compliance in the CloudAmazon Web Services
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Amazon Web Services
 

Similar to Intro to AWS Batch & How AQR Capital leverages AWS to Identify New Investment Signals (CMP372) - AWS re:Invent 2018 (20)

How AQR Capital Uses AWS to Research New Investment Signals
How AQR Capital Uses AWS to Research New Investment Signals How AQR Capital Uses AWS to Research New Investment Signals
How AQR Capital Uses AWS to Research New Investment Signals
 
AWS Governance at Scale_AWSPSSummit_Singapore
AWS Governance at Scale_AWSPSSummit_SingaporeAWS Governance at Scale_AWSPSSummit_Singapore
AWS Governance at Scale_AWSPSSummit_Singapore
 
Living the AWS Well Architected Framework
Living the AWS Well Architected FrameworkLiving the AWS Well Architected Framework
Living the AWS Well Architected Framework
 
Hybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWSHybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWS
 
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
Enabling Your Organization’s Amazon Redshift Adoption – Going from Zero to He...
 
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...
Distributed Solar Systems at EDF Renewables and AWS IoT: A Natural Fit (PUT30...
 
AWSome Day Intro Oslo 20160218
AWSome Day Intro Oslo 20160218AWSome Day Intro Oslo 20160218
AWSome Day Intro Oslo 20160218
 
AWSome Day Intro Stockholm 201509
AWSome Day Intro Stockholm 201509AWSome Day Intro Stockholm 201509
AWSome Day Intro Stockholm 201509
 
Mythbusting the Federal Cloud Journey
Mythbusting the Federal Cloud JourneyMythbusting the Federal Cloud Journey
Mythbusting the Federal Cloud Journey
 
Migrate, Modernize, and Manage: Best Practices for a Cloud Migration
Migrate, Modernize, and Manage: Best Practices for a Cloud MigrationMigrate, Modernize, and Manage: Best Practices for a Cloud Migration
Migrate, Modernize, and Manage: Best Practices for a Cloud Migration
 
“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...
“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...
“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...
 
Cloud DevSecOps and compliance considerations leveraging AWS Marketplace sellers
Cloud DevSecOps and compliance considerations leveraging AWS Marketplace sellersCloud DevSecOps and compliance considerations leveraging AWS Marketplace sellers
Cloud DevSecOps and compliance considerations leveraging AWS Marketplace sellers
 
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
Proven Methodologies for Accelerating Your Cloud Journey (ENT308-S) - AWS re:...
 
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017
 
Building Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptxBuilding Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptx
 
Cloud DevSecOps Considerations Leveraging AWS Marketplace Software
Cloud DevSecOps Considerations Leveraging AWS Marketplace SoftwareCloud DevSecOps Considerations Leveraging AWS Marketplace Software
Cloud DevSecOps Considerations Leveraging AWS Marketplace Software
 
How Cardknox Migrated 1M+ Sensitive Records to AWS
 How Cardknox Migrated 1M+ Sensitive Records to AWS How Cardknox Migrated 1M+ Sensitive Records to AWS
How Cardknox Migrated 1M+ Sensitive Records to AWS
 
AWSome day Intro cph 201509
AWSome day Intro cph 201509AWSome day Intro cph 201509
AWSome day Intro cph 201509
 
Security & Compliance in the Cloud
Security & Compliance in the CloudSecurity & Compliance in the Cloud
Security & Compliance in the Cloud
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Intro to AWS Batch & How AQR Capital leverages AWS to Identify New Investment Signals (CMP372) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Batch: Easy & Efficient Batch Computing on Amazon Web Services Michael Raposa Head of Platform Engineering AQR Capital C M P 3 7 2 Rey Wang Senior Product Manager AWS Batch and HPC
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda Intro to AWS Batch Summary of recent AWS Batch launches Glimpse into our roadmap Real world use case: how AQR Capital leverages AWS Batch to identify new investment signals Q&A
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Batch Fully managed Integrated with AWS Cost-optimized resource provisioning No software to install or servers to manage Natively integrated with the AWS platform Automatically provisions compute resources tailored to the needs of your jobs using Amazon On-Demand and Spot pricing.
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Pricing There is no additional charge for AWS Batch You only pay for the AWS resources (for example, Amazon Elastic Compute Cloud [Amazon EC2] instances) you create to store and run your batch jobs
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Batch regional expansion AWS Batch is available in 15 regions: us-east-1 (N. Virginia) us-east-2 (Ohio) us-west-1 (N. California) us-west-2 (Oregon) eu-west-1 (Ireland) eu-west-2 (London) eu-west-3 (Paris) eu-central-1 (Frankfurt) ap-south-1 (Mumbai) ap-northeast-1 (Tokyo) ap-northeast-2 (Seoul) ap-southeast-1 (Singapore) ap-southeast-2 (Sydney) ca-central-1 (Canada Central) sa-east-1 (São Paulo)
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Manageability & performance improvements • AWS Batch as an Amazon CloudWatch Events target: automate your workflow; submit a job to AWS Batch in response to an event pattern or on a schedule • Job execution timeout: control cost by automatically terminating your job once the job has been running for the specified duration • AWS CloudTrail audit calls to AWS Batch APIs: ensure compliance with internal policies and regulatory standards • Scheduling & throughput enhancements
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Improved managed compute environments • Use Amazon EC2 launch templates in your compute environment • Example use cases include: • Increase size/encrypt container volume • Support custom user-data: mount Amazon Elastic File System (Amazon EFS) on instance launch, without needing to create a custom AMI • Override Amazon Elastic Container Service (Amazon ECS) image cleanup • New instance types • Z1d • R5 • R5d • M5d • C5d • X1e
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Container Container 1 Container 2 Container 3 Container 4
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Support for multi-node parallel jobs Container 1 Container 2 Container 3 Container 4
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Support for multi-node parallel jobs • Run single jobs which require multiple EC2 instances. • Designed for distributed computing needs. • Tightly coupled High Performance Computing (HPC) applications • Distributed deep learning trainings
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What can you expect in the next 12 months? • Support for Elastic Fabric Adapter • Significant improvements to the AWS Batch console • Batch to emit CloudWatch metrics for monitoring • Better GPU support • More scheduling and performance improvements • New instance types • Further regional expansion
  • 15. How AQR Capital leverages AWS to research new investment signals Michael Raposa November 28, 2018 Head of Platform Engineering, AQR Not intended for the sale or marketing of financial products or services.
  • 16. Disclosures 16 The information set forth herein has been obtained or derived from sources believed by AQR Capital Management, LLC (“AQR”) to be reliable. However, AQR does not make any representation or warranty, express or implied, as to the information’s accuracy or completeness. This presentation does not represent a formal or official view of AQR. Nor services or applications referenced are specifically endorsed by AQR.. The information contained herein is only as current as of the date indicated, and may be superseded by subsequent market events or for other reasons. Charts and graphs provided herein are for illustrative purposes only. The information in this presentation has been developed internally and/or obtained from sources believed to be reliable; however, neither AQR nor the speaker guarantees the accuracy, adequacy or completeness of such information. Neither AQR nor the speaker assumes any duty to, nor undertakes to update forward looking statements. No representation or warranty, express or implied, is made or given by or on behalf of AQR, the speaker or any other person as to the accuracy and completeness or fairness of the information contained in this presentation, and no responsibility or liability is accepted for any such information. By accepting this presentation in its entirety, the recipient acknowledges its understanding and acceptance of the foregoing statement.
  • 17. Agenda • About AQR • Business problem • Solution summary • Lessons learned • Takeaways 17
  • 18. Our Firm AQR is a global investment management firm built at the intersection of financial theory and practical application. We strive to deliver superior, long-term results for our clients by looking past market noise to identify and isolate what matters most, and by developing ideas that stand up to rigorous testing. Our focus on practical insights and analysis has made us leaders in alternative and traditional strategies since 1998. At a glance • AQR takes a systematic, research-driven approach to managing alternative and traditional strategies • We apply quantitative tools to process fundamental information and manage risk • Our clients include institutional investors, such as pension funds, defined contribution plans, insurance companies, endowments, foundations, family offices, and sovereign wealth funds, as well as RIAs, private banks, and financial advisors • The firm has 36 principals and 1,025 employees; over half of employees hold advanced degrees • AQR is based in Greenwich, Connecticut, with offices in Boston, Chicago, Hong Kong, London, Los Angeles, and Sydney • Approximately $226 billion in assets under management as of September 30, 2018* *Approximate as of 9/30/2018, includes assets managed by AQR and its advisory affiliates. 18
  • 19. Problem statement Background • Quantitative asset manager • Investment decisions based on numerical models and systematic trading • Researchers develops models and “back test” ideas over many years • Never-ending appetite for more data and non-obvious data sets 19 Source: AQR. For illustrative purposes only.
  • 21. Problem statement Background • Quantitative asset manager • Investment decisions based on numerical models and systematic trading • Researchers develop models and “back test” ideas over many years • Never-ending appetite for more data and non-obvious data sets Problem • On-premise compute grid can’t keep up with demand • CAPEX locked into grid • Researchers wait for grid resources • Researchers need job results as quickly as possible • New experimental use cases, such as GPU, require significant time and money upfront investment 21 Source: AQR. For illustrative purposes only.
  • 22. Design considerations 1.Scalable both in compute and memory 2.Fast without long queue times 3.Don’t want to manage a job scheduler, for example, Condor or Sun Grid Engine 4.Easy to use 5.Secure 22
  • 23. Solution summary Burstability Leverage building block application services, for example, Amazon Simple storage Service (Amazon S3), EC2, Amazon DynamoDB Reduce costs—Spot Leverage cloud 23
  • 24. Solution summary Based on Sun Grid Engine Submit jobs via AWS Command Line Intervace (AWS CLI) or API—unlimited compute at fingertips Short feedback loop—immediate results to researcher Backend grid matches researcher workstation—no “It works in DEV but not on the GRID” Seamless interface 24
  • 25. Solution summary Automate everything “Unlimited” compute Short start times—no “queues” Fast 25
  • 26. Solution summary Encryption everywhere Leverage AWS security tools Security is a cloud engineering problem Secure 26
  • 36. Solution summary—Job submission (AWS CLI) Step 1 36
  • 37. Solution summary—Job submission (AWS CLI) Step 2 37
  • 38. Solution summary—Job submission (AWS CLI) Step 3 38
  • 39. Solution summary—Job submission (AWS CLI) Step 4 39
  • 40. Solution summary—DAGs 40 Client Jobs Child jobs Backtest: 20,000 Equities 1998-2018 AAPL: 1998-2018 MSFT: 1998-2018 MSFT: 1998 MSFT: 1999 MSFT: 2018AAPL: 1998 AAPL: 1999 AAPL: 2018
  • 41. Lessons learned—General Use Spot Use as many instance types as you can Use as many AZs as you can Drive the lowest cost - $15/1000 vCPU/hour 41 Spot Instance Availability Zones Instance Families
  • 42. Lessons learned—General ECS logs Job output Host logs … more Log everything! 42
  • 43. Lessons learned—General Job runtime Job start-up times Job cost by user vCPU consumption by user High priority queue consumption by user Monitor everything! 43
  • 46. Lessons learned As we added more users … 1.TooManyRequestsException—DescribeJobs API call failing 2.Providing governance “guard rails” 46
  • 51. Lessons learned As we added more compute … 1.TooManyRequestsException … API calls in the container 2.Job state storage woes 3.Job start times 4.Job costs 51
  • 52. Lessons learned—API in containers 52 Parameter Store ECS container Get Secret
  • 53. Lessons learned—API in containers At scale 53 Exception: too many requests Parameter Store ECS container ECS container ECS container ECS container ECS container
  • 54. Lessons learned—API in containers At scale 54 ECS container ECS container ECS container ECS container ECS container Amazon S3
  • 55. What is shared job state? Shared memory across containers Job assignment & orchestration Job input and output 55
  • 56. Lessons learned—Job state backend 56 Amazon EFS
  • 57. Lessons learned—Job state backend 57 Amazon EFS Redis
  • 58. Lessons learned—Job state backend 58 Amazon EFS Amazon S3 Redis
  • 61. Takeaways Spot Multi-AZ Log everything Monitor everything Follow best practices 61 Spot Instance Availability Zones
  • 62. Takeaways Eliminate API calls in your containers. Only use services that scale, such as DynamoDB. Switch to event/message- based status from poll-based. Choose a job state backend that fits your use case. Watch for scale issues 62
  • 63. Takeaways Write a light-weight wrapper around AWS Batch for non-technical users Have a “governator” function to ensure compliance Make it easy and secure 63
  • 64. Takeaways Packer and bake AMIs Pre-warm the cluster during active times of the day Give yourself an SLA— 75% of jobs start in 10 mins and 90% in 15 mins Reduce your start times 64
  • 65. Takeaways Alarm for large long- running jobs -> $$$ Set limits on your compute environment Control runaway costs 65
  • 66.
  • 67. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 68. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Michael Raposa Head of Platform Engineering AQR Capital Rey Wang Senior Product Manager AWS Batch and HPC
  • 69. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.