SlideShare a Scribd company logo
Jesse Anderson
EC2 PERFORMANCE, SPOT INSTANCE ROI AND
EMR SCALABILITY
AMAZON WEB SERVICES (AWS)

   Elastic Cloud Compute (EC2)
     Virtual   Machine in Cloud
   Simple Storage Service (S3)
     Network    Share in Cloud
   Elastic MapReduce (EMR)
     Cluster   of EC2 instances for Hadoop cluster
EC2 PRICE TYPES

   Spot Instances
     Systemfor bidding on unused instances
     Same Performance

     Go away (abruptly) if outbid

   On Demand
     Ad   Hoc starting
   Reserved
     Not   Covered
SPOT INSTANCE SAVINGS
MILLION MONKEYS PROJECT

 Randomly recreated Shakespeare
 Open source

 Good metric for CPU and memory
EC2 SPECIFICATIONS
Instance Name    Memory   EC2 Compute         Platform I/O
                          Units/Cores                  Performance
Small            1.7 GB   1 EC2 on 1 Core     32-bit   Moderate
Large            7.5 GB   4 EC2 on 2 Cores    64-bit   High
Extra Large      15 GB    8 EC2 on 8 Cores    64-bit   High
High-CPU         1.7 GB   5 EC2 on 2 Cores    32-bit   Moderate
Medium
High-CPU Large   7 GB     20 EC2 on 8 Cores   64-bit   High
Quad XL          23 GB    33.5 on 8 Cores     64-bit   Very High
 EC2 Compute Unit (ECU) – One EC2 Compute Unit (ECU) provides
 the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or
 2007 Xeon processor.
EC2 PERFORMANCE




   My Core 2 Duo 2.66 GHZ did 50,000,000,000 character groups
EC2 COST PER HOUR ON DEMAND/SPOT
PRICE PER UNIT
EMR (HADOOP) CLUSTERING

 Tests of 1, 2, 3, 4, 5, 10, 20 node clusters
 Price

 Scalability
EMR COST
PRICE PER UNIT IN A CLUSTER
CLUSTERED CHARACTER GROUPS
EMR/HADOOP SCALABILITY PERCENTAGE
EMR/HADOOP SCALABILITY ABSOLUTE
BREAKDOWNS

   Original project would have run in 3 days 9
    hours
     Took   1.5 months before
 20 node cluster costs $45.44 per day
 5 day run cost $317

 11 day run cost $528
ENGINEERING FOR THE CLOUD

 Establish if a good fit
 Test the EC2 performance

 Figure out a unit or widget

 Find the most cost efficient EC2 performer
  with price per unit/widget
 Engineer with Spot Instances in mind
CONCLUSIONS

   Spot Instance Saves
     From $2.20 to $1.30 per hour
     Saved $1,000 in one run

   Hadoop/EMR Scalability
     95% efficiency at 2-5 nodes
     87% efficiency at 10 nodes

     84% efficiency at 20 nodes
MORE INFORMATION

 http://www.jesse-anderson.com/2012/02/ec2-
  performance-spot-instance-roi-and-emr-
  scalability/
 @jessetanderson

More Related Content

What's hot

Ralph Rebske: AWS Pricing and Billing
Ralph Rebske: AWS Pricing and BillingRalph Rebske: AWS Pricing and Billing
Ralph Rebske: AWS Pricing and BillingSymposia Media
 
Reducing Cost & Maximizing Efficiency: Tightening the Belt on AWS (CPN211) | ...
Reducing Cost & Maximizing Efficiency: Tightening the Belt on AWS (CPN211) | ...Reducing Cost & Maximizing Efficiency: Tightening the Belt on AWS (CPN211) | ...
Reducing Cost & Maximizing Efficiency: Tightening the Belt on AWS (CPN211) | ...Amazon Web Services
 
AWS Cloud Kata | Bangkok - Getting to Scale on AWS
AWS Cloud Kata | Bangkok - Getting to Scale on AWSAWS Cloud Kata | Bangkok - Getting to Scale on AWS
AWS Cloud Kata | Bangkok - Getting to Scale on AWSAmazon Web Services
 
Optimising TCO with AWS at Websummit Dublin
Optimising TCO with AWS at Websummit DublinOptimising TCO with AWS at Websummit Dublin
Optimising TCO with AWS at Websummit DublinAmazon Web Services
 
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...rICh morrow
 
Keeping Your Infrastructure Costs Low - AWS Startup Day Boston 2018.pdf
Keeping Your Infrastructure Costs Low - AWS Startup Day Boston 2018.pdfKeeping Your Infrastructure Costs Low - AWS Startup Day Boston 2018.pdf
Keeping Your Infrastructure Costs Low - AWS Startup Day Boston 2018.pdfAmazon Web Services
 
Best Practices for Managing Hadoop Framework Based Workloads (on Amazon EMR) ...
Best Practices for Managing Hadoop Framework Based Workloads (on Amazon EMR) ...Best Practices for Managing Hadoop Framework Based Workloads (on Amazon EMR) ...
Best Practices for Managing Hadoop Framework Based Workloads (on Amazon EMR) ...Amazon Web Services
 
Hadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMRHadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMRrICh morrow
 
Scaling your analytics with Amazon EMR
Scaling your analytics with Amazon EMRScaling your analytics with Amazon EMR
Scaling your analytics with Amazon EMRIsrael AWS User Group
 
AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일
AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일
AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일Amazon Web Services Korea
 
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)Amazon Web Services
 
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...Amazon Web Services
 
Cloud Economics, from Genesis to Scale
Cloud Economics, from Genesis to ScaleCloud Economics, from Genesis to Scale
Cloud Economics, from Genesis to ScaleAmazon Web Services
 
AWS18_StartupDayToronto_KeepingYourInfraCostsLow
AWS18_StartupDayToronto_KeepingYourInfraCostsLowAWS18_StartupDayToronto_KeepingYourInfraCostsLow
AWS18_StartupDayToronto_KeepingYourInfraCostsLowAmazon Web Services
 
AWS Webcast - Total Cost of (Non) Ownership
AWS Webcast - Total Cost of (Non) Ownership  AWS Webcast - Total Cost of (Non) Ownership
AWS Webcast - Total Cost of (Non) Ownership Amazon Web Services
 
Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR - AWS On...
Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR - AWS On...Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR - AWS On...
Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR - AWS On...Amazon Web Services
 
AWS Summit London 2014 | Customer Stories | Just Eat
AWS Summit London 2014 | Customer Stories | Just EatAWS Summit London 2014 | Customer Stories | Just Eat
AWS Summit London 2014 | Customer Stories | Just EatAmazon Web Services
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMRightScale
 
Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...
Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...
Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...Amazon Web Services
 
getting started with amazon aurora
getting started with amazon auroragetting started with amazon aurora
getting started with amazon auroraAmazon Web Services
 

What's hot (20)

Ralph Rebske: AWS Pricing and Billing
Ralph Rebske: AWS Pricing and BillingRalph Rebske: AWS Pricing and Billing
Ralph Rebske: AWS Pricing and Billing
 
Reducing Cost & Maximizing Efficiency: Tightening the Belt on AWS (CPN211) | ...
Reducing Cost & Maximizing Efficiency: Tightening the Belt on AWS (CPN211) | ...Reducing Cost & Maximizing Efficiency: Tightening the Belt on AWS (CPN211) | ...
Reducing Cost & Maximizing Efficiency: Tightening the Belt on AWS (CPN211) | ...
 
AWS Cloud Kata | Bangkok - Getting to Scale on AWS
AWS Cloud Kata | Bangkok - Getting to Scale on AWSAWS Cloud Kata | Bangkok - Getting to Scale on AWS
AWS Cloud Kata | Bangkok - Getting to Scale on AWS
 
Optimising TCO with AWS at Websummit Dublin
Optimising TCO with AWS at Websummit DublinOptimising TCO with AWS at Websummit Dublin
Optimising TCO with AWS at Websummit Dublin
 
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
 
Keeping Your Infrastructure Costs Low - AWS Startup Day Boston 2018.pdf
Keeping Your Infrastructure Costs Low - AWS Startup Day Boston 2018.pdfKeeping Your Infrastructure Costs Low - AWS Startup Day Boston 2018.pdf
Keeping Your Infrastructure Costs Low - AWS Startup Day Boston 2018.pdf
 
Best Practices for Managing Hadoop Framework Based Workloads (on Amazon EMR) ...
Best Practices for Managing Hadoop Framework Based Workloads (on Amazon EMR) ...Best Practices for Managing Hadoop Framework Based Workloads (on Amazon EMR) ...
Best Practices for Managing Hadoop Framework Based Workloads (on Amazon EMR) ...
 
Hadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMRHadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMR
 
Scaling your analytics with Amazon EMR
Scaling your analytics with Amazon EMRScaling your analytics with Amazon EMR
Scaling your analytics with Amazon EMR
 
AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일
AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일
AWS re:Invent re:Cap - 비용 최적화 - 모범사례와 아키텍처 설계 심화편 - 이원일
 
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
 
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
AWS Storage Services - AWS Presentation - AWS Cloud Storage for the Enterpris...
 
Cloud Economics, from Genesis to Scale
Cloud Economics, from Genesis to ScaleCloud Economics, from Genesis to Scale
Cloud Economics, from Genesis to Scale
 
AWS18_StartupDayToronto_KeepingYourInfraCostsLow
AWS18_StartupDayToronto_KeepingYourInfraCostsLowAWS18_StartupDayToronto_KeepingYourInfraCostsLow
AWS18_StartupDayToronto_KeepingYourInfraCostsLow
 
AWS Webcast - Total Cost of (Non) Ownership
AWS Webcast - Total Cost of (Non) Ownership  AWS Webcast - Total Cost of (Non) Ownership
AWS Webcast - Total Cost of (Non) Ownership
 
Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR - AWS On...
Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR - AWS On...Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR - AWS On...
Best Practices for Running Amazon EC2 Spot Instances with Amazon EMR - AWS On...
 
AWS Summit London 2014 | Customer Stories | Just Eat
AWS Summit London 2014 | Customer Stories | Just EatAWS Summit London 2014 | Customer Stories | Just Eat
AWS Summit London 2014 | Customer Stories | Just Eat
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
 
Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...
Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...
Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...
 
getting started with amazon aurora
getting started with amazon auroragetting started with amazon aurora
getting started with amazon aurora
 

Similar to EC2 Performance, Spot Instance ROI and EMR Scalability

Netflix Moving To Cloud
Netflix Moving To CloudNetflix Moving To Cloud
Netflix Moving To CloudHien Luu
 
Amazon Web Services (cloud: is it good for anything?)
Amazon Web Services (cloud: is it good for anything?)Amazon Web Services (cloud: is it good for anything?)
Amazon Web Services (cloud: is it good for anything?)Maciej Pasternacki
 
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014Amazon Web Services
 
PAC 2019 virtual Stefano Doni
PAC 2019 virtual Stefano Doni   PAC 2019 virtual Stefano Doni
PAC 2019 virtual Stefano Doni Neotys
 
Amazon EC2 Instances, Featuring Performance Optimisation Best Practices
Amazon EC2 Instances, Featuring Performance Optimisation Best PracticesAmazon EC2 Instances, Featuring Performance Optimisation Best Practices
Amazon EC2 Instances, Featuring Performance Optimisation Best PracticesAmazon Web Services
 
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksAmazon Web Services
 
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesWKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesAmazon Web Services
 
Scaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customersScaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customersSpeck&Tech
 
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesWKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesAmazon Web Services
 
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalSizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalVigyan Jain
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAmazon Web Services
 
The iot academy_awstraining_part1_aws_introduction
The iot academy_awstraining_part1_aws_introductionThe iot academy_awstraining_part1_aws_introduction
The iot academy_awstraining_part1_aws_introductionThe IOT Academy
 
Amazon web services
Amazon web servicesAmazon web services
Amazon web servicestsaiscorpio
 
AWS Fargate in practice. How to run containers without managing EC2 instances
AWS Fargate in practice. How to run containers without managing EC2 instancesAWS Fargate in practice. How to run containers without managing EC2 instances
AWS Fargate in practice. How to run containers without managing EC2 instancesMax Borysov
 
Cs264 intro-to-cloud-computing
Cs264 intro-to-cloud-computingCs264 intro-to-cloud-computing
Cs264 intro-to-cloud-computingkartiko edhi
 
[Harvard CS264] 08a - Cloud Computing, Amazon EC2, MIT StarCluster (Justin Ri...
[Harvard CS264] 08a - Cloud Computing, Amazon EC2, MIT StarCluster (Justin Ri...[Harvard CS264] 08a - Cloud Computing, Amazon EC2, MIT StarCluster (Justin Ri...
[Harvard CS264] 08a - Cloud Computing, Amazon EC2, MIT StarCluster (Justin Ri...npinto
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedis Labs
 

Similar to EC2 Performance, Spot Instance ROI and EMR Scalability (20)

Netflix Moving To Cloud
Netflix Moving To CloudNetflix Moving To Cloud
Netflix Moving To Cloud
 
Amazon Web Services (cloud: is it good for anything?)
Amazon Web Services (cloud: is it good for anything?)Amazon Web Services (cloud: is it good for anything?)
Amazon Web Services (cloud: is it good for anything?)
 
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
 
PAC 2019 virtual Stefano Doni
PAC 2019 virtual Stefano Doni   PAC 2019 virtual Stefano Doni
PAC 2019 virtual Stefano Doni
 
Amazon EC2
Amazon EC2Amazon EC2
Amazon EC2
 
Amazon EC2 Instances, Featuring Performance Optimisation Best Practices
Amazon EC2 Instances, Featuring Performance Optimisation Best PracticesAmazon EC2 Instances, Featuring Performance Optimisation Best Practices
Amazon EC2 Instances, Featuring Performance Optimisation Best Practices
 
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
 
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesWKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
 
Scaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customersScaling an invoicing SaaS from zero to over 350k customers
Scaling an invoicing SaaS from zero to over 350k customers
 
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesWKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
 
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalSizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
 
The iot academy_awstraining_part1_aws_introduction
The iot academy_awstraining_part1_aws_introductionThe iot academy_awstraining_part1_aws_introduction
The iot academy_awstraining_part1_aws_introduction
 
Amazon web services
Amazon web servicesAmazon web services
Amazon web services
 
Amazon EC2
Amazon EC2Amazon EC2
Amazon EC2
 
AWS Fargate in practice. How to run containers without managing EC2 instances
AWS Fargate in practice. How to run containers without managing EC2 instancesAWS Fargate in practice. How to run containers without managing EC2 instances
AWS Fargate in practice. How to run containers without managing EC2 instances
 
Cs264 intro-to-cloud-computing
Cs264 intro-to-cloud-computingCs264 intro-to-cloud-computing
Cs264 intro-to-cloud-computing
 
[Harvard CS264] 08a - Cloud Computing, Amazon EC2, MIT StarCluster (Justin Ri...
[Harvard CS264] 08a - Cloud Computing, Amazon EC2, MIT StarCluster (Justin Ri...[Harvard CS264] 08a - Cloud Computing, Amazon EC2, MIT StarCluster (Justin Ri...
[Harvard CS264] 08a - Cloud Computing, Amazon EC2, MIT StarCluster (Justin Ri...
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
 

More from Jesse Anderson

Managing Real-Time Data Teams
Managing Real-Time Data TeamsManaging Real-Time Data Teams
Managing Real-Time Data TeamsJesse Anderson
 
Pulsar for Kafka People
Pulsar for Kafka PeoplePulsar for Kafka People
Pulsar for Kafka PeopleJesse Anderson
 
Big Data and Analytics in the COVID-19 Era
Big Data and Analytics in the COVID-19 EraBig Data and Analytics in the COVID-19 Era
Big Data and Analytics in the COVID-19 EraJesse Anderson
 
Working Together As Data Teams V1
Working Together As Data Teams V1Working Together As Data Teams V1
Working Together As Data Teams V1Jesse Anderson
 
What Does an Exec Need to About Architecture and Why
What Does an Exec Need to About Architecture and WhyWhat Does an Exec Need to About Architecture and Why
What Does an Exec Need to About Architecture and WhyJesse Anderson
 
The Five Dysfunctions of a Data Engineering Team
The Five Dysfunctions of a Data Engineering TeamThe Five Dysfunctions of a Data Engineering Team
The Five Dysfunctions of a Data Engineering TeamJesse Anderson
 
HBaseCon 2014-Just the Basics
HBaseCon 2014-Just the BasicsHBaseCon 2014-Just the Basics
HBaseCon 2014-Just the BasicsJesse Anderson
 
Million Monkeys User Group
Million Monkeys User GroupMillion Monkeys User Group
Million Monkeys User GroupJesse Anderson
 
Strata 2012 Million Monkeys
Strata 2012 Million MonkeysStrata 2012 Million Monkeys
Strata 2012 Million MonkeysJesse Anderson
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular ExpressionsJesse Anderson
 
Introduction to Android
Introduction to AndroidIntroduction to Android
Introduction to AndroidJesse Anderson
 

More from Jesse Anderson (13)

Managing Real-Time Data Teams
Managing Real-Time Data TeamsManaging Real-Time Data Teams
Managing Real-Time Data Teams
 
Pulsar for Kafka People
Pulsar for Kafka PeoplePulsar for Kafka People
Pulsar for Kafka People
 
Big Data and Analytics in the COVID-19 Era
Big Data and Analytics in the COVID-19 EraBig Data and Analytics in the COVID-19 Era
Big Data and Analytics in the COVID-19 Era
 
Working Together As Data Teams V1
Working Together As Data Teams V1Working Together As Data Teams V1
Working Together As Data Teams V1
 
What Does an Exec Need to About Architecture and Why
What Does an Exec Need to About Architecture and WhyWhat Does an Exec Need to About Architecture and Why
What Does an Exec Need to About Architecture and Why
 
The Five Dysfunctions of a Data Engineering Team
The Five Dysfunctions of a Data Engineering TeamThe Five Dysfunctions of a Data Engineering Team
The Five Dysfunctions of a Data Engineering Team
 
HBaseCon 2014-Just the Basics
HBaseCon 2014-Just the BasicsHBaseCon 2014-Just the Basics
HBaseCon 2014-Just the Basics
 
Million Monkeys User Group
Million Monkeys User GroupMillion Monkeys User Group
Million Monkeys User Group
 
Strata 2012 Million Monkeys
Strata 2012 Million MonkeysStrata 2012 Million Monkeys
Strata 2012 Million Monkeys
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular Expressions
 
Why Use MVC?
Why Use MVC?Why Use MVC?
Why Use MVC?
 
How to Use MVC
How to Use MVCHow to Use MVC
How to Use MVC
 
Introduction to Android
Introduction to AndroidIntroduction to Android
Introduction to Android
 

Recently uploaded

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesThousandEyes
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsExpeed Software
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIES VE
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Thierry Lestable
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Product School
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 

Recently uploaded (20)

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 

EC2 Performance, Spot Instance ROI and EMR Scalability

  • 1. Jesse Anderson EC2 PERFORMANCE, SPOT INSTANCE ROI AND EMR SCALABILITY
  • 2. AMAZON WEB SERVICES (AWS)  Elastic Cloud Compute (EC2)  Virtual Machine in Cloud  Simple Storage Service (S3)  Network Share in Cloud  Elastic MapReduce (EMR)  Cluster of EC2 instances for Hadoop cluster
  • 3. EC2 PRICE TYPES  Spot Instances  Systemfor bidding on unused instances  Same Performance  Go away (abruptly) if outbid  On Demand  Ad Hoc starting  Reserved  Not Covered
  • 5. MILLION MONKEYS PROJECT  Randomly recreated Shakespeare  Open source  Good metric for CPU and memory
  • 6. EC2 SPECIFICATIONS Instance Name Memory EC2 Compute Platform I/O Units/Cores Performance Small 1.7 GB 1 EC2 on 1 Core 32-bit Moderate Large 7.5 GB 4 EC2 on 2 Cores 64-bit High Extra Large 15 GB 8 EC2 on 8 Cores 64-bit High High-CPU 1.7 GB 5 EC2 on 2 Cores 32-bit Moderate Medium High-CPU Large 7 GB 20 EC2 on 8 Cores 64-bit High Quad XL 23 GB 33.5 on 8 Cores 64-bit Very High EC2 Compute Unit (ECU) – One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
  • 7. EC2 PERFORMANCE My Core 2 Duo 2.66 GHZ did 50,000,000,000 character groups
  • 8. EC2 COST PER HOUR ON DEMAND/SPOT
  • 10. EMR (HADOOP) CLUSTERING  Tests of 1, 2, 3, 4, 5, 10, 20 node clusters  Price  Scalability
  • 12. PRICE PER UNIT IN A CLUSTER
  • 16. BREAKDOWNS  Original project would have run in 3 days 9 hours  Took 1.5 months before  20 node cluster costs $45.44 per day  5 day run cost $317  11 day run cost $528
  • 17. ENGINEERING FOR THE CLOUD  Establish if a good fit  Test the EC2 performance  Figure out a unit or widget  Find the most cost efficient EC2 performer with price per unit/widget  Engineer with Spot Instances in mind
  • 18. CONCLUSIONS  Spot Instance Saves  From $2.20 to $1.30 per hour  Saved $1,000 in one run  Hadoop/EMR Scalability  95% efficiency at 2-5 nodes  87% efficiency at 10 nodes  84% efficiency at 20 nodes
  • 19. MORE INFORMATION  http://www.jesse-anderson.com/2012/02/ec2- performance-spot-instance-roi-and-emr- scalability/  @jessetanderson