SlideShare a Scribd company logo
1 of 23
Big data in the cloud
welcome to cost oriented design
Arnon Rotem-Gal-Oz
Chief Data Officer
Cost oriented design Cheap
(but hopefully cheaper)
Caveat Emptor
security
performanceavailability
correctness
scalability
solution cost
“Traditional” design for cost
POC
(choose technologies,
understand
constraints)
Size needed hardware purchase HW + SW Optimize
Amdocs TeraScale
Source: https://www.youtube.com/watch?v=0N2KhuMKAxM
Optimize &
maximize utilization
Big data in the cloud
Big Data
Big Data
Analytic Database
Vertica
vs.
RedShift
security
performanceavailability
correctness
scalability
solution initial cost
security
performanceavailability
correctness
scalability
Initial cost
Variable running
costs
Dedicated Servers vs. S3 with EC2
Spot instances
security
performanceavailability
correctness
scalability
Initial cost
Continuous cost
optimization
variable cost
Analytics database
(take 2)
Redshift, TokuMX
vs.
Druid
Google BigQuery
What do users actually need?“You can’t always get
what you want,
But if you try,
sometimes, you might find
you get what you need”
The rolling stones
Initial cost
variable cost
continuous cost
optimization
“COST ORIENTATION”
YOU KEEP USING THESE WORDS
I DON’T THINK THEY MEAN
WHAT YOU THINK THEY MEAN
We are Hiring!
https:// www.appsflyer.com/ jobs/

More Related Content

What's hot

Cost Optimization as Major Architectural Consideration for Cloud Application
Cost Optimization as Major Architectural Consideration for Cloud ApplicationCost Optimization as Major Architectural Consideration for Cloud Application
Cost Optimization as Major Architectural Consideration for Cloud Application
Udayan Banerjee
 
Big Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudBig Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the Cloud
George Ang
 
Cloud Computing with Steve Greutter
Cloud Computing with Steve GreutterCloud Computing with Steve Greutter
Cloud Computing with Steve Greutter
decindublin
 

What's hot (20)

Architecture Best Practices to Master + Pitfalls to Avoid
Architecture Best Practices to Master + Pitfalls to AvoidArchitecture Best Practices to Master + Pitfalls to Avoid
Architecture Best Practices to Master + Pitfalls to Avoid
 
AWS Customer Presenatation - SlingMedia uses AWS
AWS Customer Presenatation - SlingMedia uses AWSAWS Customer Presenatation - SlingMedia uses AWS
AWS Customer Presenatation - SlingMedia uses AWS
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive Analytics
 
Cost Optimization as Major Architectural Consideration for Cloud Application
Cost Optimization as Major Architectural Consideration for Cloud ApplicationCost Optimization as Major Architectural Consideration for Cloud Application
Cost Optimization as Major Architectural Consideration for Cloud Application
 
New Capabilities Cloud Computing
New Capabilities Cloud ComputingNew Capabilities Cloud Computing
New Capabilities Cloud Computing
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services
 
Cloud Overview
Cloud OverviewCloud Overview
Cloud Overview
 
Operationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at StarbucksOperationalizing Machine Learning at Scale at Starbucks
Operationalizing Machine Learning at Scale at Starbucks
 
Big Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudBig Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the Cloud
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 
Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...
Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...
Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...
 
Changing the Way Viacom Looks at Video Performance with Mark Cohen and Michae...
Changing the Way Viacom Looks at Video Performance with Mark Cohen and Michae...Changing the Way Viacom Looks at Video Performance with Mark Cohen and Michae...
Changing the Way Viacom Looks at Video Performance with Mark Cohen and Michae...
 
Cloud Computing with Steve Greutter
Cloud Computing with Steve GreutterCloud Computing with Steve Greutter
Cloud Computing with Steve Greutter
 
Cloud computing: cost reduction
Cloud computing: cost reductionCloud computing: cost reduction
Cloud computing: cost reduction
 
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
 
AWS Spot infographic
AWS Spot infographicAWS Spot infographic
AWS Spot infographic
 
Using AML Python SDK
Using AML Python SDKUsing AML Python SDK
Using AML Python SDK
 
IBM Edge 2015 Infographic
IBM Edge 2015 InfographicIBM Edge 2015 Infographic
IBM Edge 2015 Infographic
 
UberCloud Webinar ansys azure
UberCloud Webinar ansys azure UberCloud Webinar ansys azure
UberCloud Webinar ansys azure
 
AWS Customer Presentation - Zynga
AWS Customer Presentation - ZyngaAWS Customer Presentation - Zynga
AWS Customer Presentation - Zynga
 

Similar to Big data in the cloud - welcome to cost oriented design

Improve your TCO and Optimise your Cloud Spend
Improve your TCO and Optimise your Cloud SpendImprove your TCO and Optimise your Cloud Spend
Improve your TCO and Optimise your Cloud Spend
Amazon Web Services
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
Amazon Web Services Korea
 
AWS Cloud Kata | Manila - Getting to Profitability on AWS
AWS Cloud Kata | Manila - Getting to Profitability on AWSAWS Cloud Kata | Manila - Getting to Profitability on AWS
AWS Cloud Kata | Manila - Getting to Profitability on AWS
Amazon Web Services
 

Similar to Big data in the cloud - welcome to cost oriented design (20)

Effective use of cloud resources for Data Engineering - Johnson Darkwah
Effective use of cloud resources for Data Engineering - Johnson DarkwahEffective use of cloud resources for Data Engineering - Johnson Darkwah
Effective use of cloud resources for Data Engineering - Johnson Darkwah
 
AWS Summit London 2014 | Optimising TCO for the AWS Cloud (100)
AWS Summit London 2014 | Optimising TCO for the AWS Cloud (100)AWS Summit London 2014 | Optimising TCO for the AWS Cloud (100)
AWS Summit London 2014 | Optimising TCO for the AWS Cloud (100)
 
Improve your TCO and Optimise your Cloud Spend
Improve your TCO and Optimise your Cloud SpendImprove your TCO and Optimise your Cloud Spend
Improve your TCO and Optimise your Cloud Spend
 
Analytics on AWS - IP Expo 2013
Analytics on AWS - IP Expo 2013Analytics on AWS - IP Expo 2013
Analytics on AWS - IP Expo 2013
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
 
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
 
AWS Cloud Kata | Manila - Getting to Profitability on AWS
AWS Cloud Kata | Manila - Getting to Profitability on AWSAWS Cloud Kata | Manila - Getting to Profitability on AWS
AWS Cloud Kata | Manila - Getting to Profitability on AWS
 
Amf304 optimizing-design-and-e-660cc73d-5c4c-4331-8f59-48cccdc1b7f4-135588426...
Amf304 optimizing-design-and-e-660cc73d-5c4c-4331-8f59-48cccdc1b7f4-135588426...Amf304 optimizing-design-and-e-660cc73d-5c4c-4331-8f59-48cccdc1b7f4-135588426...
Amf304 optimizing-design-and-e-660cc73d-5c4c-4331-8f59-48cccdc1b7f4-135588426...
 
AMF304-Optimizing Design and Engineering Performance in the Cloud for Manufac...
AMF304-Optimizing Design and Engineering Performance in the Cloud for Manufac...AMF304-Optimizing Design and Engineering Performance in the Cloud for Manufac...
AMF304-Optimizing Design and Engineering Performance in the Cloud for Manufac...
 
AWS Cloud for HPC and Big Data
AWS Cloud for HPC and Big DataAWS Cloud for HPC and Big Data
AWS Cloud for HPC and Big Data
 
AWS APAC Webinar Series: How to Reduce Your Spend on AWS
AWS APAC Webinar Series: How to Reduce Your Spend on AWSAWS APAC Webinar Series: How to Reduce Your Spend on AWS
AWS APAC Webinar Series: How to Reduce Your Spend on AWS
 
How to Reduce your Spend on AWS
How to Reduce your Spend on AWSHow to Reduce your Spend on AWS
How to Reduce your Spend on AWS
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
AWS Summit Stockholm 2014 – B5 – The TCO of cloud applications
AWS Summit Stockholm 2014 – B5 – The TCO of cloud applicationsAWS Summit Stockholm 2014 – B5 – The TCO of cloud applications
AWS Summit Stockholm 2014 – B5 – The TCO of cloud applications
 
High Performance Computing with AWS
High Performance Computing with AWSHigh Performance Computing with AWS
High Performance Computing with AWS
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
AWS Summit Stockholm 2014 – T3 – disaster recovery on AWS
AWS Summit Stockholm 2014 – T3 – disaster recovery on AWSAWS Summit Stockholm 2014 – T3 – disaster recovery on AWS
AWS Summit Stockholm 2014 – T3 – disaster recovery on AWS
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 

More from Arnon Rotem-Gal-Oz

Things to think about while architecting azure solutions
Things to think about while architecting azure solutionsThings to think about while architecting azure solutions
Things to think about while architecting azure solutions
Arnon Rotem-Gal-Oz
 

More from Arnon Rotem-Gal-Oz (20)

Taking ML to production - a journey
Taking ML to production - a journeyTaking ML to production - a journey
Taking ML to production - a journey
 
Apache spark
Apache sparkApache spark
Apache spark
 
Fallacies of Distributed Computing
Fallacies of Distributed Computing Fallacies of Distributed Computing
Fallacies of Distributed Computing
 
Docker & Kubernetes intro
Docker & Kubernetes introDocker & Kubernetes intro
Docker & Kubernetes intro
 
Docker Intro
Docker IntroDocker Intro
Docker Intro
 
Data security @ the personal level
Data security @ the personal levelData security @ the personal level
Data security @ the personal level
 
Microservices - it's déjà vu all over again
Microservices  - it's déjà vu all over againMicroservices  - it's déjà vu all over again
Microservices - it's déjà vu all over again
 
Distilling insights @ AppsFlyer
Distilling insights @ AppsFlyerDistilling insights @ AppsFlyer
Distilling insights @ AppsFlyer
 
Distilling Insights @ Appsflyer (Data Architecture)
Distilling Insights @ Appsflyer (Data Architecture)Distilling Insights @ Appsflyer (Data Architecture)
Distilling Insights @ Appsflyer (Data Architecture)
 
Big data Overview
Big data OverviewBig data Overview
Big data Overview
 
Hadoop YARN overview
Hadoop YARN overviewHadoop YARN overview
Hadoop YARN overview
 
SAF
SAFSAF
SAF
 
REST presentation
REST presentationREST presentation
REST presentation
 
SOA & Big Data
SOA & Big DataSOA & Big Data
SOA & Big Data
 
Why the JVM?
Why the JVM?Why the JVM?
Why the JVM?
 
Building reliable systems from unreliable components
Building reliable systems from unreliable componentsBuilding reliable systems from unreliable components
Building reliable systems from unreliable components
 
Azure migration
Azure migrationAzure migration
Azure migration
 
Things to think about while architecting azure solutions
Things to think about while architecting azure solutionsThings to think about while architecting azure solutions
Things to think about while architecting azure solutions
 
Soa
Soa Soa
Soa
 
Rest
RestRest
Rest
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 

Big data in the cloud - welcome to cost oriented design

Editor's Notes

  1. Head of devops found it amusing * So Caveat emptor – buyers beware
  2. I met Alon Fliess CTO CodeValue who had this neat idea Pay as you go Unfortunetly for cloudoscope the main (by a long shot) cost factor is VM up time - so we abandoned it But some of the concepts and ideas we had when we’ve built that stuck in my mine
  3. There are many concerns “quality attributes” – the solution cost is one of them we will see is how big data changes this and makes cost orientation something to “worry” about I’ll present how cost orientation in the cloud is more complex I will demonstrate that using several examples from the day to day life First lets see how does it work with ”traditional”, on-premise
  4. ~7500 MB/Sec on performance POC Actual workloads are more taxing
  5. There’s a difference between the syntetic loads in POC to the ones in a real system But you get more experienced with the technology and you can squeeze it for more So that’s big data and cost on-premist…
  6. Let’s meet AppsFlyer
  7. The system handles very high loads Generating even more messages that it gets Data sinks in in two areas
  8. 5+ Billion sessions/day (5 out of 7 B events) 30+ million installs a day
  9. The ball grows bigger…
  10. The bill gets bigger as you get more traffic – maybe it is time to reconsider
  11. Redshift challenges concurrency in the face of more reports 1 week activity = 3 months retention (@ same granularity level)
  12. Cost oriented partitioning in Google BigQuery move from default by month by event To by account by day
  13. Need to understand what our customers actually need – the technical solution might be way simpler Offloading from on-demand to scheduled reports Get all clients data together (one pass over data) – then break it Store for long term on S3 (clients can take data when they need it) Auto expire (don’t let it grow too much)
  14. Cost orientation not equal cheap – just (hopefully) cheaper Small cost * big number = big bucks Understand the cost structure of the services you use Fixed ? Relative? Understand what users are actually looking for Evolve the solution