SlideShare a Scribd company logo
Mesos
A Platform for Fine-Grained Resource Sharing in the Data Center
@PapersWeLoveSEA
@ankurcha - Ankur Chauhan
Benjamin Hindman, Andy konwinski, Matei Zaharia,
Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, Ion Stoica
Background
Rapid innovation in distributed/cluster
computing
● Hadoop, Spark, Flink, Tez, HDFS, Dryad etc
● Micro-services / web-services (long running)
● <Your custom framework> …
… There is a lot of wasted effort.
Problem
● Rapid innovation in cluster/dist. frameworks
● No single framework is optimal for all
workloads
● What do we want?
○ Run multiple frameworks on a single cluster (big)
■ Maximize utilisation of resources. (elastic)
■ Share data between frameworks.
Goal
Solution
Mesos Goals
● High utilization of resources
● Support diverse frameworks
● Scalability to 10,000s of nodes
● Reliability in face of failures
● Efficient with minimal overhead
● Highly available via leader election of
master.
Other benefits
● Run multiple instances of a framework
○ Isolate production and experimental tasks
○ Run multiple version of a framework
● Build special framework targeting a
particular domain
○ eg: spark
Mesos Architecture
● Resource Offers
○ Offer available resources to frameworks
● 2-level scheduler
○ Mesos takes the decision of which framework gets
the offer.
○ Frameworks decide whether to accept/reject offer.
● Fine grained sharing
○ Improved utilization, responsiveness, data locality
● Pluggable allocation module
Mesos Architecture
High level architecture
Design components
● Mesos Master
○ Runs the allocation module
● Mesos Slave
○ Offers resources
○ task/executor isolation
● Scheduler
○ Accepts/rejects resource offers
● Executor
○ Runs tasks
● Task
○ Unit of work
Mesos Architecture
● Making resource offers scalable and robust
○ Let schedulers define filters
○ Offered resources counted as allocations
○ Rescind offers if not accepted within timeout
● Fault tolerance
○ Zookeeper for leader election (hot-standby)
○ Mesos reports slave and executor failures to
scheduler
○ Minimal internal state in master
Resource offers
● Mesos decides how many
resources to offer to each
framework, based on
allocation policy (fair share),
while framework decide which
resource offers to accept and
which tasks to run on them.
● When a framework accepts an
offer, it passes Mesos a
description of the task (and
executor) to launch.
Analysis
● Resource offers work well when:
○ Frameworks can scale elastically.
○ Task durations are homogeneous.
○ Frameworks have many preferred nodes.
● Conditions hold true in many frameworks
(hadoop, dryad, spark … )
○ Work is divided into short tasks
○ Data is replicated across multiple nodes
Limitations of distributed scheduling
● Fragmentation
○ With heterogeneous resource demands, distributed
collection of frameworks may not be able to bin pack
optimally.
● Independent framework constraints
○ Esoteric dependency scenarios can only be satisfied
with a centralized scheduler
● Framework complexity
○ Need to deal with resource offers ( not onerous )
Implementation
● ~10,000 lines of C++
● libprocess - actor-based programming (akka
for c++)
● Cgroups / Docker for task isolation
● Zookeeper for leader election (HA)
● Ports / plugins
○ Hadoop - map reduce
○ Torque / MPI
○ Spark Framework - iterative jobs
Evaluation
Dynamic Resource
sharing
Evaluation
Data locality with resource offers
○ 16 Hadoop instances using 93 EC2 instances
○ 1.7x speedup with mesos
○ 97% data locality with 5sec delay scheduling
Evaluation
Mesos scalability
● 99 EC2 instances
● Scaled to 50k slaves, 200 frameworks, 100k tasks
Evaluation
● Failure recovery
○ Mean time to recovery was between 4-8 seconds
with 95% confidence of 3secs on either side.
● Performance isolation
○ Linux containers are not perfect at isolation
○ 30% increase in request latency vs 550% increase
during tests
■ apache server + cpu hog process
The end

More Related Content

What's hot

20220314 Amazon Linux2022 をさわってみた
20220314 Amazon Linux2022 をさわってみた20220314 Amazon Linux2022 をさわってみた
20220314 Amazon Linux2022 をさわってみた
Masaru Ogura
 
KAFKA 3.1.0.pdf
KAFKA 3.1.0.pdfKAFKA 3.1.0.pdf
KAFKA 3.1.0.pdf
wonyong hwang
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
Eueung Mulyana
 
DevOps for database
DevOps for databaseDevOps for database
DevOps for database
Osama Mustafa
 
Docker & Kubernetes基礎
Docker & Kubernetes基礎Docker & Kubernetes基礎
Docker & Kubernetes基礎
Daisuke Hiraoka
 
What is Cloud Native Explained?
What is Cloud Native Explained?What is Cloud Native Explained?
What is Cloud Native Explained?
jeetendra mandal
 
Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...
Michael Elder
 
Terraform modules and (some of) best practices
Terraform modules and (some of) best practicesTerraform modules and (some of) best practices
Terraform modules and (some of) best practices
Anton Babenko
 
Scrum and agile principles
Scrum and agile principles Scrum and agile principles
Scrum and agile principles
Ruben Canlas
 
Monolith to Microservices - O’Reilly Oscon
Monolith to Microservices - O’Reilly OsconMonolith to Microservices - O’Reilly Oscon
Monolith to Microservices - O’Reilly Oscon
Christopher Grant
 
Configuration Management in Ansible
Configuration Management in Ansible Configuration Management in Ansible
Configuration Management in Ansible
Bangladesh Network Operators Group
 
私たちはRESTCONFでネットワーク自動化的に何が嬉しくなるのか考えてみた
私たちはRESTCONFでネットワーク自動化的に何が嬉しくなるのか考えてみた私たちはRESTCONFでネットワーク自動化的に何が嬉しくなるのか考えてみた
私たちはRESTCONFでネットワーク自動化的に何が嬉しくなるのか考えてみた
akira6592
 
NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX Back to Basics: Ingress Controller (Japanese Webinar)NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX, Inc.
 
Deploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOpsDeploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOps
Opsta
 
Formation Agile Scrum
Formation Agile ScrumFormation Agile Scrum
Formation Agile Scrum
Mohamed IBN ELAZZOUZI
 
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache TomcatCase Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
VMware Hyperic
 
Microservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, KanbanMicroservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, Kanban
Araf Karsh Hamid
 
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
NGINX, Inc.
 
Machine configoperatorのちょっとイイかもしれない話
Machine configoperatorのちょっとイイかもしれない話 Machine configoperatorのちょっとイイかもしれない話
Machine configoperatorのちょっとイイかもしれない話
Toshihiro Araki
 
Microservices Design Patterns
Microservices Design PatternsMicroservices Design Patterns
Microservices Design Patterns
Haim Michael
 

What's hot (20)

20220314 Amazon Linux2022 をさわってみた
20220314 Amazon Linux2022 をさわってみた20220314 Amazon Linux2022 をさわってみた
20220314 Amazon Linux2022 をさわってみた
 
KAFKA 3.1.0.pdf
KAFKA 3.1.0.pdfKAFKA 3.1.0.pdf
KAFKA 3.1.0.pdf
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
DevOps for database
DevOps for databaseDevOps for database
DevOps for database
 
Docker & Kubernetes基礎
Docker & Kubernetes基礎Docker & Kubernetes基礎
Docker & Kubernetes基礎
 
What is Cloud Native Explained?
What is Cloud Native Explained?What is Cloud Native Explained?
What is Cloud Native Explained?
 
Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...
 
Terraform modules and (some of) best practices
Terraform modules and (some of) best practicesTerraform modules and (some of) best practices
Terraform modules and (some of) best practices
 
Scrum and agile principles
Scrum and agile principles Scrum and agile principles
Scrum and agile principles
 
Monolith to Microservices - O’Reilly Oscon
Monolith to Microservices - O’Reilly OsconMonolith to Microservices - O’Reilly Oscon
Monolith to Microservices - O’Reilly Oscon
 
Configuration Management in Ansible
Configuration Management in Ansible Configuration Management in Ansible
Configuration Management in Ansible
 
私たちはRESTCONFでネットワーク自動化的に何が嬉しくなるのか考えてみた
私たちはRESTCONFでネットワーク自動化的に何が嬉しくなるのか考えてみた私たちはRESTCONFでネットワーク自動化的に何が嬉しくなるのか考えてみた
私たちはRESTCONFでネットワーク自動化的に何が嬉しくなるのか考えてみた
 
NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX Back to Basics: Ingress Controller (Japanese Webinar)NGINX Back to Basics: Ingress Controller (Japanese Webinar)
NGINX Back to Basics: Ingress Controller (Japanese Webinar)
 
Deploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOpsDeploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOps
 
Formation Agile Scrum
Formation Agile ScrumFormation Agile Scrum
Formation Agile Scrum
 
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache TomcatCase Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
 
Microservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, KanbanMicroservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, Kanban
 
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
講演資料: コスト最適なプライベートCDNを「NGINX」で実現するWeb最適化セミナー
 
Machine configoperatorのちょっとイイかもしれない話
Machine configoperatorのちょっとイイかもしれない話 Machine configoperatorのちょっとイイかもしれない話
Machine configoperatorのちょっとイイかもしれない話
 
Microservices Design Patterns
Microservices Design PatternsMicroservices Design Patterns
Microservices Design Patterns
 

Similar to Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center

Hadoop on-mesos
Hadoop on-mesosHadoop on-mesos
Hadoop on-mesos
Henry Cai 蔡明航
 
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Community
 
Ceph Day NYC: Building Tomorrow's Ceph
Ceph Day NYC: Building Tomorrow's CephCeph Day NYC: Building Tomorrow's Ceph
Ceph Day NYC: Building Tomorrow's Ceph
Ceph Community
 
Distributed Resource Scheduling Frameworks
Distributed Resource Scheduling FrameworksDistributed Resource Scheduling Frameworks
Distributed Resource Scheduling Frameworks
VARUN SAXENA
 
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
Naganarasimha Garla
 
Introduction to mesos
Introduction to mesosIntroduction to mesos
Introduction to mesos
Omid Vahdaty
 
Apache Mesos Overview and Integration
Apache Mesos Overview and IntegrationApache Mesos Overview and Integration
Apache Mesos Overview and Integration
Alex Baretto
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph
Ceph Community
 
Distributed Databases - Concepts & Architectures
Distributed Databases - Concepts & ArchitecturesDistributed Databases - Concepts & Architectures
Distributed Databases - Concepts & Architectures
Daniel Marcous
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
Demi Ben-Ari
 
Design of Experiments on Federator Polystore Architecture
Design of Experiments on Federator Polystore ArchitectureDesign of Experiments on Federator Polystore Architecture
Design of Experiments on Federator Polystore Architecture
Luiz Henrique Zambom Santana
 
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red_Hat_Storage
 
Mesos study report 03v1.2
Mesos study report  03v1.2Mesos study report  03v1.2
Mesos study report 03v1.2
Stefanie Zhao
 
Infinitely Scalable Clusters - Grid Computing on Public Cloud - London
Infinitely Scalable Clusters - Grid Computing on Public Cloud - LondonInfinitely Scalable Clusters - Grid Computing on Public Cloud - London
Infinitely Scalable Clusters - Grid Computing on Public Cloud - London
Hentsū
 
Musings on Mesos: Docker, Kubernetes, and Beyond.
Musings on Mesos: Docker, Kubernetes, and Beyond.Musings on Mesos: Docker, Kubernetes, and Beyond.
Musings on Mesos: Docker, Kubernetes, and Beyond.
Timothy St. Clair
 
Introduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStackIntroduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStack
OpenStack_Online
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jc
James Saint-Rossy
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
Morteza Zakeri
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
Bigstep
 
Java scalability considerations yogesh deshpande
Java scalability considerations   yogesh deshpandeJava scalability considerations   yogesh deshpande
Java scalability considerations yogesh deshpande
IndicThreads
 

Similar to Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center (20)

Hadoop on-mesos
Hadoop on-mesosHadoop on-mesos
Hadoop on-mesos
 
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
 
Ceph Day NYC: Building Tomorrow's Ceph
Ceph Day NYC: Building Tomorrow's CephCeph Day NYC: Building Tomorrow's Ceph
Ceph Day NYC: Building Tomorrow's Ceph
 
Distributed Resource Scheduling Frameworks
Distributed Resource Scheduling FrameworksDistributed Resource Scheduling Frameworks
Distributed Resource Scheduling Frameworks
 
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
Distributed Resource Scheduling Frameworks, Is there a clear Winner ?
 
Introduction to mesos
Introduction to mesosIntroduction to mesos
Introduction to mesos
 
Apache Mesos Overview and Integration
Apache Mesos Overview and IntegrationApache Mesos Overview and Integration
Apache Mesos Overview and Integration
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph
 
Distributed Databases - Concepts & Architectures
Distributed Databases - Concepts & ArchitecturesDistributed Databases - Concepts & Architectures
Distributed Databases - Concepts & Architectures
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
Design of Experiments on Federator Polystore Architecture
Design of Experiments on Federator Polystore ArchitectureDesign of Experiments on Federator Polystore Architecture
Design of Experiments on Federator Polystore Architecture
 
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
 
Mesos study report 03v1.2
Mesos study report  03v1.2Mesos study report  03v1.2
Mesos study report 03v1.2
 
Infinitely Scalable Clusters - Grid Computing on Public Cloud - London
Infinitely Scalable Clusters - Grid Computing on Public Cloud - LondonInfinitely Scalable Clusters - Grid Computing on Public Cloud - London
Infinitely Scalable Clusters - Grid Computing on Public Cloud - London
 
Musings on Mesos: Docker, Kubernetes, and Beyond.
Musings on Mesos: Docker, Kubernetes, and Beyond.Musings on Mesos: Docker, Kubernetes, and Beyond.
Musings on Mesos: Docker, Kubernetes, and Beyond.
 
Introduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStackIntroduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStack
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jc
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Java scalability considerations yogesh deshpande
Java scalability considerations   yogesh deshpandeJava scalability considerations   yogesh deshpande
Java scalability considerations yogesh deshpande
 

Recently uploaded

International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
wisnuprabawa3
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
enizeyimana36
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
Recycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part IIRecycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part II
Aditya Rajan Patra
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 

Recently uploaded (20)

International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
Recycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part IIRecycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part II
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 

Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center

  • 1. Mesos A Platform for Fine-Grained Resource Sharing in the Data Center @PapersWeLoveSEA @ankurcha - Ankur Chauhan Benjamin Hindman, Andy konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, Ion Stoica
  • 2. Background Rapid innovation in distributed/cluster computing ● Hadoop, Spark, Flink, Tez, HDFS, Dryad etc ● Micro-services / web-services (long running) ● <Your custom framework> … … There is a lot of wasted effort.
  • 3. Problem ● Rapid innovation in cluster/dist. frameworks ● No single framework is optimal for all workloads ● What do we want? ○ Run multiple frameworks on a single cluster (big) ■ Maximize utilisation of resources. (elastic) ■ Share data between frameworks.
  • 6. Mesos Goals ● High utilization of resources ● Support diverse frameworks ● Scalability to 10,000s of nodes ● Reliability in face of failures ● Efficient with minimal overhead ● Highly available via leader election of master.
  • 7. Other benefits ● Run multiple instances of a framework ○ Isolate production and experimental tasks ○ Run multiple version of a framework ● Build special framework targeting a particular domain ○ eg: spark
  • 8. Mesos Architecture ● Resource Offers ○ Offer available resources to frameworks ● 2-level scheduler ○ Mesos takes the decision of which framework gets the offer. ○ Frameworks decide whether to accept/reject offer. ● Fine grained sharing ○ Improved utilization, responsiveness, data locality ● Pluggable allocation module
  • 9. Mesos Architecture High level architecture Design components ● Mesos Master ○ Runs the allocation module ● Mesos Slave ○ Offers resources ○ task/executor isolation ● Scheduler ○ Accepts/rejects resource offers ● Executor ○ Runs tasks ● Task ○ Unit of work
  • 10. Mesos Architecture ● Making resource offers scalable and robust ○ Let schedulers define filters ○ Offered resources counted as allocations ○ Rescind offers if not accepted within timeout ● Fault tolerance ○ Zookeeper for leader election (hot-standby) ○ Mesos reports slave and executor failures to scheduler ○ Minimal internal state in master
  • 11. Resource offers ● Mesos decides how many resources to offer to each framework, based on allocation policy (fair share), while framework decide which resource offers to accept and which tasks to run on them. ● When a framework accepts an offer, it passes Mesos a description of the task (and executor) to launch.
  • 12. Analysis ● Resource offers work well when: ○ Frameworks can scale elastically. ○ Task durations are homogeneous. ○ Frameworks have many preferred nodes. ● Conditions hold true in many frameworks (hadoop, dryad, spark … ) ○ Work is divided into short tasks ○ Data is replicated across multiple nodes
  • 13. Limitations of distributed scheduling ● Fragmentation ○ With heterogeneous resource demands, distributed collection of frameworks may not be able to bin pack optimally. ● Independent framework constraints ○ Esoteric dependency scenarios can only be satisfied with a centralized scheduler ● Framework complexity ○ Need to deal with resource offers ( not onerous )
  • 14. Implementation ● ~10,000 lines of C++ ● libprocess - actor-based programming (akka for c++) ● Cgroups / Docker for task isolation ● Zookeeper for leader election (HA) ● Ports / plugins ○ Hadoop - map reduce ○ Torque / MPI ○ Spark Framework - iterative jobs
  • 16. Evaluation Data locality with resource offers ○ 16 Hadoop instances using 93 EC2 instances ○ 1.7x speedup with mesos ○ 97% data locality with 5sec delay scheduling
  • 17. Evaluation Mesos scalability ● 99 EC2 instances ● Scaled to 50k slaves, 200 frameworks, 100k tasks
  • 18. Evaluation ● Failure recovery ○ Mean time to recovery was between 4-8 seconds with 95% confidence of 3secs on either side. ● Performance isolation ○ Linux containers are not perfect at isolation ○ 30% increase in request latency vs 550% increase during tests ■ apache server + cpu hog process