Developers are moving away from their host-based patterns and adopting a new mindset around the idea that the datacenter is the computer. It?s quickly becoming a mainstream model that you can view a warehouse full of servers as a single computer (with terabytes of memory and tens of thousands of cores). There is a key missing piece, which is an operating system for the datacenter (DCOS), which would provide the same OS functionality and core OS abstractions across thousands of machines that an OS provides on a single machine today. In this session, we will discuss:
How the abstraction of an OS has evolved over time and can cleanly scale to spand thousands of machines in a datacenter.
How key open source technologies like the Apache Mesos distributed systems kernel provide the key underpinnings for a DCOS.
How developers can layer core system services on top of a distributed systems kernel, including an init system (Marathon), cron (Chronos), service discovery (DNS), and storage (HDFS)
What would the interface to the DCOS look like? How would you use it?
How you would install and operate datacenter services, including Apache Spark, Apache Cassandra, Apache Kafka, Apache Hadoop, Apache YARN, Apache HDFS, and Google's Kubernetes.
How will developers build datacenter-scale apps, programmed against the datacenter OS like it?s a single machine?
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksPaco Nathan
O'Reilly Media - Strata SC 2014
Apache Mesos is an open source cluster manager that provides efficient resource isolation for distributed frameworks—similar to Google’s “Borg” and “Omega” projects for warehouse scale computing. It is based on isolation features in the modern kernel: “cgroups” in Linux, “zones” in Solaris.
Google’s “Omega” research paper shows that while 80% of the jobs on a given cluster may be batch (e.g., MapReduce), 55-60% of cluster resources go toward services. The batch jobs on a cluster are the easy part—services are much more complex to schedule efficiently. However by mixing workloads, the overall problem of scheduling resources can be greatly improved.
Given the use of Mesos as the kernel for a “data center OS”, two additional open source components Chronos (like Unix “cron”) and Marathon (like Unix “init.d”) serve as the building blocks for creating distributed, fault-tolerant, highly-available apps at scale.
This talk will examine case studies of Mesos uses in production at scale: ranging from Twitter (100% on prem) to Airbnb (100% cloud), plus MediaCrossing, Categorize, HubSpot, etc. How have these organizations leveraged Mesos to build better, more scalable and efficient distributed apps? Lessons from the Mesos developer community show that one can port an existing framework with a wrapper in approximately 100 line of code. Moreover, an important lesson from Spark is that based on “data center OS” building blocks one can rewrite a distributed system much like Hadoop to be 100x faster within a relatively small amount of source code.
These case studies illustrate the obvious benefits over prior approaches based on virtualization: scalability, elasticity, fault-tolerance, high availability, improved utilization rates, etc. Less obvious outcomes also include: reduced time for engineers to ramp-up new services at scale; reduced latency between batch and services, enabling new high-ROI use cases; and enabling dev/test apps to run on a production cluster without disrupting operations.
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
Traditionally, machines were statically partitioned across the different services at Uber. In an effort to increase the machine utilization, Uber has recently started transitioning most of its services, including the storage services, to run on top of Mesos. This presentation will describe the initial experience building and operating a framework for running Cassandra on top of Mesos running across multiple datacenters at Uber. This framework automates several Cassandra operations such as node repairs, addition of new nodes and backup/restore. It improves efficiency by co-locating CPU-intensive services as well as multiple Cassandra nodes on the same Mesos agent. It handles failure and restart of Mesos agents by using persistent volumes and dynamic reservations. This talk includes statistics about the number of Cassandra clusters in production, time taken to start a new cluster, add a new node, detect a node failure; and the observed Cassandra query throughput and latency.
About the Speaker
Abhishek Verma Software Engineer, Uber
Dr. Abhishek Verma is currently working on running Cassandra on top of Mesos at Uber. Prior to this, he worked on BorgMaster at Google and was the first author of the Borg paper published in Eurosys 2015. He received an MS in 2010 and a PhD in 2012 in Computer Science from the University of Illinois at Urbana-Champaign, during which he authored more than 20 publications in conferences, journals and books and presented tens of talks.
What is Apache Mesos and how to use it. A short introduction to distributed fault-tolerant systems with using ZooKeeper and Mesos. #installfest Prague 2014
Supporting bioinformatics applications with hybrid multi-cloud servicesAhmed Abdullah
ElasticHPC Supports the creation and management of cloud computing resources over multiple public cloud Providers Including Amazon, Azure, Google and Clouds supporting OpenStack.
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksPaco Nathan
O'Reilly Media - Strata SC 2014
Apache Mesos is an open source cluster manager that provides efficient resource isolation for distributed frameworks—similar to Google’s “Borg” and “Omega” projects for warehouse scale computing. It is based on isolation features in the modern kernel: “cgroups” in Linux, “zones” in Solaris.
Google’s “Omega” research paper shows that while 80% of the jobs on a given cluster may be batch (e.g., MapReduce), 55-60% of cluster resources go toward services. The batch jobs on a cluster are the easy part—services are much more complex to schedule efficiently. However by mixing workloads, the overall problem of scheduling resources can be greatly improved.
Given the use of Mesos as the kernel for a “data center OS”, two additional open source components Chronos (like Unix “cron”) and Marathon (like Unix “init.d”) serve as the building blocks for creating distributed, fault-tolerant, highly-available apps at scale.
This talk will examine case studies of Mesos uses in production at scale: ranging from Twitter (100% on prem) to Airbnb (100% cloud), plus MediaCrossing, Categorize, HubSpot, etc. How have these organizations leveraged Mesos to build better, more scalable and efficient distributed apps? Lessons from the Mesos developer community show that one can port an existing framework with a wrapper in approximately 100 line of code. Moreover, an important lesson from Spark is that based on “data center OS” building blocks one can rewrite a distributed system much like Hadoop to be 100x faster within a relatively small amount of source code.
These case studies illustrate the obvious benefits over prior approaches based on virtualization: scalability, elasticity, fault-tolerance, high availability, improved utilization rates, etc. Less obvious outcomes also include: reduced time for engineers to ramp-up new services at scale; reduced latency between batch and services, enabling new high-ROI use cases; and enabling dev/test apps to run on a production cluster without disrupting operations.
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
Traditionally, machines were statically partitioned across the different services at Uber. In an effort to increase the machine utilization, Uber has recently started transitioning most of its services, including the storage services, to run on top of Mesos. This presentation will describe the initial experience building and operating a framework for running Cassandra on top of Mesos running across multiple datacenters at Uber. This framework automates several Cassandra operations such as node repairs, addition of new nodes and backup/restore. It improves efficiency by co-locating CPU-intensive services as well as multiple Cassandra nodes on the same Mesos agent. It handles failure and restart of Mesos agents by using persistent volumes and dynamic reservations. This talk includes statistics about the number of Cassandra clusters in production, time taken to start a new cluster, add a new node, detect a node failure; and the observed Cassandra query throughput and latency.
About the Speaker
Abhishek Verma Software Engineer, Uber
Dr. Abhishek Verma is currently working on running Cassandra on top of Mesos at Uber. Prior to this, he worked on BorgMaster at Google and was the first author of the Borg paper published in Eurosys 2015. He received an MS in 2010 and a PhD in 2012 in Computer Science from the University of Illinois at Urbana-Champaign, during which he authored more than 20 publications in conferences, journals and books and presented tens of talks.
What is Apache Mesos and how to use it. A short introduction to distributed fault-tolerant systems with using ZooKeeper and Mesos. #installfest Prague 2014
Supporting bioinformatics applications with hybrid multi-cloud servicesAhmed Abdullah
ElasticHPC Supports the creation and management of cloud computing resources over multiple public cloud Providers Including Amazon, Azure, Google and Clouds supporting OpenStack.
A brief introduction to Hadoop distributed file system. How a file is broken into blocks, written and replicated on HDFS. How missing replicas are taken care of. How a job is launched and its status is checked. Some advantages and disadvantages of HDFS-1.x
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSArun prasath
Achieving QOS in a multi-tenant cloud platforms is still a difficult task and many companies follow different approaches to solve this problem. Here in this document I tried architecting a simple solution for achieving different QOS for different tenants in a Multi-tenant cloud environment based on my experiments with containers , docker and cgroup on Openstack.
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...DataStax
EmoDB is an open source RESTful data store built on top of Cassandra that stores JSON documents and, most notably, offers a databus that allows subscribers to watch for changes to those documents in real time. It features massive non-blocking global writes, asynchronous cross data center communication, and schema-less json content.
For non-blocking global writes, we created a ""JSON delta"" specification that defines incremental updates to any json document. Each row, in Cassandra, is thus a sequence of deltas that serves as a Conflict-free Replicated Datatype (CRDT) for EmoDB's system of record. We introduce the concept of ""distributed compactions"" to frequently compact these deltas for efficient reads.
Finally, the databus forms a crucial piece of our data infrastructure and offers a change queue to real time streaming applications.
About the Speaker
Fahd Siddiqui Lead Software Engineer, Bazaarvoice
Fahd Siddiqui is a Lead Software Engineer at Bazaarvoice in the data infrastructure team. His interests include highly scalable, and distributed data systems. He holds a Master's degree in Computer Engineering from the University of Texas at Austin, and frequently talks at Austin C* User Group. About Bazaarvoice: Bazaarvoice is a network that connects brands and retailers to the authentic voices of people where they shop. More at www.bazaarvoice.com
Distributed Storage and Compute With Ceph's librados (Vault 2015)Sage Weil
The Ceph distributed storage system sports object, block, and file interfaces to a single storage cluster. These interface are built on a distributed object storage and compute platform called RADOS, which exports a conceptually simple yet powerful interface for storing and processing large amounts of data and is well-suited for backing web-scale applications and data analytics. In features a rich object model, efficient key/value storage, atomic transactions (including efficient compare-and-swap semantics), object cloning and other primitives for supporting snapshots, simple inter-client communication and coordination (ala Zookeeper), and the ability to extend the object interface using arbitrary code executed on the storage node. This talk will focus on librados API, how it is used, the security model, and some examples of RADOS classes implementing interesting functionality.
BlueStore: a new, faster storage backend for CephSage Weil
Traditionally Ceph has made use of local file systems like XFS or btrfs to store its data. However, the mismatch between the OSD's requirements and the POSIX interface provided by kernel file systems has a huge performance cost and requires a lot of complexity. BlueStore, an entirely new OSD storage backend, utilizes block devices directly, doubling performance for most workloads. This talk will cover the motivation a new backend, the design and implementation, the improved performance on HDDs, SSDs, and NVMe, and discuss some of the thornier issues we had to overcome when replacing tried and true kernel file systems with entirely new code running in userspace.
How SQL Server was ported to Linux? The presentation goes through some of the concepts: SQLOS, Drawbridge and Containers. It shows the role of SQLPAL as a platform abstraction layer.
A brief introduction to Hadoop distributed file system. How a file is broken into blocks, written and replicated on HDFS. How missing replicas are taken care of. How a job is launched and its status is checked. Some advantages and disadvantages of HDFS-1.x
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSArun prasath
Achieving QOS in a multi-tenant cloud platforms is still a difficult task and many companies follow different approaches to solve this problem. Here in this document I tried architecting a simple solution for achieving different QOS for different tenants in a Multi-tenant cloud environment based on my experiments with containers , docker and cgroup on Openstack.
One Billion Black Friday Shoppers on a Distributed Data Store (Fahd Siddiqui,...DataStax
EmoDB is an open source RESTful data store built on top of Cassandra that stores JSON documents and, most notably, offers a databus that allows subscribers to watch for changes to those documents in real time. It features massive non-blocking global writes, asynchronous cross data center communication, and schema-less json content.
For non-blocking global writes, we created a ""JSON delta"" specification that defines incremental updates to any json document. Each row, in Cassandra, is thus a sequence of deltas that serves as a Conflict-free Replicated Datatype (CRDT) for EmoDB's system of record. We introduce the concept of ""distributed compactions"" to frequently compact these deltas for efficient reads.
Finally, the databus forms a crucial piece of our data infrastructure and offers a change queue to real time streaming applications.
About the Speaker
Fahd Siddiqui Lead Software Engineer, Bazaarvoice
Fahd Siddiqui is a Lead Software Engineer at Bazaarvoice in the data infrastructure team. His interests include highly scalable, and distributed data systems. He holds a Master's degree in Computer Engineering from the University of Texas at Austin, and frequently talks at Austin C* User Group. About Bazaarvoice: Bazaarvoice is a network that connects brands and retailers to the authentic voices of people where they shop. More at www.bazaarvoice.com
Distributed Storage and Compute With Ceph's librados (Vault 2015)Sage Weil
The Ceph distributed storage system sports object, block, and file interfaces to a single storage cluster. These interface are built on a distributed object storage and compute platform called RADOS, which exports a conceptually simple yet powerful interface for storing and processing large amounts of data and is well-suited for backing web-scale applications and data analytics. In features a rich object model, efficient key/value storage, atomic transactions (including efficient compare-and-swap semantics), object cloning and other primitives for supporting snapshots, simple inter-client communication and coordination (ala Zookeeper), and the ability to extend the object interface using arbitrary code executed on the storage node. This talk will focus on librados API, how it is used, the security model, and some examples of RADOS classes implementing interesting functionality.
BlueStore: a new, faster storage backend for CephSage Weil
Traditionally Ceph has made use of local file systems like XFS or btrfs to store its data. However, the mismatch between the OSD's requirements and the POSIX interface provided by kernel file systems has a huge performance cost and requires a lot of complexity. BlueStore, an entirely new OSD storage backend, utilizes block devices directly, doubling performance for most workloads. This talk will cover the motivation a new backend, the design and implementation, the improved performance on HDDs, SSDs, and NVMe, and discuss some of the thornier issues we had to overcome when replacing tried and true kernel file systems with entirely new code running in userspace.
How SQL Server was ported to Linux? The presentation goes through some of the concepts: SQLOS, Drawbridge and Containers. It shows the role of SQLPAL as a platform abstraction layer.
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Docker, Inc.
Operating apps at web scale has become the new normal, but has been out of reach for most companies. Join us as we show you how to deploy and manage your Docker containers at scale. See how easy it is to build highly-available, fault-tolerant web scale apps using Docker with the Mesos cluster scheduler. Docker plus Mesos is a new way to scale applications. Together they give you capabilities similar to Google’s Borg, the Googleplex’s secret weapon of scalability and fault tolerance.
SMACK Stack 1.0 has been Spark, Mesos, Akka, Cassandra and Kafka working into different cohesive systems delivering different solutions for different use cases. Haven't heard about it before? Oh man! Where have you been? https://www.google.com/search?q=smack+stack+1.0
SMACK Stack 1.1 we go a step further Streaming, Mesos, Analytics, Cassandra and Kafka and Joe Stein will walk through in detail some of the different viable options for Streaming and Analytics with Mesos, Kafka and Cassandra.
OSDC 2016 - Mesos and the Architecture of the New Datacenter by Jörg SchadNETWAYS
Apache Mesos has the ability to run on every private and cloud instance, anywhere. In this talk, Jörg Schad (Software Developer at Mesosphere) will explain the momentum behind the “single computer” abstraction that has put Mesos at the center of one of the most exciting architecture shifts in recent information technology history. He will explain how Mesos is enabling application developers and devops to redefine their responsibilities and shorten the amount of time it takes to write and ship production code. Jörg will outline how Mesos is empowering the new class of “datacenter developers” to program directly against datacenter resources, and draw correlations to how the Linux revolutionized the server industry.
There have been many changes in the use of container technology over the last year. Data from a recent survey demonstrates how those changes are manifesting themselves in terms of the tools and vendors being used to manage containers. In addition, details are provided about the products being used for storage, networking and containers as a service.
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me to download the slides
Over the last several years, we’ve seen the emergence new application architecture – dubbed “cloud native” – that is highly distributed, elastic and composable with the container as the modular compute abstraction. With that, a new breed of tools has emerged to help deploy, manage and scale these applications. Cluster management, service discovery, scheduling, etc. – terms that previously were unknown or, at best, reserved for the realm of high-performance computing – are now becoming part of every IT organization’s lexicon. As the pace of innovation continues at breakneck speed, a taxonomy to help understand the elements of this new stack is helpful.
The “Cloud-Native” Ecosystem presentation is the consequence of many conversations with developers, CIOs and founders who are playing a critical role in shaping this new application paradigm. It attempts to define the discreet components of the cloud-native stack and calls out the vendors, products and projects that comprise the ecosystem.
Note, this is an ever-evolving document that’s meant to be collaborative and is by no means a hardened or exhaustive industry landscape. If you have suggestions, edits and/or know of products or companies that should be included please don’t hesitate to get in touch.
Choosing PaaS: Cisco and Open Source Options: an overviewCisco DevNet
A session in the DevNet Zone at Cisco Live, Berlin. Confused by all the open source PaaS options out there? What criteria should you use to evaluate them? We seek to answer these questions in a systematic manner and will explore top technologies such as Mesos, Apprenda, Cloud Foundry and Kubernetes along with Cisco's Project Shipped and open source Mantl. The aim of this session will be to shed light on which platforms add value to your needs, applications and workloads.
This session shows an overview of the features and architecture of SQL Server on Linux and Containers. It covers install, config, performance, security, HADR, Docker containers, and tools. Find the demos on http://aka.ms/bobwardms
A brief into into using Apache Cassandra with Kubernetes. We also covered Docker, Docker Swarm, DC/OS and some open source tools to help you get started.
Similar to OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System (20)
3. A Slice of Google Tech Transfer History
2005: MapReduce -> Hadoop (Yahoo)
2007: Linux cgroups for lightweight isolation (Google)
2009: BigTable -> MongoDB
2009: “The Datacenter as a Computer” - Barroso, Hölzle (Google)
2009: Mesos - a distributed operating system kernel (UC Berkeley)
2010: Large scale production Mesos deployment (Twitter)
since 2010: Many more frameworks and quite a few meta-frameworks
5. Cluster Operating Systems (Hardware Clustering)
Researched since the 1980s
Trying to provide (the illusion of) a single system image
Aiming at HA, load balancing, location transparency (e.g. for storage)
Many systems: Amoeba, ChorusOS, GLUnix, Hurricane, MOSIX, Plan9, RHCS,
Spring, Sprite, Sumo, QNX, Solaris MC, UnixWare, VAXclusters, …
Relatively low scale (up to 100s of nodes)
Complicated to manage, less dynamic than software clustering
5
6. From HPC Grid to Enterprise Cloud
Condor, LSF, Maui, Moab, Quartz, SLURM, …
Typically for batch jobs
Also cover services => SOA => more job schedulers
=> grid computing => grid middleware … => cloud stacks
6
7. From Server Virtualization to App Aggregation
Cloud Era:
Big apps, small servers
Client-Server Era:
Small apps, big servers
Server
Virtualization
App App App App
App
Aggregation
Serv Serv Serv Serv
8. Cloud Computing
SaaS: Salesforce demonstrated success, then many followed
PaaS: Deis, Dotcloud, OpenShift, Heroku, Pivotal, Stackato, …
IaaS: AWS, Azure, DigitalOcean, GCE…
Private cloud stacks including IaaS: Eucalyptus, CloudStack,
Joyent, OpenStack, SmartCloud, vSphere, …
8
9. Datacenter
✴ A facility used to house computer systems and associated
components (e.g. networking, storage, cooling, sensors)
✴ In this talk we focus on how to manage and use a single
production cluster of networked computers in a datacenter
✴ Such clusters range in size from 10s to 10000s of nodes
✴ Why should we and how can we end up with
just one production cluster?
9
10. Datacenter Services
✴ LAMP (Linux, Apache, MSQL, PHP) or similar
✴ MEAN (MongoDB, Express.js, Angular.js, Node.js) or similar
✴ Cassandra, ElasticSearch, Exelixi, Hadoop, Hypertable, Jenkins,
Kafka, MPI, Spark, Storm, SSSP, Torque, …
✴ Private PaaS: Deis, …
✴ …
10
14. Available Open Source Components
✴ 2-level scheduler: Apache Mesos
✴ Meta-frameworks / schedulers: Aurora, Chronos, Marathon,
Kubernetes, Swarm, …
✴ Service discovery: Consul, HAProxy, Mesos DNS, …
✴ Highly available configuration: zk, etcd, …
✴ Storage: HDFS, Ceph, …
✴ Node OSs: lots of Linux variants
✴ Lots of app frameworks: Sparc, Storm, Cassandra, Kafka, …14
15. 2-Level Scheduling
Scale: from 1 node to at least 10000s of nodes
Optimizing resource management
End-to-end principle: “application-specific functions ought to reside
in the end nodes of a network rather than intermediary nodes”
-> Requirement for general multi-tenancy
-> Requirement for having only one production cluster
15
17. Ways to Run an Application
1. Vanilla job
• Employ meta-framework for invocation: Chronos, Aurora, Kubernetes, …
2. Application of an adapted framework
• Hadoop, Sparc, Storm, ElasticSearch, Cassandra, Kafka, many more…
3. Non-adapted services
• Employ meta-framework for invocation: Marathon, Aurora, Kubernetes, …
• Provide (select) a service discovery solution
4. Program your own scheduler (and executor)
17
18. The Mesos Framework API
✴ Currently like internal Mesos communication:
• protobuf messages over HTTP
✴ Soon:
• JSON messages over HTTP (stream)
=> no need to link with binary Mesos library and/or less to reimplement
ca. a dozen programming languages => any language
18
19. How to implement a framework
✴ Scheduler interface: 1 half of 2-level scheduling
• The framework knows best when to do what with what kind of resources
• About a dozen callbacks, main functionality in 2 of them:
- receive resource offers
- receive task status updates
✴ Executor interface: task life-cycle management and monitoring
• Command line executor included in Mesos
• Docker executor included in Mesos
• Custom executors often not needed
19
21. Minimal Scheduler Implementation
class MyFrameworkScheduler implements Scheduler {
…
private TaskGenerator _taskGen;
public void resourceOffers(SchedulerDriver driver, List<Offer> offers) {
if (_taskGen.doneCreatingTasks()) {
for (offer : offers) {
driver.declineOffer(offer.getId());
}
} else {
for (offer : offers) {
List<TaskInfo> taskInfos = _taskGen.generateTaskInfos(offer);
driver.launchTasks(offer.getId(), taskInfos, _filters);
}
}
}
public void statusUpdate(SchedulerDriver driver, TaskStatus status) {
_taskGen.observeTaskStatusUpdate(taskStatus);
if (_taskGen.done()) {
driver.stop();
}
}
…
}
21
22. The Developer’s Perspective
✴ Focus on application logic, not datacenter structure
✴ Avoid networking-related code
✴ Reuse of built-in fault-tolerance and high availability
✴ Reuse distributed (infrastructure) frameworks (e.g., storage)
=> API, SDK for datacenter services
22
23. The Operations Engineer’s Perspective
✴ Ease of deployment/management
✴ Uniformity of deployment/management
✴ Hardware utilization rate
✴ Scaling up as business grows
✴ Scaling out sporadically
✴ Cost and time for moving to a different datacenter
✴ High availability and fault-tolerance of system services
✴ Monitoring
✴ Trouble shooting
23
24. Necessary Multi-Tenancy Features
Task containerization
Resource isolation
Resource and task attributes
Static and dynamic resource reservations
Reservation levels
Meta-frameworks
Dynamic scheduler update and reconfiguration
Security
24
26. Using Docker Containers in Mesos
26
Mesos Master Server
init
|
+ mesos-master
|
+ marathon
|
Mesos Slave Server
init
|
+ docker
| |
| + lxc
| |
| + (user task, under container init system)
| |
|
+ mesos-slave
| |
| + /var/lib/mesos/executors/docker
| | |
| | + docker run …
| | |
Docker
Registry
When a user requests
a container…
Mesos, LXC, and
Docker are tied
together for launch
2
1
3
4
5
6
7
8
27. Other Schedulers as Meta-Frameworks in a 2-level Scheduler
YARN => https://github.com/mesos/myriad
Kubernetes => https://github.com/mesosphere/kubernetes-mesos
Swarm => Swarm on Mesos (new project)
=> run everything in one cluster
27
28. Myriad : Virtual YARN Clusters on Mesos
28
◦ POST /api/clusters: Registers a new YARN
◦ GET /api/clusters: Lists all registered clusters
◦ GET /api/clusters/{clusterId}: Lists the cluster with {clusterId}
◦ PUT /api/clusters/{clusterId}/flexup: Expands the size of cluster with {clusterId}
◦ PUT /api/clusters/{clusterId}/flexdown: Shrinks the size of cluster with {clusterId}
◦ DELETE /api/clusters/{clusterId}: Unregisters YARN cluster with {clusterId}. Also, kills all the nodes.
Node
Master
Mesos
Slave
Mesos
YARN
Myriad
Scheduler RM
Myriad
Executor
1. Launch NodeManager
1
1
1
2.5 CPU
2.5 GB
1
NM
YARN
flexUp
2.0 CPU
2.0 GB
C1
C2
31. The Application User’s Perspective
✴ Focus on apps, services, parameters, results
✴ Avoid dealing with datacenter operations/management
✴ Avoid adjusting system settings
✴ High availability
✴ Throughput
✴ Responsiveness
✴ Predictiveness
✴ Run everything I need
✴ Return on and safety of investment
31
32. The Datacenter is the new form factor
✴ 2-level scheduler => single production cluster
✴ scalability and portability => avoiding hardware/cloud lock-in
✴ built-in container support => running containers at scale
✴ automation => operator efficiency
✴ repositories => apps/services readily available
✴ API and SDK => productive/quick app/service development
32