SlideShare a Scribd company logo
1 of 51
Download to read offline
© 2017 Mesosphere, Inc. All Rights Reserved. 1
@joerg_schad
Nightmares of a
Container
Orchestration
System
© 2018 Mesosphere, Inc. All Rights Reserved. 2
Jörg Schad
Technical Lead Community
Projects
@joerg_schad
Jan Repnak
Support Engineer/
Solution Architect
@jrx
© 2018 Mesosphere, Inc. All Rights Reserved. 3
3AM…
© 2018 Mesosphere, Inc. All Rights Reserved. 4
Datacenter Operating System (DC/OS)
Distributed Systems Kernel (Apache Mesos)
Scope of Jan’s nightmares…
Big Data + Analytics Engines
Microservices (in containers)
Streaming
Batch
Machine Learning
Analytics
Functions &
Logic
Search
Time Series
SQL / NoSQL
Databases
Modern App Components
Any Infrastructure (Physical, Virtual, Cloud)
© 2018 Mesosphere, Inc. All Rights Reserved. 5
Container Orchestration
We deploy all our
container with own
scripts…
• Use existing Container Scheduler
© 2018 Mesosphere, Inc. All Rights Reserved. 6
Container (wo Orchestration)
ServiceAppAppApp
OS OS OS
Machine
Infrastructure
Machine Machine
Container Runtime Container Runtime
Service Service
ServiceServiceServiceServiceServiceService
Container Runtime
© 2018 Mesosphere, Inc. All Rights Reserved. 7
Container Orchestration
CONTAINER
SCHEDULING
- Placement
- Replication/Scaling
- Resurrection
- Rescheduling
- Rolling Deployment
- Upgrades
- Downgrades
- Collocation
RESOURCE
MANAGEMENT
- Memory
- CPU
- GPU
- Volumes
- Ports
- IPs
- Images/Artifacts
SERVICE
MANAGEMENT
- Labels
- Groups/
Namespaces
- Dependencies
- Load Balancing
- Readiness
Checking
© 2018 Mesosphere, Inc. All Rights Reserved. 8
Container Orchestration
Machine Infrastructure
Web Apps & Services
Scheduling
Resource Management
Container
Runtime
Machine & OS
Service Management
Machine & OS Machine & OS
Container
Runtime
Container
Runtime
© 2018 Mesosphere, Inc. All Rights Reserved. 9
Backup State
Why should we have
backup?
• Backup state
• Services
• Cluster
© 2018 Mesosphere, Inc. All Rights Reserved. 10
Monitoring
I check the cluster
state ever morning…
• Monitoring is key
• Detect problems early
• OOM
• Metrics
• Application
• Container
• Cluster
© 2018 Mesosphere, Inc. All Rights Reserved. 11
Networking
Networking is easy…
• Consider Networking Challenges
• Connectivity
• Load-Balancing
• Layer 4 vs 7
• Service Discovery
• Isolation
© 2018 Mesosphere, Inc. All Rights Reserved. 12
Immutable Container Images
• Use tagged container 

images
• Keep tagged images

immutable!
© 2018 Mesosphere, Inc. All Rights Reserved. 13
Dockerhub works
great for our test
cluster…
Private Container Registries
• Use tagged container 

images
• Keep tagged images

immutable!
• Use a private container 

registry!
© 2018 Mesosphere, Inc. All Rights Reserved. 14
Repeatable Container Builds
• Use repeatable builds for images
• Including FROM clause
• Keep images minimal
• Multistage build
• From scratch
`docker commit` is
great*…
© 2018 Mesosphere, Inc. All Rights Reserved. 15
UI Deployments
• Use (Marathon) endpoints 

for deployments!
• Version (and track) your 

app definitions!
© 2018 Mesosphere, Inc. All Rights Reserved. 16
CI/CD Deployments
Continuous
Integration
Artifact Repo &
Container
Registry
Container
Orchestrator
Version
Control
System
Load Balancer
Production
Environment
Continuous Delivery Pipeline
git push
© 2018 Mesosphere, Inc. All Rights Reserved. 17
Disk Usage
All our disk are full…
• Docker and logs are great 

in filling up disk space!
• Images
• Container
• Cleanup docker instances

and images!
• docker prune
• https://github.com/spotify/docker-gc
• Monitor available disk space!
© 2018 Mesosphere, Inc. All Rights Reserved. 18
Resource Constraints
32MB are sufficient for
Docker container…
• Memory constraints are hard limits
• Consider overhead (e.g., Java)
• Difficult to approximate
• Monitor
© 2018 Mesosphere, Inc. All Rights Reserved. 19
Zookeeper Cluster Size
Our Zookeeper Cluster has
4 nodes, that is better than 3,
or?
• Zookeeper quorum (i.e., #Masters) 

should be odd!
• Production 5 is optimal!
© 2018 Mesosphere, Inc. All Rights Reserved. 20
Health Checks
What are health checks?
• Specify Health checks carefully
• Different options
• Mesos vs Marathon,
• Command vs HTTP
• Impacts Load-Balancers and restarts
• Readiness checks
© 2018 Mesosphere, Inc. All Rights Reserved. 21
NoSQL Datastores
We replaced out Postgres
instance with Cassandra, and now
we get stale results
• Consider the semantics 

of your datastore!
• ACID vs Base
• Model your data and 

queries accordingly!
© 2018 Mesosphere, Inc. All Rights Reserved. 22
Removing Stateful Frameworks
• Follow the uninstall instructions!
• Reservations and Zookeeper state!
• state.json
© 2018 Mesosphere, Inc. All Rights Reserved. 23
Container vs VMs
We just replaced all our VM
instances by containers*…
• Be aware of different 

isolation semantics!
© 2018 Mesosphere, Inc. All Rights Reserved. 24
Container Definition…
!=
What is the
difference between
container image and
instance?
• container runtime*
•!= container image
•!= container instance
© 2018 Mesosphere, Inc. All Rights Reserved. 25
Container vs Container
{
"id": "/springboot-demo",
"cmd": "$JAVA_HOME/bin/java -jar MyApp.jar",
"instances": 1,
"fetch": [
{
"uri": "http://…/MyApp.jar",
},
{
"uri": "https://.../jre-8u121-linux-x64.tar.gz",
}
],
© 2018 Mesosphere, Inc. All Rights Reserved. 26
Write Once Run Any Where
The (Java) container was running
fine in testing…
• Java (<9) not groups aware
• # threads for GC
• …
• Set default values carefully
© 2018 Mesosphere, Inc. All Rights Reserved. 27
a.ka. Understand Container Semantics
• Java (<9) not groups aware
• # threads for GC
• …
• Set default values carefully
© 2018 Mesosphere, Inc. All Rights Reserved. 28
Java meets Container
• Development
• Java App packaged in a container
• Production
• 10 JVM container on a 32 core box
– 10 * (32 cores are seen by each JRE)
– 10 * (32 threads set by default for ForkJoinPool)
– 10 * (32 threads ….)
© 2018 Mesosphere, Inc. All Rights Reserved. 29
Mesos Modules
To solve this problem,
our team quickly developed
this really cool Mesos
Module…
• Mesos Modules can be tricky!
• Monitoring and Debugging…
© 2018 Mesosphere, Inc. All Rights Reserved. 30
Linux Distributions
We are using *obscure
Linux distribution* for our (DC/
OS) cluster
• If possible use tested 

distributions!
• Especially for DC/OS!
© 2018 Mesosphere, Inc. All Rights Reserved. 31
Services on the same node…
We are running *distributed
Database* outside Mesos on the
same cluster
• Be careful when running
services outside Mesos but
on the same cluster!
• Adjust resources
accordingly!
© 2018 Mesosphere, Inc. All Rights Reserved. 32
Spreading out Master Nodes
We are running our cluster
across different AWS regions..
• Be careful when distributing
Master nodes across high
latency links!
• Different AWS AZ ok,
different region probably
not!
© 2018 Mesosphere, Inc. All Rights Reserved. 33
Agent Attributes
We changed the agent
attributes for running
cluster…
• Set agent attributes when
starting an agent!
• Do not change for running
agents!
—attributes='rack:abc;zone:west;
os:centos5;level:10;keys:[1000-1500]'
© 2018 Mesosphere, Inc. All Rights Reserved. 34
Cluster Upgrades
We upgraded our
cluster and…*
• Check state before
• Follow upgrade instructions!
• Automation
• Remember Backup!
© 2018 Mesosphere, Inc. All Rights Reserved. 35
UPGRADE PROCEDURE
Framework
Scheduler
Executor
Task
Agent
Executor
Task
Agent
LEADER STANDBY STANDBY
ZK
ZK
ZK
Before upgrading
1. Make sure cluster is healthy!
2. Perform backup
a. ZK
b. Replicated logs
c. other state
3. Review release notes
4. Generate install bundle
a. Validate versions
© 2018 Mesosphere, Inc. All Rights Reserved. 36
UPGRADE PROCEDURE
Framework
Scheduler
Executor
Task
Agent
Executor
Task
Agent
LEADER STANDBY STANDBY
ZK
ZK
ZK
1. Master rolling upgrade
a. Start with standby
b. Uninstall DC/OS
c. Install new DC/OS
2. Agent rolling upgrade
3. Framework upgrades
© 2018 Mesosphere, Inc. All Rights Reserved. 37
UPGRADE PROCEDURE
Framework
Scheduler
Executor
Task
Agent
Executor
Task
Agent
LEADER STANDBY STANDBY
ZK
ZK
ZK
1. Master rolling upgrade
2. Agent rolling upgrade
a. Uninstall DC/OS
b. Install new DC/OS
3. Framework upgrades
© 2018 Mesosphere, Inc. All Rights Reserved. 38
UPGRADE PROCEDURE
Framework
Scheduler
Executor
Task
Agent
Executor
Task
Agent
LEADER STANDBY STANDBY
ZK
ZK
ZK
1. Master rolling upgrade
2. Agent rolling upgrade
3. Framework upgrades
a. Orthogonal to DC/OS
b. Ensure changes don’t
affect existing apps
© 2018 Mesosphere, Inc. All Rights Reserved. 39
Software Upgrades
We have automatic updates
enabled for Docker…
• Follow upgrade instructions!
• Backup!
• Explicit control of versions!
© 2018 Mesosphere, Inc. All Rights Reserved. 40
Day 2 Operations
Our POC app is deployed in our
production environment, time for
vacation…
• Day 2 Operations is the 

actually challenging part!
Keep it running!
© 2018 Mesosphere, Inc. All Rights Reserved. 41
Day 2 Operations
● Configuration Updates (ex: Scaling, re-
configuration)
● Binary Upgrades
● Cluster Maintenance (ex: Backup, Restore,
Restart)
● Monitor progress of operations
● Debug any runtime blockages
© 2018 Mesosphere, Inc. All Rights Reserved. 42
METRICS
• Measurements captured to determine health and performance of
cluster
- How utilized is the cluster?
- Are resources being optimally used?
- Is the system performing better or worse over time?
- Are there bottlenecks in the system?
- What is the response time of applications?
© 2018 Mesosphere, Inc. All Rights Reserved. 43
DC/OS METRIC SOURCES
OS
Mesos
Container ContainerContainer
App App App
• Mesos metrics
– Resource, frameworks, masters, agents, tasks, system,
events
• Container Metrics
– CPU, mem, disk, network
• Application Metrics
– QPS, latency, response time, hits, active users, errors
© 2018 Mesosphere, Inc. All Rights Reserved. 44
Production Checklist
© 2018 Mesosphere, Inc. All Rights Reserved. 45
MESOS CHECKLIST
❏ Monitor both Masters and Agents for flapping
(i.e., continuously restarting). This can be
accomplished by using the `uptime` metric.
❏ Monitor the rate of changes in terminal task
states, including TASK_FAILED, TASK_LOST,
and TASK_KILLED
© 2018 Mesosphere, Inc. All Rights Reserved. 46
MESOS MASTER CHECKLIST
❏ Use five master instances in production. Three is
sufficient for HA in staging/test
❏ Place masters on separate racks, if possible
❏ Secure the teardown endpoints to prevent
accidental framework removal.
© 2018 Mesosphere, Inc. All Rights Reserved. 47
MESOS AGENT CHECKLIST
❏ Set agent attributes before you run anything on the
cluster. Once an agent is started, changing the
attributes may break recovery of running tasks in
the event of a restart. See also https://
issues.apache.org/jira/browse/MESOS-1739.
❏ Explicitly set the resources on the nodes to leave
capacity for other services running there outside of
Mesos control. For example, HDFS processes
running alongside Mesos.
© 2018 Mesosphere, Inc. All Rights Reserved. 48
ZOOKEEPER CHECKLIST
❏ Run with security and ACLs, see the `--zk=` and `--master=` flags
on the master and slaves respectively. If you do enable ACLs, they
must be enabled before nodes are created in ZK.
❏ Backup ZooKeeper snapshots and log at regular intervals. -
❏ Guano or zkConfig.py (Want Snapshots + Transaction Log)
❏ Marathon, Chronos, and other frameworks store state in ZK. The
first Marathon should store state in the same ZK as Mesos master.
❏ Userland apps should NOT store state in the ZK cluster shared by
Mesos and Marathon. Examples of userland apps include Storm,
service discovery tools, and additional instances of Marathon and
Chronos.
© 2018 Mesosphere, Inc. All Rights Reserved. 49
ZOOKEEPER CHECKLIST
❏ Monitor ZK's JVM metrics, such as heap usage,
GC pause times, and full-collection frequency.
❏ Monitor ZK for: number of client connections,
total number of znodes, size of znodes (min,
max, avg, 99% percentile), and read/write
performance metrics
CODEMOTION MILAN
November 10/11th,2017
https://amsterdam2018.codemotionworld.com/
See you in
Amsterdam, 8./9. May 2018
© 2017 Mesosphere, Inc. All Rights Reserved.
ANY
QUESTIONS?
@joerg_schad

More Related Content

What's hot

The Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
The Case For Docker In Multi-Cloud Enabled Bioinformatics ApplicationsThe Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
The Case For Docker In Multi-Cloud Enabled Bioinformatics ApplicationsAhmed Abdullah
 
Kubernetes Introduction & Whats new in Kubernetes 1.6
Kubernetes Introduction & Whats new in Kubernetes 1.6Kubernetes Introduction & Whats new in Kubernetes 1.6
Kubernetes Introduction & Whats new in Kubernetes 1.6Opcito Technologies
 
Docker, Mesos, Spark
Docker, Mesos, Spark Docker, Mesos, Spark
Docker, Mesos, Spark Qiang Wang
 
Supporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesSupporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesAhmed Abdullah
 
Mesosphere quick overview
Mesosphere quick overviewMesosphere quick overview
Mesosphere quick overviewKrishna-Kumar
 
State of Linux Containers in OpenStack
State of Linux Containers in OpenStackState of Linux Containers in OpenStack
State of Linux Containers in OpenStackopenstackindia
 
Apache Stratos 4.1.0 Architecture
Apache Stratos 4.1.0 ArchitectureApache Stratos 4.1.0 Architecture
Apache Stratos 4.1.0 ArchitectureImesh Gunaratne
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Ryan Jarvinen
 
Bug smash day magnum
Bug smash day magnumBug smash day magnum
Bug smash day magnumTon Ngo
 
What Multisite Means for Identity Management
What Multisite Means for Identity ManagementWhat Multisite Means for Identity Management
What Multisite Means for Identity ManagementOPNFV
 
Issues of OpenStack multi-region mode
Issues of OpenStack multi-region modeIssues of OpenStack multi-region mode
Issues of OpenStack multi-region modeJoe Huang
 
Openstack starter-guide-diablo
Openstack starter-guide-diabloOpenstack starter-guide-diablo
Openstack starter-guide-diablobabycat_feifei
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
 
Play Support in Cloud Foundry
Play Support in Cloud FoundryPlay Support in Cloud Foundry
Play Support in Cloud Foundryrajdeep
 
Guts & OpenStack migration
Guts & OpenStack migrationGuts & OpenStack migration
Guts & OpenStack migrationopenstackindia
 
Hyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with MesosphereHyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with MesosphereMarkus Eisele
 
What's really the difference between a VM and a Container?
What's really the difference between a VM and a Container?What's really the difference between a VM and a Container?
What's really the difference between a VM and a Container?Adrian Otto
 
Kubernetes Meetup - Seattle 2017-06-01
Kubernetes Meetup - Seattle 2017-06-01Kubernetes Meetup - Seattle 2017-06-01
Kubernetes Meetup - Seattle 2017-06-01Bassam Tabbara
 

What's hot (19)

The Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
The Case For Docker In Multi-Cloud Enabled Bioinformatics ApplicationsThe Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
The Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
 
Kubernetes Introduction & Whats new in Kubernetes 1.6
Kubernetes Introduction & Whats new in Kubernetes 1.6Kubernetes Introduction & Whats new in Kubernetes 1.6
Kubernetes Introduction & Whats new in Kubernetes 1.6
 
Rook cncf-wg-storage
Rook cncf-wg-storageRook cncf-wg-storage
Rook cncf-wg-storage
 
Docker, Mesos, Spark
Docker, Mesos, Spark Docker, Mesos, Spark
Docker, Mesos, Spark
 
Supporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesSupporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud services
 
Mesosphere quick overview
Mesosphere quick overviewMesosphere quick overview
Mesosphere quick overview
 
State of Linux Containers in OpenStack
State of Linux Containers in OpenStackState of Linux Containers in OpenStack
State of Linux Containers in OpenStack
 
Apache Stratos 4.1.0 Architecture
Apache Stratos 4.1.0 ArchitectureApache Stratos 4.1.0 Architecture
Apache Stratos 4.1.0 Architecture
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17
 
Bug smash day magnum
Bug smash day magnumBug smash day magnum
Bug smash day magnum
 
What Multisite Means for Identity Management
What Multisite Means for Identity ManagementWhat Multisite Means for Identity Management
What Multisite Means for Identity Management
 
Issues of OpenStack multi-region mode
Issues of OpenStack multi-region modeIssues of OpenStack multi-region mode
Issues of OpenStack multi-region mode
 
Openstack starter-guide-diablo
Openstack starter-guide-diabloOpenstack starter-guide-diablo
Openstack starter-guide-diablo
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
 
Play Support in Cloud Foundry
Play Support in Cloud FoundryPlay Support in Cloud Foundry
Play Support in Cloud Foundry
 
Guts & OpenStack migration
Guts & OpenStack migrationGuts & OpenStack migration
Guts & OpenStack migration
 
Hyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with MesosphereHyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with Mesosphere
 
What's really the difference between a VM and a Container?
What's really the difference between a VM and a Container?What's really the difference between a VM and a Container?
What's really the difference between a VM and a Container?
 
Kubernetes Meetup - Seattle 2017-06-01
Kubernetes Meetup - Seattle 2017-06-01Kubernetes Meetup - Seattle 2017-06-01
Kubernetes Meetup - Seattle 2017-06-01
 

Similar to Webinar - Nightmares of a Container Orchestration System - Jorg Schad

Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)QAware GmbH
 
Operating Kubernetes at Scale (Australia Presentation)
Operating Kubernetes at Scale (Australia Presentation)Operating Kubernetes at Scale (Australia Presentation)
Operating Kubernetes at Scale (Australia Presentation)Mesosphere Inc.
 
Episode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSEpisode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSMesosphere Inc.
 
Downtime is not an option - day 2 operations - Jörg Schad
Downtime is not an option - day 2 operations -  Jörg SchadDowntime is not an option - day 2 operations -  Jörg Schad
Downtime is not an option - day 2 operations - Jörg SchadCodemotion
 
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...Flink Forward
 
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...NETWAYS
 
Operating Flink on Mesos at Scale
Operating Flink on Mesos at ScaleOperating Flink on Mesos at Scale
Operating Flink on Mesos at ScaleBiswajit Das
 
Episode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleEpisode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleMesosphere Inc.
 
DevOps in Age of Kubernetes
DevOps in Age of KubernetesDevOps in Age of Kubernetes
DevOps in Age of KubernetesMesosphere Inc.
 
Webinar: Operating Kubernetes at Scale
Webinar: Operating Kubernetes at ScaleWebinar: Operating Kubernetes at Scale
Webinar: Operating Kubernetes at ScaleMesosphere Inc.
 
DevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of KubernetesDevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of KubernetesDevOps.com
 
EMC World 2016 - code.14 Deep Dive with Mesos and Persistent Storage for Appl...
EMC World 2016 - code.14 Deep Dive with Mesos and Persistent Storage for Appl...EMC World 2016 - code.14 Deep Dive with Mesos and Persistent Storage for Appl...
EMC World 2016 - code.14 Deep Dive with Mesos and Persistent Storage for Appl...{code}
 
EMC World 2016 - Deep Dive with Mesos and Persistent Storage for Applications
EMC World 2016 - Deep Dive with Mesos and Persistent Storage for ApplicationsEMC World 2016 - Deep Dive with Mesos and Persistent Storage for Applications
EMC World 2016 - Deep Dive with Mesos and Persistent Storage for ApplicationsDavid vonThenen
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxRavi Yadav
 
Deploying Containers in Production and at Scale
Deploying Containers in Production and at ScaleDeploying Containers in Production and at Scale
Deploying Containers in Production and at ScaleMesosphere Inc.
 
Platform as a Service with Kubernetes and Mesos
Platform as a Service with Kubernetes and Mesos Platform as a Service with Kubernetes and Mesos
Platform as a Service with Kubernetes and Mesos Miguel Zuniga
 
Using DC/OS for Continuous Delivery - DevPulseCon 2017
Using DC/OS for Continuous Delivery - DevPulseCon 2017Using DC/OS for Continuous Delivery - DevPulseCon 2017
Using DC/OS for Continuous Delivery - DevPulseCon 2017pleia2
 
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Codemotion
 
Kubernetes on Top of Mesos on Top of DCOS
Kubernetes on Top of Mesos on Top of DCOSKubernetes on Top of Mesos on Top of DCOS
Kubernetes on Top of Mesos on Top of DCOSStefan Schimanski
 
Dealing with kubesprawl tetris style !
Dealing with kubesprawl   tetris style !Dealing with kubesprawl   tetris style !
Dealing with kubesprawl tetris style !Taco Scargo
 

Similar to Webinar - Nightmares of a Container Orchestration System - Jorg Schad (20)

Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
 
Operating Kubernetes at Scale (Australia Presentation)
Operating Kubernetes at Scale (Australia Presentation)Operating Kubernetes at Scale (Australia Presentation)
Operating Kubernetes at Scale (Australia Presentation)
 
Episode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSEpisode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OS
 
Downtime is not an option - day 2 operations - Jörg Schad
Downtime is not an option - day 2 operations -  Jörg SchadDowntime is not an option - day 2 operations -  Jörg Schad
Downtime is not an option - day 2 operations - Jörg Schad
 
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
 
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
 
Operating Flink on Mesos at Scale
Operating Flink on Mesos at ScaleOperating Flink on Mesos at Scale
Operating Flink on Mesos at Scale
 
Episode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleEpisode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at Scale
 
DevOps in Age of Kubernetes
DevOps in Age of KubernetesDevOps in Age of Kubernetes
DevOps in Age of Kubernetes
 
Webinar: Operating Kubernetes at Scale
Webinar: Operating Kubernetes at ScaleWebinar: Operating Kubernetes at Scale
Webinar: Operating Kubernetes at Scale
 
DevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of KubernetesDevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
 
EMC World 2016 - code.14 Deep Dive with Mesos and Persistent Storage for Appl...
EMC World 2016 - code.14 Deep Dive with Mesos and Persistent Storage for Appl...EMC World 2016 - code.14 Deep Dive with Mesos and Persistent Storage for Appl...
EMC World 2016 - code.14 Deep Dive with Mesos and Persistent Storage for Appl...
 
EMC World 2016 - Deep Dive with Mesos and Persistent Storage for Applications
EMC World 2016 - Deep Dive with Mesos and Persistent Storage for ApplicationsEMC World 2016 - Deep Dive with Mesos and Persistent Storage for Applications
EMC World 2016 - Deep Dive with Mesos and Persistent Storage for Applications
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptx
 
Deploying Containers in Production and at Scale
Deploying Containers in Production and at ScaleDeploying Containers in Production and at Scale
Deploying Containers in Production and at Scale
 
Platform as a Service with Kubernetes and Mesos
Platform as a Service with Kubernetes and Mesos Platform as a Service with Kubernetes and Mesos
Platform as a Service with Kubernetes and Mesos
 
Using DC/OS for Continuous Delivery - DevPulseCon 2017
Using DC/OS for Continuous Delivery - DevPulseCon 2017Using DC/OS for Continuous Delivery - DevPulseCon 2017
Using DC/OS for Continuous Delivery - DevPulseCon 2017
 
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
 
Kubernetes on Top of Mesos on Top of DCOS
Kubernetes on Top of Mesos on Top of DCOSKubernetes on Top of Mesos on Top of DCOS
Kubernetes on Top of Mesos on Top of DCOS
 
Dealing with kubesprawl tetris style !
Dealing with kubesprawl   tetris style !Dealing with kubesprawl   tetris style !
Dealing with kubesprawl tetris style !
 

More from Codemotion

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Codemotion
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyCodemotion
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaCodemotion
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserCodemotion
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Codemotion
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Codemotion
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Codemotion
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 - Codemotion
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Codemotion
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Codemotion
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Codemotion
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Codemotion
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Codemotion
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Codemotion
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Codemotion
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...Codemotion
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Codemotion
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Codemotion
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Codemotion
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Codemotion
 

More from Codemotion (20)

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending story
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storia
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard Altwasser
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
 

Recently uploaded

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 

Recently uploaded (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 

Webinar - Nightmares of a Container Orchestration System - Jorg Schad

  • 1. © 2017 Mesosphere, Inc. All Rights Reserved. 1 @joerg_schad Nightmares of a Container Orchestration System
  • 2. © 2018 Mesosphere, Inc. All Rights Reserved. 2 Jörg Schad Technical Lead Community Projects @joerg_schad Jan Repnak Support Engineer/ Solution Architect @jrx
  • 3. © 2018 Mesosphere, Inc. All Rights Reserved. 3 3AM…
  • 4. © 2018 Mesosphere, Inc. All Rights Reserved. 4 Datacenter Operating System (DC/OS) Distributed Systems Kernel (Apache Mesos) Scope of Jan’s nightmares… Big Data + Analytics Engines Microservices (in containers) Streaming Batch Machine Learning Analytics Functions & Logic Search Time Series SQL / NoSQL Databases Modern App Components Any Infrastructure (Physical, Virtual, Cloud)
  • 5. © 2018 Mesosphere, Inc. All Rights Reserved. 5 Container Orchestration We deploy all our container with own scripts… • Use existing Container Scheduler
  • 6. © 2018 Mesosphere, Inc. All Rights Reserved. 6 Container (wo Orchestration) ServiceAppAppApp OS OS OS Machine Infrastructure Machine Machine Container Runtime Container Runtime Service Service ServiceServiceServiceServiceServiceService Container Runtime
  • 7. © 2018 Mesosphere, Inc. All Rights Reserved. 7 Container Orchestration CONTAINER SCHEDULING - Placement - Replication/Scaling - Resurrection - Rescheduling - Rolling Deployment - Upgrades - Downgrades - Collocation RESOURCE MANAGEMENT - Memory - CPU - GPU - Volumes - Ports - IPs - Images/Artifacts SERVICE MANAGEMENT - Labels - Groups/ Namespaces - Dependencies - Load Balancing - Readiness Checking
  • 8. © 2018 Mesosphere, Inc. All Rights Reserved. 8 Container Orchestration Machine Infrastructure Web Apps & Services Scheduling Resource Management Container Runtime Machine & OS Service Management Machine & OS Machine & OS Container Runtime Container Runtime
  • 9. © 2018 Mesosphere, Inc. All Rights Reserved. 9 Backup State Why should we have backup? • Backup state • Services • Cluster
  • 10. © 2018 Mesosphere, Inc. All Rights Reserved. 10 Monitoring I check the cluster state ever morning… • Monitoring is key • Detect problems early • OOM • Metrics • Application • Container • Cluster
  • 11. © 2018 Mesosphere, Inc. All Rights Reserved. 11 Networking Networking is easy… • Consider Networking Challenges • Connectivity • Load-Balancing • Layer 4 vs 7 • Service Discovery • Isolation
  • 12. © 2018 Mesosphere, Inc. All Rights Reserved. 12 Immutable Container Images • Use tagged container 
 images • Keep tagged images
 immutable!
  • 13. © 2018 Mesosphere, Inc. All Rights Reserved. 13 Dockerhub works great for our test cluster… Private Container Registries • Use tagged container 
 images • Keep tagged images
 immutable! • Use a private container 
 registry!
  • 14. © 2018 Mesosphere, Inc. All Rights Reserved. 14 Repeatable Container Builds • Use repeatable builds for images • Including FROM clause • Keep images minimal • Multistage build • From scratch `docker commit` is great*…
  • 15. © 2018 Mesosphere, Inc. All Rights Reserved. 15 UI Deployments • Use (Marathon) endpoints 
 for deployments! • Version (and track) your 
 app definitions!
  • 16. © 2018 Mesosphere, Inc. All Rights Reserved. 16 CI/CD Deployments Continuous Integration Artifact Repo & Container Registry Container Orchestrator Version Control System Load Balancer Production Environment Continuous Delivery Pipeline git push
  • 17. © 2018 Mesosphere, Inc. All Rights Reserved. 17 Disk Usage All our disk are full… • Docker and logs are great 
 in filling up disk space! • Images • Container • Cleanup docker instances
 and images! • docker prune • https://github.com/spotify/docker-gc • Monitor available disk space!
  • 18. © 2018 Mesosphere, Inc. All Rights Reserved. 18 Resource Constraints 32MB are sufficient for Docker container… • Memory constraints are hard limits • Consider overhead (e.g., Java) • Difficult to approximate • Monitor
  • 19. © 2018 Mesosphere, Inc. All Rights Reserved. 19 Zookeeper Cluster Size Our Zookeeper Cluster has 4 nodes, that is better than 3, or? • Zookeeper quorum (i.e., #Masters) 
 should be odd! • Production 5 is optimal!
  • 20. © 2018 Mesosphere, Inc. All Rights Reserved. 20 Health Checks What are health checks? • Specify Health checks carefully • Different options • Mesos vs Marathon, • Command vs HTTP • Impacts Load-Balancers and restarts • Readiness checks
  • 21. © 2018 Mesosphere, Inc. All Rights Reserved. 21 NoSQL Datastores We replaced out Postgres instance with Cassandra, and now we get stale results • Consider the semantics 
 of your datastore! • ACID vs Base • Model your data and 
 queries accordingly!
  • 22. © 2018 Mesosphere, Inc. All Rights Reserved. 22 Removing Stateful Frameworks • Follow the uninstall instructions! • Reservations and Zookeeper state! • state.json
  • 23. © 2018 Mesosphere, Inc. All Rights Reserved. 23 Container vs VMs We just replaced all our VM instances by containers*… • Be aware of different 
 isolation semantics!
  • 24. © 2018 Mesosphere, Inc. All Rights Reserved. 24 Container Definition… != What is the difference between container image and instance? • container runtime* •!= container image •!= container instance
  • 25. © 2018 Mesosphere, Inc. All Rights Reserved. 25 Container vs Container { "id": "/springboot-demo", "cmd": "$JAVA_HOME/bin/java -jar MyApp.jar", "instances": 1, "fetch": [ { "uri": "http://…/MyApp.jar", }, { "uri": "https://.../jre-8u121-linux-x64.tar.gz", } ],
  • 26. © 2018 Mesosphere, Inc. All Rights Reserved. 26 Write Once Run Any Where The (Java) container was running fine in testing… • Java (<9) not groups aware • # threads for GC • … • Set default values carefully
  • 27. © 2018 Mesosphere, Inc. All Rights Reserved. 27 a.ka. Understand Container Semantics • Java (<9) not groups aware • # threads for GC • … • Set default values carefully
  • 28. © 2018 Mesosphere, Inc. All Rights Reserved. 28 Java meets Container • Development • Java App packaged in a container • Production • 10 JVM container on a 32 core box – 10 * (32 cores are seen by each JRE) – 10 * (32 threads set by default for ForkJoinPool) – 10 * (32 threads ….)
  • 29. © 2018 Mesosphere, Inc. All Rights Reserved. 29 Mesos Modules To solve this problem, our team quickly developed this really cool Mesos Module… • Mesos Modules can be tricky! • Monitoring and Debugging…
  • 30. © 2018 Mesosphere, Inc. All Rights Reserved. 30 Linux Distributions We are using *obscure Linux distribution* for our (DC/ OS) cluster • If possible use tested 
 distributions! • Especially for DC/OS!
  • 31. © 2018 Mesosphere, Inc. All Rights Reserved. 31 Services on the same node… We are running *distributed Database* outside Mesos on the same cluster • Be careful when running services outside Mesos but on the same cluster! • Adjust resources accordingly!
  • 32. © 2018 Mesosphere, Inc. All Rights Reserved. 32 Spreading out Master Nodes We are running our cluster across different AWS regions.. • Be careful when distributing Master nodes across high latency links! • Different AWS AZ ok, different region probably not!
  • 33. © 2018 Mesosphere, Inc. All Rights Reserved. 33 Agent Attributes We changed the agent attributes for running cluster… • Set agent attributes when starting an agent! • Do not change for running agents! —attributes='rack:abc;zone:west; os:centos5;level:10;keys:[1000-1500]'
  • 34. © 2018 Mesosphere, Inc. All Rights Reserved. 34 Cluster Upgrades We upgraded our cluster and…* • Check state before • Follow upgrade instructions! • Automation • Remember Backup!
  • 35. © 2018 Mesosphere, Inc. All Rights Reserved. 35 UPGRADE PROCEDURE Framework Scheduler Executor Task Agent Executor Task Agent LEADER STANDBY STANDBY ZK ZK ZK Before upgrading 1. Make sure cluster is healthy! 2. Perform backup a. ZK b. Replicated logs c. other state 3. Review release notes 4. Generate install bundle a. Validate versions
  • 36. © 2018 Mesosphere, Inc. All Rights Reserved. 36 UPGRADE PROCEDURE Framework Scheduler Executor Task Agent Executor Task Agent LEADER STANDBY STANDBY ZK ZK ZK 1. Master rolling upgrade a. Start with standby b. Uninstall DC/OS c. Install new DC/OS 2. Agent rolling upgrade 3. Framework upgrades
  • 37. © 2018 Mesosphere, Inc. All Rights Reserved. 37 UPGRADE PROCEDURE Framework Scheduler Executor Task Agent Executor Task Agent LEADER STANDBY STANDBY ZK ZK ZK 1. Master rolling upgrade 2. Agent rolling upgrade a. Uninstall DC/OS b. Install new DC/OS 3. Framework upgrades
  • 38. © 2018 Mesosphere, Inc. All Rights Reserved. 38 UPGRADE PROCEDURE Framework Scheduler Executor Task Agent Executor Task Agent LEADER STANDBY STANDBY ZK ZK ZK 1. Master rolling upgrade 2. Agent rolling upgrade 3. Framework upgrades a. Orthogonal to DC/OS b. Ensure changes don’t affect existing apps
  • 39. © 2018 Mesosphere, Inc. All Rights Reserved. 39 Software Upgrades We have automatic updates enabled for Docker… • Follow upgrade instructions! • Backup! • Explicit control of versions!
  • 40. © 2018 Mesosphere, Inc. All Rights Reserved. 40 Day 2 Operations Our POC app is deployed in our production environment, time for vacation… • Day 2 Operations is the 
 actually challenging part! Keep it running!
  • 41. © 2018 Mesosphere, Inc. All Rights Reserved. 41 Day 2 Operations ● Configuration Updates (ex: Scaling, re- configuration) ● Binary Upgrades ● Cluster Maintenance (ex: Backup, Restore, Restart) ● Monitor progress of operations ● Debug any runtime blockages
  • 42. © 2018 Mesosphere, Inc. All Rights Reserved. 42 METRICS • Measurements captured to determine health and performance of cluster - How utilized is the cluster? - Are resources being optimally used? - Is the system performing better or worse over time? - Are there bottlenecks in the system? - What is the response time of applications?
  • 43. © 2018 Mesosphere, Inc. All Rights Reserved. 43 DC/OS METRIC SOURCES OS Mesos Container ContainerContainer App App App • Mesos metrics – Resource, frameworks, masters, agents, tasks, system, events • Container Metrics – CPU, mem, disk, network • Application Metrics – QPS, latency, response time, hits, active users, errors
  • 44. © 2018 Mesosphere, Inc. All Rights Reserved. 44 Production Checklist
  • 45. © 2018 Mesosphere, Inc. All Rights Reserved. 45 MESOS CHECKLIST ❏ Monitor both Masters and Agents for flapping (i.e., continuously restarting). This can be accomplished by using the `uptime` metric. ❏ Monitor the rate of changes in terminal task states, including TASK_FAILED, TASK_LOST, and TASK_KILLED
  • 46. © 2018 Mesosphere, Inc. All Rights Reserved. 46 MESOS MASTER CHECKLIST ❏ Use five master instances in production. Three is sufficient for HA in staging/test ❏ Place masters on separate racks, if possible ❏ Secure the teardown endpoints to prevent accidental framework removal.
  • 47. © 2018 Mesosphere, Inc. All Rights Reserved. 47 MESOS AGENT CHECKLIST ❏ Set agent attributes before you run anything on the cluster. Once an agent is started, changing the attributes may break recovery of running tasks in the event of a restart. See also https:// issues.apache.org/jira/browse/MESOS-1739. ❏ Explicitly set the resources on the nodes to leave capacity for other services running there outside of Mesos control. For example, HDFS processes running alongside Mesos.
  • 48. © 2018 Mesosphere, Inc. All Rights Reserved. 48 ZOOKEEPER CHECKLIST ❏ Run with security and ACLs, see the `--zk=` and `--master=` flags on the master and slaves respectively. If you do enable ACLs, they must be enabled before nodes are created in ZK. ❏ Backup ZooKeeper snapshots and log at regular intervals. - ❏ Guano or zkConfig.py (Want Snapshots + Transaction Log) ❏ Marathon, Chronos, and other frameworks store state in ZK. The first Marathon should store state in the same ZK as Mesos master. ❏ Userland apps should NOT store state in the ZK cluster shared by Mesos and Marathon. Examples of userland apps include Storm, service discovery tools, and additional instances of Marathon and Chronos.
  • 49. © 2018 Mesosphere, Inc. All Rights Reserved. 49 ZOOKEEPER CHECKLIST ❏ Monitor ZK's JVM metrics, such as heap usage, GC pause times, and full-collection frequency. ❏ Monitor ZK for: number of client connections, total number of znodes, size of znodes (min, max, avg, 99% percentile), and read/write performance metrics
  • 51. © 2017 Mesosphere, Inc. All Rights Reserved. ANY QUESTIONS? @joerg_schad