SlideShare a Scribd company logo
1 of 26
Download to read offline
© 2017 Mesosphere, Inc. All Rights Reserved. 1
FlinkForward 2017 - San Francisco
Flink meet DC/OS
Deploying Apache Flink at Scale
Elizabeth K. Joseph, @pleia2
Ravi Yadav, @RaaveYadav
© 2017 Mesosphere, Inc. All Rights Reserved. 2
Part 1
Introduction to Apache
Mesos, Marathon, and
DC/OS
Part 2
Demonstration of demo
data pipeline + Installing
Flink on DC/OS
Part 3
DC/OS 1.9 key features for
data services and beyond
Talk Outline
© 2017 Mesosphere, Inc. All Rights Reserved. 3
Apache Mesos:
The datacenter kernel
http://mesos.apache.org/
© 2017 Mesosphere, Inc. All Rights Reserved. 4
● Mesos can’t run applications on its
own.
● A Mesos framework is a distributed
system that has a scheduler.
● Schedulers like Marathon start and
keep your applications running. A bit
like a distributed init system.
● Mesos mechanics are fair and HA
● Learn more at
https://mesosphere.github.io/marat
hon/
Marathon
© 2017 Mesosphere, Inc. All Rights Reserved. 5
Introducing DC/OS
Solves common problems
● Resource management
● Task scheduling
● Container orchestration
● Self-healing infrastructure
● Logging and metrics
● Network management
● “Universe” of pre-configured apps (including Flink, Kafka…)
● Learn more and contribute at https://dcos.io/
© 2017 Mesosphere, Inc. All Rights Reserved. 6
DC/OS
Architecture
Overview
Security &
Governance
Container Orchestration Monitoring & Operations User Interface & Command Line
HDFS Jenkins Marathon Cassandra Flink
Spark Docker Kafka MongoDB +30 more...
DC/OS
Services & Containers
ANY INFRASTRUCTURE
© 2017 Mesosphere, Inc. All Rights Reserved. 7
Web-based GUI
https://dcos.io/docs/lates
t/usage/webinterface/
Interact with DC/OS (1/2)
© 2017 Mesosphere, Inc. All Rights Reserved. 8
Universe
© 2017 Mesosphere, Inc. All Rights Reserved. 9
CLI tool
https://dcos.io/docs/latest/usage/cli/
API
https://dcos.io/docs/latest/api/
Interact with DC/OS (2/2)
© 2017 Mesosphere, Inc. All Rights Reserved. 10
According to the December 2016 data Artisans-organized Apache Flink user survey
just under 30% of respondents were running Flink on Apache Mesos
https://dcos.io/blog/2017/apache-flink-on-dc-os-and-apache-mesos/
You may already be using Apache Mesos!
Version 1.2 of Flink includes support for Apache Mesos and DC/OS, “it is now possible
to run an highly available Flink cluster on Mesos”
https://flink.apache.org/news/2017/02/06/release-1.2.0.html#run-flink-with-apache-
mesos & https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/mesos.html
Flink on Apache Mesos and DC/OS
© 2017 Mesosphere, Inc. All Rights Reserved. 11
DEMOSDemo data pipeline + Installing Flink on DC/OS
© 2017 Mesosphere, Inc. All Rights Reserved.
DC/OS 1.9 - Data Services Ecosystem
WORKLOADS
● Pods
● GPU based
scheduling
DATA SERVICES
ECOSYSTEM
OPERATIONS
● Remote
Container Shell
● Unified Metrics
● Deployment
failure analyzer
● Alluxio
● Couchbase
● Datastax DSE
● Elastic (ELK)
● Redis
● Apache Flink
© 2017 Mesosphere, Inc. All Rights Reserved.
DC/OS 1.9 - Operations
WORKLOADS
● Pods
● GPU based
scheduling
DATA SERVICES
ECOSYSTEM
OPERATIONS
● Remote
Container Shell
● Unified Metrics
● Unified Logging
● Deployment
Failure
Debugging
● Upgrades &
Configuration
updates
© 2017 Mesosphere, Inc. All Rights Reserved. 14
● Open encrypted, interactive, remote session
to your containers
● Remotely execute commands for real time
app troubleshooting
● Provide developers access to their own
applications, not the entire host or cluster
my-laptop$ dcos task exec my-task /bin/bash
Starting /bin/bash in my-task ...
Connecting to remote my-task …
REMOTE CONTAINER SHELL
DC/OS: OPERATIONS
DC/OS 1.9
© 2017 Mesosphere, Inc. All Rights Reserved. 15
UNIFIED LOGGING
DC/OS: OPERATIONS
● Access application, DC/OS and OS logs
● Easily troubleshoot applications with critical
metadata such as container id and app id
● Integrate easily with existing logging systems
DC/OS 1.9
© 2017 Mesosphere, Inc. All Rights Reserved. 16
UNIFIED METRICS
DC/OS: OPERATIONS
● Single API for system, container and
application metrics
● Metadata such as host id and container id
are automatically added to assist in
debugging
● Integrate easily with existing metrics systems
DC/OS 1.9
Container
© 2017 Mesosphere, Inc. All Rights Reserved. 17
DEPLOYMENT FAILURE DEBUGGING
DC/OS: OPERATIONS
● Understand why your application is not
deploying
● Understand which nodes in the cluster can
accommodate the role, constraints, cpu,
mem, disk and port requirements for your
app
DC/OS 1.9
© 2017 Mesosphere, Inc. All Rights Reserved.
UPGRADES AND CONFIG UPDATES
DC/OS: OPERATIONS
DC/OS 1.9
18
● Generate new config for cluster nodes
$ dcos_generate_config.sh --generate-node-upgrade-script
<installed_cluster_version>
● Single command upgrade script for individual nodes
$ curl -O <Node upgrade script URL>
$ sudo bash ./dcos_node_upgrade.sh
© 2017 Mesosphere, Inc. All Rights Reserved.
DC/OS 1.9 - Workloads
WORKLOADS
● Pods
● GPU based
scheduling
DATA SERVICES
ECOSYSTEM
OPERATIONS
● Remote
Container Shell
● Unified Logging
● Unified Metrics
● Deployment
failure analyzer
© 2017 Mesosphere, Inc. All Rights Reserved. 20
● Schedule, deploy and scale multiple
containers on the same host(s) while sharing
IP address and storage volumes
● All containers in a pod instance run as if they
are running on a single host in pre-container
world
● Useful for migrating legacy applications or
building advanced micro services (side car
containers)
PODS
DC/OS: WORKLOADS
DC/OS 1.9
© 2017 Mesosphere, Inc. All Rights Reserved. 21
● Traditional monolithic apps on VMs
usually have support services such as log
shipper, message queuing clients
● Many support services assume
col-location on same host, and
local-host access to networking and
storage
● Pods simplify moving legacy monolithic
apps to containers, reducing risk and
accelerating migrations
DC/OS 1.9
PODS: MIGRATING LEGACY APPS TO CONTAINERS
DC/OS: WORKLOADS
© 2017 Mesosphere, Inc. All Rights Reserved. 22
● Advanced Micro Services patterns require
colocating containers together
● Support services include for example:
○ Logging or monitoring agents,
○ Backup tooling & Proxies
○ Data change watchers & Event publishers
● Pods simplify the building and maintenance of
complex such microservices
DC/OS 1.9
PODS: SUPPORT SERVICES (SIDE-CAR CONTAINERS)
DC/OS: WORKLOADS
© 2017 Mesosphere, Inc. All Rights Reserved.
GPU: WHY GPU?
DC/OS: WORKLOADS
DC/OS 1.9
● GPUs are needed for many machine
learning and deep learning applications
● GPUs are essential for real-time or near
real time machine learning models
● GPUs deliver from 10X to 100X
performance for some applications,
resulting lower $$$/IOPS and more
productivity to data science teams
● GPU applications include real time fraud
detection, genome sequencing, cohort
analysis and many others
© 2017 Mesosphere, Inc. All Rights Reserved.
GPU BASED SCHEDULING
DC/OS: WORKLOADS
DC/OS 1.9
24
● Test Locally with Nvidia-Docker, deploy to
production with DC/OS
● Isolate GPU instances and schedule workloads
just like CPU and memory, guaranteeing
performance
● Efficiently Share GPU resources across data
science team
● Simplify migrating machine learning models
across from dev to production, and across
clouds
© 2017 Mesosphere, Inc. All Rights Reserved.
OTHER IMPROVEMENTS
DC/OS
DC/OS 1.9
25
● Mesos 1.2
● Marathon 1.4
● Docker 1.12 and 1.13 (17.03-ce) support
● Centos 7.3 and CoreOS 1235.12.0 support
● Performance improvements across all networking
features.
● CNI support for 3rd party CNI plugins.
● 100s of additional bugfixes and tests
Flink forward sf 17

More Related Content

What's hot

Cloud Surfing: Kubernetes on Mesos
Cloud Surfing: Kubernetes on MesosCloud Surfing: Kubernetes on Mesos
Cloud Surfing: Kubernetes on MesosKarl Isenberg
 
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesDeep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesAlluxio, Inc.
 
Cloud Foundry Compared With Other PaaSes (Cloud Foundry Summit 2014)
Cloud Foundry Compared With Other PaaSes (Cloud Foundry Summit 2014)Cloud Foundry Compared With Other PaaSes (Cloud Foundry Summit 2014)
Cloud Foundry Compared With Other PaaSes (Cloud Foundry Summit 2014)VMware Tanzu
 
Mesos vs kubernetes comparison
Mesos vs kubernetes comparisonMesos vs kubernetes comparison
Mesos vs kubernetes comparisonKrishna-Kumar
 
How to Launch a Public PaaS with OpenSource: The GetUpCloud & OpenShift Orgin...
How to Launch a Public PaaS with OpenSource: The GetUpCloud & OpenShift Orgin...How to Launch a Public PaaS with OpenSource: The GetUpCloud & OpenShift Orgin...
How to Launch a Public PaaS with OpenSource: The GetUpCloud & OpenShift Orgin...OpenShift Origin
 
Alluxio data orchestration for machine learning
Alluxio data orchestration for machine learningAlluxio data orchestration for machine learning
Alluxio data orchestration for machine learningAlluxio, Inc.
 
Deploying Alluxio in the Cloud for Machine Learning
Deploying Alluxio in the Cloud for Machine LearningDeploying Alluxio in the Cloud for Machine Learning
Deploying Alluxio in the Cloud for Machine LearningAlluxio, Inc.
 
How Cloudify uses Chef as a Foundation for PaaS
How Cloudify uses Chef as a Foundation for PaaSHow Cloudify uses Chef as a Foundation for PaaS
How Cloudify uses Chef as a Foundation for PaaSNati Shalom
 
Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike
 
OpenShift Origin Internals
OpenShift Origin Internals OpenShift Origin Internals
OpenShift Origin Internals OpenShift Origin
 
Best Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+AlluxioBest Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+AlluxioAlluxio, Inc.
 
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)VMware Tanzu
 
整合Cloud Foundry 和 Kubernetes 技術打造企業級雲應用平台解決方案
整合Cloud Foundry 和 Kubernetes 技術打造企業級雲應用平台解決方案整合Cloud Foundry 和 Kubernetes 技術打造企業級雲應用平台解決方案
整合Cloud Foundry 和 Kubernetes 技術打造企業級雲應用平台解決方案inwin stack
 
Introduction into Cloud Foundry and Bosh | anynines
Introduction into Cloud Foundry and Bosh | anyninesIntroduction into Cloud Foundry and Bosh | anynines
Introduction into Cloud Foundry and Bosh | anyninesanynines GmbH
 
Putting Private Clouds to Work with PaaS Interop Vegas 2013 presentation by D...
Putting Private Clouds to Work with PaaS Interop Vegas 2013 presentation by D...Putting Private Clouds to Work with PaaS Interop Vegas 2013 presentation by D...
Putting Private Clouds to Work with PaaS Interop Vegas 2013 presentation by D...Diane Mueller
 
Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 mi...
Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 mi...Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 mi...
Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 mi...Alluxio, Inc.
 
Project Solum - OpenStack's Native PaaS
Project Solum - OpenStack's Native PaaSProject Solum - OpenStack's Native PaaS
Project Solum - OpenStack's Native PaaSAlex Baretto
 
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...Neil Shannon
 

What's hot (20)

Cloud Surfing: Kubernetes on Mesos
Cloud Surfing: Kubernetes on MesosCloud Surfing: Kubernetes on Mesos
Cloud Surfing: Kubernetes on Mesos
 
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesDeep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
 
Cloud Foundry Compared With Other PaaSes (Cloud Foundry Summit 2014)
Cloud Foundry Compared With Other PaaSes (Cloud Foundry Summit 2014)Cloud Foundry Compared With Other PaaSes (Cloud Foundry Summit 2014)
Cloud Foundry Compared With Other PaaSes (Cloud Foundry Summit 2014)
 
Mesos vs kubernetes comparison
Mesos vs kubernetes comparisonMesos vs kubernetes comparison
Mesos vs kubernetes comparison
 
How to Launch a Public PaaS with OpenSource: The GetUpCloud & OpenShift Orgin...
How to Launch a Public PaaS with OpenSource: The GetUpCloud & OpenShift Orgin...How to Launch a Public PaaS with OpenSource: The GetUpCloud & OpenShift Orgin...
How to Launch a Public PaaS with OpenSource: The GetUpCloud & OpenShift Orgin...
 
Alluxio data orchestration for machine learning
Alluxio data orchestration for machine learningAlluxio data orchestration for machine learning
Alluxio data orchestration for machine learning
 
Deploying Alluxio in the Cloud for Machine Learning
Deploying Alluxio in the Cloud for Machine LearningDeploying Alluxio in the Cloud for Machine Learning
Deploying Alluxio in the Cloud for Machine Learning
 
How Cloudify uses Chef as a Foundation for PaaS
How Cloudify uses Chef as a Foundation for PaaSHow Cloudify uses Chef as a Foundation for PaaS
How Cloudify uses Chef as a Foundation for PaaS
 
QN Blue Lava
QN Blue LavaQN Blue Lava
QN Blue Lava
 
Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower Manhattan
 
OpenShift Origin Internals
OpenShift Origin Internals OpenShift Origin Internals
OpenShift Origin Internals
 
Sysadmin vs. dev ops
Sysadmin vs. dev opsSysadmin vs. dev ops
Sysadmin vs. dev ops
 
Best Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+AlluxioBest Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+Alluxio
 
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
 
整合Cloud Foundry 和 Kubernetes 技術打造企業級雲應用平台解決方案
整合Cloud Foundry 和 Kubernetes 技術打造企業級雲應用平台解決方案整合Cloud Foundry 和 Kubernetes 技術打造企業級雲應用平台解決方案
整合Cloud Foundry 和 Kubernetes 技術打造企業級雲應用平台解決方案
 
Introduction into Cloud Foundry and Bosh | anynines
Introduction into Cloud Foundry and Bosh | anyninesIntroduction into Cloud Foundry and Bosh | anynines
Introduction into Cloud Foundry and Bosh | anynines
 
Putting Private Clouds to Work with PaaS Interop Vegas 2013 presentation by D...
Putting Private Clouds to Work with PaaS Interop Vegas 2013 presentation by D...Putting Private Clouds to Work with PaaS Interop Vegas 2013 presentation by D...
Putting Private Clouds to Work with PaaS Interop Vegas 2013 presentation by D...
 
Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 mi...
Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 mi...Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 mi...
Setting up monitoring system for Alluxio with Prometheus and Grafana in 10 mi...
 
Project Solum - OpenStack's Native PaaS
Project Solum - OpenStack's Native PaaSProject Solum - OpenStack's Native PaaS
Project Solum - OpenStack's Native PaaS
 
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
 

Similar to Flink forward sf 17

Using DC/OS for Continuous Delivery - DevPulseCon 2017
Using DC/OS for Continuous Delivery - DevPulseCon 2017Using DC/OS for Continuous Delivery - DevPulseCon 2017
Using DC/OS for Continuous Delivery - DevPulseCon 2017pleia2
 
Introduction to DC/OS
Introduction to DC/OSIntroduction to DC/OS
Introduction to DC/OSAmita Ekbote
 
SMACK stack and beyond
SMACK stack and beyondSMACK stack and beyond
SMACK stack and beyondMatt Jarvis
 
Webinar: End-to-End CI/CD with GitLab and DC/OS
Webinar: End-to-End CI/CD with GitLab and DC/OSWebinar: End-to-End CI/CD with GitLab and DC/OS
Webinar: End-to-End CI/CD with GitLab and DC/OSMesosphere Inc.
 
Downtime is not an option - day 2 operations - Jörg Schad
Downtime is not an option - day 2 operations -  Jörg SchadDowntime is not an option - day 2 operations -  Jörg Schad
Downtime is not an option - day 2 operations - Jörg SchadCodemotion
 
Introduction to DC/OS
Introduction to DC/OSIntroduction to DC/OS
Introduction to DC/OSMatt Jarvis
 
Introduction to DC/OS
Introduction to DC/OSIntroduction to DC/OS
Introduction to DC/OSMatt Jarvis
 
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)QAware GmbH
 
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...NETWAYS
 
Operating Flink on Mesos at Scale
Operating Flink on Mesos at ScaleOperating Flink on Mesos at Scale
Operating Flink on Mesos at ScaleBiswajit Das
 
Alluxio Mesos Meetup - SMACK to SMAACK
Alluxio Mesos Meetup - SMACK to SMAACKAlluxio Mesos Meetup - SMACK to SMAACK
Alluxio Mesos Meetup - SMACK to SMAACKAlluxio, Inc.
 
DevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of KubernetesDevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of KubernetesDevOps.com
 
Doing Dropbox the Native Cloud Native Way
Doing Dropbox the Native Cloud Native WayDoing Dropbox the Native Cloud Native Way
Doing Dropbox the Native Cloud Native WayMinio
 
End User Computing with NetApp
End User Computing with NetAppEnd User Computing with NetApp
End User Computing with NetAppNetApp
 
Mesosphere and the Enterprise: Run Your Applications on Apache Mesos - Steve ...
Mesosphere and the Enterprise: Run Your Applications on Apache Mesos - Steve ...Mesosphere and the Enterprise: Run Your Applications on Apache Mesos - Steve ...
Mesosphere and the Enterprise: Run Your Applications on Apache Mesos - Steve ...{code} by Dell EMC
 
Hyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with MesosphereHyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with MesosphereMarkus Eisele
 
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon
 
Episode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleEpisode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleMesosphere Inc.
 
Episode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSEpisode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSMesosphere Inc.
 
Operating Kubernetes at Scale (Australia Presentation)
Operating Kubernetes at Scale (Australia Presentation)Operating Kubernetes at Scale (Australia Presentation)
Operating Kubernetes at Scale (Australia Presentation)Mesosphere Inc.
 

Similar to Flink forward sf 17 (20)

Using DC/OS for Continuous Delivery - DevPulseCon 2017
Using DC/OS for Continuous Delivery - DevPulseCon 2017Using DC/OS for Continuous Delivery - DevPulseCon 2017
Using DC/OS for Continuous Delivery - DevPulseCon 2017
 
Introduction to DC/OS
Introduction to DC/OSIntroduction to DC/OS
Introduction to DC/OS
 
SMACK stack and beyond
SMACK stack and beyondSMACK stack and beyond
SMACK stack and beyond
 
Webinar: End-to-End CI/CD with GitLab and DC/OS
Webinar: End-to-End CI/CD with GitLab and DC/OSWebinar: End-to-End CI/CD with GitLab and DC/OS
Webinar: End-to-End CI/CD with GitLab and DC/OS
 
Downtime is not an option - day 2 operations - Jörg Schad
Downtime is not an option - day 2 operations -  Jörg SchadDowntime is not an option - day 2 operations -  Jörg Schad
Downtime is not an option - day 2 operations - Jörg Schad
 
Introduction to DC/OS
Introduction to DC/OSIntroduction to DC/OS
Introduction to DC/OS
 
Introduction to DC/OS
Introduction to DC/OSIntroduction to DC/OS
Introduction to DC/OS
 
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
 
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
 
Operating Flink on Mesos at Scale
Operating Flink on Mesos at ScaleOperating Flink on Mesos at Scale
Operating Flink on Mesos at Scale
 
Alluxio Mesos Meetup - SMACK to SMAACK
Alluxio Mesos Meetup - SMACK to SMAACKAlluxio Mesos Meetup - SMACK to SMAACK
Alluxio Mesos Meetup - SMACK to SMAACK
 
DevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of KubernetesDevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
 
Doing Dropbox the Native Cloud Native Way
Doing Dropbox the Native Cloud Native WayDoing Dropbox the Native Cloud Native Way
Doing Dropbox the Native Cloud Native Way
 
End User Computing with NetApp
End User Computing with NetAppEnd User Computing with NetApp
End User Computing with NetApp
 
Mesosphere and the Enterprise: Run Your Applications on Apache Mesos - Steve ...
Mesosphere and the Enterprise: Run Your Applications on Apache Mesos - Steve ...Mesosphere and the Enterprise: Run Your Applications on Apache Mesos - Steve ...
Mesosphere and the Enterprise: Run Your Applications on Apache Mesos - Steve ...
 
Hyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with MesosphereHyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with Mesosphere
 
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
 
Episode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at ScaleEpisode 2: Deploying Kubernetes at Scale
Episode 2: Deploying Kubernetes at Scale
 
Episode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSEpisode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OS
 
Operating Kubernetes at Scale (Australia Presentation)
Operating Kubernetes at Scale (Australia Presentation)Operating Kubernetes at Scale (Australia Presentation)
Operating Kubernetes at Scale (Australia Presentation)
 

Recently uploaded

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Flink forward sf 17

  • 1. © 2017 Mesosphere, Inc. All Rights Reserved. 1 FlinkForward 2017 - San Francisco Flink meet DC/OS Deploying Apache Flink at Scale Elizabeth K. Joseph, @pleia2 Ravi Yadav, @RaaveYadav
  • 2. © 2017 Mesosphere, Inc. All Rights Reserved. 2 Part 1 Introduction to Apache Mesos, Marathon, and DC/OS Part 2 Demonstration of demo data pipeline + Installing Flink on DC/OS Part 3 DC/OS 1.9 key features for data services and beyond Talk Outline
  • 3. © 2017 Mesosphere, Inc. All Rights Reserved. 3 Apache Mesos: The datacenter kernel http://mesos.apache.org/
  • 4. © 2017 Mesosphere, Inc. All Rights Reserved. 4 ● Mesos can’t run applications on its own. ● A Mesos framework is a distributed system that has a scheduler. ● Schedulers like Marathon start and keep your applications running. A bit like a distributed init system. ● Mesos mechanics are fair and HA ● Learn more at https://mesosphere.github.io/marat hon/ Marathon
  • 5. © 2017 Mesosphere, Inc. All Rights Reserved. 5 Introducing DC/OS Solves common problems ● Resource management ● Task scheduling ● Container orchestration ● Self-healing infrastructure ● Logging and metrics ● Network management ● “Universe” of pre-configured apps (including Flink, Kafka…) ● Learn more and contribute at https://dcos.io/
  • 6. © 2017 Mesosphere, Inc. All Rights Reserved. 6 DC/OS Architecture Overview Security & Governance Container Orchestration Monitoring & Operations User Interface & Command Line HDFS Jenkins Marathon Cassandra Flink Spark Docker Kafka MongoDB +30 more... DC/OS Services & Containers ANY INFRASTRUCTURE
  • 7. © 2017 Mesosphere, Inc. All Rights Reserved. 7 Web-based GUI https://dcos.io/docs/lates t/usage/webinterface/ Interact with DC/OS (1/2)
  • 8. © 2017 Mesosphere, Inc. All Rights Reserved. 8 Universe
  • 9. © 2017 Mesosphere, Inc. All Rights Reserved. 9 CLI tool https://dcos.io/docs/latest/usage/cli/ API https://dcos.io/docs/latest/api/ Interact with DC/OS (2/2)
  • 10. © 2017 Mesosphere, Inc. All Rights Reserved. 10 According to the December 2016 data Artisans-organized Apache Flink user survey just under 30% of respondents were running Flink on Apache Mesos https://dcos.io/blog/2017/apache-flink-on-dc-os-and-apache-mesos/ You may already be using Apache Mesos! Version 1.2 of Flink includes support for Apache Mesos and DC/OS, “it is now possible to run an highly available Flink cluster on Mesos” https://flink.apache.org/news/2017/02/06/release-1.2.0.html#run-flink-with-apache- mesos & https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/mesos.html Flink on Apache Mesos and DC/OS
  • 11. © 2017 Mesosphere, Inc. All Rights Reserved. 11 DEMOSDemo data pipeline + Installing Flink on DC/OS
  • 12. © 2017 Mesosphere, Inc. All Rights Reserved. DC/OS 1.9 - Data Services Ecosystem WORKLOADS ● Pods ● GPU based scheduling DATA SERVICES ECOSYSTEM OPERATIONS ● Remote Container Shell ● Unified Metrics ● Deployment failure analyzer ● Alluxio ● Couchbase ● Datastax DSE ● Elastic (ELK) ● Redis ● Apache Flink
  • 13. © 2017 Mesosphere, Inc. All Rights Reserved. DC/OS 1.9 - Operations WORKLOADS ● Pods ● GPU based scheduling DATA SERVICES ECOSYSTEM OPERATIONS ● Remote Container Shell ● Unified Metrics ● Unified Logging ● Deployment Failure Debugging ● Upgrades & Configuration updates
  • 14. © 2017 Mesosphere, Inc. All Rights Reserved. 14 ● Open encrypted, interactive, remote session to your containers ● Remotely execute commands for real time app troubleshooting ● Provide developers access to their own applications, not the entire host or cluster my-laptop$ dcos task exec my-task /bin/bash Starting /bin/bash in my-task ... Connecting to remote my-task … REMOTE CONTAINER SHELL DC/OS: OPERATIONS DC/OS 1.9
  • 15. © 2017 Mesosphere, Inc. All Rights Reserved. 15 UNIFIED LOGGING DC/OS: OPERATIONS ● Access application, DC/OS and OS logs ● Easily troubleshoot applications with critical metadata such as container id and app id ● Integrate easily with existing logging systems DC/OS 1.9
  • 16. © 2017 Mesosphere, Inc. All Rights Reserved. 16 UNIFIED METRICS DC/OS: OPERATIONS ● Single API for system, container and application metrics ● Metadata such as host id and container id are automatically added to assist in debugging ● Integrate easily with existing metrics systems DC/OS 1.9 Container
  • 17. © 2017 Mesosphere, Inc. All Rights Reserved. 17 DEPLOYMENT FAILURE DEBUGGING DC/OS: OPERATIONS ● Understand why your application is not deploying ● Understand which nodes in the cluster can accommodate the role, constraints, cpu, mem, disk and port requirements for your app DC/OS 1.9
  • 18. © 2017 Mesosphere, Inc. All Rights Reserved. UPGRADES AND CONFIG UPDATES DC/OS: OPERATIONS DC/OS 1.9 18 ● Generate new config for cluster nodes $ dcos_generate_config.sh --generate-node-upgrade-script <installed_cluster_version> ● Single command upgrade script for individual nodes $ curl -O <Node upgrade script URL> $ sudo bash ./dcos_node_upgrade.sh
  • 19. © 2017 Mesosphere, Inc. All Rights Reserved. DC/OS 1.9 - Workloads WORKLOADS ● Pods ● GPU based scheduling DATA SERVICES ECOSYSTEM OPERATIONS ● Remote Container Shell ● Unified Logging ● Unified Metrics ● Deployment failure analyzer
  • 20. © 2017 Mesosphere, Inc. All Rights Reserved. 20 ● Schedule, deploy and scale multiple containers on the same host(s) while sharing IP address and storage volumes ● All containers in a pod instance run as if they are running on a single host in pre-container world ● Useful for migrating legacy applications or building advanced micro services (side car containers) PODS DC/OS: WORKLOADS DC/OS 1.9
  • 21. © 2017 Mesosphere, Inc. All Rights Reserved. 21 ● Traditional monolithic apps on VMs usually have support services such as log shipper, message queuing clients ● Many support services assume col-location on same host, and local-host access to networking and storage ● Pods simplify moving legacy monolithic apps to containers, reducing risk and accelerating migrations DC/OS 1.9 PODS: MIGRATING LEGACY APPS TO CONTAINERS DC/OS: WORKLOADS
  • 22. © 2017 Mesosphere, Inc. All Rights Reserved. 22 ● Advanced Micro Services patterns require colocating containers together ● Support services include for example: ○ Logging or monitoring agents, ○ Backup tooling & Proxies ○ Data change watchers & Event publishers ● Pods simplify the building and maintenance of complex such microservices DC/OS 1.9 PODS: SUPPORT SERVICES (SIDE-CAR CONTAINERS) DC/OS: WORKLOADS
  • 23. © 2017 Mesosphere, Inc. All Rights Reserved. GPU: WHY GPU? DC/OS: WORKLOADS DC/OS 1.9 ● GPUs are needed for many machine learning and deep learning applications ● GPUs are essential for real-time or near real time machine learning models ● GPUs deliver from 10X to 100X performance for some applications, resulting lower $$$/IOPS and more productivity to data science teams ● GPU applications include real time fraud detection, genome sequencing, cohort analysis and many others
  • 24. © 2017 Mesosphere, Inc. All Rights Reserved. GPU BASED SCHEDULING DC/OS: WORKLOADS DC/OS 1.9 24 ● Test Locally with Nvidia-Docker, deploy to production with DC/OS ● Isolate GPU instances and schedule workloads just like CPU and memory, guaranteeing performance ● Efficiently Share GPU resources across data science team ● Simplify migrating machine learning models across from dev to production, and across clouds
  • 25. © 2017 Mesosphere, Inc. All Rights Reserved. OTHER IMPROVEMENTS DC/OS DC/OS 1.9 25 ● Mesos 1.2 ● Marathon 1.4 ● Docker 1.12 and 1.13 (17.03-ce) support ● Centos 7.3 and CoreOS 1235.12.0 support ● Performance improvements across all networking features. ● CNI support for 3rd party CNI plugins. ● 100s of additional bugfixes and tests