More Related Content Similar to Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Flink on Mesos at Scale" (20) More from Flink Forward (20) Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Flink on Mesos at Scale"2. © 2018 Mesosphere, Inc. All Rights Reserved. 2
Jörg Schad
Tech Lead Community @Mesosphere
@joerg_schad
@joerg.mesosphere
Biswajit Das
Chief Architect @Branch
biswajit@branch.io
3. © 2018 Mesosphere, Inc. All Rights Reserved.
● Resource Manager
○ Dynamic resource allocation
○ Running multiple applications
○ 2-level scheduling
● Fault-tolerant, battle-tested
● Scalable to 10,000+ nodes
● Created by Mesosphere founder @ UC Berkeley; used in production by 100+
web-scale companies [1]
●
[1] http://mesos.apache.org/documentation/latest/powered-by-mesos/
Apache Mesos in a Nutshell
4. © 2018 Mesosphere, Inc. All Rights Reserved.
● Mesos offers full functionality to implement fault tolerant and elastic
distributed applications
● 30% of survey respondents were running Flink on Mesos (prior to proper
Mesos support*, September 2016)
● Other Deployment Models
● Standalone
● Yarn
● Kubernetes
*Kudos to Eron Wright for this work
Why Flink & Mesos
5. © 2018 Mesosphere, Inc. All Rights Reserved. 5
Why Mesos?
Typical Datacenter
siloed, over-provisioned servers,
low utilization
Kafka
Kubernetes
HDFS
Flink
Flink Test
7. © 2018 Mesosphere, Inc. All Rights Reserved. 7
Datacenter
Typical Datacenter
siloed, over-provisioned servers,
low utilization
Mesos/ DC/OS
automated schedulers, workload multiplexing onto the
same machines
HDFS
Kubernetes
Kafka
Flink
Flink 2
9. © 2018 Mesosphere, Inc. All Rights Reserved.
Two-level Scheduling
1. Agents advertise resources to Master
2. Master offers resources to Framework
3. Framework rejects / uses resources
4. Agent reports task status to Master
9
MESOS ARCHITECTURE
Mesos
Master
Mesos
Master
Mesos
Master
Mesos AgentMesos Agent Service
Cassandra
Executor
Cassandra
Task
Flink
Scheduler
Spark
Executor
Spark
Task
Mesos AgentMesos Agent Service
Docker
Executor
Docker
Task
CDB
Executor
Spark
Task
Spark
Scheduler
Kafka
Scheduler
12. © 2018 Mesosphere, Inc. All Rights Reserved.
Flink Mesos Integration (old/simplefied)
Apache Flink Framework Mesos Master
Mesos App Master
Flink Mesos
ResourceManager
JobManager
Mesos Task
TaskManager
Mesos Task
TaskManager
Allocate
Resources
Launch Mesos tasksRegister
Execute Job
13. © 2018 Mesosphere, Inc. All Rights Reserved.
Flink Mesos Integration
Mesos Master
Mesos Cluster
Client
(2) HTTP POST
JobGraph/Jars
Flink Master Process
Flink Mesos
ResourceManager
JobManager
(4) Start Process
(and supervise)
(8) Deploy
Tasks
(7) Register
(5) Request slots
Flink Mesos
Dispatcher
(3) Allocate
container
for Flink master
(6) Allocate
containers
for TaskManagers
Marathon
(1) Start and
monitor
dispatcher
Mesos Task
TaskManager
Mesos Task
TaskManager
19. ➢ 50 Streaming Jobs
➢ Stream RPS 120k/sec
➢ 10B + events /day
➢ 2.5 TB /day
➢ 200+ Mesos Node cluster
➢ Marathon on Marathon
➢ Auto Scale with custom tool x-scale & ASG
➢ Custom Monitoring Platform with prometheus and Elk
21. © 2018 Mesosphere, Inc. All Rights Reserved.
● Versioned app definition/job
● Immutable Docker tags
● Private Docker registry
● CI/CD
● No manual deployments to Prod
Deployments
22. © 2018 Mesosphere, Inc. All Rights Reserved.
● Use HDFS for HA setup
● dcos package install HDFS
● dcos hdfs endpoints
HA Setup
23. © 2018 Mesosphere, Inc. All Rights Reserved.
● Which Container Runtime
● UCR vs Docker
● No need to build docker images
Containerization
{
"id": "/flink-app",
"cmd": "$JAVA_HOME/bin/java -jar MyApp.jar",
"instances": 1,
"fetch": [
{
"uri": "http://…/MyApp.jar",
},
{
"uri": "https://.../jre-8u121-linux-x64.tar.gz",
}
],
24. © 2018 Mesosphere, Inc. All Rights Reserved.
● JVM and Container
● Not aware of cgroups
● Much better with JDK 9 & 10
● Overwrite JVM default values
Containerization
https://cloakable.irdeto.com/2017/08/24/java-is-a-first-class-citizen-in-a-docker-ecosystem-now/
25. © 2018 Mesosphere, Inc. All Rights Reserved.
● Depends on Job you are :)
○ Monitoring usage/allocation
● Memory
○ Consider Overhead to Heap
● Flexibility thanks to Flip-6
Resource Allocation
26. © 2018 Mesosphere, Inc. All Rights Reserved.
● Share resources between multiple
frameworks/job
● Without static partitioning
● One role per job/entity
● Use quota per role
● Min and Max resource
allocation
Multi-User: Quota
28. © 2018 Mesosphere, Inc. All Rights Reserved.
Currently manual changes and
redeploy
● Checkpoints
● Parallel Deployments
Configuration Changes and Updates
29. © 2018 Mesosphere, Inc. All Rights Reserved. 29
Demo
Generator Display
1. Financial data created
by generator
2. Written to
Kafka topics
3. Kafka Topics
consumed by Flink 4. Results written back into Kafka
stream (another topic)
7. Results displayed
30. © 2018 Mesosphere, Inc. All Rights Reserved.
Special Thanks to All Collaborators
30
Till Rohrmann
Eron Wright
Robin Oh
Mischa Krüger
...
● Contribute!
○ Flink
○ Flink/Mesos
○ DC/OS package
○ Documentation
○ ...