This talk will focus on how to package, distribute and deploy Flink Jobs by leveraging existing docker technology: Previously deploying of Flink Jobs has been a manual job which leads into errors. In this talk, we present an approach which works well in an CI/CD environment by automating most steps: From the code of a Flink Job in a repository to a running Job on an YARN cluster.
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Containers
1. Deploying Flink Jobs
as Docker Containers
Dominik Bruhn - Director of Platform Engineering Relayr
Flink Forward Sep 12-14 2017 Berlin
2. Who am I?
● Director of Platform Engineering
● At Relayr
○ Industrial IOT Platform
○ Real Time Sensor Data is processed, stored, analyzed and presented
● Contributor to Apache Flink
4. What are the Requirements?
● Streaming Jobs
● Endless Jobs
● Deployed in Multiple Environments (Configuration Files)
● Single Deployed Artefact
● Deploy to YARN (EMR) cluster
● Repeatable Builds
5. What is the Idea?
“If it behaves like a service, package it like a service”
● Docker Container contains the Job + Flink + Some
scripting
● Docker Container submits and monitors the Flink Job in
the YARN cluster.
● Actual computation happens in the YARN cluster
● Docker Container stays attached to the job
6. Steps Necessary I - Building
Source
Flink Job
Docker
ContainerCompile
Test
Package
Upload
7. Packaging
● Result from compliation: Job Fat JAR
○ Exclude Flink from the Fat JAR
● Package into a Docker Container:
○ Java
○ Apache Flink Distribution
○ Job Fat JAR
○ Configuration File (Templates)
○ If needed: Utilities for Configuration Fetching
○ Entrypoint Shell Script
9. What does the Container do?
1. Fetch Configuration Values + YARN Credentials
2. Get YARN Configuration
3. Update Configuration File of Job + Flink
4. List YARN Jobs, find old running
a. If found, kill
5. Find out the last savepoint on HDFS
6. Start job on YARN from savepoint
7. Stay attached to the Job
10. What do we get from this?
● Flink Job deployed as all other services
● Adapting to different environment
● Monitoring and failing like other services
11. What was left out?
● Packaging as stand alone
docker container (i.e. for
testing)
What could be done in the
future?
● Approach independent of
YARN on a stand alone Flink
cluster.
● Use other resource
management tools instead of
YARN, i.e. kubernetes.