Presented By: Piyush Rana
Technology Lead @ Knoldus Inc,
Canada
Keeping your business future-ready with extremely well-engineered systems through the
unwavering pursuit of high-quality engineers, processes, and practices
REACTIVE
PRODUCTS
Microservices & API
● Scala
● Functional Java
● Rust
● Elixir
● Haskell
ENTERPRISE
DATA PROGRAM
Data Lake
● Delta
ARTIFICIAL
INTELLIGENCE
Machine Learning
Data Science
Deep Learning
● Knime
● Tensorflow
● PyTorch
● Scikit
BLOCKCHAIN
● Hyperledger
● DAML
● ScoreX
Fast Data
● Spark
● FLink
● Hadoop
Agile
Transformation
Reactive UI/UX
Test Automation
Practice
Reactive DevOps
Product
Engineering
Agenda
Anatomy Of Flink Cluster
Flink Application Execution
Using Google Operator - Knoldus Contribution
Demo On Kubernetes
Task Slots and Resources
[1]
JOB MANAGER
ResourceManager
Dispatcher
JobMaster
TASK MANAGERLeader Worker
Why?
Used to send a
Dataflow to the
JobManager
[1]
Task Slots - To control how many tasks a TaskManager accepts.
SESSION CLUSTER JOB CLUSTER APPLICATION CLUSTER
FLINK APPLICATION EXECUTION
Execution's Cluster Lifecycle Resource Isolation Other Considerations
Flink Session Cluster
Pre-existing, long-running cluster
that can accept multiple job
submissions.
Competition for cluster resources - like
network bandwidth in the submit-job
phase.
Suited for jobs having low
execution time and needs a low
startup time.
Flink Job Cluster
A new Cluster for each submitted
job and this cluster is available to
that job only.
Assign dedicated resources to each job
and a fatal error in the JobManager only
affects the one job running in that Flink
Job Cluster.
More suited to large jobs that are
long-running.
Flink Application
Cluster
A Dedicated Flink cluster that only
executes jobs from one Flink
Application and where the main()
method runs on the cluster rather
than the client.
The ResourceManager and Dispatcher
are scoped to a single Flink Application,
which provides a better separation of
concerns than the Flink Session Cluster.
A Flink Job Cluster can be seen as
a “run-on-client” alternative to
Flink Application Clusters.
Flink Operator
Advantages
Custom Resource Definition
Kubernetes Operator
Kubernetes Operator
Extend Kubernetes Control Panel
# Install google-flink-operator
`kubectl apply -f flink-operator-crd.yaml`
OR
`helm install flink-op charts/flink-operator`
Components
CRD FlinkCluster
Controller
[2]
❤
Contributor : AYUSH SINGHAL
Add Image Pull Secret And Node Selector
Add Resources In Job
Template
Knolx - Knowledge Distribution - Knoldus Tech Hub
Demo - https://techhub.knoldus.com/dashboard/projects/flink
[1] Apache Flink Documentation -
https://ci.apache.org/projects/flink/flink-docs-release-1.11/concepts/flink-architecture.ht
ml
[2] CNCF Webinar -
https://www.cncf.io/wp-content/uploads/2020/08/CNCF-Webinar_-Apache-Flink-on-Ku
bernetes-Operator-1.pdf
Knoldus Website - https://www.knoldus.com/home
Stay in Touch

Flink Jobs Deployment On Kubernetes

  • 1.
    Presented By: PiyushRana Technology Lead @ Knoldus Inc, Canada
  • 2.
    Keeping your businessfuture-ready with extremely well-engineered systems through the unwavering pursuit of high-quality engineers, processes, and practices
  • 3.
    REACTIVE PRODUCTS Microservices & API ●Scala ● Functional Java ● Rust ● Elixir ● Haskell ENTERPRISE DATA PROGRAM Data Lake ● Delta ARTIFICIAL INTELLIGENCE Machine Learning Data Science Deep Learning ● Knime ● Tensorflow ● PyTorch ● Scikit BLOCKCHAIN ● Hyperledger ● DAML ● ScoreX Fast Data ● Spark ● FLink ● Hadoop Agile Transformation Reactive UI/UX Test Automation Practice Reactive DevOps Product Engineering
  • 4.
    Agenda Anatomy Of FlinkCluster Flink Application Execution Using Google Operator - Knoldus Contribution Demo On Kubernetes Task Slots and Resources
  • 5.
  • 6.
  • 7.
    Why? Used to senda Dataflow to the JobManager
  • 8.
    [1] Task Slots -To control how many tasks a TaskManager accepts.
  • 9.
    SESSION CLUSTER JOBCLUSTER APPLICATION CLUSTER
  • 10.
    FLINK APPLICATION EXECUTION Execution'sCluster Lifecycle Resource Isolation Other Considerations Flink Session Cluster Pre-existing, long-running cluster that can accept multiple job submissions. Competition for cluster resources - like network bandwidth in the submit-job phase. Suited for jobs having low execution time and needs a low startup time. Flink Job Cluster A new Cluster for each submitted job and this cluster is available to that job only. Assign dedicated resources to each job and a fatal error in the JobManager only affects the one job running in that Flink Job Cluster. More suited to large jobs that are long-running. Flink Application Cluster A Dedicated Flink cluster that only executes jobs from one Flink Application and where the main() method runs on the cluster rather than the client. The ResourceManager and Dispatcher are scoped to a single Flink Application, which provides a better separation of concerns than the Flink Session Cluster. A Flink Job Cluster can be seen as a “run-on-client” alternative to Flink Application Clusters.
  • 11.
    Flink Operator Advantages Custom ResourceDefinition Kubernetes Operator Kubernetes Operator Extend Kubernetes Control Panel
  • 12.
    # Install google-flink-operator `kubectlapply -f flink-operator-crd.yaml` OR `helm install flink-op charts/flink-operator` Components CRD FlinkCluster Controller [2]
  • 13.
  • 14.
    Contributor : AYUSHSINGHAL Add Image Pull Secret And Node Selector Add Resources In Job Template
  • 16.
    Knolx - KnowledgeDistribution - Knoldus Tech Hub Demo - https://techhub.knoldus.com/dashboard/projects/flink
  • 17.
    [1] Apache FlinkDocumentation - https://ci.apache.org/projects/flink/flink-docs-release-1.11/concepts/flink-architecture.ht ml [2] CNCF Webinar - https://www.cncf.io/wp-content/uploads/2020/08/CNCF-Webinar_-Apache-Flink-on-Ku bernetes-Operator-1.pdf Knoldus Website - https://www.knoldus.com/home
  • 18.