SlideShare a Scribd company logo
1 of 111
Apache Flink Crash Course
Slim Baltagi & Srini Palthepu
with some materials from data-artisans.com
Chicago Apache Flink Meetup
August 4th
2015
2
“One week of trials and errors can
save you up to half an hour of
reading the documentation.”
Anonymous
3
For an overview of Apache Flink, see our slides at http://goo.gl/gVOSp8
Gelly
Table
ML
SAMOA
DataSet (Java/Scala/Python)
Batch Processing
DataStream (Java/Scala)
Stream Processing
HadoopM/R
Local
Single JVM
Embedded
Docker
Cluster
Standalone
YARN, Tez,
Mesos (WIP)
Cloud
Google’s GCE
Amazon’s EC2
IBM Docker Cloud, …
GoogleDataflow
Dataflow(WiP)
MRQL
Table
Cascading(WiP)
Runtime
Distributed Streaming
Dataflow
Zeppelin
DEPLOYSYSTEMAPIs&LIBRARIESSTORAGE
Files
Local
HDFS
S3
Tachyon
Databases
MongoDB
HBase
SQL
…
Streams
Flume
Kafka
RabbitMQ
…
Batch Optimizer Stream Builder
4
In this talk, we will cover practical steps for:
 Setup and configuration of your Apache
Flink environment
 Using Flink tools
 Learning Flink’s APIs & Domain Specific
Libraries through
 Some Apache Flink program
examples
 Free Training from Data Artisans
in Java and Scala
 Writing, testing, debugging, deploying and
tuning your Flink applications
5
Agenda
1. How to setup and configure your Apache Flink
environment?
2. How to use Apache Flink tools?
3. How to learn Apache Flink’s APIs and its domain
specific libraries?
4. How to set up your IDE (IntelliJ IDEA or Eclipse) for
Apache Flink?
5. How to write, test and debug your Apache
Flink program in an IDE?
6. How to deploy your Apache Flink application in
local, in a cluster or in the cloud?
7. How to tune your Apache Flink application?
6
1. How to setup and configure your Apache
Flink environment?
1.1 Local (on a single machine)
1.2 VM image (on a single machine)
1.3 Docker
1.4 Standalone Cluster
1.5 YARN Cluster
1.6 Cloud
7
1.1 Local (on a single machine)
Flink runs on Linux, OS X and Windows.
In order to execute a program on a running Flink
instance (and not from within your IDE) you need
to install Flink on your machine.
The following steps will be detailed for both Unix-
Like (Linux, OS X) as well as Windows
environments:
1.1.1 Verify requirements
1.1.2 Download
1.1.3 Unpack
1.1.4 Check the unpacked archive
1.1.5 Start a local Flink instance
1.1.6 Validate Flink is running
1.1.7 Run a Flink example
1.1.8 Stop the local Flink instance
8
1.1 Local (on a single machine)
1.1.1 Verify requirements
The machine that Flink will run on must have Java
1.6.x or higher installed.
In Unix-like environment, the $JAVA_HOME
environment variable must be set. Check the correct
installation of Java by issuing the following
commands: java –version and also check if $Java-
Home is set by issuing: echo $JAVA_HOME. If
needed, follow the instructions for installing Java
and Setting JAVA_HOME here:
http://docs.oracle.com/cd/E19182-01/820-7851/inst_cli_jd
9
1.1 Local (on a single machine)
In Windows environment, check the correct
installation of Java by issuing the following
commands: java –version. Also, the bin folder of
your Java Runtime Environment must be included in
Window’s %PATH% variable. If needed, follow
this guide to add Java to the path variable.
http://www.java.com/en/download/help/path.xml
1.1.2 Download the latest stable release of Apache
Flink from http://flink.apache.org/downloads.html
For example: In Linux-Like environment, run the
following command:
wget https://www.apache.org/dist/flink/flink-
0.9.0/flink-0.9.0-bin-hadoop2.tgz
10
1.1 Local (on a single machine)
1.1.3 Unpack the downloaded .tgz archive
Example:
$ cd ~/Downloads # Go to download directory
$ tar -xvzf flink-*.tgz # Unpack the downloaded archive
1.1.4. Check the unpacked archive
$ cd flink-0.9.0
The resulting folder contains a Flink setup that can
be locally executed without any further
configuration.
flink-conf.yaml under flink-0.9.0/conf contains the
default configuration parameters that allow Flink to
run out-of-the-box in single node setups.
11
1.1 Local (on a single machine)
12
1.1 Local (on a single machine)
1.1.5. Start a local Flink instance:
• Given that you have a local Flink installation,
you can start a Flink instance that runs a master
and a worker process on your local machine in a
single JVM. This execution mode is useful for
local testing.
• On UNIX-Like system you can start a Flink instance as
follows:
 cd /to/your/flink/installation
 ./bin/start-local.sh
13
1.1 Local (on a single machine)
1.1.5. Start a local Flink instance:
On Windows you can either start with:
• Windows Batch Files by running the following
commands
 cd C:toyourflinkinstallation
 .binstart-local.bat
• or with Cygwin and Unix Scripts: start the Cygwin
terminal, navigate to your Flink directory and run
the start-local.sh script
 $ cd /cydrive/c
 cd flink
 $ bin/start-local.sh
14
1.1 Local (on a single machine)
The JobManager (the master of the distributed system)
automatically starts a web interface to observe program
execution. In runs on port 8081 by default (configured
in conf/flink-config.yml). http://localhost:8081/
1.1.6 Validate that Flink is running
You can validate that a local Flink instance is running
by:
• Issuing the following command: $jps
jps: java virtual machine process status tool
• Looking at the log files in ./log/
$tail log/flink-*-jobmanager-*.log
• Opening the JobManager’s web interface at
http://localhost:8081
15
1.1 Local (on a single machine)
1.1.7 Run a Flink example
• On UNIX-Like system you can run a Flink example as follows:
 cd /to/your/flink/installation
 ./bin/flink run ./examples/flink-java-examples-0.9.0-
WordCount.jar
• On Windows Batch Files, open a second terminal and run the
following commands”
 cd C:toyourflinkinstallation
 .binflink.bat run .examplesflink-java-
examples-0.9.0-WordCount.jar
1.1.8 Stop local Flink instance
•On UNIX you call ./bin/stop-local.sh
•On Windows you quit the running process with Ctrl+C
16
1.2 VM image (on a single machine)
Download Flink Virtual Machine from:
https
://docs.google.com/uc?id=0B-oU5Z27sz1hZ0VtaW5idFViNU0&export=
download
The password is: flink
This version works with VMware Fusion on
OS X since there is no VMware player for
OSX.
https://www.vmware.com/products/fusion/fusion-evaluation.html
17
1.3 Docker
 Apache Flink cluster deployment on Docker using
Docker-Compose By Romeo Kienzler. Talk at the
Apache Flink Meetup Berlin planned for August 26,
2015 http
://www.meetup.com/Apache-Flink-Meetup/events/2239133
/
The talk will:
• Introduce the basic concepts on container
isolation exemplified on Docker
• Explain how Apache Flink is made elastic using
Docker-Compose.
• Show how to push the cluster to the cloud
exemplified on the IBM Docker Cloud.
18
1.4 Standalone Cluster
 See quick start - Cluster setuphttps
://ci.apache.org/projects/flink/flink-docs-release-0.9/quickstart/setup_quickstart.html#
setup
See instructions on how to run Flink in a fully
distributed fashion on a cluster. This involves
two steps:
• Installing and configuring Flink
• Installing and configuring the Hadoop
Distributed File System (HDFS)
https://ci.apache.org/projects/flink/flink-docs-master/setup/cluster_setup.ht
19
1.5 YARN Cluster
You can easily deploy Flink on your
existing YARN cluster.
Download the Flink Hadoop2 package: Flink
with Hadoop 2
http://www.apache.org/dyn/closer.cgi/flink/
Make sure
your HADOOP_HOME (or YARN_CONF_DIR or
HADOOP_CONF_DIR) environment
variable is set to read your YARN and HDFS
configuration.
20
1.5 YARN Cluster
Run the YARN client with:
./bin/yarn-session.sh
You can run the client with options -n 10 -tm
8192 to allocate:
 10 TaskManagers
with 8GB of memory each.
For more detailed instructions, check out the
documentation: https
://ci.apache.org/projects/flink/flink-docs-master/se
21
1.6 Cloud
1.6.1 Google Compute Engine (GCE)
1.6.2 Amazon EMR
22
1.6 Cloud
1.6.1 Google Compute Engine
Free trial for Google Cloud Engine:
https://cloud.google.com/free-trial/
Enjoy your $300 in GCE for 60 days!
Now, how to setup Flink with Hadoop 1 or
Hadoop 2 on top of a Google Compute Engine
cluster? Google’s bdutil starts a cluster and
deploys Flink with Hadoop.
To get started, just follow the steps here:
https://ci.apache.org/projects/flink/flink-docs-master/setup/gce_setup.html
23
1.6 Cloud
1.6.2 Amazon EMR
Amazon Elastic MapReduce (Amazon EMR) is
a web service providing a managed Hadoop
framework.
• http://aws.amazon.com/elasticmapreduce/
• http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-what-is
emr.html
•Example: Use Stratosphere with Amazon
Elastic MapReduce, February 18, 2014 by
Robert Metzgerhttps
://flink.apache.org/news/2014/02/18/amazon-elastic-mapreduce-cloud-yarn.html
24
1.6 Docker
Docker can be used for local development
Often resource requirements on Data
Processing Clusters exhibit high variation.
Elastic deployments reduce TCO (Total Cost
of Ownership).
Container based virtualization; lightweight
and portable; build once, run anywhere; ease
of packaging applications; automated and
scripted; isolated
Apache Flink cluster deployment on Docker
using Docker-Compose
https://github.com/streamnsight/docker-flink
25
2. How to use Apache Flink tools?
2.1 Command-Line Interface (CLI)
2.2 Job Client Web Interface
2.3 Job Manager Web Interface
2.4 Interactive Scala Shell
2.5 Zeppelin Notebook
26
2.1 Command-Line Interface (CLI)
 Example:
./bin/flink run ./examples/flink-java-examples-
0.9.0-WordCount.jar
 bin/flink has 4 major actions
• run #runs a program
• info #displays information about a program.
• list #lists running and finished programs. -r &
-s
./bin/flink list -r -s
• cancel #cancels a running program. –I
 See more examples:
https://ci.apache.org/projects/flink/flink-docs-master/apis/cli.html
27
2.2 Job Client Web Interface
Flink provides a web interface to:
• Upload jobs
• Inspect their execution plans
• Execute them
• Showcase programs
• Debug execution plans
• Demonstrate the system as a whole
The web interface runs on port 8080 by default.
To specify a custom port set
the webclient.port property in the
./conf/flink.yaml configuration file.
28
2.2 Job Client Web Interface
Start the web interface by executing:
./bin/start-webclient.sh
Stop the web interface by executing:
./bin/stop-webclient.sh
• Jobs are submitted to the JobManager
specified
by jobmanager.rpc.address and jobmanager.rpc.port
• For more details and further configuration
options, please consult this webpage:
https
://ci.apache.org/projects/flink/flink-docs-release-0.9/setup/config.html#webclient
29
2.3   Job Manager Web Interface
 The JobManager (the master of the 
distributed system) starts a web interface 
to observe program execution. 
 It runs on port 8081 by default (configured 
in conf/flink-config.yml).
 Open the JobManager’s web interface 
at 
 http://localhost:8081
• jobmanager.rpc.port   6123 
• jobmanager.web.port 8081 
30
2.3   Job Manager Web Interface
Overall system status
Job execution details
Task Manager resource
utilization
31
2.3 Job Manager Web Interface
The JobManager web frontend allows to :
• Track the progress of a Flink program as 
all status changes are also logged to the 
JobManager’s log file.
• Figure out why a program failed as it 
displays the exceptions of failed tasks 
and allow to figure out which parallel task 
first failed and caused the other tasks to 
cancel the execution.
32
2.4   Interactive Scala Shell
Flink comes with an Interactive Scala Shell - REPL 
( Read Evaluate Print Loop ) :
 ./bin/start-scala-shell.sh
Interactive queries
Let’s you explore data quickly
Complete Scala API available
It can be used in a local setup as well as in a 
cluster setup. 
The Flink Shell comes with command history and 
auto completion.
So far only batch mode is supported. There is plan 
to add streaming in the future: 
https://ci.apache.org/projects/flink/flink-docs-master/scala_shell.html
 
33
2.4   Interactive Scala Shell
bin/start-scala-shell.sh --host localhost --port 6123
34
2.4   Interactive Scala Shell
Example 1: 
Scala-Flink> val input = env.fromElements(1,2,3,4)
Scala-Flink> val doubleInput = input.map(_ *2)
Scala-Flink> doubleInput.print()
Example 2: 
Scala-Flink> val text = env.fromElements( "To be, or not
to be,--that is the question:--", "Whether 'tis nobler in the
mind to suffer", "The slings and arrows of outrageous
fortune", "Or to take arms against a sea of troubles,")
Scala-Flink> val counts = text.flatMap
{ _.toLowerCase.split("W+") }.map { (_,
1) }.groupBy(0).sum(1)
Scala-Flink> counts.print()
35
2.4   Interactive Scala Shell
Problems with the Interactive Scala Shell:
No visualization
No saving  
No replaying of written code
No assistance as in an IDE
36
2.5   Zeppelin Notebook
Web-based interactive computation 
environment 
Combines rich text, execution code, plots 
and rich media 
Exploratory data science
Storytelling 
37
2.5   Zeppelin Notebook
http://localhost:8080/
38
3. How to learn Flink’s APIs and libraries?
3.1 How to run the examples in the Apache 
Flink bundle?
3.2 How to learn Flink Programming APIs? 
3.3 How to learn Apache Flink Libraries?
39
3.1 How to run the examples in the Apache 
Flink bundle?
3.1.1 Where are the examples?
3.1.2 Where are the related source 
codes?
3.1.3 How to re-build these examples?
3.1.4 How to run these examples?
40
3.1 How to run the examples in the Apache 
Flink bundle?
3.1.1    Where are the examples?
41
3.1 How to run the examples in the Apache 
Flink bundle?
The examples provided in the Flink bundle 
showcase different applications of Flink from 
simple word counting to graph algorithms.
They illustrate the use of Flink’s API.
 They are a very good way to learn how to 
write Flink jobs. 
A good starting point would be to modify 
them!
Now, where are the related source codes!?
42
3.1 How to run the examples in the Apache 
Flink bundle?
3.1.2    Where are the related source codes?
You can find the source code of these 
Flink examples in the flink-java-examples or 
the flink-scala-examples of the flink-
examples module of the source release of 
Flink.  
You can also access the source (and 
hence the examples) through GitHub: 
https://github.com/apache/flink/tree/master/flink-examples
43
3.1 How to run the examples in the Apache 
Flink bundle?
3.1.2    Where are the related source codes?
If you don't want to import the whole Flink project 
just for playing around with the examples, you can:
• Create an empty maven project. This script will 
automatically set everything up for you: $ curl 
http://flink.apache.org/q/quickstart.sh | bash
• Import the "quickstart" project into Eclipse or 
IntelliJ. It will download all dependencies and 
package everything correctly. 
• If you want to use an example there, just copy the 
Java file into the "quickstart" project. 
44
3.1 How to run the examples in the Apache 
Flink bundle?
3.1.3    How to re-build these examples?
To build the examples, you can run: 
   "mvn clean package -DskipTests”
    in the "flink-examples/flink-java-examples" 
directory. 
This will re-build them. 
45
3.1 How to run the examples in the Apache 
Flink bundle?
3.1.4 How to run these examples?
How to display the command line 
arguments?
 ./bin/flink info ./examples/flink-java-
examples-0.9.0-WordCount.jar
Example of running an example: ./bin/flink 
run ./examples/flink-java-examples-0.9.0-
WordCount.jar
More on the bundled examples: 
https://ci.apache.org/projects/flink/flink-docs-master/apis/examples.html#running-an-
example
46
3.2 How to learn Flink Programming APIs? 
3.2.1 DataSet API
3.2.2 DataStream API
3.2.3 Table API - Relational Queries
47
3.2 How to learn Flink Programming APIs? 
3.2.1 DataSet API
https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html
https://ci.apache.org/projects/flink/flink-docs-master/api/java/
FREE Apache Flink Training by Data Artisans: 
DataSet API Basis
•Lecture: 
http://dataartisans.github.io/flink-training/dataSetBasics/slides.html  Slides 
https://www.youtube.com/watch?v=1yWKZ26NQeU Video
•Exercise:  http://dataartisans.github.io/flink-training/dataSetBasics/handsOn.html
48
3.2 How to learn Flink Programming APIs? 
3.2.1 DataSet API
DataSet API Advanced
• Lecture:  
• Slides 
http://dataartisans.github.io/flink-training/dataSetAdvanced/slides.html
• Video https://www.youtube.com/watch?v=1yWKZ26NQeU
• Exercise: 
http://dataartisans.github.io/flink-training/dataSetAdvanced/handsOn.html
49
3.2 How to learn Flink Programming APIs? 
3.2.2 DataStream API
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html
https://ci.apache.org/projects/flink/flink-docs-master/api/java/
Example 1: Event pattern detection with Apache Flink
This is a Flink streaming demo given By Data Artisans 
on July 17, 2015  titled 'Apache Flink: Unifying batch  
and streaming modern data analysis' at the Bay Area 
Apache Flink Meetup: 
• Related code: https
://github.com/StephanEwen/flink-demos/tree/master/streaming-state-machine
• Related slides: http://www.slideshare.net/KostasTzoumas/first-flink-bay-area-meetup
• Related video recording: https://www.youtube.com/watch?v=BJjGD8ijJcg
50
3.2 How to learn Flink Programming APIs? 
3.2.2 DataStream API
Example 2: Fault-Tolerant Streaming with Flink
Slides 16-23 http://www.slideshare.net/AljoschaKrettek/flink-010-upcoming-features
Code https://github.com/aljoscha/flink-fault-tolerant-stream-example
This is a demo to show how Flink can deal with stateful 
streaming jobs and fault-tolerance. 
Example 3: Flink-storm compatibility examplehttps
://github.com/apache/flink/tree/master/flink-contrib/flink-storm-compatibility/flink-storm-compatibili
examples
51
3.2 How to learn Flink Programming APIs? 
3.2.2 DataStream API
Example 4: Data Stream Analytics with Flink
http://net.t-labs.tu-berlin.de/~nsemmler/blog//flink/2015/03/02/Data-Stream-Analysis-with-
flink.html
Example 5: Introducing Flink Streaming
http://flink.apache.org/news/2015/02/09/streaming-example.html
Examples from the code base: flink-streaming-
examples
https://github.com/apache/flink/tree/master/flink-staging/flink-streaming/flink-streaming-examples/s
52
3.2 How to learn Flink Programming APIs? 
3.2.3 Table API - Relational Queries           https
://ci.apache.org/projects/flink/flink-docs-master/libs/table.html
To use the Table API in a project:
• First setup a Flink program: 
https://ci.apache.org/projects/flink/flink-docs-master/apis
• Add this to the dependencies section of your 
pom.xml <dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table</artifactId>
<version>0.10-SNAPSHOT</version>
</dependency>
Table is not currently part of the binary distribution. You need to 
link it for cluster execution: 
https://ci.apache.org/projects/flink/flink-docs-master/apis/cluster_execution.html - linking-with-module
53
3.2 How to learn Flink Programming APIs? 
3.2.3 Table API - Relational Queries           
FREE Apache Flink Training by Data Artisans – Table 
API 
• Lecture: http://www.slideshare.net/dataArtisans/flink-
table
• Exercise: 
http://dataartisans.github.io/flink-training/tableApi/
handsOn.html
See also example in slides 36-43 on Log Analysis 
http://www.grid.ucy.ac.cy/file/Talks/talks/
DeepAnalysiswithApacheFlink_2nd_cloud_workshop.pdf
54
3.3 Apache Flink Domain Specific Libraries
3.3.1 FlinkML - Machine Learning for Flink
3.3.2 Gelly - Graph Analytics for Flink
55
3.3 Apache Flink Libraries
3.3.1 FlinkML - Machine Learning for Flinkhttps
://ci.apache.org/projects/flink/flink-docs-master/libs/ml/
FlinkML – Quickstart Guidehttps
://ci.apache.org/projects/flink/flink-docs-master/libs/ml/quickstart.html
To use FlinkML in a project:
• First setup a Flink program: 
https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#linking-with-f
• Add this to the dependencies section of your pom.xml 
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-ml</artifactId>
<version>0.10-SNAPSHOT</version>
</dependency>
56
3.3 Apache Flink Libraries
3.3.1 FlinkML - Machine Learning for  Flink
Quick Start: Run K-Means Examplehttps
://ci.apache.org/projects/flink/flink-docs-master/quickstart/run_example_quickstart.html
Computing Recommendations at Extreme Scale with 
Apache Flink 
http://data-artisans.com/computing-recommendations-at-extreme-scale-with-apache-flink/a and 
related code: 
https://github.com/tillrohrmann/flink-perf/blob/ALSJoinBlockingUnified/flink-jobs/src/main/scala/com/g
ALSJoinBlocking.scala
Naive Bayes on Apache Flink 
http://www.itshared.org/2015/03/naive-bayes-on-apache-flink.html
FlinkML is not currently part of the binary distribution. 
You need to link it for cluster execution: 
https://ci.apache.org/projects/flink/flink-docs-master/apis/cluster_execution.html#linking-with-modules
57
3.3 Apache Flink Libraries
3.3.2 Gelly: Flink Graph API
https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html
To use Gelly in a project:
• First setup a Flink program:
https://ci.apache.org/projects/flink/flink-docs-master/api
flink
• Add this to the dependencies section of your
pom.xml
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-gelly</artifactId>
<version>0.10-SNAPSHOT</version>
</dependency>
58
3.3 Apache Flink Libraries
Gelly Examples:
https://github.com/apache/flink/tree/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/g
example
Gelly exercise & solution
Gelly API - PageRank on Reply Graph
http://dataartisans.github.io/flink-training/exercises/replyGraphGelly.html
Gelly is not currently part of the binary distribution.
You need to link it for cluster execution:
https://ci.apache.org/projects/flink/flink-docs-master/apis/cluster_execution.html#linking-with-mod
59
4. How to set up your IDE (IntelliJ IDEA or
Eclipse) for Apache Flink?
4.1 How to set up your IDE (IntelliJ IDEA)?
4.2 How to setup your IDE (Eclipse)?
Flink uses mixed Scala/Java projects, which
pose a challenge to some IDEs
Minimal requirements for an IDE are:
• Support for Java and Scala (also mixed projects)
• Support for Maven with Java and Scala
60
4.1 How to set up your IDE (IntelliJ
IDEA)?IntelliJ IDEA supports Maven out of the box
and offers a plugin for Scala development.
IntelliJ IDEA Download https
://www.jetbrains.com/idea/download/
IntelliJ Scala Plugin
http://plugins.jetbrains.com/plugin/?id=1347
Check out Setting up IntelliJ IDEA guide for
details
https://github.com/apache/flink/blob/master/docs/internals/ide_setup.md#
intellij-idea
Screencast: Run Apache Flink WordCount
from IntelliJ https://www.youtube.com/watch?v=JIV_rX-OIQM
61
4.2 How to setup your IDE (Eclipse)?
• For Eclipse users, Apache Flink committers
recommend using Scala IDE 3.0.3, based on
Eclipse Kepler.
• While this is a slightly older version, they
found it to be the version that works most
robustly for a complex project like Flink. One
restriction is, though, that it works only with
Java 7, not with Java 8.
• Check out how to setup Eclipse docs:
https://github.com/apache/flink/blob/master/docs/internals/ide_setup.md
#eclipse
62
5. How to write, test and debug your
Apache Flink program in an IDE?
5.1 How to write a Flink program?
5.1.1 How to generate a Flink project with
Maven?
5.1.2 How to import the Flink Maven project
into IDE
5.1.3 How to use logging?
5.1.4 FAQs and best practices related to
coding
5.2 How to test your Flink program?
5.3 How to debug your Flink program?
63
5.1 How to write a Flink program in an
IDE?
The easiest way to get a working setup to
develop (and locally execute) Flink programs
is to follow the Quick Start guide:
https://ci.apache.org/projects/flink/flink-docs-master/quickstart/java_api_quickstart.html
https://ci.apache.org/projects/flink/flink-docs-master/quickstart/scala_api_quickstart.html
It uses Maven archetype to configure and
generate a Flink Maven project.
This will save you time dealing with transitive
dependencies!
This Maven project can be imported into your
IDE.
64
5.1 How to write a Flink program in an IDE?
Generate a skeleton project with Maven to get
started
mvn archetype:generate /
-DarchetypeGroupId=org.apache.flink /
-DarchetypeArtifactId=flink-quickstart-java /
-DarchetypeVersion=0.9.0
you can also put
“quickstart-scala” here
you can also put
“quickstart-scala” here
or “0.10-SNAPSHOT”or “0.10-SNAPSHOT”
 No need for manually downloading any .tgz
or .jar files for now
5.1.1 How to generate a skeleton Flink project
with Maven?
65
5.1 How to write a Flink program in an
IDE?
5.1.1 How to generate a skeleton Flink project
with Maven?
The generated projects are located in a folder
called flink-java-project or flink-scala-project.
In order to test the generated projects and to download
all required dependencies run the following commands
(change flink-java-project to flink-scala-project for Scala
projects)
• cd flink-java-project
• mvn clean package
Maven will now start to download all required
dependencies and build the Flink quickstart project.
66
5.1 How to write a Flink program in an
IDE?5.1.2 How to import the Flink Maven project into
IDE
The generated Maven project needs to be imported into
your IDE:
IntelliJ:
• Select “File” -> “Import Project”
• Select root folder of your project
• Select “Import project from external model”,
select “Maven”
• Leave default options and finish the import
Eclipse:
• Select “File” -> “Import” -> “Maven” -> “Existing Maven
Project”
• Follow the import instructions
67
5.1 How to write a Flink program in an
IDE?
5.1.3 How to use logging?
The logging in Flink is implemented using the slf4j
logging interface. log4j is used as underlying logging
framework.
Log4j is controlled using property file usually
called log4j.properties. You can pass to the JVM the
filename and location of this file using
the Dlog4j.configuration= parameter.
The loggers using slf4j are created by calling
import org.slf4j.LoggerFactory
import org.slf4j.Logger
Logger LOG = LoggerFactory.getLogger(Foobar.class)
You can also use logback instead of log4j.
https://ci.apache.org/projects/flink/flink-docs-release-0.9/internals/logging.html
68
5.1 How to write a Flink program?
5.1.4 FAQs & best practices related to coding
Errors
http://flink.apache.org/faq.html#errors
Usage
http://flink.apache.org/faq.html#usage
Best Practices
https://ci.apache.org/projects/flink/flink-docs-
master/apis/best_practices.html
69
5.2 How to test your Flink program in an
IDE?
Start Flink in your IDE for local development &
debugging.
final ExecutionEnvironment env =
ExecutionEnvironment.createLocalEnvironment();
 Use Flink’s testing framework
@RunWith(Parameterized.class)
class YourTest extends MultipleProgramsTestBase
{
@Test
public void testRunWithConfiguration(){
expectedResult = "1 11n“;
}
}
70
5.3 How to debug your Flink program in an
IDE?
Flink programs can be executed and debugged from
within an IDE.
This significantly eases the development process and
gives a programming experience similar to working
on a regular Java application.
Starting a Flink program in your IDE is as easy as
starting its main()method.
Under the hood, the ExecutionEnvironment will start a
local Flink instance within the execution process.
Hence it is also possible to put breakpoints
everywhere in your code and debug it.
71
5.3 How to debug your Flink program in an
IDE?
• Assuming you have an IDE with a Flink quickstart
project imported, you can execute and debug the
example WordCount program which is included in the
quickstart project as follows:
• Open the org.apache.flink.quickstart.WordCount class
in your IDE
• Place a breakpoint somewhere in
the flatMap() method of the LineSplitter class which is
inline defined in the WordCount class.
• Execute or debug the main() method of
the WordCount class using your IDE.
72
5.3 How to debug your Flink program in an
IDE?
When you start a program locally with
the LocalExecutor, you can place breakpoints
in your functions and debug them like normal
Java/Scala programs.
The Accumulators are very helpful in tracking
the behavior of the parallel execution. They
allow you to gather information inside the
program’s operations and show them after the
program execution.
73
Debugging with the IDE
74
Debugging on a cluster
Good old system out debugging
• Get a logger
– Start logging
• Start logging
private static final Logger LOG =
LoggerFactory.getLogger(YourJob.class);
LOG.info("elementCount = {}",
elementCount);
• You can also use System.out.println().
75
Getting logs on a cluster
• Non-YARN (=bare metal installation)
–The logs are located in each TaskManager’s
log/ directory.
–ssh there and read the logs.
• YARN
–Make sure YARN log aggregation is enabled
–Retrieve logs from YARN (once app is finished)
$ yarn logs -applicationId <application ID>
76
Flink Logs
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - --------------------------------------------------------------------------------
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager (Version: 0.9-SNAPSHOT, Rev:2e515fc, Date:27.05.2015 @ 11:24:23 CEST)
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - Current user: robert
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.7/24.75-b04
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - Maximum heap size: 736 MiBytes
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - JAVA_HOME: (not set)
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - JVM Options:
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - -XX:MaxPermSize=256m
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - -Xms768m
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - -Xmx768m
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - -Dlog.file=/home/robert/incubator-flink/build-target/bin/../log/flink-robert-jobmanager-robert-da.log
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - -Dlog4j.configuration=file:/home/robert/incubator-flink/build-target/bin/../conf/log4j.properties
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - -Dlogback.configurationFile=file:/home/robert/incubator-flink/build-target/bin/../conf/logback.xml
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - Program Arguments:
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - --configDir
11:42:39,233 INFO org.apache.flink.runtime.jobmanager.JobManager - /home/robert/incubator-flink/build-target/bin/../conf
11:42:39,234 INFO org.apache.flink.runtime.jobmanager.JobManager - --executionMode
11:42:39,234 INFO org.apache.flink.runtime.jobmanager.JobManager - local
11:42:39,234 INFO org.apache.flink.runtime.jobmanager.JobManager - --streamingMode
11:42:39,234 INFO org.apache.flink.runtime.jobmanager.JobManager - batch
11:42:39,234 INFO org.apache.flink.runtime.jobmanager.JobManager - --------------------------------------------------------------------------------
11:42:39,469 INFO org.apache.flink.runtime.jobmanager.JobManager - Loading configuration from /home/robert/incubator-flink/build-target/bin/../conf
11:42:39,525 INFO org.apache.flink.runtime.jobmanager.JobManager - Security is not enabled. Starting non-authenticated JobManager.
11:42:39,525 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager
11:42:39,527 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager actor system at localhost:6123.
11:42:40,189 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
11:42:40,316 INFO Remoting - Starting remoting
11:42:40,569 INFO Remoting - Remoting started; listening on addresses :[akka.tcp://flink@127.0.0.1:6123]
11:42:40,573 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager actor
11:42:40,580 INFO org.apache.flink.runtime.blob.BlobServer - Created BLOB server storage directory /tmp/blobStore-50f75dc9-3001-4c1b-bc2a-6658ac21322b
11:42:40,581 INFO org.apache.flink.runtime.blob.BlobServer - Started BLOB server at 0.0.0.0:51194 - max concurrent requests: 50 - max backlog: 1000
11:42:40,613 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting embedded TaskManager for JobManager's LOCAL execution mode
11:42:40,615 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager at akka://flink/user/jobmanager#205521910.
11:42:40,663 INFO org.apache.flink.runtime.taskmanager.TaskManager - Messages between TaskManager and JobManager have a max timeout of 100000 milliseconds
11:42:40,666 INFO org.apache.flink.runtime.taskmanager.TaskManager - Temporary file directory '/tmp': total 7 GB, usable 7 GB (100.00% usable)
11:42:41,092 INFO org.apache.flink.runtime.io.network.buffer.NetworkBufferPool - Allocated 64 MB for network buffer pool (number of memory segments: 2048, bytes per segment: 32768).
11:42:41,511 INFO org.apache.flink.runtime.taskmanager.TaskManager - Using 0.7 of the currently free heap space for Flink managed memory (461 MB).
11:42:42,520 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager - I/O manager uses directory /tmp/flink-io-4c6f4364-1975-48b7-99d9-a74e4edb7103 for spill files.
11:42:42,523 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManger web frontend
Build Information
JVM details
Init messages
77
Get logs of a running YARN application
78
Debugging on a cluster - Accumulators
Useful to verify your assumptions about the
data
class Tokenizer extends RichFlatMapFunction<String, String>>
{
@Override
public void flatMap(String value, Collector<String> out) {
getRuntimeContext()
.getLongCounter("elementCount").add(1L);
// do more stuff.
}
}
Use “Rich*Functions” to get RuntimeContextUse “Rich*Functions” to get RuntimeContext
79
Debugging on a cluster - Accumulators
Where can I get the accumulator results?
• returned by env.execute()
• displayed when executed with /bin/flink
• in the JobManager web frontend
JobExecutionResult result =
env.execute("WordCount");
long ec =
result.getAccumulatorResult("elementCount");
80
Live Monitoring with Accumulators
In previous versions to Flink 0.10
• Accumulators only available after Job finishes
• In Flink 0.10
• Accumulators updated while Job is running
• System accumulators (number of bytes/records
processed…)
81
In Flink 0.10, the Job Manager Web Interface
displays the accumulators live in the web
interface
82
Excursion: RichFunctions
The default functions are SAMs (Single
Abstract Method). Interfaces with one method
(for Java8 Lambdas)
There is a “Rich” variant for each function.
• RichFlatMapFunction, …
• Methods
 open(Configuration c) & close()
 getRuntimeContext()
83
Excursion: RichFunctions &
RuntimeContext
The RuntimeContext provides some useful
methods
getIndexOfThisSubtask () /
getNumberOfParallelSubtasks() – who am I, and
if yes how many?
getExecutionConfig()
Accumulators
DistributedCache
84
Attaching a remote debugger to Flink in a
Cluster
85
Attaching a debugger to Flink in a cluster
Add JVM start option in flink-conf.yaml
env.java.opts: “-agentlib:jdwp=….”
Open an SSH tunnel to the machine:
ssh -f -N -L 5005:127.0.0.1:5005 user@host
Use your IDE to start a remote debugging
session
86
6. How to deploy your Apache Flink
application in local, in a cluster or in the
cloud?
6.1 Deploy in Local
6.2 Deploy in Cluster
6.3 Deploy in Cloud
87
6. How to deploy your Apache Flink
application in local, in a cluster or in the
cloud?
6.1 Deploy in Local
Package your job in a jar and submit it:
• /bin/flink (Command Line Interface)
• RemoteExecutionEnvironment (From a
local java app)
• Web Frontend (GUI)
• Scala Shell
88
Flink Web Submission Client
Select jobs and
preview plan
Understand Optimizer choices
89
6.2 Deploy in Cluster
• You can start a cluster locally
$ tar xzf flink-*.tgz
$ cd flink
$ bin/start-cluster.sh
Starting Job Manager
Starting task manager on host
$ jps
5158 JobManager
5262 TaskManager
90
6.3 Deploy in Cloud
 Google Compute Engine (GCE)
Free trial for Google Cloud Engine:
https://cloud.google.com/free-trial/
Enjoy your $300 in GCE for 60 days!
http://ci.apache.org/projects/flink/flink-docs-master/setup/gce_setup.html
./bdutil -e extensions/flink/flink_env.sh deploy
91
6.3 Deploy in Cloud
 Amazon EMR or any other cloud provider with
preinstalled Hadoop YARN
http://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html
wget http://stratosphere-bin.amazonaws.com/flink-0.9-SNAPSHOT-bin-
hadoop2.tgz
tar xvzf flink-0.9-SNAPSHOT-bin-hadoop2.tgz
cd flink-0.9-SNAPSHOT/
./bin/yarn-session.sh -n 4 -jm 1024 -tm 4096
 Install Flink yourself on the machines
92
7. How to tune your Apache Flink application
7.1 Tuning CPU
7.2 Tuning memory
7.3 Tuning I/O
7.4 Optimizer hints
93
7. How to tune your Apache Flink
application (CPU, Memory, I/O)?
7.1 Tuning CPU
 Processing slots, threads, …
https://ci.apache.org/projects/flink/flink-docs-
master/setup/config.html#configuring-taskmanager-processing-slots
94
Tell Flink how many CPUs you have
taskmanager.numberOfTaskSlots in flink-config.yaml:
• number of parallel job instances
• number of pipelines per TaskManager
recommended: number of available CPU cores
MapMap ReduceReduce
MapMap ReduceReduce
MapMap ReduceReduce
MapMap ReduceReduce
MapMap ReduceReduce
MapMap ReduceReduce
MapMap ReduceReduce
95
Task Manager 1
Slot 1
Slot 2
Slot 3
Task Manager 2
Slot 1
Slot 2
Slot 3
Task Manager 3
Slot 1
Slot 2
Slot 3
Task
Managers: 3
Total number
of
processing
slots: 12
flink-config.yaml:
taskmanager.numberOfTaskSlots: 4
or
/bin/yarn-session.sh –slots 4 –n 4
(Recommended value: Number of CPU cores)
Configuring TaskManager Processing slots
3 machines each with 4 CPU cores
gives us a total of 12 processing slots
Slot 4 Slot 4Slot 4
96
Task Manager 2
Slot 1
Slot 2
Slot 3
Task Manager 3
Slot 1
Slot 2
Slot 3
When no argument are given,
parallelism.default from flink-config.yaml is used.
Default value = 1
Example 1: WordCount with parallelism = 1
Task Manager 1
Slot 1
Slot 2
Slot 3
Source
->
flatMa
p
Reduc
e
Sink
Slot 4 Slot 4 Slot 4
97
Example 2: WordCount with parallelism = 2
Task Manager 1
Slot 1
Slot 2
Slot 3
Task Manager 2
Slot 1
Slot 2
Slot 3
Task Manager 3
Slot 1
Slot 2
Slot 3
Source
->
flatMap
Reduce Sink
Source
->
flatMap
Reduce Sink
Places to set parallelism for a job
 flink-config.yaml parallelism.default: 2
 Flink Client:./bin/flink -p 2
 ExecutionEnvironment: env.setParallelism(2)
Slot 4 Slot 4 Slot 4
98
Example 3: WordCount with parallelism =
12 (using all resources)
Task Manager 1
Slot 1
Slot 2
Slot 3
Source ->
flatMap
Reduce Sink
Source ->
flatMap
Reduce Sink
Source ->
flatMap
Reduce Sink
Task Manager 2
Slot 1
Slot 2
Slot 3
Task Manager 3
Slot 1
Slot 2
Slot 3
Source ->
flatMap
Reduce Sink
Source ->
flatMap Reduce Sink
Source ->
flatMap
Reduce Sink
Source ->
flatMap
Reduce Sink
Source
->
flatMap
Reduce Sink
Reduce Sink
Source
->
flatMap
99
Example 4: WordCount with parallelism = 12
and sink parallelism = 1
Task Manager 1
Slot 1
Slot 2
Slot 3
Task Manager 2
Slot 1
Slot 2
Slot 3
Task Manager 3
Slot 1
Slot 2
Slot 3
Source
->
flatMap
Reduce
Source
->
flatMap
Reduce
Source
->
flatMap
Reduce
Source
->
flatMap
Reduce
Source
->
flatMap
Reduce
Source
->
flatMap
Reduce
Source
->
flatMap
Reduce
Source
->
flatMap
Reduce
Source
->
flatMap
Reduce
The parallelism of each operator can be set individually
in the APIs
counts.writeAsCsv(outputPath,
"n", " ").setParallelism(1);
Sink
The data is streamed to this Sink from all the
other slots on the other TaskManagers
Slot 4 Slot 4 Slot 4
Source
->
flatMap
Reduce
Source ->
flatMap
Reduce
Source ->
flatMap
Reduce
100
7. How to tune your Apache Flink
application (CPU, Memory, I/O)?
7.2 Tuning Memory
How to adjust memory usage on the
TaskManager?
101
Memory in Flink - Theory
Memory Management (Batch API)
https://cwiki.apache.org/confluence/pages/viewpage.action?
pageId=53741525
102
taskmanager.network.numberOfBufferstaskmanager.network.numberOfBuffers
relative: taskmanager.memory.fraction
absolute: taskmanager.memory.size
relative: taskmanager.memory.fraction
absolute: taskmanager.memory.size
Memory in Flink - Configuration
taskmanager.heap.mb
or „-tm“ argument for bin/yarn-session.sh
taskmanager.heap.mb
or „-tm“ argument for bin/yarn-session.sh
103
Memory in Flink - OOM
2015-02-20 11:22:54 INFO JobClient:345 - java.lang.OutOfMemoryError: Java heap
space
at org.apache.flink.runtime.io.network.serialization.DataOutputSerializer.resize(DataOutputSerializer.java:249)
at org.apache.flink.runtime.io.network.serialization.DataOutputSerializer.write(DataOutputSerializer.java:93)
at org.apache.flink.api.java.typeutils.runtime.DataOutputViewStream.write(DataOutputViewStream.java:39)
at com.esotericsoftware.kryo.io.Output.flush(Output.java:163)
at com.esotericsoftware.kryo.io.Output.require(Output.java:142)
at com.esotericsoftware.kryo.io.Output.writeBoolean(Output.java:613)
at com.twitter.chill.java.BitSetSerializer.write(BitSetSerializer.java:42)
at com.twitter.chill.java.BitSetSerializer.write(BitSetSerializer.java:29)
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599)
at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.serialize(KryoSerializer.java:155)
at org.apache.flink.api.scala.typeutils.CaseClassSerializer.serialize(CaseClassSerializer.scala:91)
at org.apache.flink.api.scala.typeutils.CaseClassSerializer.serialize(CaseClassSerializer.scala:30)
at org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:51)
at
org.apache.flink.runtime.io.network.serialization.SpanningRecordSerializer.addRecord(SpanningRecordSerializer.j
ava:76)
at org.apache.flink.runtime.io.network.api.RecordWriter.emit(RecordWriter.java:82)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:88)
at org.apache.flink.api.scala.GroupedDataSet$$anon$2.reduce(GroupedDataSet.scala:262)
at org.apache.flink.runtime.operators.GroupReduceDriver.run(GroupReduceDriver.java:124)
at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257)
at java.lang.Thread.run(Thread.java:745)
Memory is missing
here
Memory is missing
here
Reduce managed
memory
reduce taskmanager.
memory.fraction
Reduce managed
memory
reduce taskmanager.
memory.fraction
104
Memory in Flink – Network buffers
Memory is missing
here
Memory is missing
here
Managed memory will
shrink automatically
Managed memory will
shrink automatically
Error: java.lang.Exception: Failed to deploy the task CHAIN
Reduce(org.okkam.flink.maintenance.deduplication.blocking.Remove
DuplicateReduceGroupFunction) ->
Combine(org.apache.flink.api.java.operators.DistinctOperator$Distinc
tFunction) (15/28) - execution #0 to slot SubSlot 5
(cab978f80c0cb7071136cd755e971be9 (5) - ALLOCATED/ALIVE):
org.apache.flink.runtime.io.network.InsufficientResourcesException:
okkam-nano-2.okkam.it
has not enough buffers to safely execute CHAIN
Reduce(org.okkam.flink.maintenance.deduplication.blocking.Remove
DuplicateReduceGroupFunction) ->
Combine(org.apache.flink.api.java.operators.DistinctOperator$Distinc
tFunction) (36 buffers missing)
Increase
taskmanager.network.numberOfBuffers
Increase
taskmanager.network.numberOfBuffers
105
What are these buffers needed for?
TaskManager 1
Slot 2
MapMap ReduceReduce
Slot 1
TaskManager 2
Slot 2
Slot 1
A small Flink cluster with 4 processing slots
(on 2 Task Managers)
A simple MapReduce Job in Flink:
106
What are these buffers needed for?
Map Reduce job with a parallelism of 2 and
2 processing slots per Machine
TaskManager 1 TaskManager 2
Slot1Slot2
MapMap
MapMap
ReduceReduce
ReduceReduce
MapMap
MapMap
ReduceReduce
ReduceReduce
MapMap
MapMap
ReduceReduce
ReduceReduce
MapMap
MapMap
ReduceReduce
ReduceReduce
Slot1Slot2
Network bufferNetwork buffer
8 buffers for outgoing
data
8 buffers for outgoing
data 8 buffers for incoming data8 buffers for incoming data
107
What are these buffers needed for?
Map Reduce job with a parallelism of 2 and 2
processing slots per Machine
TaskManager 1 TaskManager 2
Slot1Slot2
MapMap
MapMap
ReduceReduce
ReduceReduce
MapMap
MapMap
ReduceReduce
ReduceReduce
MapMap
MapMap
ReduceReduce
ReduceReduce
MapMap
MapMap
ReduceReduce
ReduceReduce
Each mapper has a logical connection to
a reducer
Each mapper has a logical connection to
a reducer
108
7. How to tune your Apache Flink
application (CPU, Memory, I/O)?
7.3 Tuning I/O
Specifying temporary directories for spilling
109
Disk I/O
Sometimes your data doesn’t fit into main
memory, so we have to spill to disk:
taskmanager.tmp.dirs:
/mnt/disk1,/mnt/disk2
Use real local disks only (no tmpfs or NAS)
Reader
Thread
Reader
Thread
Disk 1Disk 1
Writer
Thread
Writer
Thread
Reader
Thread
Reader
Thread
Writer
Thread
Writer
Thread
Disk 2Disk 2
Task Manager
110
7. How to tune your Apache Flink
application
7.4 Optimizer hints
 Examples:
DataSet.join(DataSet other,
JoinHint.BROADCAST_HASH_SECOND)
DataSet.join(DataSet other,
JoinHint.BROADCAST_HASH_FIRST)
http://stackoverflow.xluat.com/questions/31484856/the-
difference-and-benefit-of-joinwithtiny-joinwithhuge-and-joinhint
111
 Consider attending the first dedicated Apache
Flink conference on October 12-13, 2015 in Berlin,
Germany! http://flink-forward.org/
Two parallel tracks:
Talks: Presentations and use cases
Trainings: 2 days of hands on training
workshops by the Flink committers

More Related Content

What's hot

Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...Vietnam Open Infrastructure User Group
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentationIlias Okacha
 
Kubernetes Helm: Why It Matters
Kubernetes Helm: Why It MattersKubernetes Helm: Why It Matters
Kubernetes Helm: Why It MattersPlatform9
 
Creating AWS infrastructure using Terraform
Creating AWS infrastructure using TerraformCreating AWS infrastructure using Terraform
Creating AWS infrastructure using TerraformKnoldus Inc.
 
[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)
[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)
[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)OpenStack Korea Community
 
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...Vietnam Open Infrastructure User Group
 
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...Vietnam Open Infrastructure User Group
 
Serverless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on KubernetesServerless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on KubernetesClaus Ibsen
 
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache JamesRoom 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache JamesVietnam Open Infrastructure User Group
 
Docker introduction
Docker introductionDocker introduction
Docker introductionPhuc Nguyen
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
Extending kubernetes with CustomResourceDefinitions
Extending kubernetes with CustomResourceDefinitionsExtending kubernetes with CustomResourceDefinitions
Extending kubernetes with CustomResourceDefinitionsStefan Schimanski
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward
 
Easy enterprise application integration with RabbitMQ and AMQP
Easy enterprise application integration with RabbitMQ and AMQPEasy enterprise application integration with RabbitMQ and AMQP
Easy enterprise application integration with RabbitMQ and AMQPRabbit MQ
 
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentKafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentHostedbyConfluent
 
Flink on Kubernetes operator
Flink on Kubernetes operatorFlink on Kubernetes operator
Flink on Kubernetes operatorEui Heo
 

What's hot (20)

Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Room 2 - 1 - Phạm Quang Minh - A real DevOps culture in practice
Room 2 - 1 - Phạm Quang Minh - A real DevOps culture in practiceRoom 2 - 1 - Phạm Quang Minh - A real DevOps culture in practice
Room 2 - 1 - Phạm Quang Minh - A real DevOps culture in practice
 
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentation
 
Kubernetes Helm: Why It Matters
Kubernetes Helm: Why It MattersKubernetes Helm: Why It Matters
Kubernetes Helm: Why It Matters
 
Creating AWS infrastructure using Terraform
Creating AWS infrastructure using TerraformCreating AWS infrastructure using Terraform
Creating AWS infrastructure using Terraform
 
[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)
[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)
[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)
 
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
Room 2 - 3 - Nguyễn Hoài Nam & Nguyễn Việt Hùng - Terraform & Pulumi Comparin...
 
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
 
Serverless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on KubernetesServerless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on Kubernetes
 
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache JamesRoom 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
Room 1 - 1 - Benoit TELLIER - On premise email inbound service with Apache James
 
Docker introduction
Docker introductionDocker introduction
Docker introduction
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Introduction to helm
Introduction to helmIntroduction to helm
Introduction to helm
 
Extending kubernetes with CustomResourceDefinitions
Extending kubernetes with CustomResourceDefinitionsExtending kubernetes with CustomResourceDefinitions
Extending kubernetes with CustomResourceDefinitions
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
 
Easy enterprise application integration with RabbitMQ and AMQP
Easy enterprise application integration with RabbitMQ and AMQPEasy enterprise application integration with RabbitMQ and AMQP
Easy enterprise application integration with RabbitMQ and AMQP
 
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, ConfluentKafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
Kafka’s New Control Plane: The Quorum Controller | Colin McCabe, Confluent
 
Flink on Kubernetes operator
Flink on Kubernetes operatorFlink on Kubernetes operator
Flink on Kubernetes operator
 
Gitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCDGitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCD
 

Similar to Apache Flink Crash Course by Slim Baltagi and Srini Palthepu

Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Slim Baltagi
 
Extracting twitter data using apache flume
Extracting twitter data using apache flumeExtracting twitter data using apache flume
Extracting twitter data using apache flumeBharat Khanna
 
BLCN532 Lab 1Set up your development environmentV2.0.docx
BLCN532 Lab 1Set up your development environmentV2.0.docxBLCN532 Lab 1Set up your development environmentV2.0.docx
BLCN532 Lab 1Set up your development environmentV2.0.docxmoirarandell
 
Automated Amazon EC2 Cloud deployments with openQRM
Automated Amazon EC2 Cloud deployments with openQRMAutomated Amazon EC2 Cloud deployments with openQRM
Automated Amazon EC2 Cloud deployments with openQRMopenQRM Enterprise GmbH
 
Environment isolation with Docker (Alex Medvedev, Alpari)
Environment isolation with Docker (Alex Medvedev, Alpari)Environment isolation with Docker (Alex Medvedev, Alpari)
Environment isolation with Docker (Alex Medvedev, Alpari)Symfoniacs
 
Using Docker to build and test in your laptop and Jenkins
Using Docker to build and test in your laptop and JenkinsUsing Docker to build and test in your laptop and Jenkins
Using Docker to build and test in your laptop and JenkinsMicael Gallego
 
Server(less) Swift at SwiftCloudWorkshop 3
Server(less) Swift at SwiftCloudWorkshop 3Server(less) Swift at SwiftCloudWorkshop 3
Server(less) Swift at SwiftCloudWorkshop 3kognate
 
Marco Cavallini - Yocto Project, an automatic generator of embedded Linux dis...
Marco Cavallini - Yocto Project, an automatic generator of embedded Linux dis...Marco Cavallini - Yocto Project, an automatic generator of embedded Linux dis...
Marco Cavallini - Yocto Project, an automatic generator of embedded Linux dis...linuxlab_conf
 
Marco Cavallini @ LinuxLab 2018 : Workshop Yocto Project, an automatic genera...
Marco Cavallini @ LinuxLab 2018 : Workshop Yocto Project, an automatic genera...Marco Cavallini @ LinuxLab 2018 : Workshop Yocto Project, an automatic genera...
Marco Cavallini @ LinuxLab 2018 : Workshop Yocto Project, an automatic genera...Marco Cavallini
 
Oracle api gateway installation as cluster and single node
Oracle api gateway installation as cluster and single nodeOracle api gateway installation as cluster and single node
Oracle api gateway installation as cluster and single nodeOsama Mustafa
 
Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned  Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned RightScale
 
Mastering selenium for automated acceptance tests
Mastering selenium for automated acceptance testsMastering selenium for automated acceptance tests
Mastering selenium for automated acceptance testsNick Belhomme
 
Magento Docker Setup.pdf
Magento Docker Setup.pdfMagento Docker Setup.pdf
Magento Docker Setup.pdfAbid Malik
 
Introduction to JIB and Google Cloud Run
Introduction to JIB and Google Cloud RunIntroduction to JIB and Google Cloud Run
Introduction to JIB and Google Cloud RunSaiyam Pathak
 
Creating a Scheduled Backup and Replicating System Folders Introduct.docx
Creating a Scheduled Backup and Replicating System Folders Introduct.docxCreating a Scheduled Backup and Replicating System Folders Introduct.docx
Creating a Scheduled Backup and Replicating System Folders Introduct.docxwilliejgrant41084
 
當專案漸趕,當遷移也不再那麼難 (Ship Your Projects with Docker EcoSystem)
當專案漸趕,當遷移也不再那麼難 (Ship Your Projects with Docker EcoSystem)當專案漸趕,當遷移也不再那麼難 (Ship Your Projects with Docker EcoSystem)
當專案漸趕,當遷移也不再那麼難 (Ship Your Projects with Docker EcoSystem)Ruoshi Ling
 
Nomad + Flatcar: a harmonious marriage of lightweights
Nomad + Flatcar: a harmonious marriage of lightweightsNomad + Flatcar: a harmonious marriage of lightweights
Nomad + Flatcar: a harmonious marriage of lightweightsIago López Galeiras
 

Similar to Apache Flink Crash Course by Slim Baltagi and Srini Palthepu (20)

Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
 
Extracting twitter data using apache flume
Extracting twitter data using apache flumeExtracting twitter data using apache flume
Extracting twitter data using apache flume
 
BLCN532 Lab 1Set up your development environmentV2.0.docx
BLCN532 Lab 1Set up your development environmentV2.0.docxBLCN532 Lab 1Set up your development environmentV2.0.docx
BLCN532 Lab 1Set up your development environmentV2.0.docx
 
Automated Amazon EC2 Cloud deployments with openQRM
Automated Amazon EC2 Cloud deployments with openQRMAutomated Amazon EC2 Cloud deployments with openQRM
Automated Amazon EC2 Cloud deployments with openQRM
 
Phalcon - Giant Killer
Phalcon - Giant KillerPhalcon - Giant Killer
Phalcon - Giant Killer
 
Environment isolation with Docker (Alex Medvedev, Alpari)
Environment isolation with Docker (Alex Medvedev, Alpari)Environment isolation with Docker (Alex Medvedev, Alpari)
Environment isolation with Docker (Alex Medvedev, Alpari)
 
Using Docker to build and test in your laptop and Jenkins
Using Docker to build and test in your laptop and JenkinsUsing Docker to build and test in your laptop and Jenkins
Using Docker to build and test in your laptop and Jenkins
 
Server(less) Swift at SwiftCloudWorkshop 3
Server(less) Swift at SwiftCloudWorkshop 3Server(less) Swift at SwiftCloudWorkshop 3
Server(less) Swift at SwiftCloudWorkshop 3
 
Marco Cavallini - Yocto Project, an automatic generator of embedded Linux dis...
Marco Cavallini - Yocto Project, an automatic generator of embedded Linux dis...Marco Cavallini - Yocto Project, an automatic generator of embedded Linux dis...
Marco Cavallini - Yocto Project, an automatic generator of embedded Linux dis...
 
Marco Cavallini @ LinuxLab 2018 : Workshop Yocto Project, an automatic genera...
Marco Cavallini @ LinuxLab 2018 : Workshop Yocto Project, an automatic genera...Marco Cavallini @ LinuxLab 2018 : Workshop Yocto Project, an automatic genera...
Marco Cavallini @ LinuxLab 2018 : Workshop Yocto Project, an automatic genera...
 
Keep calm and vagrant up
Keep calm and vagrant upKeep calm and vagrant up
Keep calm and vagrant up
 
Oracle api gateway installation as cluster and single node
Oracle api gateway installation as cluster and single nodeOracle api gateway installation as cluster and single node
Oracle api gateway installation as cluster and single node
 
Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned  Real-World Docker: 10 Things We've Learned
Real-World Docker: 10 Things We've Learned
 
Mastering selenium for automated acceptance tests
Mastering selenium for automated acceptance testsMastering selenium for automated acceptance tests
Mastering selenium for automated acceptance tests
 
Magento Docker Setup.pdf
Magento Docker Setup.pdfMagento Docker Setup.pdf
Magento Docker Setup.pdf
 
Introduction to JIB and Google Cloud Run
Introduction to JIB and Google Cloud RunIntroduction to JIB and Google Cloud Run
Introduction to JIB and Google Cloud Run
 
Creating a Scheduled Backup and Replicating System Folders Introduct.docx
Creating a Scheduled Backup and Replicating System Folders Introduct.docxCreating a Scheduled Backup and Replicating System Folders Introduct.docx
Creating a Scheduled Backup and Replicating System Folders Introduct.docx
 
當專案漸趕,當遷移也不再那麼難 (Ship Your Projects with Docker EcoSystem)
當專案漸趕,當遷移也不再那麼難 (Ship Your Projects with Docker EcoSystem)當專案漸趕,當遷移也不再那麼難 (Ship Your Projects with Docker EcoSystem)
當專案漸趕,當遷移也不再那麼難 (Ship Your Projects with Docker EcoSystem)
 
Nomad + Flatcar: a harmonious marriage of lightweights
Nomad + Flatcar: a harmonious marriage of lightweightsNomad + Flatcar: a harmonious marriage of lightweights
Nomad + Flatcar: a harmonious marriage of lightweights
 
PHP Conference - Phalcon hands-on
PHP Conference - Phalcon hands-onPHP Conference - Phalcon hands-on
PHP Conference - Phalcon hands-on
 

More from Slim Baltagi

How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?Slim Baltagi
 
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiModern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiSlim Baltagi
 
Modern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesModern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesSlim Baltagi
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaSlim Baltagi
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeSlim Baltagi
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitSlim Baltagi
 
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming AnalyticsSlim Baltagi
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksSlim Baltagi
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
 
Apache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiApache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiSlim Baltagi
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiSlim Baltagi
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkSlim Baltagi
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkSlim Baltagi
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiSlim Baltagi
 
Big Data at CME Group: Challenges and Opportunities
Big Data at CME Group: Challenges and Opportunities Big Data at CME Group: Challenges and Opportunities
Big Data at CME Group: Challenges and Opportunities Slim Baltagi
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkSlim Baltagi
 

More from Slim Baltagi (20)

How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?
 
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiModern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
 
Modern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetesModern big data and machine learning in the era of cloud, docker and kubernetes
Modern big data and machine learning in the era of cloud, docker and kubernetes
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
 
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
Apache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiApache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim Baltagi
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
 
Flink vs. Spark
Flink vs. SparkFlink vs. Spark
Flink vs. Spark
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
 
Big Data at CME Group: Challenges and Opportunities
Big Data at CME Group: Challenges and Opportunities Big Data at CME Group: Challenges and Opportunities
Big Data at CME Group: Challenges and Opportunities
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
 

Recently uploaded

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Apache Flink Crash Course by Slim Baltagi and Srini Palthepu

Editor's Notes

  1. The following steps assume a UNIX-like environment. For Windows, see Flink on Windows: https://ci.apache.org/projects/flink/flink-docs-master/setup/local_setup.html#flink-on-windows
  2. The following steps assume a UNIX-like environment. For Windows, see Flink on Windows: https://ci.apache.org/projects/flink/flink-docs-master/setup/local_setup.html#flink-on-windows
  3. For Windows, see Flink on Windows: https://ci.apache.org/projects/flink/flink-docs-master/setup/local_setup.html#flink-on-windows
  4. This is Slide 5 of http://www.slideshare.net/robertmetzger1/apache-flink-hands-on
  5. We pass the filename and location of this file using the -Dlog4j.configuration= parameter to the JVM.
  6. We pass the filename and location of this file using the -Dlog4j.configuration= parameter to the JVM.
  7. bin/start-cluster.sh is also the option for those with Flink “on premise”
  8. An entire slide about cloud computing without having “cloud” on it
  9. An entire slide about cloud computing without having “cloud” on it
  10. this way you can also start multiple threads per disk