Apache Flink is a community-driven open source and memory-centric Big Data analytics framework. It provides the only hybrid (Real-Time Streaming + Batch) open source distributed data processing engine supporting many use cases.
Flink uses a mixture of Scala and Java internally, has very good Scala APIs and some of its libraries are basically pure Scala (FlinkML and Table).
At its core, it is a streaming dataflow execution engine and it also provides several APIs for batch processing (DataSet API), real-time streaming (DataStream API) and relational queries (Table API) and also domain-specific libraries for machine learning (FlinkML) and graph processing (Gelly).
In this talk, you will learn in more details about:
What is Apache Flink, how it fits into the Big Data ecosystem and why it is the 4G (4th Generation) of Big Data Analytics frameworks?
How Apache Flink integrates with Apache Hadoop and other open source tools for data input and output as well as deployment?
Why Apache Flink is an alternative to Apache Hadoop MapReduce, Apache Storm and Apache Spark? What are the benchmarking results between Apache Flink and those other Big Data analytics frameworks?