Successfully reported this slideshow.
Your SlideShare is downloading. ×

Better, Faster, Stronger Streaming: Your First Dive into Flink SQL

Better, Faster, Stronger Streaming: Your First Dive into Flink SQL

Download to read offline

For the most flexible, powerful stream processing engines, it seems like the barrier to entry has never been higher than it is now. If you’ve tried, or have been interested in leveraging the strengths of real-time data processing - maybe for machine learning, IoT, anomaly detection or data analysis - but you’ve been held back: I’ve been there, and it’s frustrating. And that’s why this talk is for you.

That being said, this talk is also for you if you ARE experienced with stream processing but you want an easy (and if I say so myself, pretty fun) way to add some of the newest, bleeding edge features to your toolbelt.

This session will be about getting started with Flink SQL. Apache Flink’s high level SQL language has the familiarity of the SQL you know and love (or at least, know…), but with some powerful new functionality, and of course, the benefit of being able to be used with Flink and PyFlink.

More specifically, this will be a pragmatic entry into creating data pipelines with Flink SQL, as well as a sneak peek into some of its newest and most interesting features.

For the most flexible, powerful stream processing engines, it seems like the barrier to entry has never been higher than it is now. If you’ve tried, or have been interested in leveraging the strengths of real-time data processing - maybe for machine learning, IoT, anomaly detection or data analysis - but you’ve been held back: I’ve been there, and it’s frustrating. And that’s why this talk is for you.

That being said, this talk is also for you if you ARE experienced with stream processing but you want an easy (and if I say so myself, pretty fun) way to add some of the newest, bleeding edge features to your toolbelt.

This session will be about getting started with Flink SQL. Apache Flink’s high level SQL language has the familiarity of the SQL you know and love (or at least, know…), but with some powerful new functionality, and of course, the benefit of being able to be used with Flink and PyFlink.

More specifically, this will be a pragmatic entry into creating data pipelines with Flink SQL, as well as a sneak peek into some of its newest and most interesting features.

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Better, Faster, Stronger Streaming: Your First Dive into Flink SQL

  1. 1. © 2020 Ververica 1
  2. 2. © 2020 Ververica 2 2 ● Caito Scherr Introduction
  3. 3. © 2020 Ververica 3 3 ● Caito Scherr Introduction 3 ● Caito Scherr ● Developer Advocate
  4. 4. © 2020 Ververica 4 ● Caito Scherr ● Developer Advocate ● Ververica, GmbH Introduction
  5. 5. © 2020 Ververica 5 ● Caito Scherr ● Developer Advocate ● Ververica, GmbH ● Portland, OR, USA Introduction
  6. 6. © 2020 Ververica 6 Introduction
  7. 7. © 2020 Ververica 7 Introduction
  8. 8. © 2020 Ververica 8 ● Quick Flink Intro ● Why Flink SQL? ● Getting Started Agenda
  9. 9. © 2020 Ververica 9 Agenda ● Quick Flink Intro ● Why Flink SQL? ● Getting Started
  10. 10. © 2020 Ververica 10 Agenda ● Quick Flink Intro ● Why Flink SQL? ● Getting Started
  11. 11. © 2020 Ververica 11 Flink Intro ● Stateful, distributed ● Stream processing engine ● Unified batch & streaming >> High Level
  12. 12. © 2020 Ververica 12 Flink Intro ● Stateful, distributed ● Stream processing engine ● Unified batch & streaming ● Scalable ● Highly configurable ● “Deploy anywhere” >> High Level
  13. 13. © 2020 Ververica 13 >> Flink’s APIs Flink Intro
  14. 14. © 2020 Ververica 14 >> Flink: Unified Processing ● Reuse code and logic ● Consistent semantics ● Mix historic and real-time ● Simplify operations Flink Intro
  15. 15. © 2020 Ververica 15 >> Architecture Flink Intro
  16. 16. © 2020 Ververica 16 >> Basic Example Flink Intro
  17. 17. © 2020 Ververica 17 Flink Intro >> Basic Example
  18. 18. © 2020 Ververica 18 Flink Intro >> Basic Example
  19. 19. © 2020 Ververica 19 Flink Intro >> Basic Example
  20. 20. © 2020 Ververica 20 Flink Intro >> Basic Example
  21. 21. © 2020 Ververica 21 Flink Runtime Stateful Computations over Data Streams Stateful Stream Processing Streams, State, Time Event-Driven Applications Stateful Functions Streaming Analytics & ML SQL, PyFlink, Tables Why Flink SQL? Image Credit: Marta Paes
  22. 22. © 2020 Ververica 22 >> Benefits / Use Cases ● Unified, distributed ● Less implementation overhead ● Developer autonomy Why Flink SQL?
  23. 23. © 2020 Ververica 23 user cnt Mary 2 Bob 1 SELECT user_id, COUNT(url) AS cnt FROM clicks GROUP BY user_id; Take a snapshot when the query starts A final result is produced A row that was added after the query was started is not considered user cTime url Mary 12:00:00 https://… Bob 12:00:00 https://… Mary 12:00:02 https://… Liz 12:00:03 https://… The query terminates Why Flink SQL? >> Compare: Regular SQL Engine Image Credit: Marta Paes
  24. 24. © 2020 Ververica 24 user cTime url user cnt SELECT user_id, COUNT(url) AS cnt FROM clicks GROUP BY user_id; Mary 12:00:00 https://… Bob 12:00:00 https://… Mary 12:00:02 https://… Liz 12:00:03 https://… Bob 1 Liz 1 Mary 1 Mary 2 Ingest all changes as they happen Continuously update the result The result is identical to the one-time query (at this point) >> A Streaming SQL Engine Why Flink SQL? Image Credit: Marta Paes
  25. 25. © 2020 Ververica 25 >> Simpler with SQL Why Flink SQL?
  26. 26. © 2020 Ververica 26 ● Standard SQL syntax and semantics (i.e. not a “SQL-flavor”) ● Unified APIs for batch and streaming ● Support for advanced time handling and operations (e.g. CDC, pattern matching) UDF Support Python Java Scala Execution TPC-DS Coverage Batch Streaming + Formats Native Connectors Apache Kafka Elasticsearch FileSystems JDBC HBase + Kinesis Metastore Postgres (JDBC) Data Catalogs Debezium >> Flink SQL In A Nutshell Why Flink SQL? Image Credit: Marta Paes
  27. 27. © 2020 Ververica 27 >> Simple Pre-Requisites 1. Check Java version 2. Download Flink snapshot 3. Un-tar it Getting Started
  28. 28. © 2020 Ververica 28 What Next? >> Flink SQL Cookbook
  29. 29. © 2020 Ververica 29
  30. 30. © 2020 Ververica 30
  31. 31. © 2020 Ververica 31
  32. 32. © 2020 Ververica 32
  33. 33. © 2020 Ververica 33
  34. 34. © 2020 Ververica 34 What Next? >> Flink SQL Cookbook Image Credit: Marta Paes
  35. 35. © 2020 Ververica 35 What Next? >> Flink SQL Cookbook
  36. 36. © 2020 Ververica 36 Find Me Here Twitter Caito_200_OK Content https://medium.com/@caito http://caito-200-ok.com/ Email Caito@ververica.com
  37. 37. © 2020 Ververica 37 Credits + Resources ● Flink Ahead: What Comes After Batch & Streaming: https://youtu.be/h5OYmy9Yx7Y ● Flink Table API & SQL: https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#operatio ns ● Flink SQL Cookbook: https://github.com/ververica/flink-sql-cookbook ● Flink logos: https://wints.github.io/flink-web//community.html ● Powered by Flink: https://flink.apache.org/poweredby.html ● Flink SQL Connectors: https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/formats/over view/
  38. 38. © 2020 Ververica ● Nina & the Berlin Buzzwords Conference staff!! ● Marta Paes 38 Thank You! Scan here for links & resources

×