Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Flink Forward SF 2017: Jamie Grier - Apache Flink - The latest and greatest

404 views

Published on

Jamie Grier outlines the latest important features in Apache Flink and walks you through building a working demo to show these features off. Topics may include queryable state, dynamic scaling, streaming SQL, very large state support, and whatever is the latest and greatest in March 2017.

Published in: Data & Analytics
  • Be the first to comment

Flink Forward SF 2017: Jamie Grier - Apache Flink - The latest and greatest

  1. 1. 1 Jamie Grier
 @jamiegrier
 
 data-artisans.com Apache Flink: The Latest and Greatest
  2. 2. 2 Original creators of Apache Flink® Providers of the dA Platform, a supported Flink distribution
  3. 3. The Latest Features ▪ ProcessFunction API ▪ Queryable State API ▪ Excellent support for advanced applications that are: ▪ Flexible ▪ Stateful ▪ Event Driven ▪ Time Driven
  4. 4. The Latest Features - Quick Overview ▪ Rescalable State ▪ Async I/O Support ▪ Flexible Deployment Options ▪ Enhanced Security
  5. 5. Rescalable State ▪ Separates state parallelism from task parallelism ▪ Enables autoscaling integrations while maintaining stateful computations ▪ Handled efficiently via key groups
  6. 6. Rescalable State Map Filter Window State State State Source Sink State is partitioned by key
  7. 7. Rescalable State Source Sink Map Filter Window State State State Map Filter Window State State State State is partitioned by key
  8. 8. Rescalable State Source Sink Map Filter Window State State State Map Filter Window State State State Map Filter Window State State State Map Filter Window State State State State is partitioned by key
  9. 9. Rescalable State Source Sink Map Filter Window State State State Map Filter Window State State State Map Filter Window State State State Map Filter Window State State State Map Filter Window State State State Map Filter Window State State State Map Filter Window State State State Map Filter Window State State State State is partitioned by key
  10. 10. Flexible Deployment Options ▪ YARN ▪ Mesos ▪ Docker Swarm ▪ Kubernetes
  11. 11. Flexible Deployment Options ▪ DC/OS ▪ Amazon EMR ▪ Google Dataproc
  12. 12. Asynchronous I/O Support ▪ Make aynchronous calls to external services from streaming job ▪ Efficiently keeps configurable number of asynchronous calls in flight ▪ Correctly handles failure scenarios - restarts failed async calls, etc
  13. 13. Asynchronous I/O Support Async I/O User Code
  14. 14. Asynchronous I/O Support Little’s Law: throughput = occupancy / latency
  15. 15. Asynchronous I/O Support a b c d b c d x a a b b a Sync. I/O Async. I/O database database sendRequest(x) x receiveResponse(x) wait
  16. 16. Asynchronous I/O Support
  17. 17. Asynchronous I/O Support // create the original stream val stream: DataStream[String] = ... // apply the async I/O transformation val resultStream: DataStream[(String, String)] = AsyncDataStream.unorderedWait( input = stream, asyncFunction = new AsyncDatabaseRequest(), timeout = 1000, timeUnit = TimeUnit.MILLISECONDS, concurrentRequests = 100)
  18. 18. Asynchronous I/O Support class AsyncDatabaseRequest extends AsyncFunction[String, (String, String)] { override def asyncInvoke(str: String, asyncCollector: AsyncCollector[(String, String)]): Unit = { // issue the asynchronous request, receive a future for the result val resultFuture: Future[String] = client.query(str) // set the callback to be executed once the request by the client is complete // the callback simply forwards the result to the collector resultFuture.onSuccess { case result: String => asyncCollector.collect(Iterable((str, result))); } } }
  19. 19. Enhanced Security ▪ SSL ▪ Kerberos ▪ Kafka ▪ Zookeeper ▪ Hadoop
  20. 20. Advanced Event-Driven Applications ▪ ProcessFunction API ▪ Queryable State API ▪ Excellent support for advanced applications that are: ▪ Flexible ▪ Stateful ▪ Event Driven ▪ Time Driven
  21. 21. Example: FlinkTrade ▪ Overall Requirements: ▪ Consume “starting position” and “quote” streams ▪ Process complex, time-oriented, trading rules ▪ Trade out of positions to our advantage if possible ▪ Provide a dashboard of currently held positions to traders and asset managers ▪ Complex Rules: ▪ We only make trades where the Bid Price is above our current Ask Price ▪ When a trade is made we increase our Ask Price — looking to optimize our profits ▪ Positions have a set time-to-live until we try to trade out of them more aggressively by decreasing the Ask Price over time
  22. 22. Example: FlinkTrade Quotes Starting Positions Trading Engine Trades Positions SYMBOL SHARES BUY PRICE ASK PRICE LAST TRADE PRICE Profit AAPL 10,000 140.40 140.50 140.40 $10,921.00 GOOG 20,000 846.81 846.91 846.81 $12,021.00 TWTR 8,000 15.12 15.22 15.12 $4,032.00 ● Event Driven Processing ● Complex Trading Rules ● Time Based Logic Trader Dashboard
  23. 23. Example: FlinkTrade Quotes Starting Positions Trades Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions Trading Engine Positions SYMBOL SHARES BUY PRICE ASK PRICE LAST TRADE PRICE Profit AAPL 10,000 140.40 140.50 140.40 $10,921.00 GOOG 20,000 846.81 846.91 846.81 $12,021.00 TWTR 8,000 15.12 15.22 15.12 $4,032.00
  24. 24. Example: FlinkTrade Let’s look at the code
  25. 25. We are hiring! data-artisans.com/careers @jamiegrier @ApacheFlink @dataArtisans

×