Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019

70 views

Published on

Talk I gave at the Athens Kubernetes Meetup.

Published in: Technology
  • Be the first to comment

Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019

  1. 1. 1 Stream Processing in the Cloud Rafał Leszko (@RafalLeszko) Cloud Software Engineer at Hazelcast
  2. 2. Hands Up
  3. 3. Hands Up Raise your hand if… ● ...you know what Stream Processing is?
  4. 4. Hands Up Raise your hand if… ● ...you know what Stream Processing is? ● ...you have ever used Stream Processing?
  5. 5. Hands Up Raise your hand if… ● ...you know what Stream Processing is? ● ...you have ever used Stream Processing? ● ...you have ever used Hazelcast Jet?
  6. 6. Agenda ● Part 1: Stream Processing Basics ○ What is Stream Processing and Hazelcast Jet? ○ Example: Word Count ● Part 2: Jet Under the Hood ○ How does it work? ○ Infinite Streams ○ Example: Twitter Cryptocurrency Analysis ● Part 3: Jet in the Cloud ○ Cloud (Kubernetes) integration ○ Example: Stock Trade Aggregator ● Part 4: Jet Features & Use Cases ○ Why would I need it? ○ Example: Web Crawler
  7. 7. Part 1: Stream Processing Basics
  8. 8. What is Hazelcast?
  9. 9. What is Hazelcast? Products:
  10. 10. What is Hazelcast? Products:
  11. 11. What is Hazelcast Jet?
  12. 12. What is Hazelcast Jet? DAG - Direct Acyclic Graph
  13. 13. What is Hazelcast Jet?
  14. 14. What is Hazelcast Jet?
  15. 15. Example 1: Word Count Problem: Count the number of occurrences of each word in the given text. Sample Input: Lorem ipsum dolor, dolor. Sample Output: lorem=1 ipsum=1 dolor=2
  16. 16. Example 1: Word Count Pure Java Pattern delimiter = Pattern.compile("W+"); return lines.entrySet().stream() .map(e -> e.getValue().toLowerCase()) .flatMap(t -> Arrays.stream(delimiter.split(t))) .filter(word -> !word.isEmpty()) .collect( groupingBy( identity(), counting()));
  17. 17. Example 1: Word Count
  18. 18. Example 1: Word Count
  19. 19. Example 1: Word Count
  20. 20. Example 1: Word Count
  21. 21. Example 1: Word Count
  22. 22. Example 1: Word Count Hazelcast Jet Pattern delimiter = Pattern.compile("W+"); Pipeline pipeline = Pipeline.create(); pipeline.drawFrom(Sources.<Long, String>map(LINES)) .map(e -> e.getValue().toLowerCase()) .flatMap(t -> traverseArray(delimiter.split(t))) .filter(word -> !word.isEmpty()) .groupingKey(wholeItem()) .aggregate(counting()) .drainTo(Sinks.map(COUNTS)); return pipeline;
  23. 23. Example 1: Word Count Pure Java Pattern delimiter = Pattern.compile("W+"); return lines.entrySet().stream() .map(e -> e.getValue().toLowerCase()) .flatMap(t -> Arrays.stream(delimiter.split(t))) .filter(word -> !word.isEmpty()) .collect( groupingBy( identity(), counting()));
  24. 24. Example 1: Word Count Hazelcast Jet Pattern delimiter = Pattern.compile("W+"); Pipeline pipeline = Pipeline.create(); pipeline.drawFrom(Sources.<Long, String>map(LINES)) .map(e -> e.getValue().toLowerCase()) .flatMap(t -> traverseArray(delimiter.split(t))) .filter(word -> !word.isEmpty()) .groupingKey(wholeItem()) .aggregate(counting()) .drainTo(Sinks.map(COUNTS)); return pipeline;
  25. 25. Example 1: Word Count
  26. 26. Example 1: Word Count
  27. 27. Example 1: Word Count Hazelcast Jet Pattern delimiter = Pattern.compile("W+"); Pipeline pipeline = Pipeline.create(); pipeline.drawFrom(Sources.<Long, String>map(LINES)) .map(e -> e.getValue().toLowerCase()) .flatMap(t -> traverseArray(delimiter.split(t))) .filter(word -> !word.isEmpty()) .groupingKey(wholeItem()) .aggregate(counting()) .drainTo(Sinks.map(COUNTS)); return pipeline;
  28. 28. Example 1: Word Count Demo: https://github.com/hazelcast/hazelcast-jet-code-samples
  29. 29. Part 2: Jet Under the Hood
  30. 30. How does it work?
  31. 31. How does it work?
  32. 32. How does it work?
  33. 33. How does it work? Under the Hood: ● Generate DAG representation from Pipeline ● Serialize DAG ● Send DAG to every Node ● Deserialize DAG ● Executes DAG on each Node
  34. 34. Infinite Streams
  35. 35. Infinite Streams Examples: ● Currency Exchange Rates ● Tweets from Twitter ● Events in some Event-Based system ● ...
  36. 36. Windowing pipeline.drawFrom(...) .withNativeTimestamps(0) .window(sliding(30_000, 10_000))
  37. 37. Example 2: Twitter Cryptocurrency Analysis Problem: Present in real-time the sentiments about cryptocurrencies Input: Tweets are streamed from Twitter and categorized by coin type (BTC, ETC, XRP, etc) Output: Tweets sentiments (last 30 sec, last minute, last 5 minutes)
  38. 38. Example 2: Twitter Cryptocurrency Analysis Demo: https://jet.hazelcast.org/demos/
  39. 39. Part 3: Jet in the Cloud
  40. 40. Jet in the Cloud: discovery plugins
  41. 41. Jet in the Cloud: discovery plugins
  42. 42. Jet in the Cloud: discovery plugins
  43. 43. Jet in the Cloud: discovery plugins
  44. 44. Jet in the Cloud: discovery plugins
  45. 45. Jet in the Cloud: discovery plugins
  46. 46. Jet in the Cloud: deploying on k8
  47. 47. Jet in the Cloud: deploying on k8 $ helm install stable/hazelcast-jet
  48. 48. Jet in the Cloud: deploying on k8 $ kubectl scale <name> --replicas=6
  49. 49. Example 3: Stock Trade Aggregator Problem: Present in real-time the aggregated trade price of stocks Input: Stock trades with name and price Output: Sum of prices per stock name
  50. 50. Example 3: Stock Trade Aggregator Demo: https://github.com/hazelcast/hazelcast-jet-code-samples/t ree/master/integration/kubernetes
  51. 51. Part 4: Jet Features & Use Cases
  52. 52. Jet Features Categories of Features ● Easy to Use ● Performance
  53. 53. Jet Features: Performance
  54. 54. Jet Features: Performance
  55. 55. Jet Features: other features
  56. 56. Why would I need it? ● Big Data Projects
  57. 57. Why would I need it? ● Big Data Projects ● Speed up Everything
  58. 58. Why would I need it? ● Big Data Projects ● Speed up Everything
  59. 59. Example 4: Web Crawler Problem: Parse all blog posts from the webpage Input: URL of Blog Trips Output: All the content from the Blog
  60. 60. Example 4: Web Crawler Demo: https://github.com/leszko/geodump
  61. 61. Thank You!

×