Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kafka Streams: What it is, and how to use it?

270 views

Published on

Speaker: Matthias J. Sax | Software Engineer, Confluent |
Apache Kafka PMC member

Published in: Technology
  • Be the first to comment

Kafka Streams: What it is, and how to use it?

  1. 1. 1 1 Kafka Streams: What it is, and how to use it? Matthias J. Sax | Software Engineer | @MatthiasJSax Apache Kafka PMC member
  2. 2. 2 2@MatthiasJSax
  3. 3. 3 3@MatthiasJSax Apache Kafka®: A Distributed Streaming Platform
  4. 4. 4 4@MatthiasJSax
  5. 5. 5 5@MatthiasJSax Confluent Platform The Event Streaming Platform Built by the Original Creators of Apache Kafka® Operations and Security Development & Stream Processing Apache Kafka Confluent Platform Support,Services, Training&Partners Mission-critical Reliability Complete Event Streaming Platform Freedom of Choice Datacenter Public Cloud Confluent Cloud Self-Managed Software Fully-Managed Service
  6. 6. 6 6 1.0 Enterprise Ready 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication 0.8 2012 2014 Cluster mirroring 0.7 2015 2016 20172013 20192018 2.3
  7. 7. 7 7 7 Kafka Streams (aka Streams API) @MatthiasJSax
  8. 8. Streams API Java client library for building distributed apps Part of the Apache Kafka project Read-process- write pattern State handling, fault-tolerance, scaling @MatthiasJSax 8
  9. 9. 9 9@MatthiasJSax
  10. 10. 10 10@MatthiasJSax • subscribe() • poll() • send() • flush() Consumer, Producer • filter() • join() • aggregate() Kafka Streams • Select…from… • Join…where… • Group by.. KSQL Flexibility Simplicity Why should I use it?
  11. 11. 11 11@MatthiasJSax How can I use it? <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>2.3.0</version> </dependency>
  12. 12. 12 12@MatthiasJSax How do I deploy my application?
  13. 13. 13 13@MatthiasJSax Scaling Your Application
  14. 14. 14 14@MatthiasJSax Scaling Your Application
  15. 15. 15 15@MatthiasJSax Scaling Your Application
  16. 16. 16 16@MatthiasJSax Data Model
  17. 17. 17 17@MatthiasJSax StreamsBuilder builder = new StreamsBuilder(); KStream<Long,String> inputStream = builder.stream(“input-topic”); KStream<Long,String> outputStream = inputStream.mapValues( value -> value.toLowerCase()); outputStream.to(“output-topic”); Transformations (map, filter, etc.)
  18. 18. 18 18@MatthiasJSax Stream Aggregations KTable<Long,Long> countPerKey = stream.groupByKey().count();
  19. 19. 19 19@MatthiasJSax Data Model (cont.)
  20. 20. Streams Facts/events Infinite/unbounded Append-only Immutable 20
  21. 21. Tables Current state Finite/Bounded Insert, Update, Delete Mutable 21
  22. 22. 22 22@MatthiasJSax Stream-Table Duality
  23. 23. 23 23@MatthiasJSax KTable<Long,Long> countPerKey = stream.groupByKey().count(); countPerKey.toStream().to(“topic”); Stream Aggregations: The Changelog
  24. 24. 24 24@MatthiasJSax Creating Tables KTable<Long,String> inputTable = builder.table(“changelog-topic”);
  25. 25. 25 25@MatthiasJSax Transforming Tables KTable<Long,String> outputTable = inputTable.filter();
  26. 26. 26 26@MatthiasJSax Joining Streams and Tables KStream<Long,String> enrichedStream = inputStream.join(inputTable, …);
  27. 27. 27 27 27 groupBy filter aggregate windowBy (tumbling, hopping, session) mapValues flatMap foreach join (stream-stream, stream-table, table-table)
  28. 28. 28 28@MatthiasJSax Processor API Topology topology = new Topology(); topology.addSource(…); topology.addProcessor(…); topology.addSink(…); topology.addStateStore(…); // DSL compiles down to a topology: // StreamsBuilder builder = … // Topology topology = builder.build(); KafkaStreams instance = new Kafka Streams(topology, config); instance.start();
  29. 29. 29 29@MatthiasJSax The Processor Interface • Full flexibility • Emit zero of more output records • Lookup or modify data in your state stores • Access metadata (e.g., timestamp, headers, etc) via the context • Schedule punctuations: • context.schedule(…) • Event or wall-clock time based • Regular callbacks public class MyProcessor<K,V> implements Processor<K,V> { private ProcessorContext context; private KeyValueStore store; public void init(ProcessorContext c) { context = c; store = context.getStateStore(…); } public void process(K key, V value) { // business logic state.put(key, value); context.forward(key, value); } public void close() {} }
  30. 30. 30 30@MatthiasJSax • subscribe() • poll() • send() • flush() Consumer, Producer • filter() • join() • aggregate() Kafka Streams • Select…from… • Join…where… • Group by.. KSQL Flexibility Simplicity
  31. 31. 31 31@MatthiasJSax • subscribe() • poll() • send() • flush() Consumer, Producer • filter() • join() • aggregate() Streams DSL • Select…from… • Join…where… • Group by.. KSQL Flexibility Simplicity • source / processor / sink • state store • process() • punctuations Streams PAPI
  32. 32. 32 32@MatthiasJSax • https://docs.confluent.io/current/streams/index.html • https://github.com/confluentinc/kafka-streams-examples • https://www.confluent.io/resources/ • Kafka Summit talk recording • White papers • Books • https://www.manning.com/books/kafka-streams-in-action • By Bill Bejeck Getting help: • User mailing list: https://kafka.apache.org/contact • Confluent Google Group: https://groups.google.com/forum/#!topic/confluent-platform • Confluent Community Slack: https://launchpass.com/confluentcommunity • StackOverflow How to get started…
  33. 33. 33 33 Until the end of 2019 Confluent are giving new users $50 of free usage per month for their first 3 months Sign up for a Confluent Cloud account Please bear in mind that you will be required to enter credit card information but will not be charged unless you go over the $50 usage in any of the first 3 months or if you don’t cancel your subscription before the end of your promotion. Here’s advice on how to use this promotion to try Confluent Cloud for free! You won’t be charged if you don’t go over the limit! Get the benefits of Confluent Cloud, but keep an eye on your your account making sure that you have enough remaining free credits available for the rest of your subscription month!! Cancel before the 3 months end If you don’t want to continue past the promotion If you fail to cancel within your first three months you will start being charged full price. To cancel, immediately stop all streaming and storing data in Confluent Cloud and email cloud- support@confluent.io bit.ly/TryConfluentCloudAvailable on bit.ly/TryConfluentCloud
  34. 34. 34 34 We are hiring! @MatthiasJSax matthias@confluent.io | mjsax@apache.org

×