Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building Microservices with Apache Kafka

8,705 views

Published on

Building distributed systems is challenging. Luckily, Apache Kafka provides a powerful toolkit for putting together big services as a set of scalable, decoupled components. In this talk, I'll describe some of the design tradeoffs when building microservices, and how Kafka's powerful abstractions can help. I'll also talk a little bit about what the community has been up to with Kafka Streams, Kafka Connect, and exactly-once semantics.

Presentation by Colin McCabe, Confluent, Big Data Day LA

Published in: Software
  • Cool
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Building Microservices with Apache Kafka

  1. 1. 1 By Colin McCabe Building Microservices with Apache Kafka™
  2. 2. 2 About Me
  3. 3. 3 Roadmap ● Example network service • Why microservices? • Why Kafka? ● Apache Kafka background ● How Kafka helps scale microservices ● Kafka APIs • Kafka Connect API • Kafka Streams API ● Wrap up ● New Kafka features and improvements
  4. 4. 4 Newsfeed Application
  5. 5. 5 Single Process First Try: Monolithic Service
  6. 6. 6 Emailer Second Try: Microservices with REST HDFS Connector Metrics Connector Frontend
  7. 7. 7 Third Try: Microservices with Kafka Frontend
  8. 8. 8 Themes ● Improving Decoupling • Everything in one big app: no decoupling • Microservices with REST: multiple services • Microservices with Kafka: decoupled services sharing data ● Improving Scalability • Everything in one big app: single node • Microservices with REST: one node per service • Microservices with Kafka: scalable microservices
  9. 9. 9 Apache Kafka ● A distributed streaming platform ● https://kafka.apache.org/intro ● Kafka was built at LinkedIn around 2010 ● Multi-platform: clients in Java, Scala, C, C++, Python, Go, C#, …
  10. 10. 10 Kafka Adoption
  11. 11. 11 Kafka Concepts: the 10,000 foot view ● 4 APIs • Producer • Consumer • Connector • Stream Processor
  12. 12. 12 Producers and Consumers Producer Consumer Producer Producer Consumer Consumer Consumer write messages read messages message ● key ● value
  13. 13. 13 Topics Frontend { ‘story’: ‘my news story’, ‘user’: ‘foo’, ‘timestamp’: <time> } ‘views’ topic Backend
  14. 14. 14 Kafka is Durable Frontend ● Data is replicated to multiple servers and persisted to disk. ● Configurable log retention. ● Consumers can read from any part of the log. ‘views’ topic
  15. 15. 15 Scaling with Kafka ● Can have multiple producers writing to a topic ● Can have multiple consumers reading from a topic ● Can add new microservices to consume data easily • Example: add more microservices processing views • Organize microservices around data, rather than APIs ● Can add more Kafka brokers to handle more messages and topics • Horizontal scalability
  16. 16. 16 Scaling a Topic with Multiple Partitions Frontend events topic Backend Backend Backend
  17. 17. 17 Load Balancing with Multiple Consumers Frontend emailer consumer group story_emails topic
  18. 18. 18 Partition Reassignment Frontend emailer consumer group story_emails topic
  19. 19. 19 Connecting to External Services Frontend Kafka Connect API
  20. 20. 20 Kafka Connect API docs.confluent.io/current/connect/ Connector Instance ● Responsible for copying data between Kafka and an external system Connector Task Connector Plugin
  21. 21. 21 Kafka Streams API kafka.apache.org/ documentation/streams ● Process streams of data. ● Fault-tolerant and scalable.
  22. 22. 22 Calculating News Reader Metrics Alice 13 Bob 4 Chao 25 Bob 19 Dave 55 ... Alice europe Bob us Chao asia Bob us Dave europe ... europe 68 us 23 asia 25 ... + = clicks locations clicks per location
  23. 23. 23 Kafka Streams API ● Inputs and outputs are Kafka streams ● Fault-tolerance, rebalancing, scalability provided by Kafka ● KStream ● KTable
  24. 24. 24 Joining the Clicks and Location Streams in KStreams KStream<String, Long> userClicksStream = builder.stream(..., "user-clicks-topic"); KTable<String, String> userRegionsTable = builder.table(..., "user-regions-topic") KTable<String, Long> clicksPerRegion = userClicksStream .leftJoin(userRegionsTable, (c, r) -> new RegionWithClicks(r == null ? "UNKNOWN" : r, c)) .map((user, regionWithClicks) -> new KeyValue<>(regionWithClicks.getRegion(), regionWithClicks.getClicks())). reduceByKey((c1, c2) -> c1 + c2, ...); clicksPerRegion.to("clicks-per-region-topic", ...);
  25. 25. 25 Wrap-Up Frontend Kafka Connect Kafka Streams load balancing & scalability decouple front-end and back-end
  26. 26. 26 New Kafka Features and Improvements ● Exactly once semantics in Kafka 0.11 • https://www.confluent.io/blog/exactly-once-semantics- are-possible-heres-how-apache-kafka-does-it/ ● Consumer and producer performance improvements • Up to +20% producer throughput • Up to +50% consumer throughput ● Better CLASSPATH isolation for Kafka Connect connectors
  27. 27. 27 Conclusion ● The loose coupling, deployability, and testability of microservices makes them a great way to scale. ● Apache Kafka is an incredibly useful building block for many different microservices. ● Kafka is reliable and does the heavy lifting ● Kafka Connect is a great API for connecting with external databases, Hadoop clusters, and other external systems. ● Kafka Streams can process data in realtime. ● https://www.confluent.io/solutions/microservices/
  28. 28. 28 Thank You! https://www.confluent.io/download https://www.confluent.io/careers

×