Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1
By Colin McCabe
Building Microservices
with Apache Kafka™
2
About Me
3
Roadmap
● Example network service
• Why microservices?
• Why Kafka?
● Apache Kafka background
● How Kafka helps scale mi...
4
Newsfeed Application
5
Single Process
First Try: Monolithic Service
6
Emailer
Second Try: Microservices with REST
HDFS
Connector
Metrics
Connector
Frontend
7
Third Try: Microservices with Kafka
Frontend
8
Themes
● Improving Decoupling
• Everything in one big app: no decoupling
• Microservices with REST: multiple services
• ...
9
Apache Kafka
● A distributed streaming platform
● https://kafka.apache.org/intro
● Kafka was built at LinkedIn around 20...
10
Kafka Adoption
11
Kafka Concepts: the 10,000 foot view
● 4 APIs
• Producer
• Consumer
• Connector
• Stream Processor
12
Producers and Consumers
Producer
Consumer
Producer Producer
Consumer Consumer Consumer
write messages
read messages
mes...
13
Topics
Frontend
{
‘story’: ‘my news story’,
‘user’: ‘foo’,
‘timestamp’: <time>
}
‘views’
topic
Backend
14
Kafka is Durable
Frontend
● Data is
replicated to
multiple servers
and persisted to
disk.
● Configurable log
retention....
15
Scaling with Kafka
● Can have multiple producers writing to a topic
● Can have multiple consumers reading from a topic
...
16
Scaling a Topic with Multiple Partitions
Frontend
events
topic
Backend Backend Backend
17
Load Balancing with Multiple Consumers
Frontend
emailer consumer
group
story_emails topic
18
Partition Reassignment
Frontend
emailer consumer
group
story_emails topic
19
Connecting to External Services
Frontend
Kafka
Connect API
20
Kafka Connect API
docs.confluent.io/current/connect/
Connector Instance
● Responsible for
copying data
between Kafka an...
21
Kafka Streams API
kafka.apache.org/
documentation/streams
● Process streams of data.
● Fault-tolerant and
scalable.
22
Calculating News Reader Metrics
Alice 13
Bob 4
Chao 25
Bob 19
Dave 55
...
Alice
europe
Bob us
Chao asia
Bob us
Dave
eur...
23
Kafka Streams API
● Inputs and outputs are
Kafka streams
● Fault-tolerance,
rebalancing, scalability
provided by Kafka
...
24
Joining the Clicks and Location Streams in KStreams
KStream<String, Long> userClicksStream =
builder.stream(..., "user-...
25
Wrap-Up
Frontend
Kafka
Connect
Kafka
Streams
load
balancing &
scalability
decouple
front-end and
back-end
26
New Kafka Features and Improvements
● Exactly once semantics in Kafka 0.11
• https://www.confluent.io/blog/exactly-once...
27
Conclusion
● The loose coupling, deployability, and testability of
microservices makes them a great way to scale.
● Apa...
28
Thank You!
https://www.confluent.io/download
https://www.confluent.io/careers
Upcoming SlideShare
Loading in …5
×

Building Microservices with Apache Kafka

7,032 views

Published on

Building distributed systems is challenging. Luckily, Apache Kafka provides a powerful toolkit for putting together big services as a set of scalable, decoupled components. In this talk, I'll describe some of the design tradeoffs when building microservices, and how Kafka's powerful abstractions can help. I'll also talk a little bit about what the community has been up to with Kafka Streams, Kafka Connect, and exactly-once semantics.

Presentation by Colin McCabe, Confluent, Big Data Day LA

Published in: Software
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Building Microservices with Apache Kafka

  1. 1. 1 By Colin McCabe Building Microservices with Apache Kafka™
  2. 2. 2 About Me
  3. 3. 3 Roadmap ● Example network service • Why microservices? • Why Kafka? ● Apache Kafka background ● How Kafka helps scale microservices ● Kafka APIs • Kafka Connect API • Kafka Streams API ● Wrap up ● New Kafka features and improvements
  4. 4. 4 Newsfeed Application
  5. 5. 5 Single Process First Try: Monolithic Service
  6. 6. 6 Emailer Second Try: Microservices with REST HDFS Connector Metrics Connector Frontend
  7. 7. 7 Third Try: Microservices with Kafka Frontend
  8. 8. 8 Themes ● Improving Decoupling • Everything in one big app: no decoupling • Microservices with REST: multiple services • Microservices with Kafka: decoupled services sharing data ● Improving Scalability • Everything in one big app: single node • Microservices with REST: one node per service • Microservices with Kafka: scalable microservices
  9. 9. 9 Apache Kafka ● A distributed streaming platform ● https://kafka.apache.org/intro ● Kafka was built at LinkedIn around 2010 ● Multi-platform: clients in Java, Scala, C, C++, Python, Go, C#, …
  10. 10. 10 Kafka Adoption
  11. 11. 11 Kafka Concepts: the 10,000 foot view ● 4 APIs • Producer • Consumer • Connector • Stream Processor
  12. 12. 12 Producers and Consumers Producer Consumer Producer Producer Consumer Consumer Consumer write messages read messages message ● key ● value
  13. 13. 13 Topics Frontend { ‘story’: ‘my news story’, ‘user’: ‘foo’, ‘timestamp’: <time> } ‘views’ topic Backend
  14. 14. 14 Kafka is Durable Frontend ● Data is replicated to multiple servers and persisted to disk. ● Configurable log retention. ● Consumers can read from any part of the log. ‘views’ topic
  15. 15. 15 Scaling with Kafka ● Can have multiple producers writing to a topic ● Can have multiple consumers reading from a topic ● Can add new microservices to consume data easily • Example: add more microservices processing views • Organize microservices around data, rather than APIs ● Can add more Kafka brokers to handle more messages and topics • Horizontal scalability
  16. 16. 16 Scaling a Topic with Multiple Partitions Frontend events topic Backend Backend Backend
  17. 17. 17 Load Balancing with Multiple Consumers Frontend emailer consumer group story_emails topic
  18. 18. 18 Partition Reassignment Frontend emailer consumer group story_emails topic
  19. 19. 19 Connecting to External Services Frontend Kafka Connect API
  20. 20. 20 Kafka Connect API docs.confluent.io/current/connect/ Connector Instance ● Responsible for copying data between Kafka and an external system Connector Task Connector Plugin
  21. 21. 21 Kafka Streams API kafka.apache.org/ documentation/streams ● Process streams of data. ● Fault-tolerant and scalable.
  22. 22. 22 Calculating News Reader Metrics Alice 13 Bob 4 Chao 25 Bob 19 Dave 55 ... Alice europe Bob us Chao asia Bob us Dave europe ... europe 68 us 23 asia 25 ... + = clicks locations clicks per location
  23. 23. 23 Kafka Streams API ● Inputs and outputs are Kafka streams ● Fault-tolerance, rebalancing, scalability provided by Kafka ● KStream ● KTable
  24. 24. 24 Joining the Clicks and Location Streams in KStreams KStream<String, Long> userClicksStream = builder.stream(..., "user-clicks-topic"); KTable<String, String> userRegionsTable = builder.table(..., "user-regions-topic") KTable<String, Long> clicksPerRegion = userClicksStream .leftJoin(userRegionsTable, (c, r) -> new RegionWithClicks(r == null ? "UNKNOWN" : r, c)) .map((user, regionWithClicks) -> new KeyValue<>(regionWithClicks.getRegion(), regionWithClicks.getClicks())). reduceByKey((c1, c2) -> c1 + c2, ...); clicksPerRegion.to("clicks-per-region-topic", ...);
  25. 25. 25 Wrap-Up Frontend Kafka Connect Kafka Streams load balancing & scalability decouple front-end and back-end
  26. 26. 26 New Kafka Features and Improvements ● Exactly once semantics in Kafka 0.11 • https://www.confluent.io/blog/exactly-once-semantics- are-possible-heres-how-apache-kafka-does-it/ ● Consumer and producer performance improvements • Up to +20% producer throughput • Up to +50% consumer throughput ● Better CLASSPATH isolation for Kafka Connect connectors
  27. 27. 27 Conclusion ● The loose coupling, deployability, and testability of microservices makes them a great way to scale. ● Apache Kafka is an incredibly useful building block for many different microservices. ● Kafka is reliable and does the heavy lifting ● Kafka Connect is a great API for connecting with external databases, Hadoop clusters, and other external systems. ● Kafka Streams can process data in realtime. ● https://www.confluent.io/solutions/microservices/
  28. 28. 28 Thank You! https://www.confluent.io/download https://www.confluent.io/careers

×