Kafka short

712 views

Published on

Published in: Software
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
712
On SlideShare
0
From Embeds
0
Number of Embeds
42
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Kafka short

  1. 1. 1 Kafka forKafka for BigDataBigData ProcessingProcessing Yanai Franchi , TikalYanai Franchi , Tikal
  2. 2. 2 Find “Hot” Places
  3. 3. 3
  4. 4. 4 gogobot checkin Heat Map Service Lets' Develop “Gogobot Checkins Heat-Map”
  5. 5. 5 Key Notes ● Collector Service - Collects checkins as text addresses – We need to use GeoLocation ServiceWe need to use GeoLocation Service ● Upon elapsed interval, the last locations list will be displayed as Heat-Map in GUI. ● Web Scale service – 10Ks checkins/seconds all over the world (imaginary, but lets do it for the exercise).
  6. 6. 6 Heat-Map Context Text-Address Checkins Heat-Map Service Gogobot System Gogobot Micro Service Gogobot Micro Service Gogobot Micro Service Geo Location Service Get-GeoCode(Address) Heat-Map Last Interval Locations
  7. 7. 7 Tons of Addresses Arriving Every Second
  8. 8. 8 First Reaction...
  9. 9. 9 Checkin HTTP Reactor Checkins Topic Storm Heat-Map Topology Hotzones Topic Web App Push via WebSocket Publish Checkins HDFS Checkin HTTP Firehose
  10. 10. 10
  11. 11. 11 They all are Good But not for all use-cases
  12. 12. 12 Kafka A little introduction
  13. 13. 13
  14. 14. 14 Why ?
  15. 15. 15 LinkedIn Original Architecture
  16. 16. 16
  17. 17. 17 What LinkedIn Want...
  18. 18. 18 Looks Familiar : Use Messaging (i.e. JMS, RabbitMQ)
  19. 19. 19
  20. 20. 20
  21. 21. 21
  22. 22. 22
  23. 23. 23 It Didn't Scale...
  24. 24. 24 Paradigm Change : Do NOT track message consumption
  25. 25. 25
  26. 26. 26
  27. 27. 27
  28. 28. 28 Stateless Broker & Doesn't Fear the File System
  29. 29. 29 Topics ● Logical collections of partitions (the physical fi les). ● A broker contains some of the partitions for a topic
  30. 30. 30 A partition is Consumed by Exactly One Group's Consumer
  31. 31. 31 Distributed & Fault-Tolerant
  32. 32. 32 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  33. 33. 33 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  34. 34. 34 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  35. 35. 35 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  36. 36. 36 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  37. 37. 37 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  38. 38. 38 Broker 1 Broker 4Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  39. 39. 39 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  40. 40. 40 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  41. 41. 41 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Consumer 2 Producer 1 Producer 2
  42. 42. 42 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Producer 1 Producer 2
  43. 43. 43 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Producer 1 Producer 2
  44. 44. 44 Broker 1 Broker 3Broker 2 Zoo Keeper Consumer 1 Producer 1 Producer 2
  45. 45. 45 Performance Benchmark 1 Broker 1 Producer 1 Consumer
  46. 46. 46
  47. 47. 47
  48. 48. 48 LinkedIn Kafka Performance (2012) ● 8 nodes per datacenter – ~20 GB RAM available for Kafka~20 GB RAM available for Kafka – 6TB storage, RAID 10, basic SATA drives6TB storage, RAID 10, basic SATA drives ● 10 billion messages/day ● Sustained peak: – 172,000 messages/second written172,000 messages/second written – 950,000 messages/second read950,000 messages/second read ● 367 topics ● 40 real-time consumers ● Many ad hoc consumers ● 9.5TB log retained (~ 6 days) ● End-to-end delivery time: A few seconds
  49. 49. 49 Thanks

×