Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling MQTT With Apache Kafka

31,114 views

Published on

My slides for ApacheCon North America 2014.

Published in: Technology
  • Can we use kafka over mqtt?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Yes. This is exactly what this slideshare is about. Kafka is a queue. You can substitute any other scalable queue instead of Kafka.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Tim, It looks like the problem you are referring to "firehose" could have been solved using "queue" concept available in JMS, STOMP etc. Basically the published messages are published again to a queue, from which a bunch of subscribers can consume. Any JMS or STOMP broker would load balance the messages to the the available subscribers. Did you consider this? If so, why didn't you use that? How is this different from your solution?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Scaling MQTT With Apache Kafka

  1. 1. SCALING MQTT WITH KAFKA Tim Kellogg April 7, 2014 @kellogh
  2. 2. •  MQTT broker •  Protocol onboarding •  Cloud environment (we’re a startup)
  3. 3. •  Standard •  Lightweight o  >= 2 byte overhead per message •  Easy to parse o  Length prefixed strings •  Requires very little resources on client side o  Broker keeps track of state
  4. 4. •  Reliable o  QoS 1 & 2 o  Last Will & Testament messages •  Secure o  Username + Password o  Tunnel over TLS
  5. 5. Publish / Subscribe Pub Pub Pub Broker Sub Topic/A Topic/B Topic/C Topic/B SubTopic/C SubTopic/A
  6. 6. Topics •  foo/bar/baz •  com.example/device/17/thermo •  Patterns •  com.example/device/+/thermo •  com.example/device/#
  7. 7. Scaling Goals •  More than 2 Million connected publishers •  More than 65,000 msg/s •  Single subscriber
  8. 8. Scaling Goals •  Amazon’s EC2 •  Horizontal scaling o  Reduce cost o  Plan for the future o  Less impact from downtime
  9. 9. Problems with Scaling MQTT
  10. 10. Load Balancing •  Which broker to connect to? o  DNS load balancing •  HAProxy •  QoS 1-2 messages stored in Cassandra o  Consistent hash ring
  11. 11. Single Subscriber Pub Pub Pub Broker Sub Topic/A Topic/B Topic/C Topic/#
  12. 12. Single Subscriber Pub Pub Pub Broker Sub Topic/A Topic/B Topic/C Topic/# Broker Broker LoadBalancing
  13. 13. Single Subscriber Pub Pub Pub Broker Sub Topic/A Topic/B Topic/C Topic/# Broker Broker LoadBalancing LoadBalancing
  14. 14. Single Subscriber Broker Subscriber Topic/# Broker Broker
  15. 15. Using HTTP
  16. 16. POST From The Broker Pub Pub Pub Broker Topic/A Topic/B Topic/C Broker Broker HTTP POST LoadBalancing Server HTTP POST Server HTTP POST Server LoadBalancing
  17. 17. Benefits •  Easy to load balance •  Well known & well supported
  18. 18. Drawbacks •  HTTP is heavy •  Headers •  Creating & destroying TCP connections •  Subscriber servers must be available •  Retry logic to guarantee delivery
  19. 19. Apache Kafka
  20. 20. •  •  Distributed log aggregation framework •  Server to server •  “Smart” clients •  Apache ZooKeeper
  21. 21. •  Append-only files per topic o  Client keeps track of what messages it’s processed •  No topic wildcards •  Key is used for out of band data •  device/42/thermo è topic: device-thermo key: 42
  22. 22. Subscriber Group Pub Pub Pub Broker
  23. 23. Subscriber Group Pub Pub Pub Broker Broker Broker LoadBalancing Kafka
  24. 24. Results
  25. 25. •  Linear scaling for fire hose subscriber •  At least 2 million clients •  At least 65,000 msg/s
  26. 26. Wish List •  Security •  Configuration
  27. 27. Open Source IoT
  28. 28. The Book: Mastering The Internet of Things Questions? @kellogh

×