Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kafka meetup seattle 2019 mirus reliable, high performance replication for apache kafka


Published on

Kafka Mirror Maker

Published in: Engineering
  • MADE $30 ON MY FIRST DAY! Being a fresh graduate and having lots of free time, I stumbled upon your site when I was searching for work at home opportunities, good thing I did! Just on my first day of joining I already made $30! Now I'm averaging close to $80 a day just for filling out surveys! ●●●
    Are you sure you want to  Yes  No
    Your message goes here

Kafka meetup seattle 2019 mirus reliable, high performance replication for apache kafka

  1. 1. Mirus Reliable, high performance replication for Apache Kafka Paul Davidson, Principal Engineer Seattle Apache Kafka Meetup April 18, 2019
  2. 2. Kafka Data Replication at Salesforce Must send data from multiple global data centres to aggregate clusters • Securely • No data loss • Minimal latency • Minimal data duplication Challenges: • Rapidly increasing load, multiple DCs • Dynamic environment: topics frequently added and removed • WAN connections
  3. 3. Kafka Data Replication at Salesforce Apache Mirror Maker • Simple tool provided with Apache Kafka • A Kafka Consumer sending data to a Kafka Producer • Static configuration • Regex whitelist for topics How was it done?
  4. 4. Mirror Maker
  5. 5. Mirror Maker at Scale ● Mirror Maker works well for small static clusters, but ... ● At large scale: ○ Re-balance loops ■ Fix: Careful configuration ● Increase, ○ Unhandled exceptions (especially in Kafka < 0.11) ■ Fix: Apply upstream patches and stay up-to-date! ○ Poor handling of missing / offline destination partitions ■ Fix: Custom patch for the Consumer Group Coordinator ● Filter any topics currently unavailable in the destination ● Reliable, but definitely a work-around
  6. 6. Mirror Maker at Scale ● Poor control of partition assignment impacts performance ○ Stops consuming while committing a batch ○ Large batches required for throughput across WAN ○ Must run many instances per node to achieve good throughput ● Not suitable for API-driven cluster management ○ Static configuration: release and restart to update! ○ Limited control of topic white-list There must be a better way ...
  7. 7. Mirus Dynamic Replication service based on Kafka Connect Introducing...
  8. 8. Mirus? Latin root of Mirror: “wonderful, marvelous”
  9. 9. Aside: Kafka Connect ● Built-in Kafka framework ○ For reliably streaming data to and from Kafka ● Dynamic configuration and status with REST API ● Handles cluster management ○ “Distributed Herder” built on Kafka cluster management ○ Backed by compacted Kafka topics ● Continuous ingestion ○ Keeps consuming while committing offsets ○ Supports multiple consumers and producers
  10. 10. Introducing Mirus ● Based on the Kafka Connect framework ● Dynamic configuration ○ REST API for configuration updates ○ Precise control of replication ● Configurable parallelism ● Support for task-level consumer and producer monitoring ● Improved resiliency: automated restart on Kafka Connect task failure
  11. 11. Kafka Connect Overview ● Kafka Connect cluster ○ A distributed a set of Worker processes ○ One or more Workers per host ○ Managed by Distributed Herder ● Worker Processes ○ Tasks ○ Connectors ■ Source - read from X, write to Kafka ■ Sink - read from Kafka, write to X
  12. 12. Mirus Internals ● Mirus includes: ○ Custom Source Connector implementation ○ Custom Source Task implementation ○ Customized Worker entry point ○ Custom monitor threads
  13. 13. Mirus “Kafka Monitor” ● The heart of Mirus ○ Thread managed by the Mirus Source Connector ○ Monitors the state of source and destination Kafka clusters ● Handles partition assignment ○ Applies white-list to current state ○ Missing destination topics are counted but not mirrored ○ Triggers rebalances when partition assignments change: ■ Source configuration changes ■ Source partitions changes ■ Destination partitions creation / deletion
  14. 14. Worker Process
  15. 15. Dynamic Configuration ● REST API to POST connector configuration updates ○ Connector config dynamic, worker config static ● Config updates trigger rebalance ● Configurable location, can be in Source or Destination Kafka Cluster ○ Use the cluster closest to the Mirus Workers
  16. 16. Partition and Task Assignment ● Happens on every rebalance ● Mirus partition algorithm pluggable ○ Round-robin by default ○ Could use metadata, e.g. high-throughput. ● Task assignment ○ KC framework: round-robin only ○ Not pluggable (we could patch this).
  17. 17. Open Sourced Released September 2018 BSD License
  18. 18. What’s Next? ● Integration with our Kafka orchestration tooling ○ Rapidly provision and mirror new topics across multiple clusters ● Topic creation, topic metadata replication ○ Has been requested by users ● Mirus Sink Connector ○ Improved support for multiple destination clusters in push configurations
  19. 19. Q&A
  20. 20. Appendix
  21. 21. REST API Example ● PUT connector configuration update ○ E.g. increase number of tasks. ● Handler writes to configuration topic ● Distributed Herder triggers rebalance Increase the number of parallel tasks bash-4.1$ curl localhost:8093/connectors/source-name/config -X PUT -- data '{ "connector.class": "com.salesforce.mirus.kafka.connect.MirusSourceConnector", "consumer.bootstrap.servers": "source-hostname:9093", "destination.bootstrap.servers": "dest-hostname:9093", "name": "source-name", "tasks.max": "90", "topics.regex": "^topic-name.*$", ... }'
  22. 22. DC Push Mode Replication DCDCLeaf DCs Kafka Cluster Mirus Cluster Aggregate DCs Kafka Cluster WAN
  23. 23. DCAggregate DCs Pull Mode Replication Kafka Cluster Mirus Cluster Leaf DCs Kafka Cluster WAN
  24. 24. Mirus Clusters