Kafka Evaluation - High Throughout Message Queue

976
-1

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
976
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Kafka Evaluation - High Throughout Message Queue

  1. 1. Kafka Evaluation for Data Pipeline and ETL Shafaq Abdullah @ GREE Date: 10/21/2013
  2. 2. High-Level Architecture Performance Scalability Fault-Tolerance/Error Recovery Operations and Monitoring Summary Outline
  3. 3. Kafka - A distributed messaging pub/sub system In Production: Linkedin, FourSquare, Twitter, GREE (almost) Usage: - Log aggregation (user activity stream) - Real-time events - Monitoring - Queuing Intro
  4. 4. Point-to-Point Data Pipelines Game Server 1 Logs Hadoop ... User Tracking SecuritySearch Social Graph Rules/ Recomm endation/ Engine VerticaOps DataWare House
  5. 5. Message- Central Data Channel Game Server 1 Logs Hadoop ... User Tracking SecuritySearch Social Graph Rules/ Recomm endation/ Engine VerticaOps Data Warehouse Message Queue
  6. 6. Topic- A String representing Message Stream Id Partition- Logical Division per topic-level within Broker, for writing logs generated by producer. e.g: kafkaTopic - kafkaTopic1, kafkaTopic2 Replica-Sets- Replica within Broker for a certain partition, with Leader(writes) and follower (read) balanced using hash- key modulu. Kafka Jargon
  7. 7. Log- Message Queue
  8. 8. Log- Message Queue
  9. 9. AWS m1.Large instance Dual Core Intel Xeon 64- bit@2.27GHz 7.5 GB RAM 2 x 420 GB hardisk Hardware Spec of Broker Cluster
  10. 10. Performance Results Producer thread Transactions Processing time (s) Throughput (transaction/sec) 1 488396 64.129 7616 2 1195748 110.868 10785 4 1874713 140.375 13355 10 47269410 338.094 13981 17 7987317 568.028 14061
  11. 11. Latency vs Durability Ack Status TIme to publish (ms) Tradeoff No Ack 0.7 Greater data loss Wait for ack 1.5 Lesser data loss
  12. 12. 1. Create a Kafka-Replica Sink 2. Feed data to Copier via Kafka-Sink 3. Benchmark Kakfa-Copier-Vertica Pipeline 4. Improve/Refactor for Performance Integration Plan
  13. 13. C- Consistency (Producer Sync Mode) A- Availability (Replication) P- Partition Tolerance (Cluster in same network- No Network delay) Strongly Consistent and Highly Available CA-P theorem
  14. 14. Conventional Quoram Replication: 2f+1 replicas → f failures (e.g. ZooKeeper) Kafka Replication: f+1 replicas → f failures Failure Recovery
  15. 15. Leader : ➢  Message is propagated to follower ➢  Commit offset is checkpointed to disk Follower failure and Recovery: ➢  Kicked out of ISR ➢  After restart, truncates log to last commit ➢  Catches up with leader → ISR Error Handling: Follower Failure
  16. 16. ➢ Embedded Controller via ZK detects leader failure ➢ Leader election from ISR ➢ Committed message not lost Error Handling: Leader Failure
  17. 17. - Horizontally scalable Add partitions in Broker Cluster as higher throughput needed (~10Mb/s /server) - Balancing for producer and consumer Scalability
  18. 18. •  Number of messages the consumer lags behind the producer by •  Max lag in messages btw follower and leader replica •  Unclean leader election rate •  Is controller active on broke Monitoring via JMX
  19. 19. 3 Node cluster ~ 25 MB/s, < 20 ms latency (end2end) having replication factor of 3 with 6 consumer group Monthly operations $500 + $250 (zookeeper 3 node) + 1 mm Operation Costs
  20. 20. - No callback in producer’s send() - Fully automatic balancing till script is manually run- Balancing layer of topics via ZK Nothing is perfect
  21. 21. • Infinite scaling with 10Mb/s /server with < 10 ms latency • Costs <$1000 /month + 1 man-month • No impact on current pipeline • 0.8 final release due in days to be used in production Summary
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×