Kafka Evaluation - High Throughout Message Queue
Upcoming SlideShare
Loading in...5

Kafka Evaluation - High Throughout Message Queue






Total Views
Views on SlideShare
Embed Views



2 Embeds 16

http://www.slideee.com 14
http://dschool.co 2



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Kafka Evaluation - High Throughout Message Queue Kafka Evaluation - High Throughout Message Queue Presentation Transcript

  • Kafka Evaluation for Data Pipeline and ETL Shafaq Abdullah @ GREE Date: 10/21/2013
  • High-Level Architecture Performance Scalability Fault-Tolerance/Error Recovery Operations and Monitoring Summary Outline
  • Kafka - A distributed messaging pub/sub system In Production: Linkedin, FourSquare, Twitter, GREE (almost) Usage: - Log aggregation (user activity stream) - Real-time events - Monitoring - Queuing Intro
  • Point-to-Point Data Pipelines Game Server 1 Logs Hadoop ... User Tracking SecuritySearch Social Graph Rules/ Recomm endation/ Engine VerticaOps DataWare House
  • Message- Central Data Channel Game Server 1 Logs Hadoop ... User Tracking SecuritySearch Social Graph Rules/ Recomm endation/ Engine VerticaOps Data Warehouse Message Queue
  • Topic- A String representing Message Stream Id Partition- Logical Division per topic-level within Broker, for writing logs generated by producer. e.g: kafkaTopic - kafkaTopic1, kafkaTopic2 Replica-Sets- Replica within Broker for a certain partition, with Leader(writes) and follower (read) balanced using hash- key modulu. Kafka Jargon
  • Log- Message Queue
  • Log- Message Queue
  • AWS m1.Large instance Dual Core Intel Xeon 64- bit@2.27GHz 7.5 GB RAM 2 x 420 GB hardisk Hardware Spec of Broker Cluster
  • Performance Results Producer thread Transactions Processing time (s) Throughput (transaction/sec) 1 488396 64.129 7616 2 1195748 110.868 10785 4 1874713 140.375 13355 10 47269410 338.094 13981 17 7987317 568.028 14061
  • Latency vs Durability Ack Status TIme to publish (ms) Tradeoff No Ack 0.7 Greater data loss Wait for ack 1.5 Lesser data loss
  • 1. Create a Kafka-Replica Sink 2. Feed data to Copier via Kafka-Sink 3. Benchmark Kakfa-Copier-Vertica Pipeline 4. Improve/Refactor for Performance Integration Plan
  • C- Consistency (Producer Sync Mode) A- Availability (Replication) P- Partition Tolerance (Cluster in same network- No Network delay) Strongly Consistent and Highly Available CA-P theorem
  • Conventional Quoram Replication: 2f+1 replicas → f failures (e.g. ZooKeeper) Kafka Replication: f+1 replicas → f failures Failure Recovery
  • Leader : ➢  Message is propagated to follower ➢  Commit offset is checkpointed to disk Follower failure and Recovery: ➢  Kicked out of ISR ➢  After restart, truncates log to last commit ➢  Catches up with leader → ISR Error Handling: Follower Failure
  • ➢ Embedded Controller via ZK detects leader failure ➢ Leader election from ISR ➢ Committed message not lost Error Handling: Leader Failure
  • - Horizontally scalable Add partitions in Broker Cluster as higher throughput needed (~10Mb/s /server) - Balancing for producer and consumer Scalability
  • •  Number of messages the consumer lags behind the producer by •  Max lag in messages btw follower and leader replica •  Unclean leader election rate •  Is controller active on broke Monitoring via JMX
  • 3 Node cluster ~ 25 MB/s, < 20 ms latency (end2end) having replication factor of 3 with 6 consumer group Monthly operations $500 + $250 (zookeeper 3 node) + 1 mm Operation Costs
  • - No callback in producer’s send() - Fully automatic balancing till script is manually run- Balancing layer of topics via ZK Nothing is perfect
  • • Infinite scaling with 10Mb/s /server with < 10 ms latency • Costs <$1000 /month + 1 man-month • No impact on current pipeline • 0.8 final release due in days to be used in production Summary