Kafka Evaluation for
Data Pipeline and ETL
Shafaq Abdullah @ GREE
Date: 10/21/2013
High-Level Architecture
Performance
Scalability
Fault-Tolerance/Error Recovery
Operations and Monitoring
Summary
Outline
Kafka -
A distributed messaging pub/sub system
In Production: Linkedin, FourSquare, Twitter, GREE (almost)
Usage:
- Log aggregation (user activity stream)
- Real-time events
- Monitoring
- Queuing
Intro
Point-to-Point Data Pipelines
Game
Server 1
Logs
Hadoop
...
User
Tracking
SecuritySearch
Social
Graph
Rules/
Recomm
endation/
Engine
VerticaOps
DataWare
House
Message- Central Data Channel
Game
Server 1
Logs
Hadoop
...
User
Tracking
SecuritySearch
Social
Graph
Rules/
Recomm
endation/
Engine
VerticaOps
Data
Warehouse
Message Queue
Topic-
A String representing Message Stream Id
Partition-
Logical Division per topic-level within Broker, for
writing logs generated by producer.
e.g: kafkaTopic - kafkaTopic1, kafkaTopic2
Replica-Sets-
Replica within Broker for a certain partition, with
Leader(writes) and follower (read) balanced using hash-
key modulu.
Kafka Jargon
Log- Message Queue
Log- Message Queue
AWS m1.Large instance
Dual Core Intel Xeon 64-
bit@2.27GHz
7.5 GB RAM
2 x 420 GB hardisk
Hardware Spec of Broker Cluster
Performance Results
Producer thread Transactions Processing time (s)
Throughput
(transaction/sec)
1 488396 64.129 7616
2 1195748 110.868 10785
4 1874713 140.375 13355
10 47269410 338.094 13981
17 7987317 568.028 14061
Latency vs Durability
Ack Status TIme to publish (ms) Tradeoff
No Ack 0.7 Greater data loss
Wait for ack 1.5 Lesser data loss
1. Create a Kafka-Replica Sink
2. Feed data to Copier via Kafka-Sink
3. Benchmark Kakfa-Copier-Vertica Pipeline
4. Improve/Refactor for Performance
Integration Plan
C- Consistency (Producer Sync Mode)
A- Availability (Replication)
P- Partition Tolerance (Cluster in same
network- No Network delay)
Strongly Consistent and
Highly Available
CA-P theorem
Conventional Quoram Replication:
2f+1 replicas → f failures (e.g. ZooKeeper)
Kafka Replication:
f+1 replicas → f failures
Failure Recovery
Leader :
➢  Message is propagated to follower
➢  Commit offset is checkpointed to disk
Follower failure and Recovery:
➢  Kicked out of ISR
➢  After restart, truncates log to last commit
➢  Catches up with leader → ISR
Error Handling: Follower Failure
➢ Embedded Controller via ZK detects
leader failure
➢ Leader election from ISR
➢ Committed message not lost
Error Handling: Leader Failure
- Horizontally scalable
Add partitions in Broker Cluster as higher
throughput needed (~10Mb/s /server)
- Balancing for producer and consumer
Scalability
•  Number of messages the consumer lags behind the producer by
•  Max lag in messages btw follower and leader replica
•  Unclean leader election rate
•  Is controller active on broke
Monitoring via JMX
3 Node cluster ~ 25 MB/s, < 20 ms latency
(end2end)
having replication factor of 3 with 6
consumer group
Monthly operations
$500 + $250 (zookeeper 3 node) + 1 mm
Operation Costs
- No callback in producer’s send()
- Fully automatic balancing till script is
manually run- Balancing layer of topics
via ZK
Nothing is perfect
• Infinite scaling with 10Mb/s /server with
< 10 ms latency
• Costs <$1000 /month + 1 man-month
• No impact on current pipeline
• 0.8 final release due in days to be used
in production
Summary

Kafka Evaluation - High Throughout Message Queue

  • 1.
    Kafka Evaluation for DataPipeline and ETL Shafaq Abdullah @ GREE Date: 10/21/2013
  • 2.
  • 3.
    Kafka - A distributedmessaging pub/sub system In Production: Linkedin, FourSquare, Twitter, GREE (almost) Usage: - Log aggregation (user activity stream) - Real-time events - Monitoring - Queuing Intro
  • 4.
    Point-to-Point Data Pipelines Game Server1 Logs Hadoop ... User Tracking SecuritySearch Social Graph Rules/ Recomm endation/ Engine VerticaOps DataWare House
  • 5.
    Message- Central DataChannel Game Server 1 Logs Hadoop ... User Tracking SecuritySearch Social Graph Rules/ Recomm endation/ Engine VerticaOps Data Warehouse Message Queue
  • 6.
    Topic- A String representingMessage Stream Id Partition- Logical Division per topic-level within Broker, for writing logs generated by producer. e.g: kafkaTopic - kafkaTopic1, kafkaTopic2 Replica-Sets- Replica within Broker for a certain partition, with Leader(writes) and follower (read) balanced using hash- key modulu. Kafka Jargon
  • 7.
  • 8.
  • 9.
    AWS m1.Large instance DualCore Intel Xeon 64- bit@2.27GHz 7.5 GB RAM 2 x 420 GB hardisk Hardware Spec of Broker Cluster
  • 10.
    Performance Results Producer threadTransactions Processing time (s) Throughput (transaction/sec) 1 488396 64.129 7616 2 1195748 110.868 10785 4 1874713 140.375 13355 10 47269410 338.094 13981 17 7987317 568.028 14061
  • 11.
    Latency vs Durability AckStatus TIme to publish (ms) Tradeoff No Ack 0.7 Greater data loss Wait for ack 1.5 Lesser data loss
  • 12.
    1. Create aKafka-Replica Sink 2. Feed data to Copier via Kafka-Sink 3. Benchmark Kakfa-Copier-Vertica Pipeline 4. Improve/Refactor for Performance Integration Plan
  • 13.
    C- Consistency (ProducerSync Mode) A- Availability (Replication) P- Partition Tolerance (Cluster in same network- No Network delay) Strongly Consistent and Highly Available CA-P theorem
  • 14.
    Conventional Quoram Replication: 2f+1replicas → f failures (e.g. ZooKeeper) Kafka Replication: f+1 replicas → f failures Failure Recovery
  • 16.
    Leader : ➢  Messageis propagated to follower ➢  Commit offset is checkpointed to disk Follower failure and Recovery: ➢  Kicked out of ISR ➢  After restart, truncates log to last commit ➢  Catches up with leader → ISR Error Handling: Follower Failure
  • 17.
    ➢ Embedded Controller viaZK detects leader failure ➢ Leader election from ISR ➢ Committed message not lost Error Handling: Leader Failure
  • 18.
    - Horizontally scalable Addpartitions in Broker Cluster as higher throughput needed (~10Mb/s /server) - Balancing for producer and consumer Scalability
  • 19.
    •  Number ofmessages the consumer lags behind the producer by •  Max lag in messages btw follower and leader replica •  Unclean leader election rate •  Is controller active on broke Monitoring via JMX
  • 20.
    3 Node cluster~ 25 MB/s, < 20 ms latency (end2end) having replication factor of 3 with 6 consumer group Monthly operations $500 + $250 (zookeeper 3 node) + 1 mm Operation Costs
  • 21.
    - No callbackin producer’s send() - Fully automatic balancing till script is manually run- Balancing layer of topics via ZK Nothing is perfect
  • 22.
    • Infinite scaling with10Mb/s /server with < 10 ms latency • Costs <$1000 /month + 1 man-month • No impact on current pipeline • 0.8 final release due in days to be used in production Summary