0
Target and Connect Intelligently
Experience with Kafka & Storm
Otto Mok
Solution Architect, AcuityAds
April 30, 2014 – Tor...
2
Agenda
• Background
– What does AcuityAds do?
• Use case
– What are we trying to do?
• High-level System Architecture
– ...
3
Background
Source: https://www.google.ca/search?q=banner+ads&tbm=isch&tbo=u
4
Background
• Digital Advertising
– Website banner, pre-roll video, free mobile app
• Buy ad impressions at ‘real-time’
–...
5
Use case
• 10+ billion daily impressions
• 30,000+ new sites daily
• How many daily impressions by site?
• How are the i...
6
High-level System Architecture
• 10+ billion daily bid requests
• Make up to 4 billion daily bids
• Serve millions of da...
7
Kafka
Source: http://kafka.apache.org/documentation.html
8
Kafka - Spec
• Kafka v0.8.0
• Servers – 10 x 2U(10 x 3TB) JBOD
• Total storage – 300 TB
• Replication – 3x
• Unique data...
9
Kafka - Monitoring
• Nagios
– Ping, CPU, memory, network I/O, disk space
• Producer-Consumer group message counting
– Ho...
10
Kafka - Monitoring
• Kafka Web Console
– Partition offset for each consumer group
11
Kafka - Issues
• Issue 1 - Partitions
– 10 partitions
– Each partition > 1 TB a day
– 100 TB / 1 TB – no problem!
• Eac...
12
Kafka - Issues
• Issue 2 – Unbalanced partition distribution
– Some servers running out of space
– Some servers are not...
13
Lots of data – now what?
Source: http://bookriotcom.c.presscdn.com/wp-content/uploads/2013/03/server-farm-shot.jpg
14
Use case - again
• 10+ billion daily impressions
• 30,000+ new sites daily
• How many daily impressions by site?
• How ...
15
Storm
Source: http://storm.incubator.apache.org/documentation/Tutorial.html
16
Storm - Spec
• Storm v0.8.2
• Servers – 13 x Dual Quad Core Xeon 36G RAM
• 4 worker slots per server
• Total logical CP...
17
Storm - Monitor
18
Storm - Topology
• Spout read each BidRequest from Kafka topic
• Determine new or existing, emit tuples to
different “s...
19
Storm - Topology
• InsertInventoryBolt
– Process tuples from NewInventory stream
– Field grouping on sourceId, domainNa...
20
Storm - Topology
• LogInventoryBolt
– Process tuples from ExistingInventory stream
– Field grouping on inventoryId
– Ti...
21
Storm - Issues
• Issue – Low uptime
– 10 workers, 100 executors
– Not processing many tuples
– Process latency < 10ms
•...
22
Conclusion
• Cost
– Bleed edge technology  bugs
– Support  mailing lists
– Monitoring  roll your own
– Operation  d...
23
Forward Looking
• Kafka v0.8.1.1
– Allow specify broker hostname for producer &
consumer
– Change # of partitions of a ...
24
Thank you
Otto Mok
otto.mok@acuityads.com
Source: http://jamesgieordano.files.wordpress.com/2011/05/babyelephant.jpg
Upcoming SlideShare
Loading in...5
×

Experience with Kafka & Storm

2,291

Published on

Experience with Kafka & Storm by Otto Mok

Published in: Technology, Design

Transcript of "Experience with Kafka & Storm"

  1. 1. Target and Connect Intelligently Experience with Kafka & Storm Otto Mok Solution Architect, AcuityAds April 30, 2014 – Toronto Hadoop User Group
  2. 2. 2 Agenda • Background – What does AcuityAds do? • Use case – What are we trying to do? • High-level System Architecture – How does the data flow? • Kafka & Storm – What did we do wrong?
  3. 3. 3 Background Source: https://www.google.ca/search?q=banner+ads&tbm=isch&tbo=u
  4. 4. 4 Background • Digital Advertising – Website banner, pre-roll video, free mobile app • Buy ad impressions at ‘real-time’ – Response within 50ms for auction • Find best match between people and ads – Show ad that you care about • Use machine learning algo to ‘learn’ – Data, data, data
  5. 5. 5 Use case • 10+ billion daily impressions • 30,000+ new sites daily • How many daily impressions by site? • How are the impressions distributed? – Country, Province, Gender, Age Range, etc...
  6. 6. 6 High-level System Architecture • 10+ billion daily bid requests • Make up to 4 billion daily bids • Serve millions of daily impressions • 10+ TB of messages daily • 300k+ message / second Bidder Adserver Kafka Hbase/Hadoop Storm
  7. 7. 7 Kafka Source: http://kafka.apache.org/documentation.html
  8. 8. 8 Kafka - Spec • Kafka v0.8.0 • Servers – 10 x 2U(10 x 3TB) JBOD • Total storage – 300 TB • Replication – 3x • Unique data – 100 TB • Capacity – a few days • Producer acknowledgment – never waits • Topic - BIDREQUEST
  9. 9. 9 Kafka - Monitoring • Nagios – Ping, CPU, memory, network I/O, disk space • Producer-Consumer group message counting – Hourly consumption rate check Topic Consumer Group ID Producer Count Consumer Count Error Ratio BIDREQUEST InventoryTopology 122,450,812 122,444,294 None 1.00 BIDREQUEST SearchTargetingTopology 122,450,812 107,755,295 Ratio below 98% 0.88
  10. 10. 10 Kafka - Monitoring • Kafka Web Console – Partition offset for each consumer group
  11. 11. 11 Kafka - Issues • Issue 1 - Partitions – 10 partitions – Each partition > 1 TB a day – 100 TB / 1 TB – no problem! • Each partition is stored in a directory – /disk05/kafka-logs/BIDREQUEST-09 – /disk09/kafka-logs/BIDREQUEST-03
  12. 12. 12 Kafka - Issues • Issue 2 – Unbalanced partition distribution – Some servers running out of space – Some servers are not “leader” for any partition • Network glitch cause server to drop out of cluster, no longer leader after rejoin • auto.leader.rebalance.enable=true
  13. 13. 13 Lots of data – now what? Source: http://bookriotcom.c.presscdn.com/wp-content/uploads/2013/03/server-farm-shot.jpg
  14. 14. 14 Use case - again • 10+ billion daily impressions • 30,000+ new sites daily • How many daily impressions by site? • How are the impressions distributed? – Country, Province, Gender, Age Range, etc...
  15. 15. 15 Storm Source: http://storm.incubator.apache.org/documentation/Tutorial.html
  16. 16. 16 Storm - Spec • Storm v0.8.2 • Servers – 13 x Dual Quad Core Xeon 36G RAM • 4 worker slots per server • Total logical CPUs – 208 • Total memory – 468 G • Total slots – 52 worker slots (JVMs)
  17. 17. 17 Storm - Monitor
  18. 18. 18 Storm - Topology • Spout read each BidRequest from Kafka topic • Determine new or existing, emit tuples to different “streams”
  19. 19. 19 Storm - Topology • InsertInventoryBolt – Process tuples from NewInventory stream – Field grouping on sourceId, domainName – Tick tuple every 1 second • UpdateInventoryBolt – Process tuples from ExistingInventory stream – Field grouping on inventoryId – Tick tuple every 1 second
  20. 20. 20 Storm - Topology • LogInventoryBolt – Process tuples from ExistingInventory stream – Field grouping on inventoryId – Tick tuple every 10 seconds
  21. 21. 21 Storm - Issues • Issue – Low uptime – 10 workers, 100 executors – Not processing many tuples – Process latency < 10ms • Bolts restarts due to uncaught Exceptions
  22. 22. 22 Conclusion • Cost – Bleed edge technology  bugs – Support  mailing lists – Monitoring  roll your own – Operation  dedicated personnel • Benefit – Near real-time data on site impression volume & distribution by geo, demo, etc...
  23. 23. 23 Forward Looking • Kafka v0.8.1.1 – Allow specify broker hostname for producer & consumer – Change # of partitions of a topic online • Storm v0.9.1 – Faster pure Java Netty transport – View logs from each server from Storm UI – Tick tuple using floating point seconds – Storm on Hadoop (HDP 2.1)
  24. 24. 24 Thank you Otto Mok otto.mok@acuityads.com Source: http://jamesgieordano.files.wordpress.com/2011/05/babyelephant.jpg
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×