Your SlideShare is downloading. ×
0
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Online Media Data Stream Processing with Kafka
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Online Media Data Stream Processing with Kafka

1,379

Published on

This talk was held at the third meeting of the Swiss Big Data User Group on September 17 at ETH Zürich.

This talk was held at the third meeting of the Swiss Big Data User Group on September 17 at ETH Zürich.

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,379
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. CC 2.0 by William Brawley | http://flic.kr/p/7PdUP3
  • 2. 18. Septem ber 2012•  What is Streaming Data? 2•  Why Kafka?•  Kafka Architecture•  Use Case: Prospective SearchOverview
  • 3. 18. Septem ber 2012•  Spin-off of MeMo News AG, the 3 leading provider for Social Media Monitoring & Analytics in Switzerland•  Big Data expert, focused on Hadoop, HBase and Solr•  Objective: Transforming data into insightsAbout Sentric
  • 4. CC 2.0 by audreyjm529| http://flic.kr/p/mNMtL  
  • 5. 18. Septem ber 2012•  Website Activity Data 5 •  User activity •  Server activity•  Social Media Data•  News Data•  …•  How to Analyze in Real-Time?What is Streaming Data?Data Streams
  • 6. 18. Septem ber 2012 6 now   t   Offline  (Hadoop/MR)   Online  (Ka5a)  What is Streaming Data?Offline vs. Online
  • 7. CC 2.0 by Tom Hilton | http://flic.kr/p/54KSXy  
  • 8. 18. Septem ber 2012•  Message Queues (RabbitMQ, ActiveMQ) 8 •  do not scale / have no persistence•  Flume / Scribe •  Log-Aggregation only, high throughput and scalable, push model •  Focus on offline consumption•  Kafka •  High throughput and scalable, pull model •  Different consumption profilesWhy Kafka?Streaming Systems
  • 9. 18. Septem ber 2012 9Source:  h<p://research.microso@.com/en-­‐us/um/people/srikanth/netdb11/netdb11papers/netdb11-­‐final12.pdf  Why Kafka?Consumer Performance
  • 10. CC 2.0 by Presidente | http://flic.kr/p/2ptSZ  
  • 11. 18. Septem ber 2012•  Messaging System 11•  Publish-Subscribe•  Persistent•  High-ThroughputKafka ArchitectureKey Concepts
  • 12. 18. Septem ber 2012 12 ZooKeeper Producer Consumer Producer Broker Consumer Producer Push Pull Consumer ProducerKafka ArchitectureMessaging
  • 13. 18. Septem ber 2012 Topics 13 logs … page-views Msg Msg MsgConsumer Consumer ConsumerKafka ArchitecturePublish-Subscribe
  • 14. 18. Septem ber 2012•  Persists messages to disc 14 •  Topic is base abstraction •  Binary write ahead log •  No message ID •  Message offset ID (byte position)•  Messages retained a specific time •  Default is 7 daysKafka ArchitecturePersistent
  • 15. 18. Septem ber 2012•  API Simplicity 15 •  Append message •  Fetch message from given byte position•  Batching•  Stateless Broker•  O(1) disc access (no seeks)•  Use of operating system featuresKafka ArchitectureHigh-Throughput
  • 16. CC 2.0 by nolifebeforecoffee | http://flic.kr/p/c1UTf
  • 17. 18. Septem ber 2012 n News Agents 17 Kafka REST RT Alerts Web-UI HBase MySQL Solr Icons by http://dryicons.comProspective SearchSolution Architecture
  • 18. 18. Septem ber 2012 18 Processing Pull (Batch) Prospective Search RT Alerts Kafka Consumer Icons by http://dryicons.comProspective SearchProspective Search with Kafka
  • 19. 18. Septem ber 2012•  http://incubator.apache.org/kafka/ 19•  http://sites.computer.org/debull/ A12june/A12JUN-CD.pdfResources to get started
  • 20. 18. Septem ber 2012 20 Questions? Christian Gügi, christian.guegi@sentric.chSwiss Big Data User GroupThank you!

×