Storm: Distributed and fault tolerant realtime computation

  • 338 views
Uploaded on

 

More in: Engineering , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
338
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
24
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Storm: Distributed and fault-tolerant realtime computation Ferran Galí i Reniu @ferrangali 19/06/2014
  • 2. Ferran Galí i Reniu ● UPC - FIB ● Trovit ○ Hadoop ○ Lucene/Solr ○ Storm
  • 3. Big Data ● Too much data ○ Store ○ Compute ○ Analyse ● Distributed systems ○ Provide horizontal scalability
  • 4. ● Hadoop Distributed Systems HDFS HDFS HDFS File
  • 5. ● Hadoop Distributed Systems HDFS MapReduce HDFS MapReduce HDFS MapReduce File
  • 6. Distributed Systems ● Hadoop ○ Huge files ○ Useful for batch ○ High latency ○ No real time
  • 7. Storm “Storm is a distributed realtime computation system. Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use!” http://storm.incubator.apache.org/
  • 8. Storm ● Who’s using it?
  • 9. ● Tuple ○ Ordered list of elements ○ Any type Storm String Integer Serialized Object ...
  • 10. Storm ● Stream ○ Unbounded sequence of tuples Tuple Tuple Tuple Tuple Tuple Tuple Tuple
  • 11. Storm ● Spout ○ Source of streams ○ From data sources: Queues, API... Tuple Tuple Tuple Tuple Tuple
  • 12. Storm ● Bolt ○ Consumes streams ○ Does some processing (transform, join,...) ○ Emits streams Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple
  • 13. Storm ● Topology ○ Graph of spouts & bolts ○ Runs forever
  • 14. Architecture Nimbus Zookeeper Zookeeper Zookeeper Master Worker Worker Coordinator Supervisor Slot Slot Slot Slot Supervisor Slot Slot Slot Slot
  • 15. Architecture Supervisor Slot Slot Slot Slot Worker process Single JVM Tasks - Threads
  • 16. parallelism hint = 4 parallelism hint = 1 parallelism hint = 2 parallelism hint = 2 parallelism hint = 3 parallelism hint = 4 Supervisor Slot Slot Slot Slot Supervisor Slot Slot Slot Slot Worker processes = 8
  • 17. parallelism hint = 4 parallelism hint = 1 parallelism hint = 2 parallelism hint = 2 parallelism hint = 3 parallelism hint = 4 Worker processes = 8 combined parallelism = 4 + 1 + 2 + 2 + 3 + 4 = 16 Tasks per worker = 16 / 8 = 2 Supervisor Supervisor
  • 18. Example: Word Count line line line word word word File FileSpout SplitterBolt CounterBolt parallelism hint = 2 parallelism hint = 3 parallelism hint = 2
  • 19. SplitterBoltFileSpout Example: Word Count CounterBolt Storm is a distributed realtime computation system. Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use!
  • 20. SplitterBoltFileSpout Example: Word Count CounterBolt Storm is a distributed Storm is a distributed realtime computation system. Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use!
  • 21. SplitterBoltFileSpout Example: Word Count CounterBolt Storm is a distributed Storm is a distributed realtime computation system. Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use! realtime computation system. Storm provides a
  • 22. SplitterBoltFileSpout Example: Word Count CounterBolt Storm is a distributed Storm is a distributed realtime computation system. Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use! realtime computation system. Storm provides a shuffle grouping
  • 23. SplitterBoltFileSpout Example: Word Count CounterBolt Storm is a distributed Storm is a distributed realtime computation system. Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use! realtime computation system. Storm provides a Storm a is distributed realtime computation system provides Storm a shuffle grouping
  • 24. SplitterBoltFileSpout Example: Word Count CounterBolt Storm is a distributed Storm is a distributed realtime computation system. Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use! realtime computation system. Storm provides a Storm a is distributed realtime computation system provides Storm a Storm a is distributed realtime computation system provides Storm a x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 shuffle grouping
  • 25. SplitterBoltFileSpout Example: Word Count CounterBolt Storm is a distributed Storm is a distributed realtime computation system. Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use! realtime computation system. Storm provides a shuffle grouping a is Storm distributed provides a Storm is distributed realtime computation system a x2 x1 x1 x1 x2 x1 x1 x1 realtime computation provides fields grouping system Storm
  • 26. Groupings ● Shuffle grouping ● Fields grouping ● All grouping ● Global grouping ● Direct grouping ● Local or shuffle grouping
  • 27. Fault-tolerance Nimbus Zookeeper Zookeeper Zookeeper Supervisor Supervisor
  • 28. ● Worker dies ○ Supervisor will restart it ● Worker dies too many times ○ Nimbus will reassign it to another node ● Node dies ○ Nimbus will reassign task to another node ● Nimbus is not a SPOF ● Nimbus & Supervisors are fail-fast Fault-tolerance
  • 29. Guaranteeing message processing ● Through API ○ ack ○ fail ● Manual tuple replay ○ e.g: Spout emits again message with specific id
  • 30. Guaranteeing message processing ● When is a message “fully processed”? ● Solutions ○ Transactional Topologies ○ Trident framework Storm is a distributed Storm is distributed a Ok Fail Ok Ok
  • 31. Yet another example tweet tweet tweet word word word TwitterSpout SplitterBolt CounterBolt CommitBolt signal signal signal DB shuffle grouping fields grouping all grouping https://github.com/ferrangali/betabeers-storm
  • 32. Batch + Real time ● Lambda architecture Serving Batch layer ● High latency ● Reprocesses all data New data
  • 33. Batch + Real time ● Lambda architecture Speed layer Serving Batch layer ● Low latency ● Fast & incremental algorithms ● Eventually overridden by batch layer ● High latency ● Reprocesses all data New data
  • 34. Storm ● Who’s using it?
  • 35. Trovit ● 40 countries ● 5 verticals ● Hundreds of millions of ads
  • 36. Trovit ● Batch layer: ○ MapReduce pipeline over HDFS HDFS Filter Enrich Dedup Index kafka xml
  • 37. Trovit ● Speed layer ○ Storm topology ad ad ad ad ad ad rich ad rich ad rich ad Feeds Spout Kafka Spout Processor Bolt Indexer Bolt Group by index Commit in batch every 5 minutes kafka xml
  • 38. Trovit HDFS Filter Enrich Dedup Index ad ad ad ad ad ad richad richad richad HBaseZookeeper kafka xml
  • 39. Questions? Ferran Galí i Reniu @ferrangali 19/06/2014