SlideShare a Scribd company logo
LLeecctuturree 1 18787--11 
Computer Science 425 
Distributed Systems 
CS 425 / ECE 428 
Computer Science 425 
Distributed Systems 
CS 425 / ECE 428 
Fall 2013 
Fall 2013 
Hilfi Alkaff 
November 5, 2013 
Lecture 21 
Stream Processing 
ã 2013, H. Alkaff, I. Gupta
LLeecctuturree 1 177--22 
MMoottiivvaattiioonn 
• Many important applications must process large 
streams of live data and provide results in near-real- 
time 
– Social network trends 
– Website statistics 
– Intrustion detection systems 
– etc. 
• Require large clusters to handle workloads 
• Require latencies of few seconds
TTrraaddiittiioonnaall SSoolluuttiioonnss ((MMaapp--RReedduuccee)) 
LLeecctuturree 1 177--33 
• No partial answers 
– Have to wait for the entire batch to finish 
• Hard to configure when we want to add more 
nodes. 
Mapper 1 
Mapper 2 
Reducer 1 
Reducer 2 
New Mapper
LLeecctuturree 1 177--44 
TThhee NNeeeedd ffoorr NNeeww FFrraammeewwoorrkk 
• Scalable 
• Second-scale 
• Easy to program 
• “Just” works 
– Adding/removing nodes should be transparent
LLeecctuturree 1 177--55 
EEnntteerr SSttoorrmm 
• Open-sourced at http://storm-project.net/ 
• Most watched JVM project 
• Multiple languages API supported 
– Python, Ruby, Fancy 
• Used by over 30 companies including 
– Twitter: For personalization, search 
– Flipboard: For generating custom feeds
LLeecctuturree 1 177--66 
SSttoorrmm’’ss TTeerrmmiinnoollooggyy 
• Tuples 
• Streams 
• Spouts 
• Bolts 
• Topologies
LLeecctuturree 1 177--77 
TTuupplleess 
• Ordered list of elements 
– E.g. [“Illinois”, “somebody@illinois.edu”]
Tuple Tuple Tuple Tuple Tuple Tuple Tuple 
LLeecctuturree 1 177--88 
SSttrreeaammss 
Unbounded sequence of tuples
LLeecctuturree 1 177--99 
SSppoouuttss 
Tuple Tuple Tuple Tuple 
Tuple Tuple Tuple Tuple 
Source of streams (E.g. Web logs)
Tuple Tuple Tuple 
Tuple Tuple Tuple 
LLeecctuturree 1 177--1100 
BBoollttss 
Tuple Tuple Tuple 
Processes tuples and create new streams
LLeecctuturree 1 177--1111 
BBoollttss ((CCoonnttiinnuueedd)) 
• Operations that can be performed 
– Filter 
– Joins 
– Apply/transform functions 
– Access DBs 
– Etc 
• Could specify amount of parallelism in each bolt 
– How to decide which task in bolt to route subset of the streams 
to?
LLeecctuturree 1 177--1122 
SSttrreeaammss GGrroouuppiinngg iinn BBoollttss 
• Shuffle Grouping 
– Streams are randomly distributed from to the bolt’s tasks 
• Fields Grouping 
– Lets you group a stream by a subset of its fields (Example?) 
• All Grouping 
– Stream is replicated across all the bolt’s tasks (Example?)
Tuple Tuple Tuple 
Tuple Tuple Tuple 
LLeecctuturree 1 177--1133 
TTooppoollooggiieess 
Tuple Tuple Tuple 
A directed graph of spouts and bolts
LLeecctuturree 1 177--1144 
WWoorrdd CCoouunntt EExxaammppllee ((MMaaiinn))
LLeecctuturree 1 177--1155 
WWoorrdd CCoouunntt EExxaammppllee ((SSppoouutt))
LLeecctuturree 1 177--1166 
WWoorrdd CCoouunntt EExxaammppllee ((BBoolltt))
LLeecctuturree 1 177--1177 
WWoorrdd CCoouunntt ((RRuunnnniinngg))
LLeecctuturree 1 177--1188 
SSttoorrmm CClluusstteerr 
• Master node 
– Runs a daemon called Nimbus 
– Responsible for distributing code around cluster 
– Assigning tasks to machines 
– Monitoring for failures 
• Worker node 
– Runs a daemon called Supervisor 
– Listens for work assigned to its machines 
• Zookeeper 
– Coordinates Nimbus and Supervisors communication 
– All state of Supervisor and Nimbus is being kept here
Supervisor 
LLeecctuturree 1 177--1199 
SSttoorrmm CClluusstteerr 
Nimbus Zookeeper 
Supervisor 
Zookeeper 
Zookeeper 
Supervisor 
Supervisor
GGuuaarraanntteeeeiinngg MMeessssaaggee PPrroocceessssiinngg 
LLeecctuturree 1 177--2200 
• When spout produces tuples, 
the followings are created: 
– Tuple for each word in the sentence 
– Tuple for the updated count for each 
word 
• A tuple is considered failed 
when its tree of messages 
fails to be fully processed 
within a specified timeout
LLeecctuturree 1 177--2211 
AAnncchhoorriinngg 
• When a word is being anchored, the spout tuple 
at the root of the tree will be replayed later on if 
the word tuple failed to be processed downstream 
– Might not necessarily wants this feature 
• An output can be anchored to more than one 
input tuple 
– Failure of one tuple causes multiple tuples to be replayed
MMiisscceellllaanneeoouuss ffoorr SSttoorrmm’’ss RReelliiaabbiilliittyy AAPPII 
LLeecctuturree 1 177--2222 
• Emit(tuple, output) 
– Each word is anchored if you specify the input tuple as the first 
argument to emit 
• Ack(tuple) 
– Acknowledge that you finish processing a tuple 
• Fail(tuple) 
– Immediately fail the spout tuple at the root of tuple tree if there 
is an exception from the database, etc 
• Must remember to ack/fail each tuple 
– Each tuple consume memory. Failure to do so results in 
memory leaks
LLeecctuturree 1 177--2233 
WWhhaatt’’ss wwrroonngg wwiitthh SSttoorrmm?? 
• Mutable state can be lost due to failure! 
– May update mutable state twice! 
– Processes each record at least once 
• Slow nodes? 
– No notion of time slices
LLeecctuturree 1 177--2244 
EEnntteerr SSppaarrkk SSttrreeaammiinngg 
• Research project from AMPLab Group at UC 
Berkeley 
• A follow-up from Spark research project 
– Idea: cache datasets in memory, dubbed Resilient Distributed 
Datasets (RDD) for repeated query 
– Able to perform 100 times faster than Hadoop MapReduce for 
iterative machine learning applications
LLeecctuturree 1 177--2255 
DDiissccrreettiizzeedd SSttrreeaamm PPrroocceessssiinngg 
• Run a streaming computation as a series of very 
small, deterministic batch jobs 
• Chop up the live stream into batches 
of X seconds 
• Spark treats each batch of data as 
RDDs and processes them using RDD 
operations 
• Finally, the processed results of the 
RDD operations are returned in 
batches
DDiissccrreettiizzeedd SSttrreeaamm PPrroocceessssiinngg ((CCoonnttiinnuueedd)) 
• Run a streaming computation as a series of very 
small, deterministic batch jobs 
LLeecctuturree 1 177--2266 
• Chop up the live stream into batches 
of X seconds 
• Spark treats each batch of data as 
RDDs and processes them using RDD 
operations 
• Finally, the processed results of the 
RDD operations are returned in 
batches
LLeecctuturree 1 177--2277 
EEffffeeccttss ooff DDiissccrreettiizzaattiioonn 
• (+) Easy to maintain consistency 
– Each time interval is processed independently 
» Especially useful when you have stragglers (slow nodes) 
» Not handled in Storm 
• (+) Handle out-of-order records 
– Can choose to wait for a limited “slack time” 
– Incrementally correct late records at application level 
• (+) Parallel recovery 
– Expose parallelism across both partitions of an operator and 
time 
• (-) Raises minimum latency
FFaauulltt--ttoolleerraannccee iinn SSppaarrkk SSttrreeaammiinngg 
LLeecctuturree 1 177--2288 
• RDDs are remember the sequence 
of operations that created it from 
the original fault-tolerant input 
data (Dubbed “lineage graph”) 
• Batches of input data are 
replicated in memory of multiple 
worker nodes, therefore fault-tolerant 
• Data lost due to worker failure, 
can be recomputed from input 
data
LLeecctuturree 1 177--2299 
FFaauulltt--ttoolleerraanntt SSttaatteeffuull PPrroocceessssiinngg 
• State data not lost even if a worker node dies 
– Does not change the value of your result 
• Exactly once semantics to all transformations 
– No double counting! 
• Recovers from faults/stragglers within 1 sec
LLeecctuturree 1 177--3300 
CCoommppaarriissoonn wwiitthh SSttoorrmm 
• Higher throughput than Storm 
– Spark Streaming: 670k records/second/node 
– Storm: 115k records/second/node
LLeecctuturree 1 177--3311 
MMoorree iinnffoorrmmaattiioonn 
• Storm 
– https://github.com/nathanmarz/storm/wiki/Tutorial 
– https://github.com/nathanmarz/storm-starter 
• Spark Streaming 
– http://spark.incubator.apache.org/docs/latest/streaming-programming- 
guide.html 
• Other Streaming Frameworks 
– Yahoo S4 
– LinkedIn Samza

More Related Content

What's hot

Ch2.1 computer system structures
Ch2.1 computer system structures Ch2.1 computer system structures
Ch2.1 computer system structures
Syaiful Ahdan
 
Data Structures for High Resolution, Real-time Telemetry at Scale
Data Structures for High Resolution, Real-time Telemetry at ScaleData Structures for High Resolution, Real-time Telemetry at Scale
Data Structures for High Resolution, Real-time Telemetry at Scale
ScyllaDB
 
Task migration using CRIU
Task migration using CRIUTask migration using CRIU
Task migration using CRIU
Rohit Jnagal
 
Galvin-operating System(Ch5)
Galvin-operating System(Ch5)Galvin-operating System(Ch5)
Galvin-operating System(Ch5)
dsuyal1
 
Cat @ scale
Cat @ scaleCat @ scale
Cat @ scale
Rohit Jnagal
 
Instruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar ProcessorsInstruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar Processors
Syed Zaid Irshad
 
RTAI - Earliest Deadline First
RTAI - Earliest Deadline FirstRTAI - Earliest Deadline First
RTAI - Earliest Deadline First
Stefano Bragaglia
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
DataWorks Summit
 
13 superscalar
13 superscalar13 superscalar
13 superscalar
Hammad Farooq
 
BKK16-311 EAS Upstream Stategy
BKK16-311 EAS Upstream StategyBKK16-311 EAS Upstream Stategy
BKK16-311 EAS Upstream Stategy
Linaro
 
Ch5 cpu scheduling
Ch5 cpu schedulingCh5 cpu scheduling
Ch5 cpu scheduling
Syaiful Ahdan
 
Network
NetworkNetwork
Linux Preempt-RT Internals
Linux Preempt-RT InternalsLinux Preempt-RT Internals
Linux Preempt-RT Internals
哲豪 康哲豪
 
Ch4
Ch4Ch4
HKG15-305: Real Time processing comparing the RT patch vs Core isolation
HKG15-305: Real Time processing comparing the RT patch vs Core isolationHKG15-305: Real Time processing comparing the RT patch vs Core isolation
HKG15-305: Real Time processing comparing the RT patch vs Core isolation
Linaro
 
Galvin-operating System(Ch3)
Galvin-operating System(Ch3)Galvin-operating System(Ch3)
Galvin-operating System(Ch3)
dsuyal1
 
Running & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch ClustersRunning & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch Clusters
Fred de Villamil
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux
Brendan Gregg
 
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
peknap
 

What's hot (19)

Ch2.1 computer system structures
Ch2.1 computer system structures Ch2.1 computer system structures
Ch2.1 computer system structures
 
Data Structures for High Resolution, Real-time Telemetry at Scale
Data Structures for High Resolution, Real-time Telemetry at ScaleData Structures for High Resolution, Real-time Telemetry at Scale
Data Structures for High Resolution, Real-time Telemetry at Scale
 
Task migration using CRIU
Task migration using CRIUTask migration using CRIU
Task migration using CRIU
 
Galvin-operating System(Ch5)
Galvin-operating System(Ch5)Galvin-operating System(Ch5)
Galvin-operating System(Ch5)
 
Cat @ scale
Cat @ scaleCat @ scale
Cat @ scale
 
Instruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar ProcessorsInstruction Level Parallelism and Superscalar Processors
Instruction Level Parallelism and Superscalar Processors
 
RTAI - Earliest Deadline First
RTAI - Earliest Deadline FirstRTAI - Earliest Deadline First
RTAI - Earliest Deadline First
 
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflowsBeyond unit tests: Deployment and testing for Hadoop/Spark workflows
Beyond unit tests: Deployment and testing for Hadoop/Spark workflows
 
13 superscalar
13 superscalar13 superscalar
13 superscalar
 
BKK16-311 EAS Upstream Stategy
BKK16-311 EAS Upstream StategyBKK16-311 EAS Upstream Stategy
BKK16-311 EAS Upstream Stategy
 
Ch5 cpu scheduling
Ch5 cpu schedulingCh5 cpu scheduling
Ch5 cpu scheduling
 
Network
NetworkNetwork
Network
 
Linux Preempt-RT Internals
Linux Preempt-RT InternalsLinux Preempt-RT Internals
Linux Preempt-RT Internals
 
Ch4
Ch4Ch4
Ch4
 
HKG15-305: Real Time processing comparing the RT patch vs Core isolation
HKG15-305: Real Time processing comparing the RT patch vs Core isolationHKG15-305: Real Time processing comparing the RT patch vs Core isolation
HKG15-305: Real Time processing comparing the RT patch vs Core isolation
 
Galvin-operating System(Ch3)
Galvin-operating System(Ch3)Galvin-operating System(Ch3)
Galvin-operating System(Ch3)
 
Running & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch ClustersRunning & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch Clusters
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux
 
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
Solving Real-Time Scheduling Problems With RT_PREEMPT and Deadline-Based Sche...
 

Similar to L21.fa13

Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- MulticoreLec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Hsien-Hsin Sean Lee, Ph.D.
 
Spil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLSpil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NL
Thijs Terlouw
 
Hardware Assisted Latency Investigations
Hardware Assisted Latency InvestigationsHardware Assisted Latency Investigations
Hardware Assisted Latency Investigations
ScyllaDB
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
Haris456
 
Capacity Planning For LAMP
Capacity Planning For LAMPCapacity Planning For LAMP
Capacity Planning For LAMP
John Allspaw
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
EUDAT
 
Ch8
Ch8Ch8
Dw tpain - Gordon Klok
Dw tpain - Gordon KlokDw tpain - Gordon Klok
Dw tpain - Gordon Klok
Devopsdays
 
UNIT 3 - General Purpose Processors
UNIT 3 - General Purpose ProcessorsUNIT 3 - General Purpose Processors
UNIT 3 - General Purpose Processors
ButtaRajasekhar2
 
Prologue O/S - Improving the Odds of Job Success
Prologue O/S - Improving the Odds of Job SuccessPrologue O/S - Improving the Odds of Job Success
Prologue O/S - Improving the Odds of Job Success
inside-BigData.com
 
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptx
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptxEmbedded_ PPT_4-5 unit_Dr Monika-edited.pptx
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptx
ProfMonikaJain
 
DB2 for z/OS - Starter's guide to memory monitoring and control
DB2 for z/OS - Starter's guide to memory monitoring and controlDB2 for z/OS - Starter's guide to memory monitoring and control
DB2 for z/OS - Starter's guide to memory monitoring and control
Florence Dubois
 
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
InfluxData
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
DataWorks Summit
 
RISC.ppt
RISC.pptRISC.ppt
RISC.ppt
AmarDura2
 
13 risc
13 risc13 risc
13 risc
Anwal Mirza
 
13_Superscalar.ppt
13_Superscalar.ppt13_Superscalar.ppt
13_Superscalar.ppt
LavleshkumarBais
 
cs-procstruc.ppt
cs-procstruc.pptcs-procstruc.ppt
cs-procstruc.ppt
Mohamoud Saed Mohamed
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Tathagata Das
 
pipeline and pipeline hazards
pipeline and pipeline hazards pipeline and pipeline hazards
pipeline and pipeline hazards
Bharti Khemani
 

Similar to L21.fa13 (20)

Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- MulticoreLec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
 
Spil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NLSpil Storage Platform (Erlang) @ EUG-NL
Spil Storage Platform (Erlang) @ EUG-NL
 
Hardware Assisted Latency Investigations
Hardware Assisted Latency InvestigationsHardware Assisted Latency Investigations
Hardware Assisted Latency Investigations
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Capacity Planning For LAMP
Capacity Planning For LAMPCapacity Planning For LAMP
Capacity Planning For LAMP
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
 
Ch8
Ch8Ch8
Ch8
 
Dw tpain - Gordon Klok
Dw tpain - Gordon KlokDw tpain - Gordon Klok
Dw tpain - Gordon Klok
 
UNIT 3 - General Purpose Processors
UNIT 3 - General Purpose ProcessorsUNIT 3 - General Purpose Processors
UNIT 3 - General Purpose Processors
 
Prologue O/S - Improving the Odds of Job Success
Prologue O/S - Improving the Odds of Job SuccessPrologue O/S - Improving the Odds of Job Success
Prologue O/S - Improving the Odds of Job Success
 
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptx
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptxEmbedded_ PPT_4-5 unit_Dr Monika-edited.pptx
Embedded_ PPT_4-5 unit_Dr Monika-edited.pptx
 
DB2 for z/OS - Starter's guide to memory monitoring and control
DB2 for z/OS - Starter's guide to memory monitoring and controlDB2 for z/OS - Starter's guide to memory monitoring and control
DB2 for z/OS - Starter's guide to memory monitoring and control
 
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
 
RISC.ppt
RISC.pptRISC.ppt
RISC.ppt
 
13 risc
13 risc13 risc
13 risc
 
13_Superscalar.ppt
13_Superscalar.ppt13_Superscalar.ppt
13_Superscalar.ppt
 
cs-procstruc.ppt
cs-procstruc.pptcs-procstruc.ppt
cs-procstruc.ppt
 
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSPDiscretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
 
pipeline and pipeline hazards
pipeline and pipeline hazards pipeline and pipeline hazards
pipeline and pipeline hazards
 

L21.fa13

  • 1. LLeecctuturree 1 18787--11 Computer Science 425 Distributed Systems CS 425 / ECE 428 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Fall 2013 Hilfi Alkaff November 5, 2013 Lecture 21 Stream Processing ã 2013, H. Alkaff, I. Gupta
  • 2. LLeecctuturree 1 177--22 MMoottiivvaattiioonn • Many important applications must process large streams of live data and provide results in near-real- time – Social network trends – Website statistics – Intrustion detection systems – etc. • Require large clusters to handle workloads • Require latencies of few seconds
  • 3. TTrraaddiittiioonnaall SSoolluuttiioonnss ((MMaapp--RReedduuccee)) LLeecctuturree 1 177--33 • No partial answers – Have to wait for the entire batch to finish • Hard to configure when we want to add more nodes. Mapper 1 Mapper 2 Reducer 1 Reducer 2 New Mapper
  • 4. LLeecctuturree 1 177--44 TThhee NNeeeedd ffoorr NNeeww FFrraammeewwoorrkk • Scalable • Second-scale • Easy to program • “Just” works – Adding/removing nodes should be transparent
  • 5. LLeecctuturree 1 177--55 EEnntteerr SSttoorrmm • Open-sourced at http://storm-project.net/ • Most watched JVM project • Multiple languages API supported – Python, Ruby, Fancy • Used by over 30 companies including – Twitter: For personalization, search – Flipboard: For generating custom feeds
  • 6. LLeecctuturree 1 177--66 SSttoorrmm’’ss TTeerrmmiinnoollooggyy • Tuples • Streams • Spouts • Bolts • Topologies
  • 7. LLeecctuturree 1 177--77 TTuupplleess • Ordered list of elements – E.g. [“Illinois”, “somebody@illinois.edu”]
  • 8. Tuple Tuple Tuple Tuple Tuple Tuple Tuple LLeecctuturree 1 177--88 SSttrreeaammss Unbounded sequence of tuples
  • 9. LLeecctuturree 1 177--99 SSppoouuttss Tuple Tuple Tuple Tuple Tuple Tuple Tuple Tuple Source of streams (E.g. Web logs)
  • 10. Tuple Tuple Tuple Tuple Tuple Tuple LLeecctuturree 1 177--1100 BBoollttss Tuple Tuple Tuple Processes tuples and create new streams
  • 11. LLeecctuturree 1 177--1111 BBoollttss ((CCoonnttiinnuueedd)) • Operations that can be performed – Filter – Joins – Apply/transform functions – Access DBs – Etc • Could specify amount of parallelism in each bolt – How to decide which task in bolt to route subset of the streams to?
  • 12. LLeecctuturree 1 177--1122 SSttrreeaammss GGrroouuppiinngg iinn BBoollttss • Shuffle Grouping – Streams are randomly distributed from to the bolt’s tasks • Fields Grouping – Lets you group a stream by a subset of its fields (Example?) • All Grouping – Stream is replicated across all the bolt’s tasks (Example?)
  • 13. Tuple Tuple Tuple Tuple Tuple Tuple LLeecctuturree 1 177--1133 TTooppoollooggiieess Tuple Tuple Tuple A directed graph of spouts and bolts
  • 14. LLeecctuturree 1 177--1144 WWoorrdd CCoouunntt EExxaammppllee ((MMaaiinn))
  • 15. LLeecctuturree 1 177--1155 WWoorrdd CCoouunntt EExxaammppllee ((SSppoouutt))
  • 16. LLeecctuturree 1 177--1166 WWoorrdd CCoouunntt EExxaammppllee ((BBoolltt))
  • 17. LLeecctuturree 1 177--1177 WWoorrdd CCoouunntt ((RRuunnnniinngg))
  • 18. LLeecctuturree 1 177--1188 SSttoorrmm CClluusstteerr • Master node – Runs a daemon called Nimbus – Responsible for distributing code around cluster – Assigning tasks to machines – Monitoring for failures • Worker node – Runs a daemon called Supervisor – Listens for work assigned to its machines • Zookeeper – Coordinates Nimbus and Supervisors communication – All state of Supervisor and Nimbus is being kept here
  • 19. Supervisor LLeecctuturree 1 177--1199 SSttoorrmm CClluusstteerr Nimbus Zookeeper Supervisor Zookeeper Zookeeper Supervisor Supervisor
  • 20. GGuuaarraanntteeeeiinngg MMeessssaaggee PPrroocceessssiinngg LLeecctuturree 1 177--2200 • When spout produces tuples, the followings are created: – Tuple for each word in the sentence – Tuple for the updated count for each word • A tuple is considered failed when its tree of messages fails to be fully processed within a specified timeout
  • 21. LLeecctuturree 1 177--2211 AAnncchhoorriinngg • When a word is being anchored, the spout tuple at the root of the tree will be replayed later on if the word tuple failed to be processed downstream – Might not necessarily wants this feature • An output can be anchored to more than one input tuple – Failure of one tuple causes multiple tuples to be replayed
  • 22. MMiisscceellllaanneeoouuss ffoorr SSttoorrmm’’ss RReelliiaabbiilliittyy AAPPII LLeecctuturree 1 177--2222 • Emit(tuple, output) – Each word is anchored if you specify the input tuple as the first argument to emit • Ack(tuple) – Acknowledge that you finish processing a tuple • Fail(tuple) – Immediately fail the spout tuple at the root of tuple tree if there is an exception from the database, etc • Must remember to ack/fail each tuple – Each tuple consume memory. Failure to do so results in memory leaks
  • 23. LLeecctuturree 1 177--2233 WWhhaatt’’ss wwrroonngg wwiitthh SSttoorrmm?? • Mutable state can be lost due to failure! – May update mutable state twice! – Processes each record at least once • Slow nodes? – No notion of time slices
  • 24. LLeecctuturree 1 177--2244 EEnntteerr SSppaarrkk SSttrreeaammiinngg • Research project from AMPLab Group at UC Berkeley • A follow-up from Spark research project – Idea: cache datasets in memory, dubbed Resilient Distributed Datasets (RDD) for repeated query – Able to perform 100 times faster than Hadoop MapReduce for iterative machine learning applications
  • 25. LLeecctuturree 1 177--2255 DDiissccrreettiizzeedd SSttrreeaamm PPrroocceessssiinngg • Run a streaming computation as a series of very small, deterministic batch jobs • Chop up the live stream into batches of X seconds • Spark treats each batch of data as RDDs and processes them using RDD operations • Finally, the processed results of the RDD operations are returned in batches
  • 26. DDiissccrreettiizzeedd SSttrreeaamm PPrroocceessssiinngg ((CCoonnttiinnuueedd)) • Run a streaming computation as a series of very small, deterministic batch jobs LLeecctuturree 1 177--2266 • Chop up the live stream into batches of X seconds • Spark treats each batch of data as RDDs and processes them using RDD operations • Finally, the processed results of the RDD operations are returned in batches
  • 27. LLeecctuturree 1 177--2277 EEffffeeccttss ooff DDiissccrreettiizzaattiioonn • (+) Easy to maintain consistency – Each time interval is processed independently » Especially useful when you have stragglers (slow nodes) » Not handled in Storm • (+) Handle out-of-order records – Can choose to wait for a limited “slack time” – Incrementally correct late records at application level • (+) Parallel recovery – Expose parallelism across both partitions of an operator and time • (-) Raises minimum latency
  • 28. FFaauulltt--ttoolleerraannccee iinn SSppaarrkk SSttrreeaammiinngg LLeecctuturree 1 177--2288 • RDDs are remember the sequence of operations that created it from the original fault-tolerant input data (Dubbed “lineage graph”) • Batches of input data are replicated in memory of multiple worker nodes, therefore fault-tolerant • Data lost due to worker failure, can be recomputed from input data
  • 29. LLeecctuturree 1 177--2299 FFaauulltt--ttoolleerraanntt SSttaatteeffuull PPrroocceessssiinngg • State data not lost even if a worker node dies – Does not change the value of your result • Exactly once semantics to all transformations – No double counting! • Recovers from faults/stragglers within 1 sec
  • 30. LLeecctuturree 1 177--3300 CCoommppaarriissoonn wwiitthh SSttoorrmm • Higher throughput than Storm – Spark Streaming: 670k records/second/node – Storm: 115k records/second/node
  • 31. LLeecctuturree 1 177--3311 MMoorree iinnffoorrmmaattiioonn • Storm – https://github.com/nathanmarz/storm/wiki/Tutorial – https://github.com/nathanmarz/storm-starter • Spark Streaming – http://spark.incubator.apache.org/docs/latest/streaming-programming- guide.html • Other Streaming Frameworks – Yahoo S4 – LinkedIn Samza

Editor's Notes

  1. WordCount, Probabilistic computation
  2. Construct topology Define a spout with parallelism of 5 tasks Split sentences into words with parallelism of 8 tasks
  3. Each worker executes a subset of topology, unlike Map Reduce, this does not stop until you kill it
  4. Mentions similarity to job tracker and task tracker in Hadoop
  5. Fail fast -> restart fast