SlideShare a Scribd company logo
1 of 19
From Gust To Tempest: Scaling Storm
P R E S E N T E D B Y B o b b y E v a n s
Hi I’m Bobby Evans
bobby@apache.org
2
 Low Latency Data Processing Architect @ Yahoo
 Apache Storm
 Apache Spark
 Apache Kafka
 Committer and PMC member for
 Apache Storm
 Apache Hadoop
 Apache Spark
 Apache TEZ
Agenda
3
 Apache Storm Architecture
 What was Done Already
 Current/Future Work
background: https://www.flickr.com/photos/gsfc/15072362777
Storm Concepts
1. Streams
 Unbounded sequence of tuples
2. Spout
 Source of Stream
 E.g. Read from Twitter streaming API
3. Bolts
 Processes input streams and produces new
streams
 E.g. Functions, Filters, Aggregation, Joins
4. Topologies
 Network of spouts and bolts
Routing of tuples
 Shuffle grouping: pick a random task
(but with load balancing)
 Fields grouping: consistent hashing on
a subset of tuple fields
 All grouping: send to all tasks
 Global grouping: pick task with lowest
id
 Shuffle or Local grouping: If there is a
local bolt (in the same worker process)
use it otherwise use shuffle
 Partial Key grouping: Fields grouping
but with 2 choices for load balancing.
Storm Architecture
Master
Node
Cluster
Coordination
Worker
processes
Worker
Nimbus
Zookeeper
Zookeeper
Zookeeper
Supervisor
Supervisor
Supervisor
Supervisor Worker
Worker
Worker
Launches
workers
Worker
Task
(Spout A-1)
Task
(Spout A-5)
Task
(Spout A-9)
Task
(Bolt B-3)
Other
Workers
Task
(Acker)
Routing
Current State
w hat w as done alr eady
background: https://www.flickr.com/photos/maf04/14392794749
Largest Topology Growth at Yahoo
9
2013 2014 2015
Executors 100 3000 4000
Workers 40 400 1500
0
500
1000
1500
2000
2500
3000
3500
4000
4500
background: https://www.flickr.com/photos/68942208@N02/16242761551
Cluster Growth at Yahoo
10
0
500
1000
1500
2000
2500
Jun-12
Aug-12
Oct-12
Dec-12
Feb-13
Apr-13
Jun-13
Aug-13
Oct-13
Dec-13
Feb-14
Apr-14
Jun-14
Aug-14
Oct-14
Dec-14
Feb-15
Apr-15
Jun-15
Jun-12 Jan-13 Jan-14 Jan-15 Jun-15
Total Nodes 40 170 600 1100 2300
Largest Cluster 20 60 120 250 300
background: http://bit.ly/1KypnCN
In the Beginning…
11
 Mid 2011:
 Storm is released as open source
 Early 2012:
 Yahoo evaluation begins
 https://github.com/yahoo/storm-perf-test
 Mid 2012:
 Purpose built clusters 10+ nodes
 Early 2013:
 60-node cluster, largest topology 40 workers, 100 executors
 ZooKeeper config -Djute.maxbuffer=4194304
 May 2013:
 Netty messaging layer
 http://yahooeng.tumblr.com/post/64758709722/making-storm-fly-with-netty
 Oct 2013:
 ZooKeeper heartbeat timeout checks
background: https://www.flickr.com/photos/gedas/3618792161
So Far…
 Late 2013:
 ZooKeeper config -Dzookeeper.forceSync=no
 Storm enters Apache Incubator
 Early 2014:
 250-node cluster, largest topology 400 workers, 3000 executors
 June 2014:
 STORM-376 – Compress ZooKeeper data
 STORM-375 – Check for changes before reading data from ZooKeeper
 Sep 2014
 Storm becomes an Apache Top Level Project
 Early 2015:
 STORM-632 Better grouping for data skew
 STORM-634 Thrift serialization for ZooKeeper data.
 300-node cluster (Tested 400 nodes, 1,200 theoretical maximum)
 Largest topology 1500 workers, 4000 executors
background: http://s0.geograph.org.uk/geophotos/02/27/03/2270317_7653a833.jpg
We still have a ways to go
13
Hadoop 5400
Storm 300
Nodes
Largest Cluster Size
We want to get to a
4000-node Storm
cluster.
Hadoop 41000
Storm 2300
Nodes
Total Nodes
background: https://www.flickr.com/photos/68397968@N07/14600216228
Future and Current Work
how w e ar e going to get to 4000
background: https://www.flickr.com/photos/12567713@N00/2859921414
Why Can’t Storm Scale?
It’s all about the data.
State Storage (ZooKeeper):
 Limited to disk write speed (80MB/sec typically)
 Scheduling
O(num_execs * resched_rate)
 Supervisor
O(num_supervisors * hb_rate)
 Topology Metrics (worst case)
O(num_execs * num_comps * num_streams * hb_rate)
On one 240-node Yahoo Storm cluster, ZK writes 16 MB/sec, about
99.2% of that is worker heartbeats
Theoretical Limit:
80 MB/sec / 16 MB/sec * 240 nodes = 1,200 nodes
background: http://cnx.org/resources/8ab472b9b2bc2e90bb15a2a7b2182ca45a883e0f/Figure_45_07_02.jpg
Pacemaker
heartbeat server
Simple Secure In-Memory Store for Worker Heartbeats.
 Removes Disk Limitation
 Writes Scale Linearly
(but nimbus still needs to read it all, ideally in 10 sec or less)
240 node cluster’s complete HB state is 48MB, Gigabit is about 125 MB/s
10 s / (48 MB / 125 MB/s) * 240 nodes = 6250 nodes
1200
6250
Theoretical Maximum Cluster Size
Zookeeper PaceMaker Gigabit
Highly-connected
topologies dominate data
volume.
10 GigE to the rescue
Why Can’t Storm Scale?
It’s all about the data.
All raw data serialized, transferred to UI, de-serialized and aggregated
per page load
Our largest topology uses about 400 MB in memory
Aggregate stats for UI/REST in Nimbus
 10+ min page load to 7 seconds
DDOS on Nimbus for jar download
Distributed Cache/Blob Store (STORM-411)
 Pluggable backend with HDFS support
background: https://www.flickr.com/photos/oregondot/15799498927
Why Can’t Storm Scale?
It’s all about the data.
Storm round-robin scheduling
 R-1/R % of traffic will be off rack where R is
the number of racks
 N-1/N % of traffic will be off node where N is
the number of nodes
 Does not know when resources are full (i.e.
network)
Resource & Network Topography Aware Scheduling
One slow node slows the entire topology.
Load Aware Routing (STORM-162)
Need lot more intelligent routing.
Questions?
https://www.flickr.com/photos/51029297@N00/5275403364
bobby@apache.org

More Related Content

What's hot

Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache MesosJoe Stein
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleDataWorks Summit/Hadoop Summit
 
Real Time Analytics - Stream Processing (Colombo big data meetup 18/05/2017)
Real Time Analytics - Stream Processing (Colombo big data meetup 18/05/2017)Real Time Analytics - Stream Processing (Colombo big data meetup 18/05/2017)
Real Time Analytics - Stream Processing (Colombo big data meetup 18/05/2017)mahesh madushanka
 
Zookeeper In Action
Zookeeper In ActionZookeeper In Action
Zookeeper In Actionjuvenxu
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Stormviirya
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosJoe Stein
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersSematext Group, Inc.
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time ComputationSonal Raj
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Shirshanka Das
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSematext Group, Inc.
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.DECK36
 

What's hot (19)

Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
 
Storm
StormStorm
Storm
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
 
Tuning Solr for Logs
Tuning Solr for LogsTuning Solr for Logs
Tuning Solr for Logs
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
 
Real Time Analytics - Stream Processing (Colombo big data meetup 18/05/2017)
Real Time Analytics - Stream Processing (Colombo big data meetup 18/05/2017)Real Time Analytics - Stream Processing (Colombo big data meetup 18/05/2017)
Real Time Analytics - Stream Processing (Colombo big data meetup 18/05/2017)
 
Zookeeper In Action
Zookeeper In ActionZookeeper In Action
Zookeeper In Action
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache Mesos
 
Storm
StormStorm
Storm
 
vBACD - Introduction to Opscode Chef - 2/29
vBACD - Introduction to Opscode Chef - 2/29vBACD - Introduction to Opscode Chef - 2/29
vBACD - Introduction to Opscode Chef - 2/29
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for You
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
 

Viewers also liked

Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Richard Zijdeman
 
Railroad Modeling at Hadoop Scale
Railroad Modeling at Hadoop ScaleRailroad Modeling at Hadoop Scale
Railroad Modeling at Hadoop ScaleDataWorks Summit
 
Revista Académica de la Universidad La Salle
Revista Académica de la Universidad La SalleRevista Académica de la Universidad La Salle
Revista Académica de la Universidad La SallePérez Esquer
 
Harry Verwayen, The More You Give The More You Get
Harry Verwayen, The More You Give The More You GetHarry Verwayen, The More You Give The More You Get
Harry Verwayen, The More You Give The More You GetEUscreen
 
An Exploratory Study on the Links between Individual Upcycling, Product Attac...
An Exploratory Study on the Links between Individual Upcycling, Product Attac...An Exploratory Study on the Links between Individual Upcycling, Product Attac...
An Exploratory Study on the Links between Individual Upcycling, Product Attac...Kyungeun Sung
 
Polka Dot Teapots
Polka Dot TeapotsPolka Dot Teapots
Polka Dot Teapotscinty45
 
Hadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business UnitHadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business UnitDataWorks Summit
 
Marketing de Experiencias Online
Marketing de Experiencias OnlineMarketing de Experiencias Online
Marketing de Experiencias OnlineRicardo Valdés
 
Dealing with an Upside Down Internet With High Performance Time Series Database
Dealing with an Upside Down Internet  With High Performance Time Series DatabaseDealing with an Upside Down Internet  With High Performance Time Series Database
Dealing with an Upside Down Internet With High Performance Time Series DatabaseDataWorks Summit
 
Andreas Fickers: Transmedia Storytelling and Media History
Andreas Fickers: Transmedia Storytelling and Media HistoryAndreas Fickers: Transmedia Storytelling and Media History
Andreas Fickers: Transmedia Storytelling and Media HistoryEUscreen
 
White eagle, by robert
White eagle, by robertWhite eagle, by robert
White eagle, by robertbeatusest2
 
Computación para todos (primaria) 5to grado
Computación para todos (primaria)  5to gradoComputación para todos (primaria)  5to grado
Computación para todos (primaria) 5to gradoMarcos Torres
 
Curso gestión estratégica del tiempo 2016
Curso gestión estratégica del tiempo 2016Curso gestión estratégica del tiempo 2016
Curso gestión estratégica del tiempo 2016juansalas
 
陰陽編 太極から八卦へ
陰陽編 太極から八卦へ陰陽編 太極から八卦へ
陰陽編 太極から八卦へreigan_s
 

Viewers also liked (20)

Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)Historical occupational classification and stratification schemes (lecture)
Historical occupational classification and stratification schemes (lecture)
 
Railroad Modeling at Hadoop Scale
Railroad Modeling at Hadoop ScaleRailroad Modeling at Hadoop Scale
Railroad Modeling at Hadoop Scale
 
Examen de-química-ii
Examen de-química-iiExamen de-química-ii
Examen de-química-ii
 
Examen quimica diciembre
Examen quimica diciembreExamen quimica diciembre
Examen quimica diciembre
 
Assignment.econ 260
Assignment.econ 260Assignment.econ 260
Assignment.econ 260
 
Revista Académica de la Universidad La Salle
Revista Académica de la Universidad La SalleRevista Académica de la Universidad La Salle
Revista Académica de la Universidad La Salle
 
Harry Verwayen, The More You Give The More You Get
Harry Verwayen, The More You Give The More You GetHarry Verwayen, The More You Give The More You Get
Harry Verwayen, The More You Give The More You Get
 
An Exploratory Study on the Links between Individual Upcycling, Product Attac...
An Exploratory Study on the Links between Individual Upcycling, Product Attac...An Exploratory Study on the Links between Individual Upcycling, Product Attac...
An Exploratory Study on the Links between Individual Upcycling, Product Attac...
 
Polka Dot Teapots
Polka Dot TeapotsPolka Dot Teapots
Polka Dot Teapots
 
Hadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business UnitHadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business Unit
 
Marketing de Experiencias Online
Marketing de Experiencias OnlineMarketing de Experiencias Online
Marketing de Experiencias Online
 
Dealing with an Upside Down Internet With High Performance Time Series Database
Dealing with an Upside Down Internet  With High Performance Time Series DatabaseDealing with an Upside Down Internet  With High Performance Time Series Database
Dealing with an Upside Down Internet With High Performance Time Series Database
 
Andreas Fickers: Transmedia Storytelling and Media History
Andreas Fickers: Transmedia Storytelling and Media HistoryAndreas Fickers: Transmedia Storytelling and Media History
Andreas Fickers: Transmedia Storytelling and Media History
 
White eagle, by robert
White eagle, by robertWhite eagle, by robert
White eagle, by robert
 
Examen de-química-ii-1
Examen de-química-ii-1Examen de-química-ii-1
Examen de-química-ii-1
 
Homework 3
Homework 3Homework 3
Homework 3
 
Computación para todos (primaria) 5to grado
Computación para todos (primaria)  5to gradoComputación para todos (primaria)  5to grado
Computación para todos (primaria) 5to grado
 
Curso gestión estratégica del tiempo 2016
Curso gestión estratégica del tiempo 2016Curso gestión estratégica del tiempo 2016
Curso gestión estratégica del tiempo 2016
 
陰陽編 太極から八卦へ
陰陽編 太極から八卦へ陰陽編 太極から八卦へ
陰陽編 太極から八卦へ
 
Ngss poster
Ngss posterNgss poster
Ngss poster
 

Similar to From Gust To Tempest: Scaling Storm

Everything you wanted to know about writing async, concurrent http apps in java
Everything you wanted to know about writing async, concurrent http apps in java Everything you wanted to know about writing async, concurrent http apps in java
Everything you wanted to know about writing async, concurrent http apps in java Baruch Sadogursky
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introducejhao niu
 
Streaming ML on Spark: Deprecated, experimental and internal ap is galore!
Streaming ML on Spark: Deprecated, experimental and internal ap is galore!Streaming ML on Spark: Deprecated, experimental and internal ap is galore!
Streaming ML on Spark: Deprecated, experimental and internal ap is galore!Holden Karau
 
Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?Bertrand Delacretaz
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Kyle Hailey
 
Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web OperationsJohn Allspaw
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceAshok Modi
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network ProcessingRyousei Takano
 
1. Scaling PHP/MySQL...Presentation from Flickr
	
1.	
Scaling PHP/MySQL...Presentation from Flickr	
1.	
Scaling PHP/MySQL...Presentation from Flickr
1. Scaling PHP/MySQL...Presentation from Flickrakshat
 
Riak add presentation
Riak add presentationRiak add presentation
Riak add presentationIlya Bogunov
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Dave Holland
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Spark Summit
 
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.Puppet
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013alanfgates
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon
 
MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMakerKris Buytaert
 

Similar to From Gust To Tempest: Scaling Storm (20)

Everything you wanted to know about writing async, concurrent http apps in java
Everything you wanted to know about writing async, concurrent http apps in java Everything you wanted to know about writing async, concurrent http apps in java
Everything you wanted to know about writing async, concurrent http apps in java
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
SQL Server On SANs
SQL Server On SANsSQL Server On SANs
SQL Server On SANs
 
Streaming ML on Spark: Deprecated, experimental and internal ap is galore!
Streaming ML on Spark: Deprecated, experimental and internal ap is galore!Streaming ML on Spark: Deprecated, experimental and internal ap is galore!
Streaming ML on Spark: Deprecated, experimental and internal ap is galore!
 
Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
 
Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web Operations
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performance
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
1. Scaling PHP/MySQL...Presentation from Flickr
	
1.	
Scaling PHP/MySQL...Presentation from Flickr	
1.	
Scaling PHP/MySQL...Presentation from Flickr
1. Scaling PHP/MySQL...Presentation from Flickr
 
Riak add presentation
Riak add presentationRiak add presentation
Riak add presentation
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond Panel
 
Sparkstreaming
SparkstreamingSparkstreaming
Sparkstreaming
 
MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMaker
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

From Gust To Tempest: Scaling Storm

  • 1. From Gust To Tempest: Scaling Storm P R E S E N T E D B Y B o b b y E v a n s
  • 2. Hi I’m Bobby Evans bobby@apache.org 2  Low Latency Data Processing Architect @ Yahoo  Apache Storm  Apache Spark  Apache Kafka  Committer and PMC member for  Apache Storm  Apache Hadoop  Apache Spark  Apache TEZ
  • 3. Agenda 3  Apache Storm Architecture  What was Done Already  Current/Future Work background: https://www.flickr.com/photos/gsfc/15072362777
  • 4. Storm Concepts 1. Streams  Unbounded sequence of tuples 2. Spout  Source of Stream  E.g. Read from Twitter streaming API 3. Bolts  Processes input streams and produces new streams  E.g. Functions, Filters, Aggregation, Joins 4. Topologies  Network of spouts and bolts
  • 5. Routing of tuples  Shuffle grouping: pick a random task (but with load balancing)  Fields grouping: consistent hashing on a subset of tuple fields  All grouping: send to all tasks  Global grouping: pick task with lowest id  Shuffle or Local grouping: If there is a local bolt (in the same worker process) use it otherwise use shuffle  Partial Key grouping: Fields grouping but with 2 choices for load balancing.
  • 7. Worker Task (Spout A-1) Task (Spout A-5) Task (Spout A-9) Task (Bolt B-3) Other Workers Task (Acker) Routing
  • 8. Current State w hat w as done alr eady background: https://www.flickr.com/photos/maf04/14392794749
  • 9. Largest Topology Growth at Yahoo 9 2013 2014 2015 Executors 100 3000 4000 Workers 40 400 1500 0 500 1000 1500 2000 2500 3000 3500 4000 4500 background: https://www.flickr.com/photos/68942208@N02/16242761551
  • 10. Cluster Growth at Yahoo 10 0 500 1000 1500 2000 2500 Jun-12 Aug-12 Oct-12 Dec-12 Feb-13 Apr-13 Jun-13 Aug-13 Oct-13 Dec-13 Feb-14 Apr-14 Jun-14 Aug-14 Oct-14 Dec-14 Feb-15 Apr-15 Jun-15 Jun-12 Jan-13 Jan-14 Jan-15 Jun-15 Total Nodes 40 170 600 1100 2300 Largest Cluster 20 60 120 250 300 background: http://bit.ly/1KypnCN
  • 11. In the Beginning… 11  Mid 2011:  Storm is released as open source  Early 2012:  Yahoo evaluation begins  https://github.com/yahoo/storm-perf-test  Mid 2012:  Purpose built clusters 10+ nodes  Early 2013:  60-node cluster, largest topology 40 workers, 100 executors  ZooKeeper config -Djute.maxbuffer=4194304  May 2013:  Netty messaging layer  http://yahooeng.tumblr.com/post/64758709722/making-storm-fly-with-netty  Oct 2013:  ZooKeeper heartbeat timeout checks background: https://www.flickr.com/photos/gedas/3618792161
  • 12. So Far…  Late 2013:  ZooKeeper config -Dzookeeper.forceSync=no  Storm enters Apache Incubator  Early 2014:  250-node cluster, largest topology 400 workers, 3000 executors  June 2014:  STORM-376 – Compress ZooKeeper data  STORM-375 – Check for changes before reading data from ZooKeeper  Sep 2014  Storm becomes an Apache Top Level Project  Early 2015:  STORM-632 Better grouping for data skew  STORM-634 Thrift serialization for ZooKeeper data.  300-node cluster (Tested 400 nodes, 1,200 theoretical maximum)  Largest topology 1500 workers, 4000 executors background: http://s0.geograph.org.uk/geophotos/02/27/03/2270317_7653a833.jpg
  • 13. We still have a ways to go 13 Hadoop 5400 Storm 300 Nodes Largest Cluster Size We want to get to a 4000-node Storm cluster. Hadoop 41000 Storm 2300 Nodes Total Nodes background: https://www.flickr.com/photos/68397968@N07/14600216228
  • 14. Future and Current Work how w e ar e going to get to 4000 background: https://www.flickr.com/photos/12567713@N00/2859921414
  • 15. Why Can’t Storm Scale? It’s all about the data. State Storage (ZooKeeper):  Limited to disk write speed (80MB/sec typically)  Scheduling O(num_execs * resched_rate)  Supervisor O(num_supervisors * hb_rate)  Topology Metrics (worst case) O(num_execs * num_comps * num_streams * hb_rate) On one 240-node Yahoo Storm cluster, ZK writes 16 MB/sec, about 99.2% of that is worker heartbeats Theoretical Limit: 80 MB/sec / 16 MB/sec * 240 nodes = 1,200 nodes background: http://cnx.org/resources/8ab472b9b2bc2e90bb15a2a7b2182ca45a883e0f/Figure_45_07_02.jpg
  • 16. Pacemaker heartbeat server Simple Secure In-Memory Store for Worker Heartbeats.  Removes Disk Limitation  Writes Scale Linearly (but nimbus still needs to read it all, ideally in 10 sec or less) 240 node cluster’s complete HB state is 48MB, Gigabit is about 125 MB/s 10 s / (48 MB / 125 MB/s) * 240 nodes = 6250 nodes 1200 6250 Theoretical Maximum Cluster Size Zookeeper PaceMaker Gigabit Highly-connected topologies dominate data volume. 10 GigE to the rescue
  • 17. Why Can’t Storm Scale? It’s all about the data. All raw data serialized, transferred to UI, de-serialized and aggregated per page load Our largest topology uses about 400 MB in memory Aggregate stats for UI/REST in Nimbus  10+ min page load to 7 seconds DDOS on Nimbus for jar download Distributed Cache/Blob Store (STORM-411)  Pluggable backend with HDFS support background: https://www.flickr.com/photos/oregondot/15799498927
  • 18. Why Can’t Storm Scale? It’s all about the data. Storm round-robin scheduling  R-1/R % of traffic will be off rack where R is the number of racks  N-1/N % of traffic will be off node where N is the number of nodes  Does not know when resources are full (i.e. network) Resource & Network Topography Aware Scheduling One slow node slows the entire topology. Load Aware Routing (STORM-162) Need lot more intelligent routing.