Future of Apache Storm

Future of Apache Storm
Hadoop Summit 2016, Melbourne

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About us
Arun Iyer, Hortonworks
Apache Storm Committer/PMC
arunm@apache.org
Satish Duggana, Hortonworks
Apache Storm Committer/PMC
satishd@apache.org

Apache Storm 1.0
Maturity and Improved Performance
Release Date: April 12, 2016

Distributed Cache API

 Topology resources
– Dictionaries, ML Models, Geolocation Data, etc.
 Typically packaged in topology jar
– Fine for small files
– Large files negatively impact topology startup time
– Immutable: Changes require repackaging and deployment

 Allows sharing of files (BLOBs) among topologies
 Files can be updated from the command line
 Allows for files from several KB to several GB in size
 Files can change over the lifetime of the topology
 Allows for compression (e.g. zip, tar, gzip)

 Two implementations: LocalFsBlobStore and HdfsBlobStore
 Local implementation supports Replication Factor (not needed for HDFS-backed
implementation)
 Both support ACLs

Creating a blob:
– storm blobstore create --file dict.txt --acl o::rwa --repl-fctr 2 key1
Making it available to a topology:
– storm jar topo.jar my.topo.Class test_topo -c
topology.blobstore.map=‘{"key1":{"localname":"dict.txt",
"uncompress":"false"}}'

Highly Available Nimbus

Nimbus
Nimbus HA
ZooKeeper
Supervisor
Worker*
Supervisor
Worker*
Supervisor
Worker*

Nimbus
Leader
Nimbus
Nimbus
Nimbus HA
ZooKeeper
Supervisor
Worker*
Supervisor
Worker*
Supervisor
Worker*

Nimbus
Leader
Nimbus
Nimbus
Nimbus HA
ZooKeeper
Supervisor
Worker*
Supervisor
Worker*
Supervisor
Worker*
Leader election

Nimbus
Nimbus
Leader
Nimbus
Nimbus HA
ZooKeeper
Supervisor
Worker*
Supervisor
Worker*
Supervisor
Worker*

Nimbus HA
 Increase overall availability of Nimbus
 Nimbus hosts can join/leave at any time
 Leverages Distributed Cache API
 Topology JAR, Config, and Serialized Topology uploaded to Distributed Cache
 Replication guarantees availability of all files

Pacemaker
Heartbeat Server

Pacemaker
 Replaces Zookeeper for Heartbeats
 In-Memory key-value store
 Allows Scaling to 2k-3k+ Nodes
 Secure: Kerberos/Digest Authentication

Pacemaker
 Compared to Zookeeper
– Less Memory/CPU
– No Disk
– No overhead of maintaining consistency

Automatic Back Pressure

 In previous Storm versions, the only way to throttle topologies was
to enable ACKing and set topology.spout.max.pending.
 If you don’t require at-least-once guarantees, this imposed a
significant performance penalty.

 High/Low Watermarks (expressed as % of task’s buffer size)
 Back Pressure thread monitors task’s buffers
 If High Watermark reached, slow down Spouts
 If Low Watermark reached, stop throttling
 All Spouts Supported

Performance

Up to 16x faster throughput
Realistically 3x -- Highly dependent on use case and fault tolerance settings

> 60% Latency Reduction

Resource Aware Scheduler
(RAS)

 Specify the resource requirements (Memory/CPU) for individual
topology components (Spouts/Bolts)
 Memory: On-Heap / Off-Heap (if off-heap is used)
 CPU: Point system based on number of cores
 Resource requirements are per component instance (parallelism
matters)

 CPU and Memory availability described in storm.yaml on each
supervisor node. E.g.:
supervisor.memory.capacity.mb: 3072.0
supervisor.cpu.capacity: 400.0
 Convention for CPU capacity is to use 100 for each CPU core

SpoutDeclarer spout = builder.setSpout("sp1", new TestSpout(), 10);
//set cpu requirement
spout.setCPULoad(20);
//set onheap and offheap memory requirement
spout.setMemoryLoad(64, 16);
BoltDeclarer bolt1 = builder.setBolt("b1", new MyBolt(), 3).shuffleGrouping("sp1");
//sets cpu requirement. Not neccessary to set both CPU and memory.
//For requirements not set, a default value will be used
bolt1.setCPULoad(15);
BoltDeclarer bolt2 = builder.setBolt("b2", new MyBolt(), 2).shuffleGrouping("b1");
bolt2.setMemoryLoad(100);

Native windowing support

 Fundamental abstraction for processing continuous streams of events
 Breaks the stream of events into sub-sets
 The sub-sets can then be aggregated, joined or analyzed.
Why is it important?

 Trending items
– The top trending search queries in the last day
– Top tweets in the last 1 hour
 Complex event processing
– Windowing is fundamental to gain meaningful insights from a stream of events
– E.g. If same credit card is used from geographically distributed locations within the last one hour, it
could be a fraud.
Some use cases

Sliding Window
time0 5 10 15
time0 5 10 15
Tumbling Window
Windows do not overlap
Windows can overlap

public class WindowMaxBolt extends BaseWindowedBolt {
private OutputCollector collector;
public void prepare(Map conf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
}
public void execute(TupleWindow inputWindow) {
int max = Integer.MIN_VALUE;
for (Tuple tuple: inputWindow.get()) {
max = Math.Max(max, tuple.getIntegerByField(”val"));
}
collector.emit(new Values(max));
}
public static void main(String[] args) {
// ...
builder.setBolt("slidingMax", new WindowMaxBolt().withWindow(Duration.minutes(10), Duration.minutes(2)));
// submit topology
StormSubmitter.submitTopologyWithProgressBar(args[0], conf, builder.createTopology());
}
}

}
}
}
// ...
// submit topology
}
}
triggered when the window activates
based on the window configuration

}
}
}
// ...
// submit topology
}
}
the window configuration

Trident
topology.newStream("spout", spout)
.window(
SlidingDurationWindow.of(Duration.minutes(10), Duration.minutes(2)), // window configuration
new InMemoryWindowsStoreFactory(), // window store factory
new Fields("val"), // input field
new Min("val"), // aggregate function
new Fields("max") // output field
);

 Processing time
 Event time
– e.g. Processing events out of a log file based on when the event actually happened
– Implemented based on watermarks
 Out of order and late events
– e.g. Mobile gaming data where the user goes offline and data arrives after sometime
Windowing considerations
6:026:01 6:05 6:08
7:017:00
6:10 6:12
7:02
6:15
Processing time
6:09 6:126:08
watermark watermark
6:10
late event

State Management
Stateful Bolts with Automatic Checkpointing

State Management
 Allows storm bolts to store and retrieve the state of its computation
 The framework automatically and periodically snapshots the state of the bolts
 In-memory and Redis based state stores
 Implementation based on
– Asynchronous Barrier Snapshotting (ABS) algorithm [1]
– Chandy-Lamport Algorithm [2]
1. http://arxiv.org/abs/1506.08603
2. http://research.microsoft.com/en-us/um/people/lamport/pubs/chandy.pdf

State Management
public class WordCountBolt extends BaseStatefulBolt<KeyValueState> {
private KeyValueState wordCounts;
}
public void initState(KeyValueState state) {
this.wordCounts = state;
}
public void execute(Tuple tuple) {
String word = tuple.getString(0);
Integer count = (Integer) wordCounts.get(word, 0);
count++;
wordCounts.put(word, count);
collector.emit(new Values(word, count));
}
}

State Management
}
}
count++;
}
}
Initialize state

State Management
}
}
count++;
}
}
Update state

State Management
 Checkpointing/Snapshotting: What you see
Spout Stateful Bolt1 Stateful Bolt2Bolt1
execute/update state execute/update stateexecute

State Management
 Checkpointing/Snapshotting: What you get
Spout
Checkpoint Spout
Stateful Bolt1 Stateful Bolt2Bolt1
State Store
ACKER
ACK
ACKACK
chkpt chkpt
chkpt

Streaming SQL
(currently Beta/WIP)

Streaming SQL
 SQL is a well adopted standard
– available in several projects like Drill, Hive, Phoenix, Spark
 Faster development cycle
 Opportunity to unify batch and stream data analytics
Benefits
SELECT
word, COUNT(*) AS total
FROM
words
GROUP BY word

Streaming SQL
 Leverages Apache calcite
 Compiles SQL statements into Trident topologies
 Usage:
– $ bin/storm sql <sql-file> <topo-name>
Storm implementation

Streaming SQL
 Example
CREATE EXTERNAL TABLE ORDERS
(ID INT PRIMARY KEY, UNIT_PRICE INT, QUANTITY INT)
LOCATION 'kafka://localhost:2181/brokers?topic=orders’
CREATE EXTERNAL TABLE LARGE_ORDERS
(ID INT PRIMARY KEY, TOTAL INT)
LOCATION 'kafka://localhost:2181/brokers?topic=large_orders’
INSERT INTO LARGE_ORDERS
SELECT ID, UNIT_PRICE * QUANTITY AS TOTAL
FROM ORDERS
WHERE UNIT_PRICE * QUANTITY > 50

Streaming SQL
 Storm SQL workflow

Storm Usability Improvements
Enhanced Debugging and Monitoring of Topologies

Storm usability improvements
./bin/storm set_log_level [topology name]-l [logger name]=[LEVEL]:[TIMEOUT]
Dynamic Log Levels
Storm CLI:
Storm UI:

 No more debug bolts or Trident functions
 Click on the “Events” link to view the sample log.
Tuple Sampling

 Distributed Log search
– Search across all topology logs, even archived (ZIP) logs.
 Dynamic worker profiling
– Heap dumps, JStack output, Profiler recordings
 Restart workers from UI
 Supervisor health checks

Integrations
New
 Cassandra
 Solr
 Elastic Search
 MQTT
Improvements
 Kafka
 HDFS spout
 Hbase
 Hive

What’s next for Storm?
2.0 and Beyond

Clojure to Java
Broadening the contributor base
Alibaba JStorm Contribution

Apache Beam (incubating) Integration
 Unified API for dealing with
bounded/unbounded data sources (i.e.
batch/streaming)
 One API. Multiple implementations
(execution engines). Called “Runners” in
Beamspeak.

Apache Storm 2.0 and beyond
 Windowing enhancements
 Unified Stream API
 Revamped UI
 Metrics improvements
and many more…

Q&A
Thank You

Future of Apache Storm

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to Future of Apache Storm

Similar to Future of Apache Storm (20)

More from DataWorks Summit/Hadoop Summit

More from DataWorks Summit/Hadoop Summit (20)

Recently uploaded

Recently uploaded (20)

Future of Apache Storm

Editor's Notes