SlideShare a Scribd company logo
1 of 57
Download to read offline
PREDICTIVE
DATACENTER ANALYTICS
WITH STRYMON
Vasia Kalavri
kalavriv@inf.ethz.ch
Support:
QCon San Francisco
14 November 2017
ABOUT ME
▸ Postdoc at ETH Zürich
▸ Systems Group: https://www.systems.ethz.ch/
▸ PMC member of Apache Flink
▸ Research interests
▸ Large-scale graph processing
▸ Streaming dataflow engines
▸ Current project
▸ Predictive datacenter analytics and
management
2
@vkalavri
DATACENTER
MANAGEMENT
https://code.facebook.com/posts/1499322996995183/solving-the-mystery-of-link-imbalance-a-metastable-failure-state-at-scale/
4
5
https://www.theregister.co.uk/2017/09/28/amadeus_booking_software_outages_lead_to_global_delayed_flights/
DATACENTER MANAGEMENT
66
Workloadfluctuations
Network failures
Configuration
updates
Software updatesResource scaling
Service
deployment
DATACENTER MANAGEMENT
6
Can we predict the effect of changes?
6
Workloadfluctuations
Network failures
Configuration
updates
Software updatesResource scaling
Service
deployment
DATACENTER MANAGEMENT
6
Can we predict the effect of changes?
Can we prevent catastrophic faults?
6
Workloadfluctuations
Network failures
Configuration
updates
Software updatesResource scaling
Service
deployment
Predicting outcomes under hypothetical conditions
What-if Analysis:
7
Predicting outcomes under hypothetical conditions
‣ How will response time change if we migrate a large service?
‣ What will happen to link utilization if we change the routing protocol costs?
‣ Will SLOs will be violated if a certain switch fails?
‣ Which services will be affected if we change load balancing strategy?
What-if Analysis:
7
Test deployment?
Data Center Test Cluster
▸ expensive to operate and maintain
▸ some errors only occur in a large scale!
a physical small-scale cluster
to try out configuration changes and what-ifs
8
Analytical model?
Data Center
infer workload distribution from samples and analytically
model system components (e.g. disk failure rate)
▸ hard to develop, large design space to explore
▸ often inaccurate
9
MODERN ENTERPRISE
DATACENTERS ARE ALREADY
HEAVILY INSTRUMENTED
10
Trace-driven online simulation
Use existing instrumentation to build a datacenter model
▸ construct DC state from real events
▸ simulate the state we cannot directly observe
11
Data Center
current DC
state
forked what-if
DC state
forked what-if
DC state
forked what-if
DC state
1212
traces, configuration,
topology updates, …
Datacenter
Strymon
queries, complex analytics,
simulations, …
policy enforcement,
what-if scenarios, …
STRYMON: ONLINE DATACENTER ANALYTICS AND MANAGEMENT
Datacenter
state
event streams
strymon.systems.ethz.ch
TIMELY DATAFLOW
ingress egress
feedback
STRYMON’S OPERATIONAL REQUIREMENTS
‣ Low latency: react quickly to network failures
‣ High throughput: keep up with high stream rates
‣ Iterative computation: complex graph analytics on the
network topology
‣ Incremental computation: reuse already computed results
when possible
‣ e.g. do not recompute forwarding rules after a link update
14
TIMELY DATAFLOW: STREAM PROCESSING IN RUST
15
▸ Data-parallel computations
▸ Arbitrary cyclic dataflows
▸ Logical timestamps (epochs)
▸ Asynchronous execution
▸ Low latency
D. Murray, F. McSherry, M. Isard, R. Isaacs, P. Barham, M. Abadi.
Naiad: A Timely Dataflow System. In SOSP, 2013.
15
https://github.com/frankmcsherry/timely-dataflow
WORDCOUNT IN TIMELY
fn main() {

timely::execute_from_args(std::env::args(), |worker| {



let mut input = InputHandle::new();

let mut probe = ProbeHandle::new();

let index = worker.index();



worker.dataflow(|scope| {

input.to_stream(scope)

.flat_map(|text: String|

text.split_whitespace()

.map(move |word| (word.to_owned(), 1))

.collect::<Vec<_>>()

)

.aggregate(

|_key, val, agg| { *agg += val; },

|key, agg: i64| (key, agg),

|key| hash_str(key)

)

.inspect(|data| println!("seen {:?}", data))

.probe_with(&mut probe);

});

//feed data
…

}).unwrap();

}
16
WORDCOUNT IN TIMELY
fn main() {

timely::execute_from_args(std::env::args(), |worker| {



let mut input = InputHandle::new();

let mut probe = ProbeHandle::new();

let index = worker.index();



worker.dataflow(|scope| {

input.to_stream(scope)

.flat_map(|text: String|

text.split_whitespace()

.map(move |word| (word.to_owned(), 1))

.collect::<Vec<_>>()

)

.aggregate(

|_key, val, agg| { *agg += val; },

|key, agg: i64| (key, agg),

|key| hash_str(key)

)

.inspect(|data| println!("seen {:?}", data))

.probe_with(&mut probe);

});

//feed data
…

}).unwrap();

}
16
initialize and run
a timely job
WORDCOUNT IN TIMELY
fn main() {

timely::execute_from_args(std::env::args(), |worker| {



let mut input = InputHandle::new();

let mut probe = ProbeHandle::new();

let index = worker.index();



worker.dataflow(|scope| {

input.to_stream(scope)

.flat_map(|text: String|

text.split_whitespace()

.map(move |word| (word.to_owned(), 1))

.collect::<Vec<_>>()

)

.aggregate(

|_key, val, agg| { *agg += val; },

|key, agg: i64| (key, agg),

|key| hash_str(key)

)

.inspect(|data| println!("seen {:?}", data))

.probe_with(&mut probe);

});

//feed data
…

}).unwrap();

}
16
create input and
progress handles
WORDCOUNT IN TIMELY
fn main() {

timely::execute_from_args(std::env::args(), |worker| {



let mut input = InputHandle::new();

let mut probe = ProbeHandle::new();

let index = worker.index();



worker.dataflow(|scope| {

input.to_stream(scope)

.flat_map(|text: String|

text.split_whitespace()

.map(move |word| (word.to_owned(), 1))

.collect::<Vec<_>>()

)

.aggregate(

|_key, val, agg| { *agg += val; },

|key, agg: i64| (key, agg),

|key| hash_str(key)

)

.inspect(|data| println!("seen {:?}", data))

.probe_with(&mut probe);

});

//feed data
…

}).unwrap();

}
16
define the dataflow
and its operators
WORDCOUNT IN TIMELY
fn main() {

timely::execute_from_args(std::env::args(), |worker| {



let mut input = InputHandle::new();

let mut probe = ProbeHandle::new();

let index = worker.index();



worker.dataflow(|scope| {

input.to_stream(scope)

.flat_map(|text: String|

text.split_whitespace()

.map(move |word| (word.to_owned(), 1))

.collect::<Vec<_>>()

)

.aggregate(

|_key, val, agg| { *agg += val; },

|key, agg: i64| (key, agg),

|key| hash_str(key)

)

.inspect(|data| println!("seen {:?}", data))

.probe_with(&mut probe);

});

//feed data
…

}).unwrap();

}
16
watch for
progress
WORDCOUNT IN TIMELY
fn main() {

timely::execute_from_args(std::env::args(), |worker| {



let mut input = InputHandle::new();

let mut probe = ProbeHandle::new();

let index = worker.index();



worker.dataflow(|scope| {

input.to_stream(scope)

.flat_map(|text: String|

text.split_whitespace()

.map(move |word| (word.to_owned(), 1))

.collect::<Vec<_>>()

)

.aggregate(

|_key, val, agg| { *agg += val; },

|key, agg: i64| (key, agg),

|key| hash_str(key)

)

.inspect(|data| println!("seen {:?}", data))

.probe_with(&mut probe);

});

//feed data
…

}).unwrap();

}
16
a few Rust
peculiarities to
get used to :-)
PROGRESS TRACKING
▸ All tuples bear a logical timestamp (think event time)
▸ To send a timestamped tuple, an operator must hold a
capability for it
▸ Workers broadcast progress changes to other workers
▸ Each worker independently determines progress
17
A distributed protocol that allows operators reason
about the possibility of receiving data
MAKING PROGRESS
fn main() {

timely::execute_from_args(std::env::args(), |worker| {

…

…
…
for round in 0..10 {

input.send(("round".to_owned(), 1));

input.advance_to(round + 1);

while probe.less_than(input.time()) {

worker.step();

}
}

}).unwrap();

}
18
MAKING PROGRESS
fn main() {

timely::execute_from_args(std::env::args(), |worker| {

…

…
…
for round in 0..10 {

input.send(("round".to_owned(), 1));

input.advance_to(round + 1);

while probe.less_than(input.time()) {

worker.step();

}
}

}).unwrap();

}
18
push data to the
input stream
MAKING PROGRESS
fn main() {

timely::execute_from_args(std::env::args(), |worker| {

…

…
…
for round in 0..10 {

input.send(("round".to_owned(), 1));

input.advance_to(round + 1);

while probe.less_than(input.time()) {

worker.step();

}
}

}).unwrap();

}
18
advance the
input epoch
MAKING PROGRESS
fn main() {

timely::execute_from_args(std::env::args(), |worker| {

…

…
…
for round in 0..10 {

input.send(("round".to_owned(), 1));

input.advance_to(round + 1);

while probe.less_than(input.time()) {

worker.step();

}
}

}).unwrap();

}
18
do work while there’s
still data for this epoch
TIMELY ITERATIONS
timely::example(|scope| {



let (handle, stream) = scope.loop_variable(100, 1);

(0..10).to_stream(scope)

.concat(&stream)

.inspect(|x| println!("seen: {:?}", x))

.connect_loop(handle);

});
19
TIMELY ITERATIONS
timely::example(|scope| {



let (handle, stream) = scope.loop_variable(100, 1);

(0..10).to_stream(scope)

.concat(&stream)

.inspect(|x| println!("seen: {:?}", x))

.connect_loop(handle);

});
19
loop 100 times
at most
advance
timestamps by 1
in each iteration
create the
feedback loop
TIMELY ITERATIONS
timely::example(|scope| {



let (handle, stream) = scope.loop_variable(100, 1);

(0..10).to_stream(scope)

.concat(&stream)

.inspect(|x| println!("seen: {:?}", x))

.connect_loop(handle);

});
19
loop 100 times
at most
advance
timestamps by 1
in each iteration
create the
feedback loop
t
(t, l1)
(t, (l1, l2))
TIMELY & I
20
TIMELY & I
20
Relationship status:
it’s complicated
TIMELY & I
20
Relationship status:
it’s complicated
APIs &
libraries
performance
ecosystem
fault-tolerance
deployment
debugging
expressiveness
incremental
computation
performance
STRYMON
USE-CASES
What-if analysis
Real-time
datacenter analytics
Incremental
network routing
Online critical
path analysis
Strymon
22
‣ Query and analyze state online
‣ Control and enforce configuration
‣ Simulate what-if scenarios
‣ Understand performance
What-if analysis
Real-time
datacenter analytics
Incremental
network routing
Online critical
path analysis
Strymon
23
‣ Query and analyze state online
‣ Control and enforce configuration
‣ Simulate what-if scenarios
‣ Understand performance
RECONSTRUCTING USER SESSIONS
24
Application A
Application B
A.1
A.2
A.3
B.1
B.2
B.3
B.4
Time: 2015/09/01 10:03:38.599859
Session ID: XKSHSKCBA53U088FXGE7LD8
Transaction ID: 26-3-11-5-1
PERFORMANCE RESULTS
25
Log event
A
B
1 2
1-1 1-2
1
1-3
1-1
Client Time
1
1-1 1-2
2
Inactivity
Transaction IDn-2-8Transaction
‣ Logs from 1263
streams and 42 servers
‣ 1.3 million events/s at
424.3 MB/s
‣ 26ms per epoch vs. 2.1s
per epoch with Flink
Zaheer Chothia, John Liagouris, Desislava Dimitrova, and Timothy Roscoe.
Online Reconstruction of Structural Information from Datacenter Logs. (EuroSys '17).
More Results:
What-if analysis
Real-time
datacenter analytics
Incremental
network routing
Online critical
path analysis
Strymon
26
‣ Query and analyze state online
‣ Control and enforce configuration
‣ Simulate what-if scenarios
‣ Understand performance
ROUTING AS A STREAMING COMPUTATION
Network
changes
G R
delta-join
iteration
Updated rules
Forwarding 

rules
27
‣ Compute APSP and keep forwarding rules as operator state
‣ Flow requests translate to lookups on that state
‣ Network updates cause re-computation of affected rules only
REACTION TO LINK FAILURES
▸ Fail random link
▸ 500 individual runs
▸ 32 threads
28
Desislava C. Dimitrova et. al.
Quick Incremental Routing Logic for Dynamic Network Graphs. SIGCOMM Posters and Demos '17
More Results:
What-if analysis
Real-time
datacenter analytics
Incremental
network routing
Online critical
path analysis
Strymon
29
‣ Query and analyze state online
‣ Control and enforce configuration
‣ Simulate what-if scenarios
‣ Understand performance
ONLINE CRITICAL PATH ANALYSIS
30
input stream output stream
periodic
snapshot
trace snapshot
stream
analyzer
performance
summaries
Apache Flink,
Apache Spark,
TensorFlow,
Heron,
Timely Dataflow
TASK SCHEDULING IN APACHE SPARK
31
DRIVER
W1
W2
W3
Venkataraman, Shivaram, et al. "Drizzle: Fast and adaptable stream processing at scale." Spark Summit (2016).
SCHEDULING BOTTLENECK IN APACHE SPARK
32
0 5 10 15
Snapshot
0.0
0.2
0.4
0.6
0.8
CP
0 5 10 15
Snapshot
%weight
Processing Scheduling
Conventional ProfilingStrymon Profiling
Apache Spark: Yahoo! Streaming Benchmark, 16 workers, 8s snapshots
SCHEDULING BOTTLENECK IN APACHE SPARK
32
0 5 10 15
Snapshot
0.0
0.2
0.4
0.6
0.8
CP
0 5 10 15
Snapshot
%weight
Processing Scheduling
Conventional ProfilingStrymon Profiling
Apache Spark: Yahoo! Streaming Benchmark, 16 workers, 8s snapshots
What-if analysis
Real-time
datacenter analytics
Incremental
network routing
Online critical
path analysis
Strymon
33
‣ Query and analyze state online
‣ Control and enforce configuration
‣ Simulate what-if scenarios
‣ Understand performance
EVALUATING LOAD BALANCING STRATEGIES
▸ 13k services
▸ ~100K user requests/s
▸ OSPF routing
▸ Weighted Round-Robin
load balancing
34
Traffic matrix simulation
under topology and load
balancing changes
WHAT-IF DATAFLOW
35
Logging Servers
Topology Discovery
Configuration DBs
RPC logs
(text)
Topology changes
(graph)
Device specs,

application placement
(relational)
ETL Load Balancing Routing
LB View
Construction
Routing View
Construction
Traffic Matrix
Construction
DC Model
construction
STRYMON DEMO
Strymon 0.1.0 has been released!
https://strymon-system.github.io
https://github.com/strymon-system
strymon-users@lists.inf.ethz.ch
Try it out:
Send us feedback:
37
THE STRYMON TEAM & FRIENDS
38
Vasiliki Kalavri
Zaheer Chothia
Andrea Lattuada
Prof. Timothy Roscoe
Sebastian Wicki
Moritz HoffmannDesislava Dimitrova
John Liagouris
Frank McSherry
THE STRYMON TEAM & FRIENDS
38
Vasiliki Kalavri
Zaheer Chothia
Andrea Lattuada
Prof. Timothy Roscoe
Sebastian Wicki
Moritz HoffmannDesislava Dimitrova
John Liagouris
? ?
Frank McSherry
THE STRYMON TEAM & FRIENDS
38
Vasiliki Kalavri
Zaheer Chothia
Andrea Lattuada
Prof. Timothy Roscoe
Sebastian Wicki
Moritz HoffmannDesislava Dimitrova
John Liagouris
? ?
Frank McSherry
IT COULD BE YOU!
strymon.systems.ethz.ch
PREDICTIVE
DATACENTER ANALYTICS
WITH STRYMON
Vasia Kalavri
kalavriv@inf.ethz.ch
Support:
QCon San Francisco
14 November 2017

More Related Content

What's hot

Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Stephan Ewen
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo PeriniFlink Forward
 
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...ucelebi
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeFlink Forward
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantParis Carbone
 
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015Till Rohrmann
 
Flink Gelly - Karlsruhe - June 2015
Flink Gelly - Karlsruhe - June 2015Flink Gelly - Karlsruhe - June 2015
Flink Gelly - Karlsruhe - June 2015Andra Lungu
 
Self-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processingSelf-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processingVasia Kalavri
 
Machine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning GroupMachine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning GroupTill Rohrmann
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit
 
Michael Häusler – Everyday flink
Michael Häusler – Everyday flinkMichael Häusler – Everyday flink
Michael Häusler – Everyday flinkFlink Forward
 
Kenneth Knowles - Apache Beam - A Unified Model for Batch and Streaming Data...
Kenneth Knowles -  Apache Beam - A Unified Model for Batch and Streaming Data...Kenneth Knowles -  Apache Beam - A Unified Model for Batch and Streaming Data...
Kenneth Knowles - Apache Beam - A Unified Model for Batch and Streaming Data...Flink Forward
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Robert Metzger
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System OverviewFlink Forward
 
Apache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonApache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonStephan Ewen
 
Block Sampling: Efficient Accurate Online Aggregation in MapReduce
Block Sampling: Efficient Accurate Online Aggregation in MapReduceBlock Sampling: Efficient Accurate Online Aggregation in MapReduce
Block Sampling: Efficient Accurate Online Aggregation in MapReduceVasia Kalavri
 
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...Flink Forward
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkDatabricks
 
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...Flink Forward
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Spark Summit
 

What's hot (20)

Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
 
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are important
 
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
 
Flink Gelly - Karlsruhe - June 2015
Flink Gelly - Karlsruhe - June 2015Flink Gelly - Karlsruhe - June 2015
Flink Gelly - Karlsruhe - June 2015
 
Self-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processingSelf-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processing
 
Machine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning GroupMachine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning Group
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Michael Häusler – Everyday flink
Michael Häusler – Everyday flinkMichael Häusler – Everyday flink
Michael Häusler – Everyday flink
 
Kenneth Knowles - Apache Beam - A Unified Model for Batch and Streaming Data...
Kenneth Knowles -  Apache Beam - A Unified Model for Batch and Streaming Data...Kenneth Knowles -  Apache Beam - A Unified Model for Batch and Streaming Data...
Kenneth Knowles - Apache Beam - A Unified Model for Batch and Streaming Data...
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Apache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonApache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World London
 
Block Sampling: Efficient Accurate Online Aggregation in MapReduce
Block Sampling: Efficient Accurate Online Aggregation in MapReduceBlock Sampling: Efficient Accurate Online Aggregation in MapReduce
Block Sampling: Efficient Accurate Online Aggregation in MapReduce
 
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
 
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 

Similar to Predictive Datacenter Analytics with Strymon

Hadoop Introduction
Hadoop IntroductionHadoop Introduction
Hadoop IntroductionSNEHAL MASNE
 
RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]Igor Lozynskyi
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learningSamir Bessalah
 
Testing multi outputformat based mapreduce
Testing multi outputformat based mapreduceTesting multi outputformat based mapreduce
Testing multi outputformat based mapreduceAshok Agarwal
 
Cascading Through Hadoop for the Boulder JUG
Cascading Through Hadoop for the Boulder JUGCascading Through Hadoop for the Boulder JUG
Cascading Through Hadoop for the Boulder JUGMatthew McCullough
 
Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Massimo Schenone
 
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design PathshalaAdvance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design PathshalaDesing Pathshala
 
Dataservices: Processing (Big) Data the Microservice Way
Dataservices: Processing (Big) Data the Microservice WayDataservices: Processing (Big) Data the Microservice Way
Dataservices: Processing (Big) Data the Microservice WayQAware GmbH
 
Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017
Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017
Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017Codemotion
 
Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Anyscale
 
Functional Reactive Programming in JavaScript
Functional Reactive Programming in JavaScriptFunctional Reactive Programming in JavaScript
Functional Reactive Programming in JavaScriptzupzup.org
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkmxmxm
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014NoSQLmatters
 
Apache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and FriendsApache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and FriendsStephan Ewen
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetLucian Oprea
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 
Spark what's new what's coming
Spark what's new what's comingSpark what's new what's coming
Spark what's new what's comingDatabricks
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Apache Apex
 
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
 Let Spark Fly: Advantages and Use Cases for Spark on Hadoop Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
Let Spark Fly: Advantages and Use Cases for Spark on HadoopMapR Technologies
 

Similar to Predictive Datacenter Analytics with Strymon (20)

Hadoop Introduction
Hadoop IntroductionHadoop Introduction
Hadoop Introduction
 
RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learning
 
Testing multi outputformat based mapreduce
Testing multi outputformat based mapreduceTesting multi outputformat based mapreduce
Testing multi outputformat based mapreduce
 
Cascading Through Hadoop for the Boulder JUG
Cascading Through Hadoop for the Boulder JUGCascading Through Hadoop for the Boulder JUG
Cascading Through Hadoop for the Boulder JUG
 
Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Apache Spark: What? Why? When?
Apache Spark: What? Why? When?
 
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design PathshalaAdvance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
 
Dataservices: Processing (Big) Data the Microservice Way
Dataservices: Processing (Big) Data the Microservice WayDataservices: Processing (Big) Data the Microservice Way
Dataservices: Processing (Big) Data the Microservice Way
 
Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017
Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017
Microservices in GO - Massimiliano Dessì - Codemotion Rome 2017
 
Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0
 
Functional Reactive Programming in JavaScript
Functional Reactive Programming in JavaScriptFunctional Reactive Programming in JavaScript
Functional Reactive Programming in JavaScript
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
 
Apache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and FriendsApache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and Friends
 
Amazon elastic map reduce
Amazon elastic map reduceAmazon elastic map reduce
Amazon elastic map reduce
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
Spark what's new what's coming
Spark what's new what's comingSpark what's new what's coming
Spark what's new what's coming
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
 
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
 Let Spark Fly: Advantages and Use Cases for Spark on Hadoop Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
 

More from Vasia Kalavri

From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondVasia Kalavri
 
Online performance analysis of distributed dataflow systems (O'Reilly Velocit...
Online performance analysis of distributed dataflow systems (O'Reilly Velocit...Online performance analysis of distributed dataflow systems (O'Reilly Velocit...
Online performance analysis of distributed dataflow systems (O'Reilly Velocit...Vasia Kalavri
 
The shortest path is not always a straight line
The shortest path is not always a straight lineThe shortest path is not always a straight line
The shortest path is not always a straight lineVasia Kalavri
 
Like a Pack of Wolves: Community Structure of Web Trackers
Like a Pack of Wolves: Community Structure of Web TrackersLike a Pack of Wolves: Community Structure of Web Trackers
Like a Pack of Wolves: Community Structure of Web TrackersVasia Kalavri
 
Big data processing systems research
Big data processing systems researchBig data processing systems research
Big data processing systems researchVasia Kalavri
 
m2r2: A Framework for Results Materialization and Reuse
m2r2: A Framework for Results Materialization and Reusem2r2: A Framework for Results Materialization and Reuse
m2r2: A Framework for Results Materialization and ReuseVasia Kalavri
 
MapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open IssuesMapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open IssuesVasia Kalavri
 
A Skype case study (2011)
A Skype case study (2011)A Skype case study (2011)
A Skype case study (2011)Vasia Kalavri
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep DiveVasia Kalavri
 
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Vasia Kalavri
 

More from Vasia Kalavri (10)

From data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyondFrom data stream management to distributed dataflows and beyond
From data stream management to distributed dataflows and beyond
 
Online performance analysis of distributed dataflow systems (O'Reilly Velocit...
Online performance analysis of distributed dataflow systems (O'Reilly Velocit...Online performance analysis of distributed dataflow systems (O'Reilly Velocit...
Online performance analysis of distributed dataflow systems (O'Reilly Velocit...
 
The shortest path is not always a straight line
The shortest path is not always a straight lineThe shortest path is not always a straight line
The shortest path is not always a straight line
 
Like a Pack of Wolves: Community Structure of Web Trackers
Like a Pack of Wolves: Community Structure of Web TrackersLike a Pack of Wolves: Community Structure of Web Trackers
Like a Pack of Wolves: Community Structure of Web Trackers
 
Big data processing systems research
Big data processing systems researchBig data processing systems research
Big data processing systems research
 
m2r2: A Framework for Results Materialization and Reuse
m2r2: A Framework for Results Materialization and Reusem2r2: A Framework for Results Materialization and Reuse
m2r2: A Framework for Results Materialization and Reuse
 
MapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open IssuesMapReduce: Optimizations, Limitations, and Open Issues
MapReduce: Optimizations, Limitations, and Open Issues
 
A Skype case study (2011)
A Skype case study (2011)A Skype case study (2011)
A Skype case study (2011)
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep Dive
 
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
 

Recently uploaded

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 

Recently uploaded (20)

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 

Predictive Datacenter Analytics with Strymon

  • 1. PREDICTIVE DATACENTER ANALYTICS WITH STRYMON Vasia Kalavri kalavriv@inf.ethz.ch Support: QCon San Francisco 14 November 2017
  • 2. ABOUT ME ▸ Postdoc at ETH Zürich ▸ Systems Group: https://www.systems.ethz.ch/ ▸ PMC member of Apache Flink ▸ Research interests ▸ Large-scale graph processing ▸ Streaming dataflow engines ▸ Current project ▸ Predictive datacenter analytics and management 2 @vkalavri
  • 7. DATACENTER MANAGEMENT 6 Can we predict the effect of changes? 6 Workloadfluctuations Network failures Configuration updates Software updatesResource scaling Service deployment
  • 8. DATACENTER MANAGEMENT 6 Can we predict the effect of changes? Can we prevent catastrophic faults? 6 Workloadfluctuations Network failures Configuration updates Software updatesResource scaling Service deployment
  • 9. Predicting outcomes under hypothetical conditions What-if Analysis: 7
  • 10. Predicting outcomes under hypothetical conditions ‣ How will response time change if we migrate a large service? ‣ What will happen to link utilization if we change the routing protocol costs? ‣ Will SLOs will be violated if a certain switch fails? ‣ Which services will be affected if we change load balancing strategy? What-if Analysis: 7
  • 11. Test deployment? Data Center Test Cluster ▸ expensive to operate and maintain ▸ some errors only occur in a large scale! a physical small-scale cluster to try out configuration changes and what-ifs 8
  • 12. Analytical model? Data Center infer workload distribution from samples and analytically model system components (e.g. disk failure rate) ▸ hard to develop, large design space to explore ▸ often inaccurate 9
  • 13. MODERN ENTERPRISE DATACENTERS ARE ALREADY HEAVILY INSTRUMENTED 10
  • 14. Trace-driven online simulation Use existing instrumentation to build a datacenter model ▸ construct DC state from real events ▸ simulate the state we cannot directly observe 11 Data Center current DC state forked what-if DC state forked what-if DC state forked what-if DC state
  • 15. 1212 traces, configuration, topology updates, … Datacenter Strymon queries, complex analytics, simulations, … policy enforcement, what-if scenarios, … STRYMON: ONLINE DATACENTER ANALYTICS AND MANAGEMENT Datacenter state event streams strymon.systems.ethz.ch
  • 17. STRYMON’S OPERATIONAL REQUIREMENTS ‣ Low latency: react quickly to network failures ‣ High throughput: keep up with high stream rates ‣ Iterative computation: complex graph analytics on the network topology ‣ Incremental computation: reuse already computed results when possible ‣ e.g. do not recompute forwarding rules after a link update 14
  • 18. TIMELY DATAFLOW: STREAM PROCESSING IN RUST 15 ▸ Data-parallel computations ▸ Arbitrary cyclic dataflows ▸ Logical timestamps (epochs) ▸ Asynchronous execution ▸ Low latency D. Murray, F. McSherry, M. Isard, R. Isaacs, P. Barham, M. Abadi. Naiad: A Timely Dataflow System. In SOSP, 2013. 15 https://github.com/frankmcsherry/timely-dataflow
  • 19. WORDCOUNT IN TIMELY fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 
 let mut input = InputHandle::new();
 let mut probe = ProbeHandle::new();
 let index = worker.index();
 
 worker.dataflow(|scope| {
 input.to_stream(scope)
 .flat_map(|text: String|
 text.split_whitespace()
 .map(move |word| (word.to_owned(), 1))
 .collect::<Vec<_>>()
 )
 .aggregate(
 |_key, val, agg| { *agg += val; },
 |key, agg: i64| (key, agg),
 |key| hash_str(key)
 )
 .inspect(|data| println!("seen {:?}", data))
 .probe_with(&mut probe);
 });
 //feed data …
 }).unwrap();
 } 16
  • 20. WORDCOUNT IN TIMELY fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 
 let mut input = InputHandle::new();
 let mut probe = ProbeHandle::new();
 let index = worker.index();
 
 worker.dataflow(|scope| {
 input.to_stream(scope)
 .flat_map(|text: String|
 text.split_whitespace()
 .map(move |word| (word.to_owned(), 1))
 .collect::<Vec<_>>()
 )
 .aggregate(
 |_key, val, agg| { *agg += val; },
 |key, agg: i64| (key, agg),
 |key| hash_str(key)
 )
 .inspect(|data| println!("seen {:?}", data))
 .probe_with(&mut probe);
 });
 //feed data …
 }).unwrap();
 } 16 initialize and run a timely job
  • 21. WORDCOUNT IN TIMELY fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 
 let mut input = InputHandle::new();
 let mut probe = ProbeHandle::new();
 let index = worker.index();
 
 worker.dataflow(|scope| {
 input.to_stream(scope)
 .flat_map(|text: String|
 text.split_whitespace()
 .map(move |word| (word.to_owned(), 1))
 .collect::<Vec<_>>()
 )
 .aggregate(
 |_key, val, agg| { *agg += val; },
 |key, agg: i64| (key, agg),
 |key| hash_str(key)
 )
 .inspect(|data| println!("seen {:?}", data))
 .probe_with(&mut probe);
 });
 //feed data …
 }).unwrap();
 } 16 create input and progress handles
  • 22. WORDCOUNT IN TIMELY fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 
 let mut input = InputHandle::new();
 let mut probe = ProbeHandle::new();
 let index = worker.index();
 
 worker.dataflow(|scope| {
 input.to_stream(scope)
 .flat_map(|text: String|
 text.split_whitespace()
 .map(move |word| (word.to_owned(), 1))
 .collect::<Vec<_>>()
 )
 .aggregate(
 |_key, val, agg| { *agg += val; },
 |key, agg: i64| (key, agg),
 |key| hash_str(key)
 )
 .inspect(|data| println!("seen {:?}", data))
 .probe_with(&mut probe);
 });
 //feed data …
 }).unwrap();
 } 16 define the dataflow and its operators
  • 23. WORDCOUNT IN TIMELY fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 
 let mut input = InputHandle::new();
 let mut probe = ProbeHandle::new();
 let index = worker.index();
 
 worker.dataflow(|scope| {
 input.to_stream(scope)
 .flat_map(|text: String|
 text.split_whitespace()
 .map(move |word| (word.to_owned(), 1))
 .collect::<Vec<_>>()
 )
 .aggregate(
 |_key, val, agg| { *agg += val; },
 |key, agg: i64| (key, agg),
 |key| hash_str(key)
 )
 .inspect(|data| println!("seen {:?}", data))
 .probe_with(&mut probe);
 });
 //feed data …
 }).unwrap();
 } 16 watch for progress
  • 24. WORDCOUNT IN TIMELY fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 
 let mut input = InputHandle::new();
 let mut probe = ProbeHandle::new();
 let index = worker.index();
 
 worker.dataflow(|scope| {
 input.to_stream(scope)
 .flat_map(|text: String|
 text.split_whitespace()
 .map(move |word| (word.to_owned(), 1))
 .collect::<Vec<_>>()
 )
 .aggregate(
 |_key, val, agg| { *agg += val; },
 |key, agg: i64| (key, agg),
 |key| hash_str(key)
 )
 .inspect(|data| println!("seen {:?}", data))
 .probe_with(&mut probe);
 });
 //feed data …
 }).unwrap();
 } 16 a few Rust peculiarities to get used to :-)
  • 25. PROGRESS TRACKING ▸ All tuples bear a logical timestamp (think event time) ▸ To send a timestamped tuple, an operator must hold a capability for it ▸ Workers broadcast progress changes to other workers ▸ Each worker independently determines progress 17 A distributed protocol that allows operators reason about the possibility of receiving data
  • 26. MAKING PROGRESS fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 …
 … … for round in 0..10 {
 input.send(("round".to_owned(), 1));
 input.advance_to(round + 1);
 while probe.less_than(input.time()) {
 worker.step();
 } }
 }).unwrap();
 } 18
  • 27. MAKING PROGRESS fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 …
 … … for round in 0..10 {
 input.send(("round".to_owned(), 1));
 input.advance_to(round + 1);
 while probe.less_than(input.time()) {
 worker.step();
 } }
 }).unwrap();
 } 18 push data to the input stream
  • 28. MAKING PROGRESS fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 …
 … … for round in 0..10 {
 input.send(("round".to_owned(), 1));
 input.advance_to(round + 1);
 while probe.less_than(input.time()) {
 worker.step();
 } }
 }).unwrap();
 } 18 advance the input epoch
  • 29. MAKING PROGRESS fn main() {
 timely::execute_from_args(std::env::args(), |worker| {
 …
 … … for round in 0..10 {
 input.send(("round".to_owned(), 1));
 input.advance_to(round + 1);
 while probe.less_than(input.time()) {
 worker.step();
 } }
 }).unwrap();
 } 18 do work while there’s still data for this epoch
  • 30. TIMELY ITERATIONS timely::example(|scope| {
 
 let (handle, stream) = scope.loop_variable(100, 1);
 (0..10).to_stream(scope)
 .concat(&stream)
 .inspect(|x| println!("seen: {:?}", x))
 .connect_loop(handle);
 }); 19
  • 31. TIMELY ITERATIONS timely::example(|scope| {
 
 let (handle, stream) = scope.loop_variable(100, 1);
 (0..10).to_stream(scope)
 .concat(&stream)
 .inspect(|x| println!("seen: {:?}", x))
 .connect_loop(handle);
 }); 19 loop 100 times at most advance timestamps by 1 in each iteration create the feedback loop
  • 32. TIMELY ITERATIONS timely::example(|scope| {
 
 let (handle, stream) = scope.loop_variable(100, 1);
 (0..10).to_stream(scope)
 .concat(&stream)
 .inspect(|x| println!("seen: {:?}", x))
 .connect_loop(handle);
 }); 19 loop 100 times at most advance timestamps by 1 in each iteration create the feedback loop t (t, l1) (t, (l1, l2))
  • 34. TIMELY & I 20 Relationship status: it’s complicated
  • 35. TIMELY & I 20 Relationship status: it’s complicated APIs & libraries performance ecosystem fault-tolerance deployment debugging expressiveness incremental computation performance
  • 37. What-if analysis Real-time datacenter analytics Incremental network routing Online critical path analysis Strymon 22 ‣ Query and analyze state online ‣ Control and enforce configuration ‣ Simulate what-if scenarios ‣ Understand performance
  • 38. What-if analysis Real-time datacenter analytics Incremental network routing Online critical path analysis Strymon 23 ‣ Query and analyze state online ‣ Control and enforce configuration ‣ Simulate what-if scenarios ‣ Understand performance
  • 39. RECONSTRUCTING USER SESSIONS 24 Application A Application B A.1 A.2 A.3 B.1 B.2 B.3 B.4 Time: 2015/09/01 10:03:38.599859 Session ID: XKSHSKCBA53U088FXGE7LD8 Transaction ID: 26-3-11-5-1
  • 40. PERFORMANCE RESULTS 25 Log event A B 1 2 1-1 1-2 1 1-3 1-1 Client Time 1 1-1 1-2 2 Inactivity Transaction IDn-2-8Transaction ‣ Logs from 1263 streams and 42 servers ‣ 1.3 million events/s at 424.3 MB/s ‣ 26ms per epoch vs. 2.1s per epoch with Flink Zaheer Chothia, John Liagouris, Desislava Dimitrova, and Timothy Roscoe. Online Reconstruction of Structural Information from Datacenter Logs. (EuroSys '17). More Results:
  • 41. What-if analysis Real-time datacenter analytics Incremental network routing Online critical path analysis Strymon 26 ‣ Query and analyze state online ‣ Control and enforce configuration ‣ Simulate what-if scenarios ‣ Understand performance
  • 42. ROUTING AS A STREAMING COMPUTATION Network changes G R delta-join iteration Updated rules Forwarding 
 rules 27 ‣ Compute APSP and keep forwarding rules as operator state ‣ Flow requests translate to lookups on that state ‣ Network updates cause re-computation of affected rules only
  • 43. REACTION TO LINK FAILURES ▸ Fail random link ▸ 500 individual runs ▸ 32 threads 28 Desislava C. Dimitrova et. al. Quick Incremental Routing Logic for Dynamic Network Graphs. SIGCOMM Posters and Demos '17 More Results:
  • 44. What-if analysis Real-time datacenter analytics Incremental network routing Online critical path analysis Strymon 29 ‣ Query and analyze state online ‣ Control and enforce configuration ‣ Simulate what-if scenarios ‣ Understand performance
  • 45. ONLINE CRITICAL PATH ANALYSIS 30 input stream output stream periodic snapshot trace snapshot stream analyzer performance summaries Apache Flink, Apache Spark, TensorFlow, Heron, Timely Dataflow
  • 46. TASK SCHEDULING IN APACHE SPARK 31 DRIVER W1 W2 W3 Venkataraman, Shivaram, et al. "Drizzle: Fast and adaptable stream processing at scale." Spark Summit (2016).
  • 47. SCHEDULING BOTTLENECK IN APACHE SPARK 32 0 5 10 15 Snapshot 0.0 0.2 0.4 0.6 0.8 CP 0 5 10 15 Snapshot %weight Processing Scheduling Conventional ProfilingStrymon Profiling Apache Spark: Yahoo! Streaming Benchmark, 16 workers, 8s snapshots
  • 48. SCHEDULING BOTTLENECK IN APACHE SPARK 32 0 5 10 15 Snapshot 0.0 0.2 0.4 0.6 0.8 CP 0 5 10 15 Snapshot %weight Processing Scheduling Conventional ProfilingStrymon Profiling Apache Spark: Yahoo! Streaming Benchmark, 16 workers, 8s snapshots
  • 49. What-if analysis Real-time datacenter analytics Incremental network routing Online critical path analysis Strymon 33 ‣ Query and analyze state online ‣ Control and enforce configuration ‣ Simulate what-if scenarios ‣ Understand performance
  • 50. EVALUATING LOAD BALANCING STRATEGIES ▸ 13k services ▸ ~100K user requests/s ▸ OSPF routing ▸ Weighted Round-Robin load balancing 34 Traffic matrix simulation under topology and load balancing changes
  • 51. WHAT-IF DATAFLOW 35 Logging Servers Topology Discovery Configuration DBs RPC logs (text) Topology changes (graph) Device specs,
 application placement (relational) ETL Load Balancing Routing LB View Construction Routing View Construction Traffic Matrix Construction DC Model construction
  • 53. Strymon 0.1.0 has been released! https://strymon-system.github.io https://github.com/strymon-system strymon-users@lists.inf.ethz.ch Try it out: Send us feedback: 37
  • 54. THE STRYMON TEAM & FRIENDS 38 Vasiliki Kalavri Zaheer Chothia Andrea Lattuada Prof. Timothy Roscoe Sebastian Wicki Moritz HoffmannDesislava Dimitrova John Liagouris Frank McSherry
  • 55. THE STRYMON TEAM & FRIENDS 38 Vasiliki Kalavri Zaheer Chothia Andrea Lattuada Prof. Timothy Roscoe Sebastian Wicki Moritz HoffmannDesislava Dimitrova John Liagouris ? ? Frank McSherry
  • 56. THE STRYMON TEAM & FRIENDS 38 Vasiliki Kalavri Zaheer Chothia Andrea Lattuada Prof. Timothy Roscoe Sebastian Wicki Moritz HoffmannDesislava Dimitrova John Liagouris ? ? Frank McSherry IT COULD BE YOU! strymon.systems.ethz.ch
  • 57. PREDICTIVE DATACENTER ANALYTICS WITH STRYMON Vasia Kalavri kalavriv@inf.ethz.ch Support: QCon San Francisco 14 November 2017