SlideShare a Scribd company logo
1 of 22
Download to read offline
A Multicore Parallelization of
Continuous Skyline Queries on Data
Streams
University of Pisa
Italy
Europar 2015 - Vienna
Tiziano De Matteis, Salvatore Di Girolamo,
Gabriele Mencagli
INTRODUCTION
Skyline queries are used to retrieve interesting points from a large
dataset according to multiple criteria (Pareto optimal).
Example: “Find cheap hotels near the City Center”
distance
price
Traditionally used in static DBMS, they have become a commonplace
in real-time applications working on input data on the fly such as
financial applications, social network analysis, sensor networks… and so on.
INTRODUCTION
(Skyline) queries over data streams are challenging:
○ no control on how elements arrive;
○ due to unbounded input, the query is evaluated on windows that
contains the most recent tuples;
○ performance requirements in term of throughput and latency.
Parallelism is unavoidable
Goal: parallelization of continuous skyline query over multicores:
○ map-reduce pattern implementation;
○ taking into account optimizations such as asynchronous reduce
and load-balancing.
PRELIMINARIES
Each point p is represented as a tuple of d≥1 attributes {p1
,p2
,...,pd
}
Given two points p and r, we say that p dominates r (p≺ r) iff:
∀i ∈ [1,d] pi
≤ ri
and ∃ j | pj
< rj
A sliding window is used to maintain the most recent tuples. Its length
is expressed by the user in Tw
time units (e.g. seconds, minutes):
○ the skyline at time t is computed over all points arrived in [t-Tw
,t];
○ a point p arrived at time tp
arr
expires at time tp
exp
=tp
arr
+Tw
Given a set of points , its skyline is the
subset of all the points not dominated by any
other point in
p
v r
CONTINUOUS SKYLINE OPERATOR
The Skyline Operator has to maintain the skyline set of the points contained in
the current window (i.e. received in the last Tw
time units)
OP
...v, s, r... (act, p, t)...
○ in input we have a stream of points;
○ in output a stream of skyline updates, that
indicates whether a point p enter (ADD) or exit
from (DEL) the skyline set at a given time t;
The operator has to maintain the set of live (non-obsolete) points in an
internal spatial data structure (DB) on which performs insertions, deletions
and searches (e.g. vector, R-Tree,...)
Two type of activations:
○ external: due to point arrivals;
○ internal: due to point expirations.
DB
EAGER ALGORITHM
Due to Tao and Papadias [2006], performs most of the work at points
arrival
Definition: the Skyline Influence Time of a point p (SITp
) is the expiring
time of the youngest point r dominating p (i.e. its critical dominator)
The algorithm maintains an event list EL with two type of events:
○ skytime(p,t): indicates the entering of p into the skyline at time t;
○ expire(p,t): indicates the expiring of p at time t.
EXTERNAL ACTIVATION
1. Pruning: all the points in DB dominated
by p must be removed and their
associated events cleared by EL. DEL
updates for skyline points are emitted;
At the reception of the point p:
Skyline
p
2. Insertion: the point p is inserted in DB;
3. Search the critical dominator r of p:
○ if it exists, add the event skytime(p,tr
exp
) to EL;
○ otherwise p is a skyline point: ADD update in output stream and
expire(p,tp
exp
) in EL.
r
INTERNAL ACTIVATION
The events in EL are processed by using an internal timer. When an
event is triggered:
○ skytime(p,t): the point p is added to the
skyline and an ADD update is emitted.
A new event expire(p,tp
exp
) is inserted in
EL.
Skyline
○ expire(p,t): p is removed from DB and a
DEL update is emitted;
p
p
r
PARALLELIZATION
It is based on a Map pattern with a Reduce phase. DB and EL are
partitioned among a set of Workers
For each received point p the Emitter:
1. assigns the timestamp tp
arr
according to current system time;
2. assigns the ownership of p to a specific Worker;
3. p is multicasted to all the Workers.
...v, s, r...
(p,owner)
(p,owner)
E
W
W
C
PARALLELIZATION
A generic Worker Wi
will:
2. prune points dominated by p from DBi
;
E
Wi
C
(p,owner)
3. Wi
computes the local SITp
i
on the local DBi
send it to the Collector;
4. if it is not the owner discards the point. Otherwise has to wait the result
of the reduce from the Collector:
a. if p does not have a dominator it is a skyline point: expire(p,tp
exp
) in ELi
and ADD update to Collector;
b. otherwise add a skytime(p,SITp
) to the event list
1. execute the events in ELi
with timestamp
smaller than tp
arr
(updates are sent);
(act,p,t)
SIT
p
i
SITp
PARALLELIZATION
The Collector receives two type of messages from Workers:
○ reduce messages: once it receives the local SITp
i
from any Workers,
compute SITp
=max{SITp
i
} and send the result to the owner of p;
○ skyline updates: the Collector reorder the updates and transmit
them onto the output stream.
Straightforward solution… but it has two main problems:
○ synchronous reduce phase: owner has to wait for a reply from the
Collector;
○ load unbalancing due to pruning and wrong owner selection
policies.
ASYNCHRONOUS REDUCE
Reduce can be done also in an asynchronous fashion.
Each Worker will wait a message from the Emitter (points) or from
the Collector (reduce results). When the reduce result is received, ELk
is properly updated
Wk
(owner of point p) can process subsequent points while SITp
is not
available. For each point r:
○ searches the youngest dominator in DBk
(independent from SITp
);
○ prune all the points dominated in DBk
: if p is one of them, when the
SIT is received produce the proper updates.
OWNER SELECTION POLICIES
Emitter has to assign points’ ownership in order to keep DBi
evenly sized.
Four heuristics, independent from the spatial coordinates of the points:
○ Round Robin (RR): ownership is interleaved among Workers;
○ On Demand (OD): ownership is assigned to the first Worker able to
accommodate it into its input queue;
○ Least Loaded Worker (LLW): the point is assigned to the Worker with the
smaller DBi
;
○ Least Loaded Worker with Ownership (LLW+): for each Worker we take
into account the number of enqueued points for which it has been
designed as the owner.
EXPERIMENTS
A prototypal implementation of the parallelization has been done targeting
shared memory architecture:
○ parallel entities have been implemented as pthreads, pinned on cores;
○ they interact through non-blocking lock-free queues provided by the
Fastflow library.
Target architecture: dual CPU Intel Sandy Bridge Xeon E5-2650
16 cores (32 with HT) running at 2GHz. 32 GB of Ram
In addition to the entities required by the
parallelization, we have a Generator and a
Consumer threads. Therefore we can have up
to 12 Workers (if we don’t use HT)
G C
w
E
w
C
EXPERIMENTAL EVALUATION
To study the effect of pruning, we considered three different point distributions
○ in any case the number of points in DB is three order of magnitude lower
wrt the number of points received;
○ save memory at the expense of increased proc. time per point.
OWNER SELECTION POLICIES
We use a configuration with 4.5K non obsolete points distributed in 12
Workers. We measured the =|DBmax
| - |DBmin
|. Indep. Point dimension = 5
Strategy avg 2
RR 66.02 229
OD 34.13 1791
LLW 3.15 4.28
LLW+ 2.55 4.05
○ load-aware policies obtain smaller avg
with lower variance;
○ LLW+ is able to achieve a 20% improvement wrt LLW
ASYNCHRONOUS REDUCE
We measure the benefit of the asynchronous reduce on throughput, with LLW+
Scenario with rate of 100K points/sec, anticorr. distribution and Tw
=20s
The average gain is ~10% (higher with high parallelism degree)
THROUGHPUT AND SCALABILITY
Different execution scenarios for different point distributions
Anticorrelated: Tw
=10s, =80Kp/s Independent: Tw
=10s, =100Kp/s Correlated: Tw
=60s, =250Kp/s
Ant. Indep. Corr.
B(12)
28Kp/s 78Kp/s 237Kp/s
S(12)
11.65 10.7 8.16
|DB| 4598 4226 1192
CONCLUSIONS
We have presented a map-reduce parallelization of the skyline
operator on data stream. Optimized for what concern:
○ reduce phase: asynchronous reduce;
○ owner-selection policies.
Both of them improved the performance of the solution
Future works:
○ investigate on the correlated case;
○ enhance the implementation with autonomic features.
Thank you!
Questions?
BKP-REORDERING WITH SYNC. REDUCE
If we adopt a synchronous reduce:
○ the results are produced by each Worker in order;
○ but the Collector has to re order them in order to respect their
chronological order. To do that:
○ buffers the updates and keeps them ordered by timestamp using a
priority queue;
○ maintains the timestamp of the last received update from each
Worker;
○ the buffered updates with timestamp smaller or equal than min{lst-ti
} can be safely transmitted
BKP-REODERING WITH ASYNC. REDUCE
Under this assumption, updates produced by each Worker can be
disordered: for example the point p has to be inserted in the skyline
by the owner, but while waiting for its reduce results other points
arrive an possibly updates with a greater timestamp are produced.
The solution that we have adopted is to use punctuations:
○ when the Worker receive a result for the reduce state to the
Collector that all the future results will have timestamp greater
than that;
○ the Collector use this info for re ordering the result (clearly a little
bit slower than sync. reduce)

More Related Content

What's hot

B.sc cs-ii-u-2.1-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.1-overview of register transfer, micro operations and basic co...B.sc cs-ii-u-2.1-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.1-overview of register transfer, micro operations and basic co...Rai University
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspectiveপল্লব রায়
 
Floating point ALU using VHDL implemented on FPGA
Floating point ALU using VHDL implemented on FPGAFloating point ALU using VHDL implemented on FPGA
Floating point ALU using VHDL implemented on FPGAAzhar Syed
 
Optimized Floating-point Complex number multiplier on FPGA
Optimized Floating-point Complex number multiplier on FPGAOptimized Floating-point Complex number multiplier on FPGA
Optimized Floating-point Complex number multiplier on FPGADr. Pushpa Kotipalli
 
Parallel sorting algorithm
Parallel sorting algorithmParallel sorting algorithm
Parallel sorting algorithmRicha Kumari
 
Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...
Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...
Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...Silicon Mentor
 
overview of register transfer, micro operations and basic computer organizati...
overview of register transfer, micro operations and basic computer organizati...overview of register transfer, micro operations and basic computer organizati...
overview of register transfer, micro operations and basic computer organizati...Rai University
 
Basic Block Scheduling
Basic Block SchedulingBasic Block Scheduling
Basic Block SchedulingNilaNila16
 
PowerLyra@EuroSys2015
PowerLyra@EuroSys2015PowerLyra@EuroSys2015
PowerLyra@EuroSys2015realstolz
 
Optimization of graph storage using GoFFish
Optimization of graph storage using GoFFishOptimization of graph storage using GoFFish
Optimization of graph storage using GoFFishAnushree Prasanna Kumar
 
High performance LINPACK
High performance LINPACKHigh performance LINPACK
High performance LINPACKWei Mu
 
Ronalao termpresent
Ronalao termpresentRonalao termpresent
Ronalao termpresentElma Belitz
 
Connectivity Methodology3.0
Connectivity Methodology3.0Connectivity Methodology3.0
Connectivity Methodology3.0Tianyuan Liu
 
Simulation and Comparison of P, PI, PID Controllers on MATLAB/ Simulink
Simulation and Comparison of P, PI, PID Controllers on MATLAB/ SimulinkSimulation and Comparison of P, PI, PID Controllers on MATLAB/ Simulink
Simulation and Comparison of P, PI, PID Controllers on MATLAB/ SimulinkHarshKumar649
 
Logic synthesis,flootplan&placement
Logic synthesis,flootplan&placementLogic synthesis,flootplan&placement
Logic synthesis,flootplan&placementshaik sharief
 

What's hot (20)

B.sc cs-ii-u-2.1-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.1-overview of register transfer, micro operations and basic co...B.sc cs-ii-u-2.1-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.1-overview of register transfer, micro operations and basic co...
 
matlab_simulink_for_control082p.pdf
matlab_simulink_for_control082p.pdfmatlab_simulink_for_control082p.pdf
matlab_simulink_for_control082p.pdf
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspective
 
Floating point ALU using VHDL implemented on FPGA
Floating point ALU using VHDL implemented on FPGAFloating point ALU using VHDL implemented on FPGA
Floating point ALU using VHDL implemented on FPGA
 
Optimized Floating-point Complex number multiplier on FPGA
Optimized Floating-point Complex number multiplier on FPGAOptimized Floating-point Complex number multiplier on FPGA
Optimized Floating-point Complex number multiplier on FPGA
 
Parallel sorting algorithm
Parallel sorting algorithmParallel sorting algorithm
Parallel sorting algorithm
 
Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...
Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...
Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...
 
overview of register transfer, micro operations and basic computer organizati...
overview of register transfer, micro operations and basic computer organizati...overview of register transfer, micro operations and basic computer organizati...
overview of register transfer, micro operations and basic computer organizati...
 
Basic Block Scheduling
Basic Block SchedulingBasic Block Scheduling
Basic Block Scheduling
 
PowerLyra@EuroSys2015
PowerLyra@EuroSys2015PowerLyra@EuroSys2015
PowerLyra@EuroSys2015
 
Reversible code converter
Reversible code converterReversible code converter
Reversible code converter
 
Optimization of graph storage using GoFFish
Optimization of graph storage using GoFFishOptimization of graph storage using GoFFish
Optimization of graph storage using GoFFish
 
High performance LINPACK
High performance LINPACKHigh performance LINPACK
High performance LINPACK
 
main
mainmain
main
 
Ronalao termpresent
Ronalao termpresentRonalao termpresent
Ronalao termpresent
 
Block diagram design
Block diagram designBlock diagram design
Block diagram design
 
Connectivity Methodology3.0
Connectivity Methodology3.0Connectivity Methodology3.0
Connectivity Methodology3.0
 
Simulation and Comparison of P, PI, PID Controllers on MATLAB/ Simulink
Simulation and Comparison of P, PI, PID Controllers on MATLAB/ SimulinkSimulation and Comparison of P, PI, PID Controllers on MATLAB/ Simulink
Simulation and Comparison of P, PI, PID Controllers on MATLAB/ Simulink
 
Logic synthesis,flootplan&placement
Logic synthesis,flootplan&placementLogic synthesis,flootplan&placement
Logic synthesis,flootplan&placement
 

Viewers also liked

KH ATL Brochure small
KH ATL Brochure smallKH ATL Brochure small
KH ATL Brochure smallAndrew Salmon
 
mechantronics - assignment 1
mechantronics - assignment 1mechantronics - assignment 1
mechantronics - assignment 1Kerrie Noble
 
Michaelhampton figuredrawing-designandinvention-130423232526-phpapp02
Michaelhampton figuredrawing-designandinvention-130423232526-phpapp02Michaelhampton figuredrawing-designandinvention-130423232526-phpapp02
Michaelhampton figuredrawing-designandinvention-130423232526-phpapp02Brandon Pamcakes
 
Eshan Senanayake- Thesis 2011
Eshan Senanayake- Thesis 2011Eshan Senanayake- Thesis 2011
Eshan Senanayake- Thesis 2011Eshan Senanayake
 
Central Park BROCHURE [small]
Central Park BROCHURE [small]Central Park BROCHURE [small]
Central Park BROCHURE [small]Andrew Salmon
 
В поисках потока. Сергей Тюменцев
В поисках потока. Сергей ТюменцевВ поисках потока. Сергей Тюменцев
В поисках потока. Сергей ТюменцевСергей Тюменцев
 
CWS Business Plan (2)
CWS Business Plan (2)CWS Business Plan (2)
CWS Business Plan (2)Kerrie Noble
 
APDM - companion flange manufacturing report
APDM - companion flange manufacturing reportAPDM - companion flange manufacturing report
APDM - companion flange manufacturing reportKerrie Noble
 
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...Tiziano De Matteis
 
cv of suresh latest (4) (1)
cv of suresh latest (4) (1)cv of suresh latest (4) (1)
cv of suresh latest (4) (1)Suresh Singh
 
La derivada aplicaciones
La derivada aplicacionesLa derivada aplicaciones
La derivada aplicacionesemma matamoros
 

Viewers also liked (20)

KH ATL Brochure small
KH ATL Brochure smallKH ATL Brochure small
KH ATL Brochure small
 
DISSERTATION
DISSERTATIONDISSERTATION
DISSERTATION
 
RMH ACAD CV 04.04.15
RMH ACAD CV 04.04.15RMH ACAD CV 04.04.15
RMH ACAD CV 04.04.15
 
mechantronics - assignment 1
mechantronics - assignment 1mechantronics - assignment 1
mechantronics - assignment 1
 
Michaelhampton figuredrawing-designandinvention-130423232526-phpapp02
Michaelhampton figuredrawing-designandinvention-130423232526-phpapp02Michaelhampton figuredrawing-designandinvention-130423232526-phpapp02
Michaelhampton figuredrawing-designandinvention-130423232526-phpapp02
 
Eshan Senanayake- Thesis 2011
Eshan Senanayake- Thesis 2011Eshan Senanayake- Thesis 2011
Eshan Senanayake- Thesis 2011
 
Angel Sloss Makerspaces
Angel Sloss MakerspacesAngel Sloss Makerspaces
Angel Sloss Makerspaces
 
Как убрать страхи в МЛМ
Как убрать страхи в МЛМКак убрать страхи в МЛМ
Как убрать страхи в МЛМ
 
Central Park BROCHURE [small]
Central Park BROCHURE [small]Central Park BROCHURE [small]
Central Park BROCHURE [small]
 
В поисках потока. Сергей Тюменцев
В поисках потока. Сергей ТюменцевВ поисках потока. Сергей Тюменцев
В поисках потока. Сергей Тюменцев
 
CWS Business Plan (2)
CWS Business Plan (2)CWS Business Plan (2)
CWS Business Plan (2)
 
CV DUMIE
CV DUMIECV DUMIE
CV DUMIE
 
LLE-Powerpoint-02b.pptx
LLE-Powerpoint-02b.pptxLLE-Powerpoint-02b.pptx
LLE-Powerpoint-02b.pptx
 
APDM - companion flange manufacturing report
APDM - companion flange manufacturing reportAPDM - companion flange manufacturing report
APDM - companion flange manufacturing report
 
SIK
SIKSIK
SIK
 
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
Keep Calm and React with Foresight: Strategies for Low-Latency and Energy-Eff...
 
cv of suresh latest (4) (1)
cv of suresh latest (4) (1)cv of suresh latest (4) (1)
cv of suresh latest (4) (1)
 
Новости GTGP
Новости GTGPНовости GTGP
Новости GTGP
 
project report
project reportproject report
project report
 
La derivada aplicaciones
La derivada aplicacionesLa derivada aplicaciones
La derivada aplicaciones
 

Similar to Multicore Parallel Skyline Queries on Data Streams

Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...Tiziano De Matteis
 
Java 8 - functional features
Java 8 - functional featuresJava 8 - functional features
Java 8 - functional featuresRafal Rybacki
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersJen Aman
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Databricks
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Databricks
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA
 
FlumeJava: Easy, Efficient Data-Parallel Pipelines
FlumeJava: Easy, Efficient Data-Parallel PipelinesFlumeJava: Easy, Efficient Data-Parallel Pipelines
FlumeJava: Easy, Efficient Data-Parallel PipelinesMiro Cupak
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowEmanuel Di Nardo
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Sparkdatamantra
 
32-bit unsigned multiplier by using CSLA & CLAA
32-bit unsigned multiplier by using CSLA &  CLAA32-bit unsigned multiplier by using CSLA &  CLAA
32-bit unsigned multiplier by using CSLA & CLAAGanesh Sambasivarao
 
Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Intel® Software
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPCVictor Eijkhout
 
Computer Architecture Performance and Energy
Computer Architecture Performance and EnergyComputer Architecture Performance and Energy
Computer Architecture Performance and EnergyJason J Pulikkottil
 
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdfAuto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdfKundjanasith Thonglek
 

Similar to Multicore Parallel Skyline Queries on Data Streams (20)

Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
Parallel Patterns for Window-based Stateful Operators on Data Streams: an Alg...
 
Java 8 - functional features
Java 8 - functional featuresJava 8 - functional features
Java 8 - functional features
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity Clusters
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
 
The google MapReduce
The google MapReduceThe google MapReduce
The google MapReduce
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
 
FlumeJava: Easy, Efficient Data-Parallel Pipelines
FlumeJava: Easy, Efficient Data-Parallel PipelinesFlumeJava: Easy, Efficient Data-Parallel Pipelines
FlumeJava: Easy, Efficient Data-Parallel Pipelines
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflow
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Spark
 
Cadancesimulation
CadancesimulationCadancesimulation
Cadancesimulation
 
Cpmprt
CpmprtCpmprt
Cpmprt
 
32-bit unsigned multiplier by using CSLA & CLAA
32-bit unsigned multiplier by using CSLA &  CLAA32-bit unsigned multiplier by using CSLA &  CLAA
32-bit unsigned multiplier by using CSLA & CLAA
 
Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*
 
SparkNet presentation
SparkNet presentationSparkNet presentation
SparkNet presentation
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
 
pandu-vivek (1)
pandu-vivek (1)pandu-vivek (1)
pandu-vivek (1)
 
Fx570 ms 991ms_e
Fx570 ms 991ms_eFx570 ms 991ms_e
Fx570 ms 991ms_e
 
Computer Architecture Performance and Energy
Computer Architecture Performance and EnergyComputer Architecture Performance and Energy
Computer Architecture Performance and Energy
 
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdfAuto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
Auto-Scaling Apache Spark cluster using Deep Reinforcement Learning.pdf
 

Recently uploaded

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Multicore Parallel Skyline Queries on Data Streams

  • 1. A Multicore Parallelization of Continuous Skyline Queries on Data Streams University of Pisa Italy Europar 2015 - Vienna Tiziano De Matteis, Salvatore Di Girolamo, Gabriele Mencagli
  • 2. INTRODUCTION Skyline queries are used to retrieve interesting points from a large dataset according to multiple criteria (Pareto optimal). Example: “Find cheap hotels near the City Center” distance price Traditionally used in static DBMS, they have become a commonplace in real-time applications working on input data on the fly such as financial applications, social network analysis, sensor networks… and so on.
  • 3. INTRODUCTION (Skyline) queries over data streams are challenging: ○ no control on how elements arrive; ○ due to unbounded input, the query is evaluated on windows that contains the most recent tuples; ○ performance requirements in term of throughput and latency. Parallelism is unavoidable Goal: parallelization of continuous skyline query over multicores: ○ map-reduce pattern implementation; ○ taking into account optimizations such as asynchronous reduce and load-balancing.
  • 4. PRELIMINARIES Each point p is represented as a tuple of d≥1 attributes {p1 ,p2 ,...,pd } Given two points p and r, we say that p dominates r (p≺ r) iff: ∀i ∈ [1,d] pi ≤ ri and ∃ j | pj < rj A sliding window is used to maintain the most recent tuples. Its length is expressed by the user in Tw time units (e.g. seconds, minutes): ○ the skyline at time t is computed over all points arrived in [t-Tw ,t]; ○ a point p arrived at time tp arr expires at time tp exp =tp arr +Tw Given a set of points , its skyline is the subset of all the points not dominated by any other point in p v r
  • 5. CONTINUOUS SKYLINE OPERATOR The Skyline Operator has to maintain the skyline set of the points contained in the current window (i.e. received in the last Tw time units) OP ...v, s, r... (act, p, t)... ○ in input we have a stream of points; ○ in output a stream of skyline updates, that indicates whether a point p enter (ADD) or exit from (DEL) the skyline set at a given time t; The operator has to maintain the set of live (non-obsolete) points in an internal spatial data structure (DB) on which performs insertions, deletions and searches (e.g. vector, R-Tree,...) Two type of activations: ○ external: due to point arrivals; ○ internal: due to point expirations. DB
  • 6. EAGER ALGORITHM Due to Tao and Papadias [2006], performs most of the work at points arrival Definition: the Skyline Influence Time of a point p (SITp ) is the expiring time of the youngest point r dominating p (i.e. its critical dominator) The algorithm maintains an event list EL with two type of events: ○ skytime(p,t): indicates the entering of p into the skyline at time t; ○ expire(p,t): indicates the expiring of p at time t.
  • 7. EXTERNAL ACTIVATION 1. Pruning: all the points in DB dominated by p must be removed and their associated events cleared by EL. DEL updates for skyline points are emitted; At the reception of the point p: Skyline p 2. Insertion: the point p is inserted in DB; 3. Search the critical dominator r of p: ○ if it exists, add the event skytime(p,tr exp ) to EL; ○ otherwise p is a skyline point: ADD update in output stream and expire(p,tp exp ) in EL. r
  • 8. INTERNAL ACTIVATION The events in EL are processed by using an internal timer. When an event is triggered: ○ skytime(p,t): the point p is added to the skyline and an ADD update is emitted. A new event expire(p,tp exp ) is inserted in EL. Skyline ○ expire(p,t): p is removed from DB and a DEL update is emitted; p p r
  • 9. PARALLELIZATION It is based on a Map pattern with a Reduce phase. DB and EL are partitioned among a set of Workers For each received point p the Emitter: 1. assigns the timestamp tp arr according to current system time; 2. assigns the ownership of p to a specific Worker; 3. p is multicasted to all the Workers. ...v, s, r... (p,owner) (p,owner) E W W C
  • 10. PARALLELIZATION A generic Worker Wi will: 2. prune points dominated by p from DBi ; E Wi C (p,owner) 3. Wi computes the local SITp i on the local DBi send it to the Collector; 4. if it is not the owner discards the point. Otherwise has to wait the result of the reduce from the Collector: a. if p does not have a dominator it is a skyline point: expire(p,tp exp ) in ELi and ADD update to Collector; b. otherwise add a skytime(p,SITp ) to the event list 1. execute the events in ELi with timestamp smaller than tp arr (updates are sent); (act,p,t) SIT p i SITp
  • 11. PARALLELIZATION The Collector receives two type of messages from Workers: ○ reduce messages: once it receives the local SITp i from any Workers, compute SITp =max{SITp i } and send the result to the owner of p; ○ skyline updates: the Collector reorder the updates and transmit them onto the output stream. Straightforward solution… but it has two main problems: ○ synchronous reduce phase: owner has to wait for a reply from the Collector; ○ load unbalancing due to pruning and wrong owner selection policies.
  • 12. ASYNCHRONOUS REDUCE Reduce can be done also in an asynchronous fashion. Each Worker will wait a message from the Emitter (points) or from the Collector (reduce results). When the reduce result is received, ELk is properly updated Wk (owner of point p) can process subsequent points while SITp is not available. For each point r: ○ searches the youngest dominator in DBk (independent from SITp ); ○ prune all the points dominated in DBk : if p is one of them, when the SIT is received produce the proper updates.
  • 13. OWNER SELECTION POLICIES Emitter has to assign points’ ownership in order to keep DBi evenly sized. Four heuristics, independent from the spatial coordinates of the points: ○ Round Robin (RR): ownership is interleaved among Workers; ○ On Demand (OD): ownership is assigned to the first Worker able to accommodate it into its input queue; ○ Least Loaded Worker (LLW): the point is assigned to the Worker with the smaller DBi ; ○ Least Loaded Worker with Ownership (LLW+): for each Worker we take into account the number of enqueued points for which it has been designed as the owner.
  • 14. EXPERIMENTS A prototypal implementation of the parallelization has been done targeting shared memory architecture: ○ parallel entities have been implemented as pthreads, pinned on cores; ○ they interact through non-blocking lock-free queues provided by the Fastflow library. Target architecture: dual CPU Intel Sandy Bridge Xeon E5-2650 16 cores (32 with HT) running at 2GHz. 32 GB of Ram In addition to the entities required by the parallelization, we have a Generator and a Consumer threads. Therefore we can have up to 12 Workers (if we don’t use HT) G C w E w C
  • 15. EXPERIMENTAL EVALUATION To study the effect of pruning, we considered three different point distributions ○ in any case the number of points in DB is three order of magnitude lower wrt the number of points received; ○ save memory at the expense of increased proc. time per point.
  • 16. OWNER SELECTION POLICIES We use a configuration with 4.5K non obsolete points distributed in 12 Workers. We measured the =|DBmax | - |DBmin |. Indep. Point dimension = 5 Strategy avg 2 RR 66.02 229 OD 34.13 1791 LLW 3.15 4.28 LLW+ 2.55 4.05 ○ load-aware policies obtain smaller avg with lower variance; ○ LLW+ is able to achieve a 20% improvement wrt LLW
  • 17. ASYNCHRONOUS REDUCE We measure the benefit of the asynchronous reduce on throughput, with LLW+ Scenario with rate of 100K points/sec, anticorr. distribution and Tw =20s The average gain is ~10% (higher with high parallelism degree)
  • 18. THROUGHPUT AND SCALABILITY Different execution scenarios for different point distributions Anticorrelated: Tw =10s, =80Kp/s Independent: Tw =10s, =100Kp/s Correlated: Tw =60s, =250Kp/s Ant. Indep. Corr. B(12) 28Kp/s 78Kp/s 237Kp/s S(12) 11.65 10.7 8.16 |DB| 4598 4226 1192
  • 19. CONCLUSIONS We have presented a map-reduce parallelization of the skyline operator on data stream. Optimized for what concern: ○ reduce phase: asynchronous reduce; ○ owner-selection policies. Both of them improved the performance of the solution Future works: ○ investigate on the correlated case; ○ enhance the implementation with autonomic features.
  • 21. BKP-REORDERING WITH SYNC. REDUCE If we adopt a synchronous reduce: ○ the results are produced by each Worker in order; ○ but the Collector has to re order them in order to respect their chronological order. To do that: ○ buffers the updates and keeps them ordered by timestamp using a priority queue; ○ maintains the timestamp of the last received update from each Worker; ○ the buffered updates with timestamp smaller or equal than min{lst-ti } can be safely transmitted
  • 22. BKP-REODERING WITH ASYNC. REDUCE Under this assumption, updates produced by each Worker can be disordered: for example the point p has to be inserted in the skyline by the owner, but while waiting for its reduce results other points arrive an possibly updates with a greater timestamp are produced. The solution that we have adopted is to use punctuations: ○ when the Worker receive a result for the reduce state to the Collector that all the future results will have timestamp greater than that; ○ the Collector use this info for re ordering the result (clearly a little bit slower than sync. reduce)