A Parallel Data
Distribution Management
Algorithm
Gabriele D’Angelo
<g.dangelo@unibo.it>
http://www.cs.unibo.it/gdangelo/
...
Presentation outline


Data Distribution Management (DDM)



DDM algorithms



Parallel Data Distribution Management

...
Data Distribution Management (DDM)


DDM services are part of the IEEE 1516 “High Level
Architecture” (HLA) specification...
DDM: simple example in 2 dimensions
dimension 2

S1

U1

S3
U2

S2

dimension 1



The problem of identifying whether two...
DDM algorithms: from simple to complex


No DDM: all updates are broadcasted, the filtering is on
receivers. CONS: commun...
DDM algorithms: from simple to complex


Sort-Based Matching (SBM): before matching, the extents
are sorted. In this way,...
DDM algorithms: from simple to complex


Sort-Based Matching (SBM): before matching, the extents
are sorted. In this way,...
DDM algorithms: from simple to complex


Sort-Based Matching (SBM): before matching, the extents
are sorted. In this way,...
Parallel Data Distribution Management


Easy solution: each core (or CPU) is responsible of a specific
dimension. The “ma...
Parallel Data Distribution Management


Easy solution: each core (or CPU) is responsible of a specific
dimension. The “ma...
Parallel Data Distribution Management


Easy solution: each core (or CPU) is responsible of a specific
dimension. The “ma...
Parallel Data Distribution Management


Easy solution: each core (or CPU) is responsible of a specific
dimension. The “ma...
Parallel Data Distribution Management


Easy solution: each core (or CPU) is responsible of a specific
dimension. The “ma...
Parallel matching algorithms


Brute Force (BF): inefficient, embarrassingly parallel



Sort-Based Matching (SBM): effi...
Our proposal: Interval Tree Matching (ITM)


Based on the simple Interval Tree data structure



It can be implemented u...
Our proposal: Interval Tree Matching (ITM)


Based on the simple Interval Tree data structure



It can be implemented u...
Our proposal: Interval Tree Matching (ITM)


Based on the simple Interval Tree data structure



It can be implemented u...
Interval Tree Matching (ITM): algorithm description


Support function INTERVAL-QUERY:
returns the list of intervals inte...
Interval Tree Matching (ITM): comparison


Support function INTERVAL-QUERY:
returns the list of intervals intersecting a ...
Interval Tree Matching (ITM): comparison


Support function INTERVAL-QUERY:
returns the list of intervals intersecting a ...
Interval Tree Matching (ITM): comparison


Support function INTERVAL-QUERY:
returns the list of intervals intersecting a ...
Experimental evaluation


The very same methodology used in [Raczy et al. 2005]



Instances with a single dimension are...
Experimental evaluation


The very same methodology used in [Raczy et al. 2005]



Instances with a single dimension are...
Experimental evaluation: sequential execution
α=0.01

α=1

α=100



BF is very slow



SBM is unaffected by



ITM has ...
Experimental evaluation: parallel execution

A Parallel Data Distribution Management Algoriithm

DS-RT 2013

Gabriele D'An...
Experimental evaluation: parallel execution

A Parallel Data Distribution Management Algoriithm

DS-RT 2013

Gabriele D'An...
Experimental evaluation: parallel execution

Execution platform with 4
physical cores but with
Hyper-Threading

A Parallel...
Experimental evaluation: parallel execution

A Parallel Data Distribution Management Algoriithm

DS-RT 2013

Gabriele D'An...
Experimental evaluation: parallel execution

A Parallel Data Distribution Management Algoriithm

DS-RT 2013

Gabriele D'An...
Experimental evaluation: parallel execution

BF is embarrassingly parallel, in the range [1, 4] (physical
cores) both algo...
Experimental evaluation: parallel execution

In the range [5, 8] (Hyper-Threading logical cores) ITM is
better than BF. HT...
Experimental evaluation: parallel execution

With more threads than logical cores there is a
significant performance drop
...
Experimental evaluation: parallel execution

8 threads

As expected, the ITM depends on both the number
of extents and the...
Conclusions


We've proposed a new parallel Data Distribution Management
(DDM) mechanism called Interval Trees Matching (...
Further information
Moreno Marzolla, Gabriele D'Angelo, Marco Mandrioli
A Parallel Data Distribution Management Algorithm
...
A Parallel Data
Distribution Management
Algorithm
Gabriele D’Angelo
<g.dangelo@unibo.it>
http://www.cs.unibo.it/gdangelo/
...
Upcoming SlideShare
Loading in …5
×

A Parallel Data Distribution Management Algorithm

670 views

Published on

Identifying intersections among a set of d-dimensional rectangular regions (d-rectangles) is a common problem in many simulation and modeling applications. Since algorithms for computing intersections over a large number of regions can be computationally demanding, an obvious solution is to take advantage of the multiprocessing capabilities of modern multicore processors. Unfortunately, many solutions employed for the Data Distribution Management service of the High Level Architecture are either inefficient, or can only partially be parallelized. In this paper we propose the Interval Tree Matching (ITM) algorithm for computing intersections among d-rectangles. ITM is based on a simple Interval Tree data structure, and exhibits an embarrassingly parallel structure. We implement the ITM algorithm, and compare its sequential performance with two widely used solutions (brute force and sort-based matching). We also analyze the scalability of ITM on shared-memory multicore processors. The results show that the sequential implementation of ITM is competitive with sort-based matching; moreover, the parallel implementation provides good speedup on multicore processors.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
670
On SlideShare
0
From Embeds
0
Number of Embeds
96
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

A Parallel Data Distribution Management Algorithm

  1. 1. A Parallel Data Distribution Management Algorithm Gabriele D’Angelo <g.dangelo@unibo.it> http://www.cs.unibo.it/gdangelo/ joint work with: Moreno Marzolla and Marco Mandrioli Delft, Netherlands Distributed Simulation and Real Time Applications (DS-RT), 2013
  2. 2. Presentation outline  Data Distribution Management (DDM)  DDM algorithms  Parallel Data Distribution Management  Parallel Matching Algorithms  Interval Tree Matching (ITM)  Experimental Evaluation  Conclusions A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 2 / 36
  3. 3. Data Distribution Management (DDM)  DDM services are part of the IEEE 1516 “High Level Architecture” (HLA) specification  DDM is about forwarding events generated on update regions to a set of subscription regions  A region is a d-dimensional rectangle (d-rectangle) in a ddimensional routing space  The goal is to find all the couples of update and subscription regions that overlap  We assume that each region corresponds to a single extent A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 3 / 36
  4. 4. DDM: simple example in 2 dimensions dimension 2 S1 U1 S3 U2 S2 dimension 1  The problem of identifying whether two d-rectangles intersect can be reduced to d independent intersection problems among one-dimensional segments  Two d-rectangles intersect if and only if they intersect among all their projections A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 4 / 36
  5. 5. DDM algorithms: from simple to complex  No DDM: all updates are broadcasted, the filtering is on receivers. CONS: communication inefficiency  Region-based matching (Brute Force, BF): it tests all the subscription-update pairs. CONS: computational inefficiency  Grid-based matching: it partitions the routing space into a grid of d-dimensional cells and finds matches CONS: S1  U1 S3 S2 false positives  which cell size is the best? U2 A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 5 / 36
  6. 6. DDM algorithms: from simple to complex  Sort-Based Matching (SBM): before matching, the extents are sorted. In this way, the overlaps can be found efficiently. In many cases SBM does better than grid-based [Raczy et al. 2005].  CONS: are there drawbacks? Binary partition-based: recursive binary partitioning of the extents. CONS: (more than) quadratic worst case cost A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 6 / 36
  7. 7. DDM algorithms: from simple to complex  Sort-Based Matching (SBM): before matching, the extents are sorted. In this way, the overlaps can be found efficiently. In many cases SBM does better than grid-based [Raczy et al. 2005].  CONS: are there drawbacks? Binary partition-based: recursive binary partitioning of the extents. CONS: (more than) quadratic worst case cost  … and so, what's the problem? A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 7 / 36
  8. 8. DDM algorithms: from simple to complex  Sort-Based Matching (SBM): before matching, the extents are sorted. In this way, the overlaps can be found efficiently. In many cases SBM does better than grid-based [Raczy et al. 2005].  CONS: are there drawbacks? Binary partition-based: recursive binary partitioning of the extents. CONS: (more than) quadratic worst case cost  … and so, what's the problem?  Even smartphones are multi-core and we're still dealing with sequential algorithms  The future of computing is very likely to be many-core A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 8 / 36
  9. 9. Parallel Data Distribution Management  Easy solution: each core (or CPU) is responsible of a specific dimension. The “master” core computes the final result dimension 2 S1 U1 S3 S2 U2 dimension 1 A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 9 / 36
  10. 10. Parallel Data Distribution Management  Easy solution: each core (or CPU) is responsible of a specific dimension. The “master” core computes the final result dimension 2 S1 U1 S3 core 1 → result 1 A Parallel Data Distribution Management Algoriithm S2 U2 dimension 1 DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 10 / 36
  11. 11. Parallel Data Distribution Management  Easy solution: each core (or CPU) is responsible of a specific dimension. The “master” core computes the final result dimension 2 core 2 → result 2 S1 U1 S3 core 1 → result 1 A Parallel Data Distribution Management Algoriithm S2 U2 dimension 1 DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 11 / 36
  12. 12. Parallel Data Distribution Management  Easy solution: each core (or CPU) is responsible of a specific dimension. The “master” core computes the final result dimension 2 core 2 → result 2 S1 ∩ core 1 → core 1 → U1 S3 result 1 S2 U2 dimension 1 final result A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 12 / 36
  13. 13. Parallel Data Distribution Management  Easy solution: each core (or CPU) is responsible of a specific dimension. The “master” core computes the final result  PRO: it can be applied to every DDM algorithm  CONS: what if the number of cores/CPUs is larger than the number of dimensions? This solution is not adequate for “many core” environments We need parallel matching algorithms A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 13 / 36
  14. 14. Parallel matching algorithms  Brute Force (BF): inefficient, embarrassingly parallel  Sort-Based Matching (SBM): efficient, not easy to parallelize  DESIDERATA:  efficient  easy to parallelize  based on a simple data structure  that allows the efficient update/add/delete of extents (dynamic interval management) A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 14 / 36
  15. 15. Our proposal: Interval Tree Matching (ITM)  Based on the simple Interval Tree data structure  It can be implemented using priority search trees or augmented AVL trees. For simplicity we've used the last one [9, 15] 0 – 19 [1, 13] 0 – 13 [16,18] 11 – 19 interval [0, 8] 0–8 [2, 7] 2–7 [11, 14] 11 – 14 min lower bound 0 [17, 19] 17 – 19 max upper bound 8 1 13 2 7 9 15 11 14 16 18 17 A Parallel Data Distribution Management Algoriithm DS-RT 2013 19 Gabriele D'Angelo <g.dangelo@unibo.it> 15 / 36
  16. 16. Our proposal: Interval Tree Matching (ITM)  Based on the simple Interval Tree data structure  It can be implemented using priority search trees or upper bound of the subtree augmented AVL trees. For simplicity we've used the last one maximum value of the rooted in this node minimum value of the [9, 15] 0 – 19 lower bound of the subtree rooted in this node13] [1, 0 – 13 [16,18] 11 – 19 interval [0, 8] 0–8 [2, 7] 2–7 [11, 14] 11 – 14 min lower bound 0 [17, 19] 17 – 19 max upper bound 8 1 13 2 7 9 15 11 14 16 18 17 A Parallel Data Distribution Management Algoriithm DS-RT 2013 19 Gabriele D'Angelo <g.dangelo@unibo.it> 16 / 36
  17. 17. Our proposal: Interval Tree Matching (ITM)  Based on the simple Interval Tree data structure  It can be implemented using priority search trees or augmented AVL trees. For simplicity we've used the last one [9, 15] 0 – 19 balanced tree [1, 13] 0 – 13 interval [0, 8] 0–8 insertion and delete [16,18] 11 – 19 [2, 7] 2–7 [11, 14] 11 – 14 [17, 19] 17 – 19 are efficient min lower bound 0 max upper bound 8 1 13 2 7 9 15 it contains all the subscription (or the update) regions 11 14 16 18 17 A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 17 / 36
  18. 18. Interval Tree Matching (ITM): algorithm description  Support function INTERVAL-QUERY: returns the list of intervals intersecting a given extent  Main function: 1) create the Interval Tree with all the subscription extents 2) for every update extent perform an INTERVAL-QUERY A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 18 / 36
  19. 19. Interval Tree Matching (ITM): comparison  Support function INTERVAL-QUERY: returns the list of intervals intersecting a given extent  Main function: 1) create the Interval Tree with all the subscription extents 2) for every update extent perform an INTERVAL-QUERY Algorithm Computational cost Additional space O(n·m) Brute force with n subscription extents, none m update extents Sort-based O((n+m)log(n+m)+n·m) O(n+m) O(min{n·m, (K+1)log n}) Interval tree with K ≤ n·m that is the total number of O(n) intersections A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 19 / 36
  20. 20. Interval Tree Matching (ITM): comparison  Support function INTERVAL-QUERY: returns the list of intervals intersecting a given extent  Main function: 1) create the Interval Tree with all the subscription extents 2) for every update extent perform an INTERVAL-QUERY Algorithm Brute force Computational cost Additional space Parallelization: the queries at O(n·m) point 2) are independent → it is none with n subscription extents, trivial to parallelize m update extents Sort-based O((n+m)log(n+m)+n·m) O(n+m) O(min{n·m, (K+1)log n}) Interval tree with K ≤ n·m that is the total number of O(n) intersections A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 20 / 36
  21. 21. Interval Tree Matching (ITM): comparison  Support function INTERVAL-QUERY: returns the list of intervals intersecting a given extent  Main function: 1) create the Interval Tree with all the subscription extents 2) for every update extent perform an INTERVAL-QUERY Dynamic interval management: updates are efficient Algorithm Brute force Computational cost Additional space Parallelization: the queries at O(n·m) point 2) are independent → it is none with n subscription extents, trivial to parallelize m update extents Sort-based O((n+m)log(n+m)+n·m) O(n+m) O(min{n·m, (K+1)log n}) Interval tree with K ≤ n·m that is the total number of O(n) intersections A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 21 / 36
  22. 22. Experimental evaluation  The very same methodology used in [Raczy et al. 2005]  Instances with a single dimension are considered  Main parameters:  = number of extents = [50x103, 500x103] with 50% of update extents   N α = overlapping degree = Execution platform: ∑ area of extents area of the routing space = {0.01, 1, 100} Intel® Core© i7-2600 3.40 GHz CPU with 4 physical cores and Hyper-Threading (HT), 16 GB RAM, Ubuntu 11.04 A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 22 / 36
  23. 23. Experimental evaluation  The very same methodology used in [Raczy et al. 2005]  Instances with a single dimension are considered  Main parameters:  = number of extents = [50x103, 500x103] with 50% of update extents   N α = overlapping degree = Execution platform: ∑ area of extents area of the routing space = {0.01, 1, 100} Intel® Core© i7-2600 3.40 GHz CPU with 4 physical cores and Hyper-Threading (HT), 16 GB RAM, Ubuntu 11.04 All the source code and scripts used for this performance evaluation are available with a Free Software license from: http://pads.cs.unibo.it A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 23 / 36
  24. 24. Experimental evaluation: sequential execution α=0.01 α=1 α=100  BF is very slow  SBM is unaffected by  ITM has the best performance  Increasing α, the gap between SBM and ITM decreases  Note that α = 100 is a very high overlay degree A Parallel Data Distribution Management Algoriithm α DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 24 / 36
  25. 25. Experimental evaluation: parallel execution A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 25 / 36
  26. 26. Experimental evaluation: parallel execution A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 26 / 36
  27. 27. Experimental evaluation: parallel execution Execution platform with 4 physical cores but with Hyper-Threading A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 27 / 36
  28. 28. Experimental evaluation: parallel execution A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 28 / 36
  29. 29. Experimental evaluation: parallel execution A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 29 / 36
  30. 30. Experimental evaluation: parallel execution BF is embarrassingly parallel, in the range [1, 4] (physical cores) both algorithms obtain almost the same speedup A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 30 / 36
  31. 31. Experimental evaluation: parallel execution In the range [5, 8] (Hyper-Threading logical cores) ITM is better than BF. HT provides a performance boost A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 31 / 36
  32. 32. Experimental evaluation: parallel execution With more threads than logical cores there is a significant performance drop A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 32 / 36
  33. 33. Experimental evaluation: parallel execution 8 threads As expected, the ITM depends on both the number of extents and the overlap degree α A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 33 / 36
  34. 34. Conclusions  We've proposed a new parallel Data Distribution Management (DDM) mechanism called Interval Trees Matching (ITM)  Both sequential and parallel implementations of ITM show good results when compared to the state of the art  Since updates on the interval tree are efficient, the ITM can be easily extended to deal with the dynamic matching problem  All our source code is available with a Free Software license: we believe that all the published results have to be reproducible A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 34 / 36
  35. 35. Further information Moreno Marzolla, Gabriele D'Angelo, Marco Mandrioli A Parallel Data Distribution Management Algorithm Proceedings of the 16-th ACM/IEEE International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2013). Delft, Netherlands, 2013 An draft version of this paper is available on the open e-print archive All the source code used in this paper is available at http://pads.cs.unibo.it Gabriele D'Angelo  E-mail: <g.dangelo@unibo.it>  http://www.cs.unibo.it/gdangelo/ A Parallel Data Distribution Management Algoriithm DS-RT 2013 Gabriele D'Angelo <g.dangelo@unibo.it> 35 / 36
  36. 36. A Parallel Data Distribution Management Algorithm Gabriele D’Angelo <g.dangelo@unibo.it> http://www.cs.unibo.it/gdangelo/ joint work with: Moreno Marzolla and Marco Mandrioli Delft, Netherlands Distributed Simulation and Real Time Applications (DS-RT), 2013

×