7.
Some justification <ul><li>Couting and Sorting are basic tasks in software and algorithms.
8.
At any stage our programs rely intensively on one of this tasks. </li><ul><li>Take advantage of new system architecture trends like multiprocessor environments
13.
Quiescent consistency <ul><li>Quiescent periods for concurrent objects: </li><ul><li>No pending calls </li></ul><li>The state of any quiescent object must be equivalent to some sequential order of the completed method calls.
14.
A quiescent counter: </li><ul><li>Neither omissions nor duplicates!! </li></ul></ul>
19.
To discuss... <ul><li>What about these examples? </li></ul>
20.
Linearizability <ul><li>If one call precedes another (even across different threads), then the earlier call must have taken effect before the later call.
21.
If two calls overlap, then their order is ambiguous and we are free to order them in any convenient way.
22.
Linearization points: points where the method seems to take effect. </li></ul>
24.
Measuring the performance <ul><li>Latency </li><ul><li>Time it takes an individual call to complete </li></ul><li>Throughput </li><ul><li>Average rate at which a set of method calls complete </li></ul><li>Lock-based approaches benefit latency. </li></ul>
26.
The “classical” approach <ul><li>Counter: </li><ul><li>Shared object holding an integer value with getandIncrement(int n = 1) method which returns the value and then adds n. </li></ul><li>Do you want to increment the counter? Use a lock: </li><ul><li>Acquire the lock
38.
Balanced binary tree with k levels </li><ul><li>k = min{ j | 2 j >= p}, 2 k-1 leave nodes </li></ul><li>Each nodes holds 2 values to combine and a state.
39.
Each thread is assigned to a leaf. A leaf can be assigned to 2 threads at most.
40.
The value of the counter is stored at the root </li></ul>
41.
Software Combining Trees <ul><li>To increment the counter, threads have to traverse the tree from leaf node to the root. </li><ul><li>Might combine their values with some other threads during the way. </li></ul><li>Node states </li><ul><li>IDLE: Initial state of the nodes
42.
FIRST: One thread has visited this node and becomes the master
43.
SECOND: A second thread (slave) is waiting for the master to combine. </li></ul></ul>
47.
Sensitive to changes in concurrency rate </li><ul><li>Threads might fail to combine immediately </li></ul><li>What about n-ary trees?
48.
How long to wait for other threads to combine? </li></ul>
49.
Counting networks - Balancers <ul><li>A balancer distributes tokens coming asynchronously from its 2 input wires.
50.
Balancing networks are constructed connecting balancer's outputs to other balancer's inputs. </li></ul>
51.
Balancing network <ul><li>A balancing network has width w with w inputs x 0 , x 1 ,...,x w-1 and outputs y 0 , y 1 ,... y w-1 and in quiescent periods: </li></ul><ul><li>The depth d is defined as the maximum number of balancers a token can traverse starting from any input wire. </li></ul>
52.
The step property <ul><li>If a balancing network follows the step property it is called a Counting Network
53.
Threads sheperd tokens through the network. </li><ul><li>Given the step property it is easy to see, we can use the network to count how many tokens have traversed it. </li></ul></ul>
54.
Bitonic Counting Network <ul><li>For k = 1, a single balancer
60.
Counting networks <ul><li>Bitonic and Sorting networks have depth O(lg 2 (w)), w = 2k = width </li></ul>
61.
Counting networks <ul><li>Saturation measures the ratio token/balancers </li><ul><li>S > 1 oversaturated, S <1 undersaturated </li></ul><li>2k-block and 2k-merger are used in Barrier implementations </li><ul><li>They are threshold networks </li></ul></ul>
62.
Counting networks <ul><li>Periodic and Bitonic are not the only: </li><ul><li>Difracting trees, O(lg w)
63.
BM or Busch-Mavronikolas, w inputs, p*w outputs for some p>1
67.
Sorting networks <ul><li>A comparator is to a sorting network which a balancer is to counting network. </li><ul><li>But they are synchronous !!! </li></ul></ul>
68.
Wow... This saves time!! <ul><li>Isomorphism: </li><ul><li>If a balancing network counts, then its comparison counterpart also does.
81.
Sample sorting <ul><li>Designed for large sets which do not fit in main memory </li><ul><li>Accessing them can be very expensive (if they are in disk.. auch!!)
82.
We need more locality of reference, how? </li></ul><li>p threads, n input keys </li></ul>
83.
Sample sorting – 3 magic steps <ul><li>Step 1: Choose p-1 splitter keys to divide the set evenly. </li><ul><li>But they are not sorted!!
84.
Take s samples then sort them using BitonicSort
85.
Select keys in positions s, 2s,... (p-1)*s as splitters </li></ul><li>Now we have divided the big set into subsets of size n/p approx. </li></ul>
86.
Sample sorting – 3 magic steps <ul><li>Step 2: Each thread sequentially process n/p moving each item to its bucket (defined by the splitters)
87.
Step 3: Each thread sequentially sorts the items in its bucket. </li></ul>
89.
What about integer, fixed keys? Radix sort!! </li></ul><li>Sample might be avoided with prior knowledge of data probability distribution </li></ul>
90.
Other alternatives to Sample Sorting <ul><li>Flash sorting
94.
An equilibrium between fairness in terms of work distribution and communication effort. </li></ul></ul>
95.
Sources [1] M. Herlihy and N. Shavit, “Concurrent objects,” in The Art of Multiprocessor Programming , pp. 45–69, Burlington, USA: Elsevier Inc., 2008. [2] E. N. Klein, C. Busch, and D. R. Musser, “An experimental analysis of counting networks,” Journal of the ACM , pp. 1020–1048, September 1994. [3] J. Aspnes, M. Herlihy, and N. Shavit, “Counting networks,” Journal of the ACM , 1994. [4] M. Herlihy and N. Shavit, “Barriers,” in The Art of Multiprocessor Programming , pp. 397–415, Burlington, USA: Elsevier Inc., 2008. [5] A. Sohn and Y. Kodama, “Load balanced parallel radix sort,” in ICS ’98: Proceedings of the 12th international conference on Supercomputing , (New York, NY, USA), pp. 305–312, ACM, 1998. [6] S.-J. Lee, M. Jeon, D. Kim, and A. Sohn, “Partitioned parallel radix sort,” J. Parallel Distrib. Comput. , vol. 62, no. 4, pp. 656–668, 2002.
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.
Be the first to comment