SlideShare a Scribd company logo
1 of 140
Download to read offline
Scala Parallel Collections 
Aleksandar Prokopec 
EPFL
Scala collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s) 
McDonald
Scala collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s) 
1040 ms
Scala parallel collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s)
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s)
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s) 
2 cores 
575 ms
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s) 
4 cores 
305 ms
for comprehensions 
surnames.par.flatMap { s => 
names.par 
.filter(n => s endsWith n) 
.map(n => (n, s)) 
}
for comprehensions nested parallelized bulk operations 
surnames.par.flatMap { s => 
names.par 
.filter(n => s endsWith n) 
.map(n => (n, s)) 
}
Nested parallelism
Nested parallelism parallel within parallel 
composition 
surnames.par.flatMap { s => 
surnameToCollection(s) 
// may invoke parallel ops 
}
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ...
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
recursive algorithms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, Array(""))
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, Array("")) 
1545 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray(""))
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
1 core 
1575 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
2 cores 
809 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
4 cores 
530 ms
So, I just use par and I’m home free?
How to think parallel
Character count use case for foldLeft 
val txt: String = ... 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}
6 
5 
4 
3 
2 
1 
0 
Character count use case for foldLeft 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
going left to right - not parallelizable! 
A 
B 
C 
D 
E 
F 
_ + 1
Character count use case for foldLeft 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
going left to right – not really necessary 
3 
2 
1 
0 
A 
B 
C 
_ + 1 
3 
2 
1 
0 
D 
E 
F 
_ + 1 
_ + _ 
6
Character count in parallel 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}
Character count in parallel 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
3 
2 
1 
A 
B 
C 
_ + 1 
3 
2 
1 
A 
B 
C 
: (Int, Char) => Int
Character count fold not applicable 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
! (Int, Int) => Int
Character count use case for aggregate 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _)
3 
2 
1 
A 
B 
C 
Character count use case for aggregate 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
_ + 1
Character count use case for aggregate 
aggregation  element 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + 1
Character count use case for aggregate 
aggregation  aggregation 
aggregation  element 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + 1
Word count another use case for foldLeft 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
}
Word count initial accumulation 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
0 words so far 
last character was a space 
“Folding me softly.”
Word count a space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character is a space
Word count a non space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character was a space – a new word
Word count a non space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character wasn’t a space – no new word
Word count in parallel 
“softly.“ 
“Folding me “ 
P1 
P2
Word count in parallel 
“softly.“ 
“Folding me “ 
wc = 2; rs = 1 
wc = 1; ls = 0 
 
P1 
P2
Word count in parallel 
“softly.“ 
“Folding me “ 
wc = 2; rs = 1 
wc = 1; ls = 0 
 
wc = 3 
P1 
P2
Word count must assume arbitrary partitions 
“g me softly.“ 
“Foldin“ 
wc = 1; rs = 0 
wc = 3; ls = 0 
 
P1 
P2
Word count must assume arbitrary partitions 
“g me softly.“ 
“Foldin“ 
wc = 1; rs = 0 
wc = 3; ls = 0 
 
P1 
P2 
wc = 3
Word count initial aggregation 
txt.par.aggregate((0, 0, 0))
Word count initial aggregation 
txt.par.aggregate((0, 0, 0)) 
# spaces on the left 
# spaces on the right 
#words
Word count initial aggregation 
txt.par.aggregate((0, 0, 0)) 
# spaces on the left 
# spaces on the right 
#words 
””
Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
““ 
“Folding me“ 
 
“softly.“ 
““ 

Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
“e softly.“ 
“Folding m“ 

Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
“ softly.“ 
“Folding me” 

Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
”_” 
0 words and a space – add one more space each side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
” m” 
0 words and a non-space – one word, no spaces on the right side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
” me_” 
nonzero words and a space – one more space on the right side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
” me sof” 
nonzero words, last non-space and current non-space – no change
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
” me s” 
nonzero words, last space and current non-space – one more word
Word count in parallel 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count using parallel strings? 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count string not really parallelizable 
scala> (txt: String).par
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…)
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation!
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation! 
ParArray
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation! 
ParArray 
 copy string contents into an array
Conversions going parallel 
// `par` is efficient for... 
mutable.{Array, ArrayBuffer, ArraySeq} 
mutable.{HashMap, HashSet} 
immutable.{Vector, Range} 
immutable.{HashMap, HashSet}
Conversions going parallel 
// `par` is efficient for... 
mutable.{Array, ArrayBuffer, ArraySeq} 
mutable.{HashMap, HashSet} 
immutable.{Vector, Range} 
immutable.{HashMap, HashSet} 
most other collections construct a new parallel collection!
Conversions going parallel 
sequential 
parallel 
Array, ArrayBuffer, ArraySeq 
mutable.ParArray 
mutable.HashMap 
mutable.ParHashMap 
mutable.HashSet 
mutable.ParHashSet 
immutable.Vector 
immutable.ParVector 
immutable.Range 
immutable.ParRange 
immutable.HashMap 
immutable.ParHashMap 
immutable.HashSet 
immutable.ParHashSet
Conversions going parallel 
// `seq` is always efficient 
ParArray(1, 2, 3).seq 
List(1, 2, 3, 4).seq 
ParHashMap(1 -> 2, 3 -> 4).seq 
”abcd”.seq 
// `par` may not be... 
”abcd”.par
Custom collections
Custom collection 
class ParString(val str: String)
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] {
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str)
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str) 
def splitter: Splitter[Char]
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str) 
def splitter = 
new ParStringSplitter(0, str.length)
Custom collection splitter definition 
class ParStringSplitter(var i: Int, len: Int) 
extends Splitter[Char] {
Custom collection splitters are iterators 
class ParStringSplitter(i: Int, len: Int) 
extends Splitter[Char] { 
def hasNext = i < len 
def next = { 
val r = str.charAt(i) 
i += 1 
r 
}
Custom collection splitters must be duplicated 
... 
def dup = new ParStringSplitter(i, len)
Custom collection splitters know how many elements remain 
... 
def dup = new ParStringSplitter(i, len) 
def remaining = len - i
Custom collection splitters can be split 
... 
def psplit(sizes: Int*): Seq[ParStringSplitter] = { 
val splitted = new ArrayBuffer[ParStringSplitter] 
for (sz <- sizes) { 
val next = (i + sz) min ntl 
splitted += new ParStringSplitter(i, next) 
i = next 
} 
splitted 
}
Word count now with parallel strings 
new ParString(txt).aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count performance 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
new ParString(txt).aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
}) 
100 ms 
cores: 1 2 4 
time: 137 ms 70 ms 35 ms
Hierarchy 
GenTraversable 
GenIterable 
GenSeq 
Traversable 
Iterable 
Seq 
ParIterable 
ParSeq
Hierarchy 
def nonEmpty(sq: Seq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
}
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
}
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
} 
side-effects! 
ArrayBuffer is not synchronized!
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
} 
side-effects! 
ArrayBuffer is not synchronized! 
ParSeq 
Seq
Hierarchy 
def nonEmpty(sq: GenSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res.synchronized { 
res += s 
} 
} 
res 
}
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
These return collections!
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
Sequential collections – builders
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
Sequential collections – builders 
Parallel collections – combiners
Builders building a sequential collection 
1 
2 
3 
4 
5 
6 
7 
Nil 
Nil 
ListBuilder 
+= 
+= 
+= 
result
How to build parallel?
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
}
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
Combiner 
Combiner 
Combiner
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
Should be efficient – O(log n) worst case
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
How to implement this combine?
Parallel arrays 
1, 2, 3, 4 
5, 6, 7, 8 
4 
6, 8 
3, 1, 8, 0 
2, 2, 1, 9 
8, 0 
2, 2 
merge 
merge 
merge 
copy 
allocate 
2 
4 
6 
8 
8 
0 
2 
2
Parallel hash tables 
ParHashMap
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
e.g. calling filter
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
ParHashCombiner 
e.g. calling filter
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
0 
1 
4 
ParHashCombiner 
5 
7 
9
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
0 
1 
4 
ParHashCombiner 
5 
9 
5 
7 
0 
1 
4 
7 
9
Parallel hash tables 
ParHashMap 
ParHashCombiner 
ParHashCombiner 
How to merge? 
5 
7 
0 
1 
4 
9
5 
7 
8 
9 
1 
4 
0 
Parallel hash tables 
buckets! 
ParHashCombiner 
ParHashCombiner 
ParHashMap 
2 
0 = 00002 
1 = 00012 
4 = 01002
Parallel hash tables 
ParHashCombiner 
ParHashCombiner 
0 
1 
4 
9 
7 
5 
combine
Parallel hash tables 
ParHashCombiner 
ParHashCombiner 
9 
7 
5 
0 
1 
4 
ParHashCombiner 
no copying!
Parallel hash tables 
9 
7 
5 
0 
1 
4 
ParHashCombiner
Parallel hash tables 
9 
7 
5 
0 
1 
4 
ParHashMap
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
What is the return type here?
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
creates a ParVector!
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
creates a ParVector! 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
...
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
...
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
... 
protected[this] override def newCombiner 
: Combiner[Char, ParString]
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
... 
protected[this] override def newCombiner = 
new ParStringCombiner
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] {
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
size
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
size
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
size 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
size 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
size 
lastc 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
def +=(elem: Char) = { 
lastc += elem 
size += 1 
this 
}
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
def +=(elem: Char) = { 
lastc += elem 
size += 1 
this 
} 
size 
lastc 
chunks 
+1
Custom combiners for methods returning custom collections 
... 
def combine[U <: Char, NewTo >: ParString] 
(other: Combiner[U, NewTo]) = other match { 
case psc: ParStringCombiner => 
sz += that.sz 
chunks ++= that.chunks 
lastc = chunks.last 
this 
}
Custom combiners for methods returning custom collections 
... 
def combine[U <: Char, NewTo >: ParString] 
(other: Combiner[U, NewTo]) 
lastc 
chunks 
lastc 
chunks
Custom combiners for methods returning custom collections 
... 
def result = { 
val rsb = new StringBuilder 
for (sb <- chunks) rsb.append(sb) 
new ParString(rsb.toString) 
} 
...
Custom combiners for methods returning custom collections 
... 
def result = ... 
lastc 
chunks 
StringBuilder
Custom combiners for methods expecting implicit builder factories 
// only for big boys 
... 
with GenericParTemplate[T, ParColl] 
... 
object ParColl extends ParFactory[ParColl] { 
implicit def canCombineFrom[T] = 
new GenericCanCombineFrom[T] 
...
Custom combiners performance measurement 
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘)
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
2 cores 
81 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
Custom combiners performance measurement
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
t/ms 
proc 
125 ms 
1 
2 
4 
81 ms 
56 ms 
Custom combiners performance measurement
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
t/ms 
proc 
125 ms 
1 
2 
4 
81 ms 
56 ms 
def result 
(not parallelized) 
Custom combiners performance measurement
Custom combiners tricky! 
•two-step evaluation 
–parallelize the result method in combiners 
•efficient merge operation 
–binomial heaps, ropes, etc. 
•concurrent data structures 
–non-blocking scalable insertion operation 
–we’re working on this
Future work coming up 
•concurrent data structures 
•more efficient vectors 
•custom task pools 
•user defined scheduling 
•parallel bulk in-place modifications
Thank you! 
Examples at: 
git://github.com/axel22/sd.git

More Related Content

What's hot

Functional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianFunctional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianBrian Lonsdorf
 
Purely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaPurely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaVladimir Kostyukov
 
Kotlin collections
Kotlin collectionsKotlin collections
Kotlin collectionsMyeongin Woo
 
Kotlin Advanced - Apalon Kotlin Sprint Part 3
Kotlin Advanced - Apalon Kotlin Sprint Part 3Kotlin Advanced - Apalon Kotlin Sprint Part 3
Kotlin Advanced - Apalon Kotlin Sprint Part 3Kirill Rozov
 
Switching from java to groovy
Switching from java to groovySwitching from java to groovy
Switching from java to groovyPaul Woods
 
7 Habits For a More Functional Swift
7 Habits For a More Functional Swift7 Habits For a More Functional Swift
7 Habits For a More Functional SwiftJason Larsen
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYvikram mahendra
 
Constraint Programming in Haskell
Constraint Programming in HaskellConstraint Programming in Haskell
Constraint Programming in HaskellDavid Overton
 
Lambda Expressions in Java 8
Lambda Expressions in Java 8Lambda Expressions in Java 8
Lambda Expressions in Java 8bryanbibat
 
学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2Opt Technologies
 

What's hot (15)

Functional Patterns for the non-mathematician
Functional Patterns for the non-mathematicianFunctional Patterns for the non-mathematician
Functional Patterns for the non-mathematician
 
Purely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaPurely Functional Data Structures in Scala
Purely Functional Data Structures in Scala
 
Data import-cheatsheet
Data import-cheatsheetData import-cheatsheet
Data import-cheatsheet
 
Data transformation-cheatsheet
Data transformation-cheatsheetData transformation-cheatsheet
Data transformation-cheatsheet
 
Kotlin collections
Kotlin collectionsKotlin collections
Kotlin collections
 
Scala by Luc Duponcheel
Scala by Luc DuponcheelScala by Luc Duponcheel
Scala by Luc Duponcheel
 
Kotlin Advanced - Apalon Kotlin Sprint Part 3
Kotlin Advanced - Apalon Kotlin Sprint Part 3Kotlin Advanced - Apalon Kotlin Sprint Part 3
Kotlin Advanced - Apalon Kotlin Sprint Part 3
 
Tuples All the Way Down
Tuples All the Way DownTuples All the Way Down
Tuples All the Way Down
 
Python programming : List and tuples
Python programming : List and tuplesPython programming : List and tuples
Python programming : List and tuples
 
Switching from java to groovy
Switching from java to groovySwitching from java to groovy
Switching from java to groovy
 
7 Habits For a More Functional Swift
7 Habits For a More Functional Swift7 Habits For a More Functional Swift
7 Habits For a More Functional Swift
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAY
 
Constraint Programming in Haskell
Constraint Programming in HaskellConstraint Programming in Haskell
Constraint Programming in Haskell
 
Lambda Expressions in Java 8
Lambda Expressions in Java 8Lambda Expressions in Java 8
Lambda Expressions in Java 8
 
学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2学生向けScalaハンズオンテキスト part2
学生向けScalaハンズオンテキスト part2
 

Similar to Scala Parallel Collections

Truth, deduction, computation lecture g
Truth, deduction, computation   lecture gTruth, deduction, computation   lecture g
Truth, deduction, computation lecture gVlad Patryshev
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...Philip Schwarz
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and ScalaFolding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and ScalaPhilip Schwarz
 
The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212Mahmoud Samir Fayed
 
Perl 6 in Context
Perl 6 in ContextPerl 6 in Context
Perl 6 in Contextlichtkind
 
Definition ofvectorspace
Definition ofvectorspaceDefinition ofvectorspace
Definition ofvectorspaceTanuj Parikh
 
(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?Tomasz Wrobel
 
Monadologie
MonadologieMonadologie
Monadologieleague
 
The Magnificent Seven
The Magnificent SevenThe Magnificent Seven
The Magnificent SevenMike Fogus
 
Rewriting Java In Scala
Rewriting Java In ScalaRewriting Java In Scala
Rewriting Java In ScalaSkills Matter
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to PerlSway Wang
 
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVSCBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVSGautham Rajesh
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingAlberto Labarga
 
Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Bryan O'Sullivan
 
Laziness in Swift
Laziness in Swift Laziness in Swift
Laziness in Swift SwiftWro
 
Scala - where objects and functions meet
Scala - where objects and functions meetScala - where objects and functions meet
Scala - where objects and functions meetMario Fusco
 
iRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetSamuel Lampa
 

Similar to Scala Parallel Collections (20)

Truth, deduction, computation lecture g
Truth, deduction, computation   lecture gTruth, deduction, computation   lecture g
Truth, deduction, computation lecture g
 
Python Lecture 11
Python Lecture 11Python Lecture 11
Python Lecture 11
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - with ...
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and ScalaFolding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
 
The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212The Ring programming language version 1.10 book - Part 33 of 212
The Ring programming language version 1.10 book - Part 33 of 212
 
Perl 6 in Context
Perl 6 in ContextPerl 6 in Context
Perl 6 in Context
 
Definition ofvectorspace
Definition ofvectorspaceDefinition ofvectorspace
Definition ofvectorspace
 
SDC - Einführung in Scala
SDC - Einführung in ScalaSDC - Einführung in Scala
SDC - Einführung in Scala
 
(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?(How) can we benefit from adopting scala?
(How) can we benefit from adopting scala?
 
Monadologie
MonadologieMonadologie
Monadologie
 
The Magnificent Seven
The Magnificent SevenThe Magnificent Seven
The Magnificent Seven
 
Rewriting Java In Scala
Rewriting Java In ScalaRewriting Java In Scala
Rewriting Java In Scala
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to Perl
 
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVSCBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Real World Haskell: Lecture 2
Real World Haskell: Lecture 2
 
Laziness in Swift
Laziness in Swift Laziness in Swift
Laziness in Swift
 
Scala - where objects and functions meet
Scala - where objects and functions meetScala - where objects and functions meet
Scala - where objects and functions meet
 
The hitchhicker’s guide to unit testing
The hitchhicker’s guide to unit testingThe hitchhicker’s guide to unit testing
The hitchhicker’s guide to unit testing
 
iRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat Sheet
 

Recently uploaded

Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 

Recently uploaded (20)

Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 

Scala Parallel Collections

  • 1. Scala Parallel Collections Aleksandar Prokopec EPFL
  • 2.
  • 3. Scala collections for { s <- surnames n <- names if s endsWith n } yield (n, s) McDonald
  • 4. Scala collections for { s <- surnames n <- names if s endsWith n } yield (n, s) 1040 ms
  • 5.
  • 6. Scala parallel collections for { s <- surnames n <- names if s endsWith n } yield (n, s)
  • 7. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s)
  • 8. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s) 2 cores 575 ms
  • 9. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s) 4 cores 305 ms
  • 10. for comprehensions surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s)) }
  • 11. for comprehensions nested parallelized bulk operations surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s)) }
  • 13. Nested parallelism parallel within parallel composition surnames.par.flatMap { s => surnameToCollection(s) // may invoke parallel ops }
  • 14. Nested parallelism going recursive def vowel(c: Char): Boolean = ...
  • 15. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc
  • 16. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield recursive algorithms
  • 17. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c
  • 18. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c
  • 19. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, Array(""))
  • 20. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, Array("")) 1545 ms
  • 21. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray(""))
  • 22. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 1 core 1575 ms
  • 23. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 2 cores 809 ms
  • 24. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 4 cores 530 ms
  • 25. So, I just use par and I’m home free?
  • 26. How to think parallel
  • 27. Character count use case for foldLeft val txt: String = ... txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 }
  • 28. 6 5 4 3 2 1 0 Character count use case for foldLeft txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } going left to right - not parallelizable! A B C D E F _ + 1
  • 29. Character count use case for foldLeft txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } going left to right – not really necessary 3 2 1 0 A B C _ + 1 3 2 1 0 D E F _ + 1 _ + _ 6
  • 30. Character count in parallel txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 }
  • 31. Character count in parallel txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } 3 2 1 A B C _ + 1 3 2 1 A B C : (Int, Char) => Int
  • 32. Character count fold not applicable txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } 3 2 1 A B C _ + _ 3 3 3 2 1 A B C ! (Int, Int) => Int
  • 33. Character count use case for aggregate txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _)
  • 34. 3 2 1 A B C Character count use case for aggregate txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + _ 3 3 3 2 1 A B C _ + 1
  • 35. Character count use case for aggregate aggregation  element 3 2 1 A B C _ + _ 3 3 3 2 1 A B C txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + 1
  • 36. Character count use case for aggregate aggregation  aggregation aggregation  element 3 2 1 A B C _ + _ 3 3 3 2 1 A B C txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + 1
  • 37. Word count another use case for foldLeft txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) }
  • 38. Word count initial accumulation txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } 0 words so far last character was a space “Folding me softly.”
  • 39. Word count a space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character is a space
  • 40. Word count a non space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character was a space – a new word
  • 41. Word count a non space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character wasn’t a space – no new word
  • 42. Word count in parallel “softly.“ “Folding me “ P1 P2
  • 43. Word count in parallel “softly.“ “Folding me “ wc = 2; rs = 1 wc = 1; ls = 0  P1 P2
  • 44. Word count in parallel “softly.“ “Folding me “ wc = 2; rs = 1 wc = 1; ls = 0  wc = 3 P1 P2
  • 45. Word count must assume arbitrary partitions “g me softly.“ “Foldin“ wc = 1; rs = 0 wc = 3; ls = 0  P1 P2
  • 46. Word count must assume arbitrary partitions “g me softly.“ “Foldin“ wc = 1; rs = 0 wc = 3; ls = 0  P1 P2 wc = 3
  • 47. Word count initial aggregation txt.par.aggregate((0, 0, 0))
  • 48. Word count initial aggregation txt.par.aggregate((0, 0, 0)) # spaces on the left # spaces on the right #words
  • 49. Word count initial aggregation txt.par.aggregate((0, 0, 0)) # spaces on the left # spaces on the right #words ””
  • 50. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res ““ “Folding me“  “softly.“ ““ 
  • 51. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) “e softly.“ “Folding m“ 
  • 52. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) “ softly.“ “Folding me” 
  • 53. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) ”_” 0 words and a space – add one more space each side
  • 54. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) ” m” 0 words and a non-space – one word, no spaces on the right side
  • 55. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) ” me_” nonzero words and a space – one more space on the right side
  • 56. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) ” me sof” nonzero words, last non-space and current non-space – no change
  • 57. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) ” me s” nonzero words, last space and current non-space – one more word
  • 58. Word count in parallel txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 59. Word count using parallel strings? txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 60. Word count string not really parallelizable scala> (txt: String).par
  • 61. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…)
  • 62. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation!
  • 63. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation! ParArray
  • 64. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation! ParArray  copy string contents into an array
  • 65. Conversions going parallel // `par` is efficient for... mutable.{Array, ArrayBuffer, ArraySeq} mutable.{HashMap, HashSet} immutable.{Vector, Range} immutable.{HashMap, HashSet}
  • 66. Conversions going parallel // `par` is efficient for... mutable.{Array, ArrayBuffer, ArraySeq} mutable.{HashMap, HashSet} immutable.{Vector, Range} immutable.{HashMap, HashSet} most other collections construct a new parallel collection!
  • 67. Conversions going parallel sequential parallel Array, ArrayBuffer, ArraySeq mutable.ParArray mutable.HashMap mutable.ParHashMap mutable.HashSet mutable.ParHashSet immutable.Vector immutable.ParVector immutable.Range immutable.ParRange immutable.HashMap immutable.ParHashMap immutable.HashSet immutable.ParHashSet
  • 68. Conversions going parallel // `seq` is always efficient ParArray(1, 2, 3).seq List(1, 2, 3, 4).seq ParHashMap(1 -> 2, 3 -> 4).seq ”abcd”.seq // `par` may not be... ”abcd”.par
  • 70. Custom collection class ParString(val str: String)
  • 71. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] {
  • 72. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length
  • 73. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str)
  • 74. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter: Splitter[Char]
  • 75. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter = new ParStringSplitter(0, str.length)
  • 76. Custom collection splitter definition class ParStringSplitter(var i: Int, len: Int) extends Splitter[Char] {
  • 77. Custom collection splitters are iterators class ParStringSplitter(i: Int, len: Int) extends Splitter[Char] { def hasNext = i < len def next = { val r = str.charAt(i) i += 1 r }
  • 78. Custom collection splitters must be duplicated ... def dup = new ParStringSplitter(i, len)
  • 79. Custom collection splitters know how many elements remain ... def dup = new ParStringSplitter(i, len) def remaining = len - i
  • 80. Custom collection splitters can be split ... def psplit(sizes: Int*): Seq[ParStringSplitter] = { val splitted = new ArrayBuffer[ParStringSplitter] for (sz <- sizes) { val next = (i + sz) min ntl splitted += new ParStringSplitter(i, next) i = next } splitted }
  • 81. Word count now with parallel strings new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 82. Word count performance txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) }) 100 ms cores: 1 2 4 time: 137 ms 70 ms 35 ms
  • 83. Hierarchy GenTraversable GenIterable GenSeq Traversable Iterable Seq ParIterable ParSeq
  • 84. Hierarchy def nonEmpty(sq: Seq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res }
  • 85. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res }
  • 86. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res } side-effects! ArrayBuffer is not synchronized!
  • 87. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res } side-effects! ArrayBuffer is not synchronized! ParSeq Seq
  • 88. Hierarchy def nonEmpty(sq: GenSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res.synchronized { res += s } } res }
  • 89. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …
  • 90. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … These return collections!
  • 91. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … Sequential collections – builders
  • 92. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … Sequential collections – builders Parallel collections – combiners
  • 93. Builders building a sequential collection 1 2 3 4 5 6 7 Nil Nil ListBuilder += += += result
  • 94. How to build parallel?
  • 95. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] }
  • 96. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } Combiner Combiner Combiner
  • 97. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } Should be efficient – O(log n) worst case
  • 98. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } How to implement this combine?
  • 99. Parallel arrays 1, 2, 3, 4 5, 6, 7, 8 4 6, 8 3, 1, 8, 0 2, 2, 1, 9 8, 0 2, 2 merge merge merge copy allocate 2 4 6 8 8 0 2 2
  • 100. Parallel hash tables ParHashMap
  • 101. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 e.g. calling filter
  • 102. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner ParHashCombiner e.g. calling filter
  • 103. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner 0 1 4 ParHashCombiner 5 7 9
  • 104. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner 0 1 4 ParHashCombiner 5 9 5 7 0 1 4 7 9
  • 105. Parallel hash tables ParHashMap ParHashCombiner ParHashCombiner How to merge? 5 7 0 1 4 9
  • 106. 5 7 8 9 1 4 0 Parallel hash tables buckets! ParHashCombiner ParHashCombiner ParHashMap 2 0 = 00002 1 = 00012 4 = 01002
  • 107. Parallel hash tables ParHashCombiner ParHashCombiner 0 1 4 9 7 5 combine
  • 108. Parallel hash tables ParHashCombiner ParHashCombiner 9 7 5 0 1 4 ParHashCombiner no copying!
  • 109. Parallel hash tables 9 7 5 0 1 4 ParHashCombiner
  • 110. Parallel hash tables 9 7 5 0 1 4 ParHashMap
  • 111. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) What is the return type here?
  • 112. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) creates a ParVector!
  • 113. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) creates a ParVector! class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) ...
  • 114. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ...
  • 115. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ... protected[this] override def newCombiner : Combiner[Char, ParString]
  • 116. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ... protected[this] override def newCombiner = new ParStringCombiner
  • 117. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] {
  • 118. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0
  • 119. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 size
  • 120. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) size
  • 121. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) size chunks
  • 122. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last size chunks
  • 123. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last size lastc chunks
  • 124. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }
  • 125. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this } size lastc chunks +1
  • 126. Custom combiners for methods returning custom collections ... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo]) = other match { case psc: ParStringCombiner => sz += that.sz chunks ++= that.chunks lastc = chunks.last this }
  • 127. Custom combiners for methods returning custom collections ... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo]) lastc chunks lastc chunks
  • 128. Custom combiners for methods returning custom collections ... def result = { val rsb = new StringBuilder for (sb <- chunks) rsb.append(sb) new ParString(rsb.toString) } ...
  • 129. Custom combiners for methods returning custom collections ... def result = ... lastc chunks StringBuilder
  • 130. Custom combiners for methods expecting implicit builder factories // only for big boys ... with GenericParTemplate[T, ParColl] ... object ParColl extends ParFactory[ParColl] { implicit def canCombineFrom[T] = new GenericCanCombineFrom[T] ...
  • 131. Custom combiners performance measurement txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘)
  • 132. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms Custom combiners performance measurement
  • 133. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms Custom combiners performance measurement
  • 134. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms 2 cores 81 ms Custom combiners performance measurement
  • 135. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms 2 cores 81 ms 4 cores 56 ms Custom combiners performance measurement
  • 136. 1 core 125 ms 2 cores 81 ms 4 cores 56 ms t/ms proc 125 ms 1 2 4 81 ms 56 ms Custom combiners performance measurement
  • 137. 1 core 125 ms 2 cores 81 ms 4 cores 56 ms t/ms proc 125 ms 1 2 4 81 ms 56 ms def result (not parallelized) Custom combiners performance measurement
  • 138. Custom combiners tricky! •two-step evaluation –parallelize the result method in combiners •efficient merge operation –binomial heaps, ropes, etc. •concurrent data structures –non-blocking scalable insertion operation –we’re working on this
  • 139. Future work coming up •concurrent data structures •more efficient vectors •custom task pools •user defined scheduling •parallel bulk in-place modifications
  • 140. Thank you! Examples at: git://github.com/axel22/sd.git