2. What is stream
A stream is not a collection , sequence , or a stream of objects
A stream is an abstraction that holds zero or more values
Not (necessarily) a collection : values might not be stored anywhere
Not (necessarily) a sequence : order might not matter
Values ,not objects : avoids mutation and side effects
3. Pipelines
A stream source
Zero of more intermediate operations
A terminal operation
Collection.strteam()
.filter(…)
.map(…)
.collect(…);
4. Parallel Streams
Sources start with stream(),parallelStream() or other stream factory
Can be switched using parallel and sequential stream
Parallel vs sequential is a property of the entire pipeline
Can’t switch between parallel and sequential in the middle
Last one wins
Parallel makes it auto-magically go faster?
NO
collection().stream()
.filter(...)
.parallel()
.map(...)
.sequential()
.collect(...)
entire stream runs sequentially
5. Parallel stream considerations
Parallel and sequential stream should give same result
Parallelism leads to non determinism which is bad
Encounter order vs processing order
Stateful vs stateless : side effects
Accumulation vs Reduction
Reduction : Identity and associativity
Explicit nondeterminism can speed things up
Parallel has a overload , might also slow up things
10. Identity Value
The starting value of each partition in parallel stream
Becomes the result if the stream is empty
The values must be correct
must really a VALUE(immutable)
12. Where are threads
Stream workload split and dispatched to the common-fork-join pool
Control over concurrency is explicitly opaque in the api
Common pool controlled by system properties
13. When go parallel
Parallel stream has startup overload
Typically 1000 misroseconds
If you computation is shorter , do not even bother
Consider parallel if N * Q >= 10,000
N = number of elements
Q = cost per element
Assumptions
Element processing is idependent and source is spliatble