This document discusses Java 8 streams and how they are implemented under the hood. Streams use a pipeline concept where the output of one unit becomes the input of the next. Streams are built on functional interfaces and lambdas introduced in Java 8. Spliterators play an important role in enabling parallel processing of streams by splitting data sources into multiple chunks that can be processed independently in parallel. The reference pipeline implementation uses a linked list of processing stages to represent the stream operations and evaluates whether operations can be parallelized to take advantage of multiple threads or cores.
2. What's all about …………………….
This presentation gives more insight how the streams api are implemented
under the hood .
This can give more insight into what library developers and app developers are
using
3. Design involved in Streams api
● The building block of the stream is computer science concept called pipeline .
● A pipeline is concept when the output of one of the one unit of the work is the input of the other
● Streams api use this concept to provide powerful interfaces for manipulating the data stored in
collections
● Streams api is built around functional interfaces new introduction to java 8 in java.util.function*
● Functional interfaces provides along with lambdas provide great opportunity for behavior driven
development
4. Functional Interface
● A functional interface can have only abstract method
● Other methods can be declared but it can default and should have a base implementation
● Lambda can be used only with functional interface
5. java.util.function*
Function<T, R>
Takes an input and gives an output
R apply(T t);
default funtion compose(Function<? super V, ? extends T> before)
evaluates the bef0re ands applies the result to the current
default function andThen(Function<? super R, ? extends V> after)
evaluates the current apply and input the after.apply()
6. More functions
BiFunction<T, U, R>
takes two inputs and gives an output
and the default fn and then evaluates bifunction and applies the output to the after
function ie after.apply()
Predicate<T>
A predicate take an input performs a test and supplies boolean and(Predicate<T>
predicate)
Returns a composed predicate that represents a short-circuiting logical AND of this
predicate and another.
negate
Returns a predicate that represents the logical negation of this predicate
or(Predicate<T> predicate)
returns or short circuiting
isEqual(Object ref)
Returns a predicate that tests if two arguments are equal according
Consumer<T>
A consumer consumes a values and return nothing nothing and then accepts the current
andthe after.accepts
7. Spliterator
● Spliterator Interface design is means of achieving pipelined access to the data structure
● A Split Iterator can advance through the block of data individually tryAdance() or in a bulk forEachRemaining()
and can split trySplit() into another Spliterator
● It has a series of characteristics represented by masks which indicates to the client of the spilt iterator so that
client write the behaviour according to that
● Some of these are ORDERED DISTINCT SORTED CONCURRENT IMMUTABLE
● Characteristics like order makes the code traversing conforming to order
● The split iterator also detects for concurrent modification if it does not have the characteristics CONCURRENT
and IMMUTABLE .
● A late binding Spliterator can bind to elements in collection when the first tryAdvance(), trySplit() id called.
● An non late binding can attach to the source at invocation on of any method of the split iterator
8. Spliterator
● As an example how a parallel computation framework, such as the {@code java.util.stream} package, would use Spliterator in a
parallel computation
● if we assume that the order of processing across subtasks doesn't matter; different (forked) tasks may further split and process
elements concurrently in undetermined order.
static class ParEach<T> extends CountedCompleter<Void> {
final Spliterator<T> spliterator;
final Consumer<T> action;
final long targetBatchSize;
ParEach(ParEach<T> parent, Spliterator<T> spliterator,
Consumer<T> action, long targetBatchSize) {
super(parent);
this.spliterator = spliterator; this.action = action;
this.targetBatchSize = targetBatchSize;
}
public void compute() {
Spliterator<T> sub;
while (spliterator.estimateSize() > targetBatchSize &&
(sub = spliterator.trySplit()) != null) {
addToPendingCount(1);
new ParEach<>(this, sub, action, targetBatchSize).fork();
}
spliterator.forEachRemaining(action);
propagateCompletion();
9. How the reference pipeline works?
● The basic data structure involved is a linked list of pipe stages .
● The pipe stages are initiated with a Head when the stream is initialized
● The when an ops like filter , map or reduce is added the pipe stages are added on a linked list
● the ops like filter(Predicate<? super P_OUT> predicate) map(Function<? super P_OUT, ? extends R> mapper)
flatMap(Function<? super P_OUT, ? extends Stream<? extends R>> mapper) are intermediate stages
● the terminal ops are reduce(final P_OUT identity, final BinaryOperator<P_OUT> accumulator)
,collect(Collector<? super P_OUT, A, R> collector) and forEach(Consumer<? super P_OUT> action)
● So when the terminal stages are added the code evaluates whether the ops can be parallelised and then start
the call the spliterator code and apply all the behaviours in the sequentially
10. Parallelisation of ops
● Each stages of the pipeline can be parallelized if the spliterator implementation and order is not required for the
execution
● So how parallel ops work . There is a method called trysplit() in spliterator which splits the data structure into two
● These two can be further split if that data is long enough and supplied the fork join common pool .
● So the fork join uses a counted completer for computing each split and compute parallel
● There is a eval parallel which evaluates the contention in the common pool before doing a fork join .
● So parallel is not guaranteed to work cause if allowed for the developer control . Careless implementation can
cause overload and less throughput