Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Java Concurrency, Memory Model, and... by Carol McDonald 12585 views
- Concurrency Utilities in Java 8 by Martin Toshev 1374 views
- Concurrency with java by James Wong 180 views
- Java 7 - Fork/Join by Zenika 8511 views
- Java concurrency - Thread pools by maksym220889 5295 views
- Java 9, JShell, and Modularity by Mohammad Hossein ... 924 views

The slides from the talk I gave at Oracle III #JuevesTecnológicos in Madrid.

A review of how the ParallelStreams Work in Java 8 and some considerations we must know in order to get the better performance from the concurrent data processing in #Java8

No Downloads

Total views

9,097

On SlideShare

0

From Embeds

0

Number of Embeds

854

Shares

0

Downloads

110

Comments

4

Likes

13

No notes for slide

- 1. ParallelStreams Concurrent data processing in Java 8 David Gómez G. @dgomezg dgomezg@autentia.com
- 2. Do you remember? use stream() for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threadsn”, even.size(), System.currentTimeMillis() - start, Thread.activeCount()); } 4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads @dgomezg
- 3. Previously on…
- 4. Streams? What’s that?
- 5. A Stream is… An convenience method to iterate over collections in a declarative way List<Integer> numbers = new ArrayList<Integer>(); for (int i= 0; i < 100 ; i++) { numbers.add(i); } List<Integer> evenNumbers = numbers.stream() .filter(n -> n % 2 == 0) .collect(toList()); @dgomezg
- 6. Anatomy of a Stream Source Intermediate Operations filter map order function Final operation pipeline @dgomezg
- 7. Iterating a Stream List<Integer> evenNumbers = numbers.stream() .filter(n -> n % 2 == 0) .collect(toList()); Internal Iteration - No manual Iterators handling - Concise - Fluent API: chain sequence processing Elements computed only when needed @dgomezg
- 8. Iterating a Stream List<Integer> evenNumbers = numbers.parallelStream() .filter(n -> n % 2 == 0) .collect(toList()); Easily Parallelism - Concurrency is hard to be done right! - Uses ForkJoin - Process steps should be - stateless - independent @dgomezg
- 9. Parallel Streams use stream() List<Integer> numbers = new ArrayList<>(); for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100)); } //This will use just a single thread Stream<Integer> evenNumbers = numbers.stream(); or parallelStream() //Automatically select the optimum number of threads Stream<Integer> evenNumbers = numbers.parallelStream(); @dgomezg
- 10. Let’s test it use stream() for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threadsn”, even.size(), System.currentTimeMillis() - start, Thread.activeCount()); } 5001983 elements computed in 828 msecs with 2 threads 5001983 elements computed in 843 msecs with 2 threads 5001983 elements computed in 675 msecs with 2 threads 5001983 elements computed in 795 msecs with 2 threads @dgomezg
- 11. Going parallel use stream() for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threadsn”, even.size(), System.currentTimeMillis() - start, Thread.activeCount()); } 4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads @dgomezg
- 12. Previously on… http://www.slideshare.net/dgomezg/streams-en-java-8
- 13. Parallelism Under the hood
- 14. Fork/Join Framework Proposed by Doug Lea "a style of parallel programming in which problems are solved by (recursively) splitting them into subtasks that are solved in parallel." Available in Java 7 Used by ParallelStreams
- 15. The F/J algorithm Result solve(Problem problem) { if (problem is small) directly solve problem else { split problem into independent parts fork new subtasks to solve each part join all subtasks compose result from subresults } } as proposed by Doug Lea
- 16. ForkJoinPool ExecutorService implementation that • has a defined number of Workers (threads) • executes ForkJoinTasks • submitted by execute(ForkJoinTask task) • or by invoke(ForkJoinTask task)
- 17. ForkJoinTask Abstract class that represents a task to be run concurrently Every ForkJoinTask could be splitted (if not small enough) and solved Recursively Two concrete implementations • RecursiveAction if not returning value • RecursiveTask if returning a value
- 18. ForkJoinWorkerThread Any of the threads created by the ForkJoinPool Executes ForkJoinTasks Everyone has a Dequeue for tasks (allows task stealing)
- 19. ForkJoinWorkerThread Result solve(Problem problem) { if (problem is small) directly solve problem else { split problem into independent parts fork new subtasks to solve each part join all subtasks compose result from subresults } } the F/J algorithm plus Task Stealing.
- 20. Fork/Join. When to use? For computations that could be splitted into smaller tasks aka ‘divide and conquer’ algorithms Independent Reduction with no contention.
- 21. ParallelStreams in action!
- 22. ParallellStreams for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threadsn”, even.size(), System.currentTimeMillis() - start, Thread.activeCount()); } 4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads
- 23. Thread.activeCount not accurate for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threadsn”, even.size(), System.currentTimeMillis() - start, Thread.activeCount()); } Thread.activeCount() does not show the effective number of threads processing the stream
- 24. Better count threads involved Set<String> workerThreadNames = new ConcurrentSet<>(); for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threadsn”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); }
- 25. Threads usage ParallelStreams use the common ForkJoinPool Number of worker threads configured with -‐Djava.util.concurrent.ForkJoinPool.common.parallelism=n Useful to keep CPU parallelism under control… …but …
- 26. Limiting parallelism for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threadsn”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); } -‐Djava.util.concurrent.ForkJoinPool.common.parallelism=4 5001069 elements computed in 269 msecs with 5 threads WTF
- 27. Limiting parallelism for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threadsn”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); } System.out.println("credits to threads: “ + workerThreadNames); 5001069 elements computed in 269 msecs with 5 threads credits to threads: ForkJoinPool.commonPool-worker-0, ForkJoinPool.commonPool-worker-1, ForkJoinPool.commonPool-worker-2, ForkJoinPool.commonPool-worker-3, main WTF
- 28. Threads Involved in ParallelStream ParallelStreams use the common ForkJoinPool Thread invoking ParallelStream also used as Worker Caveats: •ParallelStream processing is synchronous for invoking thread •Other Threads using common ForkJoinPool could be affected
- 29. ParallelStream Hack ParallelStream can be forced to use a custom ForkJoinPool ForkJoinPool forkJoinPool = new ForkJoinPool(4); long start = System.currentTimeMillis(); numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList());
- 30. ParallelStream Hack ParallelStream can be forced to use a custom ForkJoinPool ForkJoinPool forkJoinPool = new ForkJoinPool(4); long start = System.currentTimeMillis(); ForkJoinTask<List<Integer>> task = forkJoinPool.submit(() -> { return numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); } ); List<Integer> even = task.get();
- 31. ParallelStream Hack ParallelStream can be forced to use a custom ForkJoinPool ForkJoinPool forkJoinPool = new ForkJoinPool(4); ForkJoinTask<List<Integer>> task = forkJoinPool.submit(() -> { return numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); } ); List<Integer> even = task.get(); Task submitted in 1 msecs 5000805 elements computed in 328 msecs with 4 threads
- 32. ParallelStream Hack benefits A custom ExecutorService • Does not affect other ParallelStreams • Does not affect Common ForkJoinPool users • Reduces unpredictable latency due to other CommonForkJoin Pool load • Invoking thread not used as worker (async parallel process)
- 33. Problems derived from Common ForkJoinPool
- 34. Blocking for IO If firsts URLs stuck on a ConnectionTimeOut, overall performance could be affected Stream<String> urls = Files.lines(Paths.get("urlsToCheck.txt")); List<String> errors = urls.parallel().filter(url -> { //Connect to URL and wait for 200 response or timeout return true; }).collect(toList());
- 35. Nested parallelStreams Outer parallelStream could exhaust ForkJoin Workers: long start = System.currentTimeMillis(); IntStream.range(0, 10_000).parallel() .forEach(i -> { results[i][0] = (int) Math.round(Math.random() * 100); IntStream.range(1, 9_999) .parallel().forEach((int j) -> results[i][j] = (int) Math.round(Math.random() * 1000)); }); Process finalized in 22974 msecs Process finalized in 22575 msecs Process finalized in 22606 msecs
- 36. Nested parallelStreams Outer parallelStream could exhaust ForkJoin Workers: long start = System.currentTimeMillis(); IntStream.range(0, 10_000).parallel() .forEach(i -> { results[i][0] = (int) Math.round(Math.random() * 100); IntStream.range(1, 9_999) .sequential().forEach((int j) -> results[i][j] = (int) Math.round(Math.random() * 1000)); }); Process finalized in 12491 msecs Process finalized in 12589 msecs Process finalized in 12798 msecs
- 37. Other performance problems
- 38. Too much Auto(un)boxing outboxing and boxing of Integers in every filter call List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); 4999464 elements computed in 290 msecs with 8 threads 4999464 elements computed in 276 msecs with 8 threads 4999464 elements computed in 257 msecs with 8 threads 4999464 elements computed in 265 msecs with 8 threads
- 39. Less Auto(un)boxing outboxing and boxing of Integers in every filter call List<Integer> even = numbers.parallelStream() .mapToInt(n -> n) .filter(n -> n % 2 == 0) .sorted() .boxed() .collect(toList()); 4999460 elements computed in 160 msecs with 8 threads 4999460 elements computed in 243 msecs with 8 threads 4999460 elements computed in 144 msecs with 8 threads 4999460 elements computed in 140 msecs with 8 threads
- 40. Conclusions
- 41. Conclusions ParallelStreams eases concurrent processing but: • Understand how it works • Don’t abuse the default common ForkJoinPool • Don’t use when blocking by IO • Or use a custom ForkJoinPool • Avoid unnecessary autoboxing • Don’t add contention or synchronisation • Be careful with nested parallel streams • Use method references when sorting
- 42. Thank You. @dgomezg dgomezg@autentia.com

No public clipboards found for this slide

Login to see the comments