@crichardson
Map(), flatMap() and reduce() are your
new best friends:
Simpler collections, concurrency, and big
data
Chris ...
@crichardson
Presentation goal
How functional programming simplifies
your code
Show that
map(), flatMap() and reduce()
are r...
@crichardson
About Chris
@crichardson
About Chris
Founder of a buzzword compliant (stealthy, social, mobile,
big data, machine learning, ...) start...
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simpli...
@crichardson
What’s functional
programming?
@crichardson
It’s a programming paradigm
@crichardson
It’s a kind of programming
language
@crichardson
Functions as the building
blocks of the application
@crichardson
Functions as first class
citizens
Assign functions to variables
Store functions in fields
Use and write higher-...
@crichardson
Avoids mutable state
Use:
Immutable data structures
Single assignment variables
Some functional languages suc...
@crichardson
Why functional programming?
"the highest goal of
programming-language
design to enable good
ideas to be elega...
@crichardson
Why functional programming?
More expressive
More concise
More intuitive - solution matches problem definition
...
@crichardson
An ancient idea that has
recently become popular
@crichardson
Mathematical foundation:
λ-calculus
Introduced by
Alonzo Church in the 1930s
@crichardson
Lisp = an early functional
language invented in 1958
http://en.wikipedia.org/wiki/
Lisp_(programming_language...
@crichardson
My final year project in 1985:
Implementing SASL
sieve (p:xs) =
p : sieve [x | x <- xs, rem x p > 0];
primes =...
Mostly an Ivory Tower
technology
Lisp was used for AI
FP languages: Miranda,
ML, Haskell, ...
“Side-effects
kills kittens a...
@crichardson
http://steve-yegge.blogspot.com/2010/12/haskell-researchers-announce-discovery.html
!*
!*
!*
@crichardson
But today FP is mainstream
Clojure - a dialect of Lisp
A hybrid OO/functional language
A hybrid OO/FP languag...
@crichardson
Java 8 lambda expressions
are functions
x -> x * x
x -> {
for (int i = 2; i < Math.sqrt(x); i = i + 1) {
if (...
@crichardson
Java 8 lambdas are a
shorthand* for an anonymous
inner class
* not exactly. See http://programmers.stackexcha...
@crichardson
Java 8 functional interfaces
Interface with a single abstract method
e.g. Runnable, Callable, Spring’s Transa...
@crichardson
Example Functional Interface
Function<Integer, Integer> square = x -> x * x;
BiFunction<Integer, Integer, Int...
@crichardson
Example Functional Interface
ExecutorService executor = ...;
final int x = 999
Future<Boolean> outcome = exec...
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simpli...
@crichardson
Lot’s of application code
=
collection processing:
Mapping, filtering, and reducing
@crichardson
Social network example
public class Person {
enum Gender { MALE, FEMALE }
private Name name;
private LocalDat...
@crichardson
Mapping, filtering, and
reducing
public class Person {
public Set<Hometown> hometownsOfFriends() {
Set<Hometow...
@crichardson
Mapping, filtering, and
reducing
public class Person {
public Set<Person> friendOfFriends() {
Set<Person> resu...
@crichardson
Mapping, filtering, and
reducing
public class SocialNetwork {
private Set<Person> people;
...
public Set<Perso...
@crichardson
Mapping, filtering, and
reducing
public class SocialNetwork {
private Set<Person> people;
...
public int avera...
@crichardson
Problems with this style of
programming
Low level
Imperative (how to do it) NOT declarative (what to do)
Verb...
@crichardson
Java 8 streams to the rescue
A sequence of elements
“Wrapper” around a collection
Streams can also be infinite...
@crichardson
Using Java 8 streams -
mapping
class Person ..
private Set<Friend> friends = ...;
public Set<Hometown> hometo...
@crichardson
The map() function
s1 a b c d e ...
s2 f(a) f(b) f(c) f(d) f(e) ...
s2 = s1.map(f)
@crichardson
public class SocialNetwork {
private Set<Person> people;
...
public Set<Person> peopleWithNoFriends() {
Set<P...
@crichardson
Using Java 8 streams - friend
of friends V1
class Person ..
public Set<Person> friendOfFriends() {
Set<Set<Fr...
@crichardson
Using Java 8 streams -
mapping
class Person ..
public Set<Person> friendOfFriends() {
return friends.stream()...
@crichardson
The flatMap() function
s1 a b ...
s2 f(a)0 f(a)1 f(b)0 f(b)1 f(b)2 ...
s2 = s1.flatMap(f)
@crichardson
Using Java 8 streams -
reducing
public class SocialNetwork {
private Set<Person> people;
...
public long aver...
@crichardson
The reduce() function
s1 a b c d e ...
x = s1.reduce(initial, f)
f(f(f(f(f(f(initial, a), b), c), d), e), ...)
@crichardson
Newton's method for finding square
roots
public class SquareRootCalculator {
public double squareRoot(double i...
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simpli...
@crichardson
Tony’s $1B mistake
“I call it my billion-dollar mistake.
It was the invention of the null
reference in 1965.....
@crichardson
Coding with null pointers
class Person
public Friend longestFriendship() {
Friend result = null;
for (Friend ...
@crichardson
Java 8 Optional<T>
A wrapper for nullable references
It has two states:
empty throws an exception if you try ...
@crichardson
Coding with optionals
class Person
public Optional<Friend> longestFriendship() {
Friend result = null;
for (F...
@crichardson
Using Optionals - better
Optional<Friend> oldestFriendship = ...;
Friend whoToCall1 = oldestFriendship.orElse...
@crichardson
Using Optional.map()
public class Person {
public Optional<Friend> longestFriendship() {
return ...;
}
public...
@crichardson
Using flatMap()
class Person
public Optional<Friend> longestFriendship() {...}
public Optional<Friend> longest...
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simpli...
@crichardson
Let’s imagine you are performing
a CPU intensive operation
class Person ..
public Set<Hometown> hometownsOfFr...
@crichardson
class Person ..
public Set<Hometown> hometownsOfFriends() {
return friends.parallelStream()
.map(f -> cpuInte...
@crichardson
Let’s imagine that you are
writing code to display the
products in a user’s wish list
@crichardson
The need for concurrency
Step #1
Web service request to get the user profile including wish
list (list of prod...
@crichardson
Futures are a great
concurrency abstraction
http://en.wikipedia.org/wiki/Futures_and_promises
@crichardson
Worker thread or
event-driven
Main thread
How futures work
Outcome
Future
Client
get
Asynchronous
operation
s...
@crichardson
Benefits
Simple way for two concurrent activities to communicate safely
Abstraction:
Client does not know how ...
@crichardson
Example wish list service
public interface UserService {
Future<UserProfile> getUserProfile(long userId);
}
p...
@crichardson
public class WishlistService {
private UserService userService;
private ProductInfoService productInfoService...
@crichardson
It works BUT
Code is very low-level and
messy
And, it’s blocking
@crichardson
Better: Futures with callbacks
no blocking!
def asyncSquare(x : Int)
: Future[Int] = ... x * x...
val f = asy...
@crichardson
But
callback-based scatter/gather
Messy, tangled code
(aka. callback hell)
@crichardson
Functional futures - map
def asyncPlus(x : Int, y : Int) = ... x + y ...
val future2 = asyncPlus(4, 5).map{ _...
@crichardson
Functional futures - flatMap()
val f2 = asyncPlus(5, 8).flatMap { x => asyncSquare(x) }
assertEquals(169, Awai...
@crichardson
flatMap() is asynchronous
Outcome3f3
Outcome3
f2
f2 = f1 flatMap (someFn)
Outcome1
f1
Implemented using callbac...
@crichardson
class WishListService(...) {
def getWishList(userId : Long) : Future[WishList] = {
userService.getUserProfile...
@crichardson
Using Java 8 CompletableFutures
public class UserServiceImpl implements UserService {
@Override
public Comple...
@crichardson
Using Java 8 CompletableFutures
public CompletableFuture<Wishlist> getWishlistDetails(long userId) {
return u...
@crichardson
Introducing Reactive
Extensions (Rx)
The Reactive Extensions (Rx) is a library for composing
asynchronous and...
@crichardson
About RxJava
Reactive Extensions (Rx) for the JVM
Original motivation for Netflix was to provide rich Futures
...
@crichardson
RxJava core concepts
trait Observable[T] {
def subscribe(observer : Observer[T]) : Subscription
...
}
trait O...
Comparing Observable to...
Observer pattern - similar but
adds
Observer.onComplete()
Observer.onError()
Iterator pattern -...
@crichardson
Fun with observables
val every10Seconds = Observable.interval(10 seconds)
-1 0 1 ...
t=0 t=10 t=20 ...
val on...
@crichardson
def getTableStatus(tableName: String) : Observable[DynamoDbStatus]=
Observable { subscriber: Subscriber[Dynam...
@crichardson
Transforming observables
val tableStatus = ticker.flatMap { i =>
logger.info("{}th describe table", i + 1)
ge...
@crichardson
Calculating rolling average
class AverageTradePriceCalculator {
def calculateAverages(trades: Observable[Trad...
@crichardson
Calculating average pricesdef calculateAverages(trades: Observable[Trade]): Observable[AveragePrice] = {
trad...
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simpli...
@crichardson
Let’s imagine that you want
to count word frequencies
@crichardson
Scala Word Count
val frequency : Map[String, Int] =
Source.fromFile("gettysburgaddress.txt").getLines()
.flat...
@crichardson
But how to scale to a cluster
of machines?
@crichardson
Apache Hadoop
Open-source software for reliable, scalable, distributed computing
Hadoop Distributed File Syst...
@crichardson
Overview of MapReduce
Input
Data
Mapper
Mapper
Mapper
Reducer
Reducer
Reducer
Out
put
Data
Shuffle
(K,V)
(K,V)...
@crichardson
MapReduce Word count -
mapperclass Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final ...
@crichardson
Hadoop then shuffles the
key-value pairs...
@crichardson
MapReduce Word count -
reducer
class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public vo...
@crichardson
About MapReduce
Very simple programming abstract yet incredibly powerful
By chaining together multiple map/re...
@crichardson
Scalding: Scala DSL for
MapReduce
class WordCountJob(args : Args) extends Job(args) {
TextLine( args("input")...
@crichardson
Apache Spark
Part of the Hadoop ecosystem
Key abstraction = Resilient Distributed Datasets (RDD)
Collection t...
@crichardson
Spark Word Count
val sc = new SparkContext(...)
sc.textFile(“s3n://mybucket/...”)
.flatMap { _.split(" ")}
.g...
@crichardson
Summary
Functional programming enables the elegant expression of
good ideas in a wide variety of domains
map(...
@crichardson
Questions?
@crichardson chris@chrisrichardson.net
http://plainoldobjects.com
Upcoming SlideShare
Loading in …5
×

Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)

35,151 views

Published on

Higher-order functions such as map(), flatmap(), filter() and reduce() have their origins in mathematics and ancient functional programming languages such as Lisp. But today they have entered the mainstream and are available in languages such as JavaScript, Scala and Java 8. They are well on their way to becoming an essential part of every developer’s toolbox.

In this talk you will learn how these and other higher-order functions enable you to write simple, expressive and concise code that solve problems in a diverse set of domains. We will describe how you use them to process collections in Java and Scala. You will learn how functional Futures and Rx (Reactive Extensions) Observables simplify concurrent code. We will even talk about how to write big data applications in a functional style using libraries such as Scalding.

Published in: Technology
1 Comment
65 Likes
Statistics
Notes
No Downloads
Views
Total views
35,151
On SlideShare
0
From Embeds
0
Number of Embeds
1,081
Actions
Shares
0
Downloads
292
Comments
1
Likes
65
Embeds 0
No embeds

No notes for slide

Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)

  1. 1. @crichardson Map(), flatMap() and reduce() are your new best friends: Simpler collections, concurrency, and big data Chris Richardson Author of POJOs in Action Founder of the original CloudFoundry.com @crichardson chris@chrisrichardson.net http://plainoldobjects.com
  2. 2. @crichardson Presentation goal How functional programming simplifies your code Show that map(), flatMap() and reduce() are remarkably versatile functions
  3. 3. @crichardson About Chris
  4. 4. @crichardson About Chris Founder of a buzzword compliant (stealthy, social, mobile, big data, machine learning, ...) startup Consultant helping organizations improve how they architect and deploy applications using cloud, micro services, polyglot applications, NoSQL, ...
  5. 5. @crichardson Agenda Why functional programming? Simplifying collection processing Eliminating NullPointerExceptions Simplifying concurrency with Futures and Rx Observables Tackling big data problems with functional programming
  6. 6. @crichardson What’s functional programming?
  7. 7. @crichardson It’s a programming paradigm
  8. 8. @crichardson It’s a kind of programming language
  9. 9. @crichardson Functions as the building blocks of the application
  10. 10. @crichardson Functions as first class citizens Assign functions to variables Store functions in fields Use and write higher-order functions: Pass functions as arguments Return functions as values
  11. 11. @crichardson Avoids mutable state Use: Immutable data structures Single assignment variables Some functional languages such as Haskell don’t side-effects There are benefits to immutability Easier concurrency More reliable code But be pragmatic
  12. 12. @crichardson Why functional programming? "the highest goal of programming-language design to enable good ideas to be elegantly expressed" http://en.wikipedia.org/wiki/Tony_Hoare
  13. 13. @crichardson Why functional programming? More expressive More concise More intuitive - solution matches problem definition Elimination of error-prone mutable state Easy parallelization
  14. 14. @crichardson An ancient idea that has recently become popular
  15. 15. @crichardson Mathematical foundation: λ-calculus Introduced by Alonzo Church in the 1930s
  16. 16. @crichardson Lisp = an early functional language invented in 1958 http://en.wikipedia.org/wiki/ Lisp_(programming_language) 1940 1950 1960 1970 1980 1990 2000 2010 garbage collection dynamic typing self-hosting compiler tree data structures (defun factorial (n) (if (<= n 1) 1 (* n (factorial (- n 1)))))
  17. 17. @crichardson My final year project in 1985: Implementing SASL sieve (p:xs) = p : sieve [x | x <- xs, rem x p > 0]; primes = sieve [2..] A list of integers starting with 2 Filter out multiples of p
  18. 18. Mostly an Ivory Tower technology Lisp was used for AI FP languages: Miranda, ML, Haskell, ... “Side-effects kills kittens and puppies”
  19. 19. @crichardson http://steve-yegge.blogspot.com/2010/12/haskell-researchers-announce-discovery.html !* !* !*
  20. 20. @crichardson But today FP is mainstream Clojure - a dialect of Lisp A hybrid OO/functional language A hybrid OO/FP language for .NET Java 8 has lambda expressions
  21. 21. @crichardson Java 8 lambda expressions are functions x -> x * x x -> { for (int i = 2; i < Math.sqrt(x); i = i + 1) { if (x % i == 0) return false; } return true; }; (x, y) -> x * x + y * y
  22. 22. @crichardson Java 8 lambdas are a shorthand* for an anonymous inner class * not exactly. See http://programmers.stackexchange.com/questions/ 177879/type-inference-in-java-8
  23. 23. @crichardson Java 8 functional interfaces Interface with a single abstract method e.g. Runnable, Callable, Spring’s TransactionCallback A lambda expression is an instance of a functional interface. You can use a lambda wherever a function interface “value” is expected The type of the lambda expression is determined from it’s context
  24. 24. @crichardson Example Functional Interface Function<Integer, Integer> square = x -> x * x; BiFunction<Integer, Integer, Integer> sumSquares = (x, y) -> x * x + y * y; Predicate<Integer> makeIsDivisibleBy(int y) { return x -> x % y == 0; } Predicate<Integer> isEven = makeIsDivisibleBy(2); Assert.assertTrue(isEven.test(8)); Assert.assertFalse(isEven.test(11));
  25. 25. @crichardson Example Functional Interface ExecutorService executor = ...; final int x = 999 Future<Boolean> outcome = executor.submit(() -> { for (int i = 2; i < Math.sqrt(x); i = i + 1) { if (x % i == 0) return false; } return true; } This lambda is a Callable
  26. 26. @crichardson Agenda Why functional programming? Simplifying collection processing Eliminating NullPointerExceptions Simplifying concurrency with Futures and Rx Observables Tackling big data problems with functional programming
  27. 27. @crichardson Lot’s of application code = collection processing: Mapping, filtering, and reducing
  28. 28. @crichardson Social network example public class Person { enum Gender { MALE, FEMALE } private Name name; private LocalDate birthday; private Gender gender; private Hometown hometown; private Set<Friend> friends = new HashSet<Friend>(); .... public class Friend { private Person friend; private LocalDate becameFriends; ... } public class SocialNetwork { private Set<Person> people; ...
  29. 29. @crichardson Mapping, filtering, and reducing public class Person { public Set<Hometown> hometownsOfFriends() { Set<Hometown> result = new HashSet<>(); for (Friend friend : friends) { result.add(friend.getPerson().getHometown()); } return result; }
  30. 30. @crichardson Mapping, filtering, and reducing public class Person { public Set<Person> friendOfFriends() { Set<Person> result = new HashSet(); for (Friend friend : friends) for (Friend friendOfFriend : friend.getPerson().friends) if (friendOfFriend.getPerson() != this) result.add(friendOfFriend.getPerson()); return result; }
  31. 31. @crichardson Mapping, filtering, and reducing public class SocialNetwork { private Set<Person> people; ... public Set<Person> lonelyPeople() { Set<Person> result = new HashSet<Person>(); for (Person p : people) { if (p.getFriends().isEmpty()) result.add(p); } return result; }
  32. 32. @crichardson Mapping, filtering, and reducing public class SocialNetwork { private Set<Person> people; ... public int averageNumberOfFriends() { int sum = 0; for (Person p : people) { sum += p.getFriends().size(); } return sum / people.size(); }
  33. 33. @crichardson Problems with this style of programming Low level Imperative (how to do it) NOT declarative (what to do) Verbose Mutable variables are potentially error prone Difficult to parallelize
  34. 34. @crichardson Java 8 streams to the rescue A sequence of elements “Wrapper” around a collection Streams can also be infinite Provides a functional/lambda-based API for transforming, filtering and aggregating elements Much simpler, cleaner code
  35. 35. @crichardson Using Java 8 streams - mapping class Person .. private Set<Friend> friends = ...; public Set<Hometown> hometownsOfFriends() { return friends.stream() .map(f -> f.getPerson().getHometown()) .collect(Collectors.toSet()); }
  36. 36. @crichardson The map() function s1 a b c d e ... s2 f(a) f(b) f(c) f(d) f(e) ... s2 = s1.map(f)
  37. 37. @crichardson public class SocialNetwork { private Set<Person> people; ... public Set<Person> peopleWithNoFriends() { Set<Person> result = new HashSet<Person>(); for (Person p : people) { if (p.getFriends().isEmpty()) result.add(p); } return result; } Using Java 8 streams - filtering public class SocialNetwork { private Set<Person> people; ... public Set<Person> lonelyPeople() { return people.stream() .filter(p -> p.getFriends().isEmpty()) .collect(Collectors.toSet()); }
  38. 38. @crichardson Using Java 8 streams - friend of friends V1 class Person .. public Set<Person> friendOfFriends() { Set<Set<Friend>> fof = friends.stream() .map(friend -> friend.getPerson().friends) .collect(Collectors.toSet()); ... } Using map() => Set of Sets :-( Somehow we need to flatten
  39. 39. @crichardson Using Java 8 streams - mapping class Person .. public Set<Person> friendOfFriends() { return friends.stream() .flatMap(friend -> friend.getPerson().friends.stream()) .map(Friend::getPerson) .filter(f -> f != this) .collect(Collectors.toSet()); } maps and flattens
  40. 40. @crichardson The flatMap() function s1 a b ... s2 f(a)0 f(a)1 f(b)0 f(b)1 f(b)2 ... s2 = s1.flatMap(f)
  41. 41. @crichardson Using Java 8 streams - reducing public class SocialNetwork { private Set<Person> people; ... public long averageNumberOfFriends() { return people.stream() .map ( p -> p.getFriends().size() ) .reduce(0, (x, y) -> x + y) / people.size(); } int x = 0; for (int y : inputStream) x = x + y return x;
  42. 42. @crichardson The reduce() function s1 a b c d e ... x = s1.reduce(initial, f) f(f(f(f(f(f(initial, a), b), c), d), e), ...)
  43. 43. @crichardson Newton's method for finding square roots public class SquareRootCalculator { public double squareRoot(double input, double precision) { return Stream.iterate( new Result(1.0), current -> refine(current, input, precision)) .filter(r -> r.done) .findFirst().get().value; } private static Result refine(Result current, double input, double precision) { double value = current.value; double newCurrent = value - (value * value - input) / (2 * value); boolean done = Math.abs(value - newCurrent) < precision; return new Result(newCurrent, done); } class Result { boolean done; double value; } Creates an infinite stream: seed, f(seed), f(f(seed)), ..... Don’t panic! Streams are lazy
  44. 44. @crichardson Agenda Why functional programming? Simplifying collection processing Eliminating NullPointerExceptions Simplifying concurrency with Futures and Rx Observables Tackling big data problems with functional programming
  45. 45. @crichardson Tony’s $1B mistake “I call it my billion-dollar mistake. It was the invention of the null reference in 1965....But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement...” http://qconlondon.com/london-2009/presentation/ Null+References:+The+Billion+Dollar+Mistake
  46. 46. @crichardson Coding with null pointers class Person public Friend longestFriendship() { Friend result = null; for (Friend friend : friends) { if (result == null || friend.getBecameFriends() .isBefore(result.getBecameFriends())) result = friend; } return result; } Friend oldestFriend = person.longestFriendship(); if (oldestFriend != null) { ... } else { ... } Null check is essential yet easily forgotten
  47. 47. @crichardson Java 8 Optional<T> A wrapper for nullable references It has two states: empty throws an exception if you try to get the reference non-empty contain a non-null reference Provides methods for: testing whether it has a value getting the value ... Return reference wrapped in an instance of this type instead of null
  48. 48. @crichardson Coding with optionals class Person public Optional<Friend> longestFriendship() { Friend result = null; for (Friend friend : friends) { if (result == null || friend.getBecameFriends().isBefore(result.getBecameFriends())) result = friend; } return Optional.ofNullable(result); } Optional<Friend> oldestFriend = person.longestFriendship(); // Might throw java.util.NoSuchElementException: No value present // Person dangerous = popularPerson.get(); if (oldestFriend.isPresent) { ...oldestFriend.get() } else { ... }
  49. 49. @crichardson Using Optionals - better Optional<Friend> oldestFriendship = ...; Friend whoToCall1 = oldestFriendship.orElse(mother); Avoid calling isPresent() and get() Friend whoToCall3 = oldestFriendship.orElseThrow( () -> new LonelyPersonException()); Friend whoToCall2 = oldestFriendship.orElseGet(() -> lazilyFindSomeoneElse());
  50. 50. @crichardson Using Optional.map() public class Person { public Optional<Friend> longestFriendship() { return ...; } public Optional<Long> ageDifferenceWithOldestFriend() { Optional<Friend> oldestFriend = longestFriendship(); return oldestFriend.map ( of -> Math.abs(of.getPerson().getAge() - getAge())) ); } Eliminates messy conditional logic
  51. 51. @crichardson Using flatMap() class Person public Optional<Friend> longestFriendship() {...} public Optional<Friend> longestFriendshipOfLongestFriend() { return longestFriendship() .flatMap(friend -> friend.getPerson().longestFriendship()); } not always a symmetric relationship. :-)
  52. 52. @crichardson Agenda Why functional programming? Simplifying collection processing Eliminating NullPointerExceptions Simplifying concurrency with Futures and Rx Observables Tackling big data problems with functional programming
  53. 53. @crichardson Let’s imagine you are performing a CPU intensive operation class Person .. public Set<Hometown> hometownsOfFriends() { return friends.stream() .map(f -> cpuIntensiveOperation()) .collect(Collectors.toSet()); }
  54. 54. @crichardson class Person .. public Set<Hometown> hometownsOfFriends() { return friends.parallelStream() .map(f -> cpuIntensiveOperation()) .collect(Collectors.toSet()); } Parallel streams = simple concurrency Potentially uses N cores Nx speed up
  55. 55. @crichardson Let’s imagine that you are writing code to display the products in a user’s wish list
  56. 56. @crichardson The need for concurrency Step #1 Web service request to get the user profile including wish list (list of product Ids) Step #2 For each productId: web service request to get product info Sequentially terrible response time Need fetch productInfo concurrently
  57. 57. @crichardson Futures are a great concurrency abstraction http://en.wikipedia.org/wiki/Futures_and_promises
  58. 58. @crichardson Worker thread or event-driven Main thread How futures work Outcome Future Client get Asynchronous operation set initiates
  59. 59. @crichardson Benefits Simple way for two concurrent activities to communicate safely Abstraction: Client does not know how the asynchronous operation is implemented Easy to implement scatter/gather: Scatter: Client can invoke multiple asynchronous operations and gets a Future for each one. Gather: Get values from the futures
  60. 60. @crichardson Example wish list service public interface UserService { Future<UserProfile> getUserProfile(long userId); } public class UserServiceProxy implements UserService { private ExecutorService executorService; @Override public Future<UserProfile> getUserProfile(long userId) { return executorService.submit(() -> restfulGet("http://uservice/user/" + userId, UserProfile.class)); } ... } public interface ProductInfoService { Future<ProductInfo> getProductInfo(long productId); }
  61. 61. @crichardson public class WishlistService { private UserService userService; private ProductInfoService productInfoService; public Wishlist getWishlistDetails(long userId) throws Exception { Future<UserProfile> userProfileFuture = userService.getUserProfile(userId); UserProfile userProfile = userProfileFuture.get(300, TimeUnit.MILLISECONDS); Example wish list service get user info List<Future<ProductInfo>> productInfoFutures = userProfile.getWishListProductIds().stream() .map(productInfoService::getProductInfo) .collect(Collectors.toList()); long deadline = System.currentTimeMillis() + 300; List<ProductInfo> products = new ArrayList<ProductInfo>(); for (Future<ProductInfo> pif : productInfoFutures) { long timeout = deadline - System.currentTimeMillis(); if (timeout <= 0) throw new TimeoutException(...); products.add(pif.get(timeout, TimeUnit.MILLISECONDS)); } ... return new Wishlist(products); } asynchronously get all products wait for product info
  62. 62. @crichardson It works BUT Code is very low-level and messy And, it’s blocking
  63. 63. @crichardson Better: Futures with callbacks no blocking! def asyncSquare(x : Int) : Future[Int] = ... x * x... val f = asyncSquare(25) Guava ListenableFutures, Spring 4 ListenableFuture Java 8 CompletableFuture, Scala Futures f onSuccess { case x : Int => println(x) } f onFailure { case e : Exception => println("exception thrown") } Partial function applied to successful outcome Applied to failed outcome
  64. 64. @crichardson But callback-based scatter/gather Messy, tangled code (aka. callback hell)
  65. 65. @crichardson Functional futures - map def asyncPlus(x : Int, y : Int) = ... x + y ... val future2 = asyncPlus(4, 5).map{ _ * 3 } assertEquals(27, Await.result(future2, 1 second)) Scala, Java 8 CompletableFuture Asynchronously transforms future
  66. 66. @crichardson Functional futures - flatMap() val f2 = asyncPlus(5, 8).flatMap { x => asyncSquare(x) } assertEquals(169, Await.result(f2, 1 second)) Scala, Java 8 CompletableFuture (partially) Calls asyncSquare() with the eventual outcome of asyncPlus()
  67. 67. @crichardson flatMap() is asynchronous Outcome3f3 Outcome3 f2 f2 = f1 flatMap (someFn) Outcome1 f1 Implemented using callbacks someFn(outcome1)
  68. 68. @crichardson class WishListService(...) { def getWishList(userId : Long) : Future[WishList] = { userService.getUserProfile(userId) flatMap { userProfile => Scala wishlist service val futureOfProductsList : Future[List[ProductInfo]] = Future.sequence(listOfProductFutures) val timeoutFuture = ... Future.firstCompletedOf(Seq(wishlist, timeoutFuture)) } } val wishlist = futureOfProductsList.map { products => WishList(products) } val listOfProductFutures : List[Future[ProductInfo]] = userProfile.wishListProductIds .map { productInfoService.getProductInfo }
  69. 69. @crichardson Using Java 8 CompletableFutures public class UserServiceImpl implements UserService { @Override public CompletableFuture<UserInfo> getUserInfo(long userId) { return CompletableFuture.supplyAsync( () -> httpGetRequest("http://myuservice/user" + userId, UserInfo.class)); } Runs in ExecutorService
  70. 70. @crichardson Using Java 8 CompletableFutures public CompletableFuture<Wishlist> getWishlistDetails(long userId) { return userService.getUserProfile(userId).thenComposeAsync(userProfile -> { Stream<CompletableFuture<ProductInfo>> s1 = userProfile.getWishListProductIds() .stream() .map(productInfoService::getProductInfo); Stream<CompletableFuture<List<ProductInfo>>> s2 = s1.map(fOfPi -> fOfPi.thenApplyAsync(pi -> Arrays.asList(pi))); CompletableFuture<List<ProductInfo>> productInfos = s2 .reduce((f1, f2) -> f1.thenCombine(f2, ListUtils::union)) .orElse(CompletableFuture.completedFuture(Collections.emptyList())); return productInfos.thenApply(list -> new Wishlist()); }); } Java 8 is missing Future.sequence() flatMap()! map()!
  71. 71. @crichardson Introducing Reactive Extensions (Rx) The Reactive Extensions (Rx) is a library for composing asynchronous and event-based programs using observable sequences and LINQ-style query operators. Using Rx, developers represent asynchronous data streams with Observables , query asynchronous data streams using LINQ operators , and ..... https://rx.codeplex.com/
  72. 72. @crichardson About RxJava Reactive Extensions (Rx) for the JVM Original motivation for Netflix was to provide rich Futures Implemented in Java Adaptors for Scala, Groovy and Clojure https://github.com/Netflix/RxJava
  73. 73. @crichardson RxJava core concepts trait Observable[T] { def subscribe(observer : Observer[T]) : Subscription ... } trait Observer[T] { def onNext(value : T) def onCompleted() def onError(e : Throwable) } Notifies An asynchronous stream of items Used to unsubscribe
  74. 74. Comparing Observable to... Observer pattern - similar but adds Observer.onComplete() Observer.onError() Iterator pattern - mirror image Push rather than pull Futures - similar Can be used as Futures But Observables = a stream of multiple values Collections and Streams - similar Functional API supporting map(), flatMap(), ... But Observables are asynchronous
  75. 75. @crichardson Fun with observables val every10Seconds = Observable.interval(10 seconds) -1 0 1 ... t=0 t=10 t=20 ... val oneItem = Observable.items(-1L) val ticker = oneItem ++ every10Seconds val subscription = ticker.subscribe { (value: Long) => println("value=" + value) } ... subscription.unsubscribe()
  76. 76. @crichardson def getTableStatus(tableName: String) : Observable[DynamoDbStatus]= Observable { subscriber: Subscriber[DynamoDbMessage] => } Connecting observables to the outside world amazonDynamoDBAsyncClient.describeTableAsync( new DescribeTableRequest(tableName), new AsyncHandler[DescribeTableRequest, DescribeTableResult] { override def onSuccess(request: DescribeTableRequest, result: DescribeTableResult) = { subscriber.onNext(DynamoDbStatus(result.getTable.getTableStatus)) subscriber.onCompleted() } override def onError(exception: Exception) = exception match { case t: ResourceNotFoundException => subscriber.onNext(DynamoDbStatus("NOT_FOUND")) subscriber.onCompleted() case _ => subscriber.onError(exception) } }) }
  77. 77. @crichardson Transforming observables val tableStatus = ticker.flatMap { i => logger.info("{}th describe table", i + 1) getTableStatus(name) } Status1 Status2 Status3 ... t=0 t=10 t=20 ... + Usual collection methods: map(), filter(), take(), drop(), ...
  78. 78. @crichardson Calculating rolling average class AverageTradePriceCalculator { def calculateAverages(trades: Observable[Trade]): Observable[AveragePrice] = { ... } case class Trade( symbol : String, price : Double, quantity : Int ... ) case class AveragePrice( symbol : String, price : Double, ...)
  79. 79. @crichardson Calculating average pricesdef calculateAverages(trades: Observable[Trade]): Observable[AveragePrice] = { trades.groupBy(_.symbol).map { symbolAndTrades => val (symbol, tradesForSymbol) = symbolAndTrades val openingEverySecond = Observable.items(-1L) ++ Observable.interval(1 seconds) def closingAfterSixSeconds(opening: Any) = Observable.interval(6 seconds).take(1) tradesForSymbol.window(...).map { windowOfTradesForSymbol => windowOfTradesForSymbol.fold((0.0, 0, List[Double]())) { (soFar, trade) => val (sum, count, prices) = soFar (sum + trade.price, count + trade.quantity, trade.price +: prices) } map { x => val (sum, length, prices) = x AveragePrice(symbol, sum / length, prices) } }.flatten }.flatten }
  80. 80. @crichardson Agenda Why functional programming? Simplifying collection processing Eliminating NullPointerExceptions Simplifying concurrency with Futures and Rx Observables Tackling big data problems with functional programming
  81. 81. @crichardson Let’s imagine that you want to count word frequencies
  82. 82. @crichardson Scala Word Count val frequency : Map[String, Int] = Source.fromFile("gettysburgaddress.txt").getLines() .flatMap { _.split(" ") }.toList frequency("THE") should be(11) frequency("LIBERTY") should be(1) .groupBy(identity) .mapValues(_.length)) Map Reduce
  83. 83. @crichardson But how to scale to a cluster of machines?
  84. 84. @crichardson Apache Hadoop Open-source software for reliable, scalable, distributed computing Hadoop Distributed File System (HDFS) Efficiently stores very large amounts of data Files are partitioned and replicated across multiple machines Hadoop MapReduce Batch processing system Provides plumbing for writing distributed jobs Handles failures ...
  85. 85. @crichardson Overview of MapReduce Input Data Mapper Mapper Mapper Reducer Reducer Reducer Out put Data Shuffle (K,V) (K,V) (K,V) (K,V)* (K,V)* (K,V)* (K1,V, ....)* (K2,V, ....)* (K3,V, ....)* (K,V) (K,V) (K,V)
  86. 86. @crichardson MapReduce Word count - mapperclass Map extends Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } } (“Four”, 1), (“score”, 1), (“and”, 1), (“seven”, 1), ... Four score and seven years http://wiki.apache.org/hadoop/WordCount
  87. 87. @crichardson Hadoop then shuffles the key-value pairs...
  88. 88. @crichardson MapReduce Word count - reducer class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterable<IntWritable> values, Context context) { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } } (“the”, 11) (“the”, (1, 1, 1, 1, 1, 1, ...)) http://wiki.apache.org/hadoop/WordCount
  89. 89. @crichardson About MapReduce Very simple programming abstract yet incredibly powerful By chaining together multiple map/reduce jobs you can process very large amounts of data e.g. Apache Mahout for machine learning But Mappers and Reducers = verbose code Development is challenging, e.g. unit testing is difficult It’s disk-based, batch processing slow
  90. 90. @crichardson Scalding: Scala DSL for MapReduce class WordCountJob(args : Args) extends Job(args) { TextLine( args("input") ) .flatMap('line -> 'word) { line : String => tokenize(line) } .groupBy('word) { _.size } .write( Tsv( args("output") ) ) def tokenize(text : String) : Array[String] = { text.toLowerCase.replaceAll("[^a-zA-Z0-9s]", "") .split("s+") } } https://github.com/twitter/scalding Expressive and unit testable Each row is a map of named fields
  91. 91. @crichardson Apache Spark Part of the Hadoop ecosystem Key abstraction = Resilient Distributed Datasets (RDD) Collection that is partitioned across cluster members Operations are parallelized Created from either a Scala collection or a Hadoop supported datasource - HDFS, S3 etc Can be cached in-memory for super-fast performance Can be replicated for fault-tolerance http://spark.apache.org
  92. 92. @crichardson Spark Word Count val sc = new SparkContext(...) sc.textFile(“s3n://mybucket/...”) .flatMap { _.split(" ")} .groupBy(identity) .mapValues(_.length) .toArray.toMap } } Expressive, unit testable and very fast
  93. 93. @crichardson Summary Functional programming enables the elegant expression of good ideas in a wide variety of domains map(), flatMap() and reduce() are remarkably versatile higher-order functions Use FP and OOP together Java 8 has taken a good first step towards supporting FP
  94. 94. @crichardson Questions? @crichardson chris@chrisrichardson.net http://plainoldobjects.com

×