Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Faster persistent data structures t... by Johan Tibell 6072 views
- High-Performance Haskell by Johan Tibell 25989 views
- Haskell is Not For Production and O... by Katie Ots 6767 views
- A Scalable I/O Manager for GHC by Johan Tibell 3077 views
- Why Haskell by Susan Potter 8694 views
- Efficient Immutable Data Structures... by Tom Faulhaber 2001 views

4,253 views

Published on

No Downloads

Total views

4,253

On SlideShare

0

From Embeds

0

Number of Embeds

605

Shares

0

Downloads

49

Comments

0

Likes

2

No embeds

No notes for slide

- 1. Reasoning about laziness Johan Tibell johan.tibell@gmail.com 2011-02-12
- 2. Laziness Haskell is a lazy language Functions and data constructors don’t evaluate their arguments until they need them cond : : Bool −> a −> a −> a cond True t e = t cond F a l s e t e = e Same with local deﬁnitions abs : : I n t −> I n t abs x | x > 0 = x | otherwise = neg x where n e g x = negate x
- 3. Why laziness is important Laziness supports modular programming Programmer-written functions instead of built-in language constructs ( | | ) : : Bool −> Bool −> Bool True | | = True False | | x = x
- 4. Laziness and modularity Laziness lets us separate producers and consumers and still get eﬃcient execution: Generate all solutions (a huge tree structure) Find the solution(s) you want nextMove : : Board −> Move nextMove b = s e l e c t M o v e a l l M o v e s where allMoves = allMovesFrom b The solutions are generated as they are consumed.
- 5. Example: summing some numbers sum : : [ I n t ] −> I n t sum x s = sum ’ 0 x s where sum ’ a c c [ ] = acc sum ’ a c c ( x : x s ) = sum ’ ( a c c + x ) x s foldl abstracts the accumulator recursion pattern: f o l d l : : ( a −> b −> a ) −> a −> [ b ] −> a foldl f z [] = z f o l d l f z ( x : xs ) = f o l d l f ( f z x ) xs sum = f o l d l (+) 0
- 6. A misbehaving function How does evaluation of this expression proceed? sum [ 1 , 2 , 3 ] Like this: sum [1,2,3] ==> foldl (+) 0 [1,2,3] ==> foldl (+) (0+1) [2,3] ==> foldl (+) ((0+1)+2) [3] ==> foldl (+) (((0+1)+2)+3) [] ==> ((0+1)+2)+3 ==> (1+2)+3 ==> 3+3 ==> 6
- 7. Thunks A thunk represents an unevaluated expression. GHC needs to store all the unevaluated + expressions on the heap, until their value is needed. Storing and evaluating thunks is costly, and unnecessary if the expression was going to be evaluated anyway. foldl allocates n thunks, one for each addition, causing a stack overﬂow when GHC tries to evaluate the chain of thunks.
- 8. Controlling evaluation order The seq function allows to control evaluation order. seq : : a −> b −> b Informally, when evaluated, the expression seq a b evaluates a and then returns b.
- 9. Weak head normal form Evaluation stops as soon as a data constructor (or lambda) is reached: ghci> seq (1 ‘div‘ 0) 2 *** Exception: divide by zero ghci> seq ((1 ‘div‘ 0), 3) 2 2 We say that seq evaluates to weak head normal form (WHNF).
- 10. Weak head normal form Forcing the evaluation of an expression using seq only makes sense if the result of that expression is used later: l e t x = 1 + 2 i n seq x ( f x ) The expression p r i n t ( seq ( 1 + 2 ) 3 ) doesn’t make sense as the result of 1+2 is never used.
- 11. Exercise Rewrite the expression (1 + 2 , ’ a ’ ) so that the component of the pair is evaluated before the pair is created.
- 12. Solution Rewrite the expression as l e t x = 1 + 2 i n seq x ( x , ’ a ’ )
- 13. A strict left fold We want to evaluate the expression f z x before evaluating the recursive call: f o l d l ’ : : ( a −> b −> a ) −> a −> [ b ] −> a foldl ’ f z [ ] = z foldl ’ f z ( x : xs ) = l e t z ’ = f z x i n seq z ’ ( f o l d l ’ f z ’ x s )
- 14. Summing numbers, attempt 2 How does evaluation of this expression proceed? foldl’ (+) 0 [1,2,3] Like this: foldl’ (+) 0 [1,2,3] ==> foldl’ (+) 1 [2,3] ==> foldl’ (+) 3 [3] ==> foldl’ (+) 6 [] ==> 6 Sanity check: ghci> print (foldl’ (+) 0 [1..1000000]) 500000500000
- 15. Computing the mean A function that computes the mean of a list of numbers: mean : : [ Double ] −> Double mean x s = s / f r o m I n t e g r a l l where ( s , l ) = f o l d l ’ s t e p (0 , 0) xs s t e p ( s , l ) a = ( s+a , l +1) We compute the length of the list and the sum of the numbers in one pass. $ ./Mean Stack space overflow: current size 8388608 bytes. Use ‘+RTS -Ksize -RTS’ to increase it. Didn’t we just ﬁx that problem?!?
- 16. seq and data constructors Remember: Data constructors don’t evaluate their arguments when created seq only evaluates to the outmost data constructor, but doesn’t evaluate its arguments Problem: foldl ’ forces the evaluation of the pair constructor, but not its arguments, causing unevaluated thunks build up inside the pair: (0.0 + 1.0 + 2.0 + 3.0, 0 + 1 + 1 + 1)
- 17. Forcing evaluation of constructor arguments We can force GHC to evaluate the constructor arguments before the constructor is created: mean : : [ Double ] −> Double mean x s = s / f r o m I n t e g r a l l where ( s , l ) = f o l d l ’ s t e p (0 , 0) xs step (s , l ) a = let s ’ = s + a l ’ = l + 1 i n seq s ’ ( seq l ’ ( s ’ , l ’ ) )
- 18. Bang patterns A bang patterns is a concise way to express that an argument should be evaluated. {−# LANGUAGE B a n g P a t t e r n s #−} mean : : [ Double ] −> Double mean x s = s / f r o m I n t e g r a l l where ( s , l ) = f o l d l ’ s t e p (0 , 0) xs s t e p ( ! s , ! l ) a = ( s + a , l + 1) s and l are evaluated before the right-hand side of step is evaluated.
- 19. Strictness We say that a function is strict in an argument, if evaluating the function always causes the argument to be evaluated. n u l l : : [ a ] −> Bool n u l l [ ] = True null = False null is strict in its ﬁrst (and only) argument, as it needs to be evaluated to pick a return value.
- 20. Strictness - Example cond is strict in the ﬁrst argument, but not in the second and third argument: cond : : Bool −> a −> a −> a cond True t e = t cond F a l s e t e = e Reason: Each of the two branches only evaluate one of the two last arguments to cond.
- 21. Strict data types Haskell lets us say that we always want the arguments of a constructor to be evaluated: data P a i r S a b = PS ! a ! b When a PairS is evaluated, its arguments are evaluated.
- 22. Strict pairs as accumulators We can use a strict pair to simplify our mean function: mean : : [ Double ] −> Double mean x s = s / f r o m I n t e g r a l l where PS s l = f o l d l ’ s t e p ( PS 0 0 ) x s s t e p ( PS s l ) a = PS ( s + a ) ( l + 1 ) Tip: Prefer strict data types when laziness is not needed for your program to work correctly.
- 23. Reasoning about laziness A function application is only evaluated if its result is needed, therefore: One of the function’s right-hand sides will be evaluated. Any expression whose value is required to decide which RHS to evaluate, must be evaluated. By using this “backward-to-front” analysis we can ﬁgure which arguments a function is strict in.
- 24. Reasoning about laziness: example max : : I n t −> I n t −> I n t max x y | x > y = x | x < y = y | otherwise = x −− a r b i t r a r y To pick one of the three RHS, we must evaluate x > y. Therefore we must evaluate both x and y. Therefore max is strict in both x and y.
- 25. Poll data BST = L e a f | Node I n t BST BST i n s e r t : : I n t −> BST −> BST insert x Leaf = Node x L e a f L e a f i n s e r t x ( Node x ’ l r ) | x < x’ = Node x ’ ( i n s e r t x l ) r | x > x’ = Node x ’ l ( i n s e r t x r ) | o t h e r w i s e = Node x l r Which arguments is insert strict in? None 1st 2nd Both
- 26. Solution Only the second, as inserting into an empty tree can be done without comparing the value being inserted. For example, this expression i n s e r t ( 1 ‘ div ‘ 0 ) L e a f does not raise a division-by-zero expression but i n s e r t ( 1 ‘ div ‘ 0 ) ( Node 2 L e a f L e a f ) does.
- 27. Some other things worth pointing out insert x l is not evaluated before the Node is created, so it’s stored as a thunk. Most tree based data structures use strict sub-trees: data S e t a = Tip | Bin ! S i z e a ! ( S e t a ) ! ( S e t a )
- 28. Strict function arguments are great for performance Strict arguments can often be passed as unboxed values (e.g. a machine integer in a register instead of a pointer to an integer on the heap). The compiler can often infer which arguments are stricts, but can sometimes need a little help (like in the case of insert a few slides back).
- 29. Summary Understanding how evaluation works in Haskell is important and requires practice.

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment