Real World Haskell:
     Lecture 6

   Bryan O’Sullivan


     2009-11-11
Models of evaluation


   Coming from a C++ or Python background, you’re surely used to
   the the || and or operators in those languages.
       If the operator’s left argument evaluates to true, it “short
       circuits” the right (i.e. doesn’t evaluate it).

   We see the same behaviour in Haskell.

   Prelude> undefined
   *** Exception: Prelude.undefined

   Prelude> True || undefined
   True
Eager, or strict, evaluation

   In most languages, the usual model of is of strict evaluation: a
   function’s arguments are fully evaluated before the evaluation of
   the function’s body begins.

   def inc(x):
     return x + 1

   def bar(a):
     b = 5 if a % 2 else 8
     return inc(a + 3) * inc(inc(b))

   For example, if we run bar(1) then:
       The local variable b gets the value 5.
       The two calls to inc are fully evaluated before we invoke * on
       5 and 7, respectively.
Are things different in Haskell?

   Let’s think about our old friend the list constructor.

   Prelude> 1:[]
   [1]
   Prelude> 1:undefined
   [1*** Exception: Prelude.undefined

   And a function that operates on a list:

   Prelude> head (1:[])
   1

   What do you think will happen now?
Are things different in Haskell?

   Let’s think about our old friend the list constructor.

   Prelude> 1:[]
   [1]
   Prelude> 1:undefined
   [1*** Exception: Prelude.undefined

   And a function that operates on a list:

   Prelude> head (1:[])
   1

   What do you think will happen now?

   Prelude> head (1:undefined)
   1
Was that a special case?


   Well, uh, perhaps lists are special in Haskell. Riiight?
   −− d e f i n e d i n Data . Maybe
   i s J u s t ( Just ) = True
   isJust                = False
   Let’s see how the Maybe type fares.

   Prelude Data.Maybe> Just undefined
   Just *** Exception: Prelude.undefined
   Prelude Data.Maybe> isJust (Just undefined)
   True
What else should we expect?

   Here’s a slightly different function:
   i s J u s t O n e ( Just a ) | a == 1 = True
   isJustOne                             = False
   How will it behave?

   *Main> isJustOne (Just 2)
   False

   Okay, we expected that. What if we follow the same pattern as
   before, and package up an undefined value?
What else should we expect?

   Here’s a slightly different function:
   i s J u s t O n e ( Just a ) | a == 1 = True
   isJustOne                             = False
   How will it behave?

   *Main> isJustOne (Just 2)
   False

   Okay, we expected that. What if we follow the same pattern as
   before, and package up an undefined value?

   *Main> isJustOne (Just undefined)
   *** Exception: Prelude.undefined
Haskell’s evaluation model


   Haskell follows a semantic model called non-strict evaluation:
   expressions are not evaluated unless (and usually until) their values
   are used.
   Perhaps you’ve heard of lazy evaluation: this is a specific kind of
   non-strict semantics.
   Haskell compilers go further, using call by need as an
   implementation strategy.
   Call by need: evaluate an expression when needed, then overwrite
   the location of the expression with the evaluated result
   (i.e. memoize it), in case it is needed again.
What does this mean in practice?


   Consider the isJust function again.
   i s J u s t ( Just   ) = True
   isJust                  = False
   It only evaluates its argument to the point of seeing whether it was
   constructed with a Just or Nothing constructor.
   Notably, the function does not inspect the argument of the Just
   constructor.
   When we try isJust (Just undefined) in ghci, the value
   undefined is never evaluated.
And our other example, revisited




   Who can explain why this code crashes when presented with
   Just undefined?
   i s J u s t O n e ( Just a ) | a == 1 = True
   isJustOne                             = False
A classic example



   Here is the infinite list of Fibonacci numbers in Haskell:
   f i b s = 0 : 1 : zipWith (+) f i b s ( t a i l f i b s )

   Even though this list is conceptually infinite, its components only
   get generated on demand.

   *Main> head (drop 256 fibs)
   141693817714056513234709965875411919657707794958199867
Traversing lists

   What do these functions have in common?

   map f [ ]         = []
   map f ( x : x s ) = f x : map f x s

   bunzip [ ]                      = ([] ,[])
   b u n z i p ( ( a , b ) : x s ) = l e t ( as , b s ) = b u n z i p x s
                                     i n ( a : as , b : b s )

   *Main> map succ "button"
   "cvuupo"
   *Main> bunzip [(1,’a’),(3,’b’),(5,’c’)]
   ([1,3,5],"abc")
What does map do?




  If you think about what map does to the structure of a list, it
  replaces every (:) constructor with a new (:) constructor that has
  a transformed version of its arguments.

  map succ (      1 :      2 : [])
  ==       ( succ 1 : succ 2 : [ ] )
And bunzip?


  Thinking structurally, bunzip performs the same kind of operation
  as map.

  bunzip ((1 , ’ a ’ ) : (2 , ’ b ’ ) : [ ] )
  ==     ((1 : 2 : [ ] ) , ( ’ a ’ : ’b ’ : [ ] ) )

  This time, the pattern is much harder to see, but it’s still there:
       Every time we see a (:) constructor, we replace it with a
       transformed piece of data.
       In this case, the transformed data is the head pair pulled apart
       and grafted onto the heads of a pair of lists.
Abstraction! Abstraction! Abstraction!



   If we have two functions that do essentially the same thing, don’t
   we have a design pattern?
   In Haskell, we can usually do better than waffle about design
   patterns: we can reify them into code! To wit:

   foldr f z [ ]         = z
   fol dr f z ( x : xs ) = f x ( fold r f z xs )
The right fold (no, really)

   The foldr function is called a right fold, because it associates to
   the right. What do I mean by this?

      f o l d r (+) 1 [ 2 , 3 , 4 ]
   == f o l d r (+) 1 ( 2 : 3 : 4 : [ ] )
   == f o l d r (+) 1 ( 2 : ( 3 : ( 4 : [ ] ) ) )
   ==                   2 + (3 + (4 + 1))

   Notice a few things:
       We replaced the empty list with the “empty” value.
       We replaced each non-empty constructor with an addition
       operator.
       That’s our structural transformation in a nutshell!
map as a right fold



   Because map follows the same pattern as foldr, we can actually
   write map in terms of foldr!

   bmap : : ( a −> b ) −> [ a ] −> [ b ]
   bmap f x s = f o l d r g [ ] x s
       where g y y s = f y : y s

   Since we can write a map as a fold, this implies that a fold is
   somehow more primitive than a map.
unzip as a right fold


   And here’s our unzip-like function as a fold:

   b u n z i p : : [ ( a , b ) ] −> ( [ a ] , [ b ] )
   bunzip xs = fold r g ( [ ] , [ ] ) xs
          where g ( x , y ) ( xs , y s ) = ( x : xs , y : y s )

   In fact, I’d suggest that bunzip-in-terms-of-foldr is actually easier
   to understand than the original definition.
   Many other common list functions can be expressed as right folds,
   too!
Function composition (I)


   Remember your high school algebra?


                        f ◦ g (x) ≡ f (g (x))
                            f ◦ g ≡ λx → f (g x)

   Taking the above notation and writing it in Haskell, we get this:
   f . g =  x −> f ( g x )
   The backslash is Haskell ASCII art for λ, and introduces an
   anonymous function. Between the backslash and the ASCII arrow
   are the function’s arguments, and following the arrow is its body.
Function composition (II)

   f . g =  x −> f ( g x )
   The result of f . g is a function that accepts one argument,
   applied f to it, then applies g to the result.
   So the expression succ . succ adds two to a number, for instance.
   Why care about this? Well, here’s our old definition of
   map-as-foldr.
   bmap f x s = f o l d r g [ ] x s
       where g y y s = f y : y s
   And here’s a more succinct version.
   bmap f x s = f o l d r ( ( : ) . f ) [ ] x s
The left fold


   So we’ve seen folds that associate to the right:
   1 + (2 + (3 + 4))

   What about folds that associate to the left?
   ((1 + 2) + 3) + 4

   Not surprisingly, the left fold does indeed exist, and is named foldl .
   foldl f z []             = z
   f o l d l f z ( x : xs ) = f o l d l f ( f z x ) xs
The right fold in pictures
The left fold in pictures
Folds and laziness


   Which of these definitions for adding up the elements of a list is
   better?

   sum1 x s = f o l d r (+) 0 x s
   sum2 x s = f o l d l (+) 0 x s

   That’s a hard question to approach without a sense of what lazy
   evaluation will cause to happen.
   Suppose an oracle generates the list [1..1000] for us at a rate of
   one element per second.
Sum as right fold




   In the first second, we see the partial expression
   1 : ( . . . ) {− can ’ t s e e a n y t h i n g more y e t −}
   But we want to know the result as soon as possible, so we
   generate a partial result:
   1 + ( . . . ) {− can ’ t make any more p r o g r e s s y e t −}
Second number two

  In the second second, we now have the partial expression
  1 : ( 2 : . . . ) {− can ’ t s e e a n y t h i n g more y e t −}

  We thus construct a little more of our eventual result:
  1 + ( 2 + ( . . . ) ) {− s t i l l no f u r t h e r p r o g r e s s −}

  Because we’re constructing a right-associative expression (that’s
  what a right fold is for), we can’t create an intermediate result of
  the-sum-so-far at any point.
  In other words, we’re creating a big expression containing 1000
  nested applications of (+), which we’ll only be able to fully
  evaluate at the end of the list!
What happens in practice?

   On casual inspection, it’s not clear that this right fold business
   really matters.

   Prelude> foldr (+) 0 [0..100000]
   5000050000

   But if we try to sum a longer list, we get a problem:

   Prelude> foldr (+) 0 [0..1000000]
   *** Exception: stack overflow

   The GHC runtime imposes a limit on the size of a deferred
   expression to reduce the likelihood of us shooting ourselves in the
   foot. Or at least to make the foot-shooting happen early enough
   that it won’t be a serious problem.
Left folds are better . . . uh, right?


   Obviously, a left fold can’t tell us the sum before we reach the end
   of a list, but it has a promising property.
   Given a list like this:
   1 : 2 : 3 :         ...

   Then our sum-via- foldl will produce a result like this:
    ((1 + 2) + 3) + . . .

   This is left associative, so we could potentially evaluate the
   leftmost portion on the fly: add 1 + 2 to give 3, then add 3 to
   give 6, and so on, keeping a single Int as the rolling sum-so-far.
Are we out of the woods?


   So we know that this will fail:

   Prelude> foldr (+) 0 [0..1000000]
   *** Exception: stack overflow

   But what about this?

   Prelude> foldl (+) 0 [0..1000000]
   *** Exception: stack overflow

   Hey! Shouldn’t the left fold have saved our bacon?
Why did foldl not help?


   Alas, consider the definition of foldl :
   f o l d l : : ( a −> b −> a ) −> a −> [ b ] −> a
   foldl f z []             = z
   f o l d l f z ( x : xs ) = f o l d l f ( f z x ) xs

   Because foldl is polymorphic, there is no way it can inspect the
   result of f z x.
   And since the intermediate result of each f z x can’t be evaluated,
   a huge unevaluated expression piles up until we reach the end of
   the list, just as with foldr!
Tracing evaluation


   How can we even get a sense of what pure Haskell code is actually
   doing?

   import Debug . T r a c e

   f o l d l T : : (Show a ) =  >
                   ( a −> b −> a ) −> a −> [ b ] −> a
   foldlT f z []              = z
   f o l d l T f z ( x : xs ) =
       let i = f z x
       i n t r a c e ( ”now ” ++ show i ) f o l d l T f i x s
What does trace do?

   The trace function is a magical function of sin: it prints its first
   argument on stderr, then returns its second argument.
   The expression trace (”now ”++ show i) foldlT prints
   something, then returns foldT.
   If you have the patience, run this in ghci:

   Prelude> foldlT (+) 0 [0..1000000]
   now 0
   now 1
   now 3
   ...blah blah blah...
   500000500000

   Whoa! It eventually prints a result, where plain foldl failed!
What’s going on?




      In order to print an intermediate result, trace must evaluate it
      first.
      Haskell’s call-by-need evaluation ensures that an unevaluated
      expression will be overwritten with the evaluated result.
      Instead of a constantly-growing expression, we thus have a
      single primitive value as the running sum at each iteration
      through the loop.
Is a debugging hack the answer to our problem?


   Clearly, using trace seems like an incredibly lame solution to the
   problem of evaluating intermediate results, although it’s very useful
   for debugging.
   (Actually, it’s the only Haskell debugger I use.)
   The real solution to our problem lies in a function named seq.
   seq : : a −> t −> t

   This function is a rather magical hack; all it does is evaluate its
   first argument until it reaches a constructor, then return its second
   argument.
Folding with seq
   The Data.List module defines the following function for us:

   f o l d l ’ : : ( a −> b −> a ) −> a −> [ b ] −> a
   foldl ’ f z [ ]          = z
   foldl ’ f z ( x : xs ) = l e t i = f z x
                              in    i ‘ seq ‘ f o l d l ’ f i x s

   Let’s compare the two in practice:

   Prelude> foldl (+) 0 [0..1000000]
   *** Exception: stack overflow

   Prelude> import Data.List
   Prelude> foldl’ (+) 0 [0..1000000]
   500000500000
Rules of thumb for folds




   If you can generate your result lazily and incrementally, e.g. as
   map does, use foldr.
   If you are generating what is conceptually a single result (e.g. one
   number), use foldl ’, because it will evaluate those tricky
   intermediate results strictly.
   Never use plain old foldl without the little tick at the end.
Homework
  For each of the following, choose the appropriate fold for the kind
  of result you are returning. You can find the type signature for
  each function using ghci.
      Write concat using a fold.
      Write length using a fold.
      Write (++) using a fold.
      Write and using a fold.
      Write unwords using a fold.
      Write () from Data.List using a fold.

  For super duper bonus points:
      Write either foldr or foldl in terms of the other. (Hint 1:
      only one is actually possible. Hint 2: the answer is highly
      non-obvious, and involves higher order functions.)

Real World Haskell: Lecture 6

  • 1.
    Real World Haskell: Lecture 6 Bryan O’Sullivan 2009-11-11
  • 2.
    Models of evaluation Coming from a C++ or Python background, you’re surely used to the the || and or operators in those languages. If the operator’s left argument evaluates to true, it “short circuits” the right (i.e. doesn’t evaluate it). We see the same behaviour in Haskell. Prelude> undefined *** Exception: Prelude.undefined Prelude> True || undefined True
  • 3.
    Eager, or strict,evaluation In most languages, the usual model of is of strict evaluation: a function’s arguments are fully evaluated before the evaluation of the function’s body begins. def inc(x): return x + 1 def bar(a): b = 5 if a % 2 else 8 return inc(a + 3) * inc(inc(b)) For example, if we run bar(1) then: The local variable b gets the value 5. The two calls to inc are fully evaluated before we invoke * on 5 and 7, respectively.
  • 4.
    Are things differentin Haskell? Let’s think about our old friend the list constructor. Prelude> 1:[] [1] Prelude> 1:undefined [1*** Exception: Prelude.undefined And a function that operates on a list: Prelude> head (1:[]) 1 What do you think will happen now?
  • 5.
    Are things differentin Haskell? Let’s think about our old friend the list constructor. Prelude> 1:[] [1] Prelude> 1:undefined [1*** Exception: Prelude.undefined And a function that operates on a list: Prelude> head (1:[]) 1 What do you think will happen now? Prelude> head (1:undefined) 1
  • 6.
    Was that aspecial case? Well, uh, perhaps lists are special in Haskell. Riiight? −− d e f i n e d i n Data . Maybe i s J u s t ( Just ) = True isJust = False Let’s see how the Maybe type fares. Prelude Data.Maybe> Just undefined Just *** Exception: Prelude.undefined Prelude Data.Maybe> isJust (Just undefined) True
  • 7.
    What else shouldwe expect? Here’s a slightly different function: i s J u s t O n e ( Just a ) | a == 1 = True isJustOne = False How will it behave? *Main> isJustOne (Just 2) False Okay, we expected that. What if we follow the same pattern as before, and package up an undefined value?
  • 8.
    What else shouldwe expect? Here’s a slightly different function: i s J u s t O n e ( Just a ) | a == 1 = True isJustOne = False How will it behave? *Main> isJustOne (Just 2) False Okay, we expected that. What if we follow the same pattern as before, and package up an undefined value? *Main> isJustOne (Just undefined) *** Exception: Prelude.undefined
  • 9.
    Haskell’s evaluation model Haskell follows a semantic model called non-strict evaluation: expressions are not evaluated unless (and usually until) their values are used. Perhaps you’ve heard of lazy evaluation: this is a specific kind of non-strict semantics. Haskell compilers go further, using call by need as an implementation strategy. Call by need: evaluate an expression when needed, then overwrite the location of the expression with the evaluated result (i.e. memoize it), in case it is needed again.
  • 10.
    What does thismean in practice? Consider the isJust function again. i s J u s t ( Just ) = True isJust = False It only evaluates its argument to the point of seeing whether it was constructed with a Just or Nothing constructor. Notably, the function does not inspect the argument of the Just constructor. When we try isJust (Just undefined) in ghci, the value undefined is never evaluated.
  • 11.
    And our otherexample, revisited Who can explain why this code crashes when presented with Just undefined? i s J u s t O n e ( Just a ) | a == 1 = True isJustOne = False
  • 12.
    A classic example Here is the infinite list of Fibonacci numbers in Haskell: f i b s = 0 : 1 : zipWith (+) f i b s ( t a i l f i b s ) Even though this list is conceptually infinite, its components only get generated on demand. *Main> head (drop 256 fibs) 141693817714056513234709965875411919657707794958199867
  • 13.
    Traversing lists What do these functions have in common? map f [ ] = [] map f ( x : x s ) = f x : map f x s bunzip [ ] = ([] ,[]) b u n z i p ( ( a , b ) : x s ) = l e t ( as , b s ) = b u n z i p x s i n ( a : as , b : b s ) *Main> map succ "button" "cvuupo" *Main> bunzip [(1,’a’),(3,’b’),(5,’c’)] ([1,3,5],"abc")
  • 14.
    What does mapdo? If you think about what map does to the structure of a list, it replaces every (:) constructor with a new (:) constructor that has a transformed version of its arguments. map succ ( 1 : 2 : []) == ( succ 1 : succ 2 : [ ] )
  • 15.
    And bunzip? Thinking structurally, bunzip performs the same kind of operation as map. bunzip ((1 , ’ a ’ ) : (2 , ’ b ’ ) : [ ] ) == ((1 : 2 : [ ] ) , ( ’ a ’ : ’b ’ : [ ] ) ) This time, the pattern is much harder to see, but it’s still there: Every time we see a (:) constructor, we replace it with a transformed piece of data. In this case, the transformed data is the head pair pulled apart and grafted onto the heads of a pair of lists.
  • 16.
    Abstraction! Abstraction! Abstraction! If we have two functions that do essentially the same thing, don’t we have a design pattern? In Haskell, we can usually do better than waffle about design patterns: we can reify them into code! To wit: foldr f z [ ] = z fol dr f z ( x : xs ) = f x ( fold r f z xs )
  • 17.
    The right fold(no, really) The foldr function is called a right fold, because it associates to the right. What do I mean by this? f o l d r (+) 1 [ 2 , 3 , 4 ] == f o l d r (+) 1 ( 2 : 3 : 4 : [ ] ) == f o l d r (+) 1 ( 2 : ( 3 : ( 4 : [ ] ) ) ) == 2 + (3 + (4 + 1)) Notice a few things: We replaced the empty list with the “empty” value. We replaced each non-empty constructor with an addition operator. That’s our structural transformation in a nutshell!
  • 18.
    map as aright fold Because map follows the same pattern as foldr, we can actually write map in terms of foldr! bmap : : ( a −> b ) −> [ a ] −> [ b ] bmap f x s = f o l d r g [ ] x s where g y y s = f y : y s Since we can write a map as a fold, this implies that a fold is somehow more primitive than a map.
  • 19.
    unzip as aright fold And here’s our unzip-like function as a fold: b u n z i p : : [ ( a , b ) ] −> ( [ a ] , [ b ] ) bunzip xs = fold r g ( [ ] , [ ] ) xs where g ( x , y ) ( xs , y s ) = ( x : xs , y : y s ) In fact, I’d suggest that bunzip-in-terms-of-foldr is actually easier to understand than the original definition. Many other common list functions can be expressed as right folds, too!
  • 20.
    Function composition (I) Remember your high school algebra? f ◦ g (x) ≡ f (g (x)) f ◦ g ≡ λx → f (g x) Taking the above notation and writing it in Haskell, we get this: f . g = x −> f ( g x ) The backslash is Haskell ASCII art for λ, and introduces an anonymous function. Between the backslash and the ASCII arrow are the function’s arguments, and following the arrow is its body.
  • 21.
    Function composition (II) f . g = x −> f ( g x ) The result of f . g is a function that accepts one argument, applied f to it, then applies g to the result. So the expression succ . succ adds two to a number, for instance. Why care about this? Well, here’s our old definition of map-as-foldr. bmap f x s = f o l d r g [ ] x s where g y y s = f y : y s And here’s a more succinct version. bmap f x s = f o l d r ( ( : ) . f ) [ ] x s
  • 22.
    The left fold So we’ve seen folds that associate to the right: 1 + (2 + (3 + 4)) What about folds that associate to the left? ((1 + 2) + 3) + 4 Not surprisingly, the left fold does indeed exist, and is named foldl . foldl f z [] = z f o l d l f z ( x : xs ) = f o l d l f ( f z x ) xs
  • 23.
    The right foldin pictures
  • 24.
    The left foldin pictures
  • 25.
    Folds and laziness Which of these definitions for adding up the elements of a list is better? sum1 x s = f o l d r (+) 0 x s sum2 x s = f o l d l (+) 0 x s That’s a hard question to approach without a sense of what lazy evaluation will cause to happen. Suppose an oracle generates the list [1..1000] for us at a rate of one element per second.
  • 26.
    Sum as rightfold In the first second, we see the partial expression 1 : ( . . . ) {− can ’ t s e e a n y t h i n g more y e t −} But we want to know the result as soon as possible, so we generate a partial result: 1 + ( . . . ) {− can ’ t make any more p r o g r e s s y e t −}
  • 27.
    Second number two In the second second, we now have the partial expression 1 : ( 2 : . . . ) {− can ’ t s e e a n y t h i n g more y e t −} We thus construct a little more of our eventual result: 1 + ( 2 + ( . . . ) ) {− s t i l l no f u r t h e r p r o g r e s s −} Because we’re constructing a right-associative expression (that’s what a right fold is for), we can’t create an intermediate result of the-sum-so-far at any point. In other words, we’re creating a big expression containing 1000 nested applications of (+), which we’ll only be able to fully evaluate at the end of the list!
  • 28.
    What happens inpractice? On casual inspection, it’s not clear that this right fold business really matters. Prelude> foldr (+) 0 [0..100000] 5000050000 But if we try to sum a longer list, we get a problem: Prelude> foldr (+) 0 [0..1000000] *** Exception: stack overflow The GHC runtime imposes a limit on the size of a deferred expression to reduce the likelihood of us shooting ourselves in the foot. Or at least to make the foot-shooting happen early enough that it won’t be a serious problem.
  • 29.
    Left folds arebetter . . . uh, right? Obviously, a left fold can’t tell us the sum before we reach the end of a list, but it has a promising property. Given a list like this: 1 : 2 : 3 : ... Then our sum-via- foldl will produce a result like this: ((1 + 2) + 3) + . . . This is left associative, so we could potentially evaluate the leftmost portion on the fly: add 1 + 2 to give 3, then add 3 to give 6, and so on, keeping a single Int as the rolling sum-so-far.
  • 30.
    Are we outof the woods? So we know that this will fail: Prelude> foldr (+) 0 [0..1000000] *** Exception: stack overflow But what about this? Prelude> foldl (+) 0 [0..1000000] *** Exception: stack overflow Hey! Shouldn’t the left fold have saved our bacon?
  • 31.
    Why did foldlnot help? Alas, consider the definition of foldl : f o l d l : : ( a −> b −> a ) −> a −> [ b ] −> a foldl f z [] = z f o l d l f z ( x : xs ) = f o l d l f ( f z x ) xs Because foldl is polymorphic, there is no way it can inspect the result of f z x. And since the intermediate result of each f z x can’t be evaluated, a huge unevaluated expression piles up until we reach the end of the list, just as with foldr!
  • 32.
    Tracing evaluation How can we even get a sense of what pure Haskell code is actually doing? import Debug . T r a c e f o l d l T : : (Show a ) = > ( a −> b −> a ) −> a −> [ b ] −> a foldlT f z [] = z f o l d l T f z ( x : xs ) = let i = f z x i n t r a c e ( ”now ” ++ show i ) f o l d l T f i x s
  • 33.
    What does tracedo? The trace function is a magical function of sin: it prints its first argument on stderr, then returns its second argument. The expression trace (”now ”++ show i) foldlT prints something, then returns foldT. If you have the patience, run this in ghci: Prelude> foldlT (+) 0 [0..1000000] now 0 now 1 now 3 ...blah blah blah... 500000500000 Whoa! It eventually prints a result, where plain foldl failed!
  • 34.
    What’s going on? In order to print an intermediate result, trace must evaluate it first. Haskell’s call-by-need evaluation ensures that an unevaluated expression will be overwritten with the evaluated result. Instead of a constantly-growing expression, we thus have a single primitive value as the running sum at each iteration through the loop.
  • 35.
    Is a debugginghack the answer to our problem? Clearly, using trace seems like an incredibly lame solution to the problem of evaluating intermediate results, although it’s very useful for debugging. (Actually, it’s the only Haskell debugger I use.) The real solution to our problem lies in a function named seq. seq : : a −> t −> t This function is a rather magical hack; all it does is evaluate its first argument until it reaches a constructor, then return its second argument.
  • 36.
    Folding with seq The Data.List module defines the following function for us: f o l d l ’ : : ( a −> b −> a ) −> a −> [ b ] −> a foldl ’ f z [ ] = z foldl ’ f z ( x : xs ) = l e t i = f z x in i ‘ seq ‘ f o l d l ’ f i x s Let’s compare the two in practice: Prelude> foldl (+) 0 [0..1000000] *** Exception: stack overflow Prelude> import Data.List Prelude> foldl’ (+) 0 [0..1000000] 500000500000
  • 37.
    Rules of thumbfor folds If you can generate your result lazily and incrementally, e.g. as map does, use foldr. If you are generating what is conceptually a single result (e.g. one number), use foldl ’, because it will evaluate those tricky intermediate results strictly. Never use plain old foldl without the little tick at the end.
  • 38.
    Homework Foreach of the following, choose the appropriate fold for the kind of result you are returning. You can find the type signature for each function using ghci. Write concat using a fold. Write length using a fold. Write (++) using a fold. Write and using a fold. Write unwords using a fold. Write () from Data.List using a fold. For super duper bonus points: Write either foldr or foldl in terms of the other. (Hint 1: only one is actually possible. Hint 2: the answer is highly non-obvious, and involves higher order functions.)