Functional Programming with Immutable Data Structures

4,999
-1

Published on

author: Ivar Thorson
great slide!!! congratulations.

Published in: Technology
1 Comment
8 Likes
Statistics
Notes
No Downloads
Views
Total Views
4,999
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
131
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide

Functional Programming with Immutable Data Structures

  1. 1. Functional Programming with Immutable Data Structures Why Imperative Languages are Fundamentally Broken in a Multi-Threaded Environment Ivar Thorson Italian Institute of Technology November 2, 2010
  2. 2. Ivar Thorson
  3. 3. Research interests:
  4. 4. Compliant actuation
  5. 5. Hopping Robots
  6. 6. Rigid Body Dynamics Simulations
  7. 7. The Next 37 minutes:
  8. 8. Important abstractions of functional programming
  9. 9. particularly for a 4-year old language
  10. 10. Rich Hickey
  11. 11. What I have learned from his work
  12. 12. Clojure: Blending theoretical abstractions + practical know-how
  13. 13. Target Audience: C, C++, Java, Matlab Programmers
  14. 14. Tempting to start with a feature tour!
  15. 15. But you won’t understand why Clojure is cool without context
  16. 16. Actually, I’m going to try to knock the cup out of your hand.
  17. 17. Three Outdated Concepts
  18. 18. Three Outdated Concepts 1. Variables
  19. 19. Three Outdated Concepts 1. Variables 2. Syntax
  20. 20. Three Outdated Concepts 1. Variables 2. Syntax 3. Object Orientation
  21. 21. Goal is to convince you that
  22. 22. Goal is to convince you that 1. shared mutable data is now a philosophically bankrupt idea.
  23. 23. Goal is to convince you that 1. shared mutable data is now a philosophically bankrupt idea. 2. code and data should be structured as trees
  24. 24. Goal is to convince you that 1. shared mutable data is now a philosophically bankrupt idea. 2. code and data should be structured as trees 3. OOP isn’t the best way to achieve OOP’s goals
  25. 25. Speaking bluntly
  26. 26. Speaking bluntly 1. everything you know is wrong (15 min)
  27. 27. Speaking bluntly 1. everything you know is wrong (15 min) 2. lisp parentheses are better than syntax (10 min)
  28. 28. Speaking bluntly 1. everything you know is wrong (15 min) 2. lisp parentheses are better than syntax (10 min) 3. OOP inheritance sucks (5 min)
  29. 29. disclaimer: some hyperbole in previous statements
  30. 30. Oh noes, too many parentheses!
  31. 31. Good reasons for parentheses
  32. 32. Good reasons for parentheses 1. Lisp is homoiconic
  33. 33. Good reasons for parentheses 1. Lisp is homoiconic 2. Parentheses uniquely define tree-shaped computations
  34. 34. Good reasons for parentheses 1. Lisp is homoiconic 2. Parentheses uniquely define tree-shaped computations 3. Parentheses enable structural editing
  35. 35. For now, please be patient
  36. 36. Introduction: Motivation for multi-threaded programming
  37. 37. 1. Last 40 years: Moore’s Law
  38. 38. 1. Last 40 years: Moore’s Law 2. “Transistor count will double every 2 years”
  39. 39. # of transistors ≈ CPU performance
  40. 40. Constraining physical relationship between power density, swiching time, oxide thickness
  41. 41. The future of hardware is increasingly parallel
  42. 42. The future of software will be ruled by Amdahl’s law
  43. 43. Some things are sequential: Two women cannot have a baby in 4.5 months.
  44. 44. 1. Dividing up work is already a hard design task
  45. 45. 1. Dividing up work is already a hard design task 2. Resource contention makes this problem harder
  46. 46. Common multi-threaded bugs:
  47. 47. Common multi-threaded bugs: Invalid state
  48. 48. Common multi-threaded bugs: Invalid state Race conditions
  49. 49. Common multi-threaded bugs: Invalid state Race conditions Deadlocks
  50. 50. Common multi-threaded bugs: Invalid state Race conditions Deadlocks Livelocks
  51. 51. Common multi-threaded bugs: Invalid state Race conditions Deadlocks Livelocks Resource starvation
  52. 52. What if most bugs were merely due to the imperative programming model?
  53. 53. Part 1: The Functional Programming Style
  54. 54. Pure functions always return the same result when they get the same input.
  55. 55. Pure functions don’t...
  56. 56. Pure functions don’t... ...look outside their box
  57. 57. Pure functions don’t... ...look outside their box ...modify anything, anywhere
  58. 58. Pure functions don’t... ...look outside their box ...modify anything, anywhere ...print messages to the user
  59. 59. Pure functions don’t... ...look outside their box ...modify anything, anywhere ...print messages to the user ...write to disk
  60. 60. Pure functions have no side effects
  61. 61. Nothing would change if you ran the function again – anywhere!
  62. 62. f (x) = x2 + 1
  63. 63. Pure functions just return a value, and do nothing more.
  64. 64. Pure functions compose to other pure functions
  65. 65. f (a, b, c) = (a + b)/(c ∗ 2)
  66. 66. Languages that emphasize the use of pure functions are called functional languages
  67. 67. Imperative languages describe computation in terms of changes to state.
  68. 68. C, C++, Java, and most engineering languages are imperative.
  69. 69. Imperative languages describe memory operations instead of purely functional operations.
  70. 70. Imperative style: directly causing side effects on memory.
  71. 71. The assignment operator changes memory...often a side effect!
  72. 72. Could you write a C program...
  73. 73. Could you write a C program... without any non-local variables?
  74. 74. Could you write a C program... without any non-local variables? where = is only used for initialization?
  75. 75. Next: Variables are fundamentally a bad abstraction in multithreaded environments.
  76. 76. Claim #1. Shared mutable data is now philosophically bankrupt
  77. 77. x = x + 1
  78. 78. x = x + 1 x = 3 (...last time I checked!)
  79. 79. x = x + 1 x = 3 (...last time I checked!) In my universe, 3 = 3 + 1 is never true
  80. 80. The Big Problem: the concept of variables encourage us to forget about time.
  81. 81. x[t] = x0 x[t + 1] = x[t] + 1
  82. 82. The value of x for a given t is immutable and unchanging!
  83. 83. The Most Important Slide
  84. 84. The Most Important Slide x is a name, an identity of a sequence of values
  85. 85. The Most Important Slide x is a name, an identity of a sequence of values x has different values at different times
  86. 86. The Most Important Slide x is a name, an identity of a sequence of values x has different values at different times The values of x are related by pure functions
  87. 87. The Most Important Slide x is a name, an identity of a sequence of values x has different values at different times The values of x are related by pure functions In this case, by the increment function
  88. 88. The idea of a variable confuses identity and the most current value!
  89. 89. Locking: a tactic for winning a battle.
  90. 90. What we need is a strategy to win the war.
  91. 91. “What if all data was immutable?” – Rich Hickey (not the first one to ask this question)
  92. 92. Keeping Old Immutable Data
  93. 93. Keeping Old Immutable Data x@(t=0) → 5
  94. 94. Keeping Old Immutable Data x@(t=0) → 5 x@(t=1) → 6
  95. 95. Keeping Old Immutable Data x@(t=0) → 5 x@(t=1) → 6 x@(t=13) → 7
  96. 96. Keeping Old Immutable Data x@(t=0) → 5 x@(t=1) → 6 x@(t=13) → 7 x@(t=15) → 8
  97. 97. Keeping Old Immutable Data x@(t=0) → 5 x@(t=1) → 6 x@(t=13) → 7 x@(t=15) → 8 ...
  98. 98. The old values of the data are kept and indexed by time
  99. 99. The old values of the data are kept and indexed by time Data is immutable once created – we cannot/will not change it!
  100. 100. The old values of the data are kept and indexed by time Data is immutable once created – we cannot/will not change it! Values only destroyed when unneeded
  101. 101. Doesn’t keeping old copies of data consume too much memory?
  102. 102. (a b c) + d = (a b c d)
  103. 103. (a b c) + d = (a b c d) What if the input and output shared structure?
  104. 104. (a b c) + d = (a b c d) What if the input and output shared structure? Sharing structure is dangerous for mutable data
  105. 105. (a b c) + d = (a b c d) What if the input and output shared structure? Sharing structure is dangerous for mutable data ...but sharing structure is safe if the data is immutable.
  106. 106. The Trick: Represent the list as a tree
  107. 107. input tree → pure function → output tree
  108. 108. both trees are immutable but distinct
  109. 109. Same approach works also for insertions, modifications, deletions, and all other list operations.
  110. 110. a million threads, a million trees, a million references to trees, zero locks
  111. 111. If you want a more current worldview, just get its reference
  112. 112. Advantages of immutable trees:
  113. 113. Advantages of immutable trees: No locking required
  114. 114. Advantages of immutable trees: No locking required No ’stopping the world’ to see (readers don’t block writers)
  115. 115. Advantages of immutable trees: No locking required No ’stopping the world’ to see (readers don’t block writers) Worldview never becomes corrupted
  116. 116. Advantages of immutable trees: No locking required No ’stopping the world’ to see (readers don’t block writers) Worldview never becomes corrupted Minimizes memory use while maintaining multiple copies
  117. 117. Advantages of immutable trees: No locking required No ’stopping the world’ to see (readers don’t block writers) Worldview never becomes corrupted Minimizes memory use while maintaining multiple copies Unused nodes are garbage-collected
  118. 118. We’ve gone a long way, but we’re only half way to real concurrency
  119. 119. Immutability lets us read concurrently, but not write concurrently to a single piece of data
  120. 120. How can we coordinate the actions of different threads working on the same data at the same time?
  121. 121. ”Treat changes in an identity’s value as a database transaction.” – Rich Hickey
  122. 122. Database Details:
  123. 123. Database Details: Software Transactional Memory (STM)
  124. 124. Database Details: Software Transactional Memory (STM) Multi-Version Concurrency Control (MVCC)
  125. 125. The STM Guarantees:
  126. 126. The STM Guarantees: Atomicity (All or nothing)
  127. 127. The STM Guarantees: Atomicity (All or nothing) Consistency (Validation before commits)
  128. 128. The STM Guarantees: Atomicity (All or nothing) Consistency (Validation before commits) Isolation (Transactions can’t see each other)
  129. 129. Transactions are speculative and may be retried if there is a collision.
  130. 130. For even more concurrency:
  131. 131. For even more concurrency: Sometimes you don’t care about the order of function application
  132. 132. For even more concurrency: Sometimes you don’t care about the order of function application Commutative writers won’t need to retry
  133. 133. For even more concurrency: Sometimes you don’t care about the order of function application Commutative writers won’t need to retry Writers don’t interfere with other writers!
  134. 134. STMs exist for other languages...
  135. 135. STMs exist for other languages... ... but Clojure is first to have built-in STM with pervasive immutability
  136. 136. Part 1 Summary: In Clojure...
  137. 137. Part 1 Summary: In Clojure... ...Readers don’t block anybody
  138. 138. Part 1 Summary: In Clojure... ...Readers don’t block anybody ...Writers don’t block anybody
  139. 139. Part 1 Summary: In Clojure... ...Readers don’t block anybody ...Writers don’t block anybody ...Writers retry if conflicting
  140. 140. Part 1 Summary: In Clojure... ...Readers don’t block anybody ...Writers don’t block anybody ...Writers retry if conflicting ...Writers don’t retry if commutative
  141. 141. Variables couldn’t do this because:
  142. 142. Variables couldn’t do this because: Confuse identity and values
  143. 143. Variables couldn’t do this because: Confuse identity and values Are mutable and can be corrupted
  144. 144. Variables couldn’t do this because: Confuse identity and values Are mutable and can be corrupted Assume a single thread of control, no interruptions
  145. 145. Variables couldn’t do this because: Confuse identity and values Are mutable and can be corrupted Assume a single thread of control, no interruptions Maintain only the last written copy
  146. 146. ”Mutable stateful objects are the new spaghetti code” – Rich Hickey
  147. 147. ”We oppose the uncontrolled mutation of variables.” – Stuart Halloway
  148. 148. ”Mutable objects are a concurrency disaster.” – Rich Hickey
  149. 149. ”The future is a function of the past, but doesn’t change it.” – Rich Hickey
  150. 150. ”Many people can watch a baseball game, but only one can be at bat.” – Rich Hickey
  151. 151. Part 2: Revenge of the Lisp
  152. 152. Claim #2: Your language’s syntax is unneccesarily complex
  153. 153. Now we’ll explain why lisp has parentheses!
  154. 154. Pure functions represent computation as trees
  155. 155. Reason 1: The tree structure is made explicitly visible by the parentheses
  156. 156. Sorry, Haskell/Erlang/OCaml/ML/etc!
  157. 157. Infix Notation 1 + 1
  158. 158. Prefix Notation (+ 1 1)
  159. 159. (fn arg1 arg2 arg3 ...)
  160. 160. 1 + 2 + 3 + 4
  161. 161. (+ 1 2 3 4)
  162. 162. With prefix notation, you can forget about the rules of precedence.
  163. 163. 6 + 12 / 2 * 3
  164. 164. (6 + 12) / (2 * 3)
  165. 165. (/ (+ 6 12) (* 2 3))
  166. 166. (/ (+ 6p 12) (* 2 3))
  167. 167. Triangular Tree Structure
  168. 168. Lisp code’s tree of computation is exceptionally visible and regular.
  169. 169. Reason 2: Homoiconicity
  170. 170. Homoiconic = Homo + icon = ”same” + ”representation”
  171. 171. The property where code & a language primitive look the same.
  172. 172. An example: Writing C with XML
  173. 173. for (i=0; i<100; i++) { printf("%dn", i); dostuff(); }
  174. 174. <for> <init>i = 0</init> <test>i < 100</test> <count>i++</count> <body> <print format="%d" args="i"/> <dostuff/> </body> </for>
  175. 175. Imagine how simple it would be to use an XML generator to emit compilable source code.
  176. 176. We could modify our code programmatically.
  177. 177. In lisp, you can write programs that write programs.
  178. 178. (list 1 2 3) -> (1 2 3)
  179. 179. (list ’+ 1 2 3) -> (+ 1 2 3) (+ 1 2 3) -> 6
  180. 180. (defmacro and ([] true) ([x] x) ([x & rest] ‘(let [and# ~x] (if and# (and ~@rest) and#))))
  181. 181. Why lispers go nuts:
  182. 182. Why lispers go nuts: Macros in lisp are far more powerful than in other languages.
  183. 183. Why lispers go nuts: Macros in lisp are far more powerful than in other languages. You can build new constructs that are just as legitimate as existing constructs like if
  184. 184. Why lispers go nuts: Macros in lisp are far more powerful than in other languages. You can build new constructs that are just as legitimate as existing constructs like if You can abstract away boilerplate code
  185. 185. Aside
  186. 186. Aside If Java used parentheses properly, XML wouldn’t exist
  187. 187. Aside If Java used parentheses properly, XML wouldn’t exist Lisp parentheses describe structure
  188. 188. Aside If Java used parentheses properly, XML wouldn’t exist Lisp parentheses describe structure Most languages use ad-hoc syntax, data formats
  189. 189. Aside If Java used parentheses properly, XML wouldn’t exist Lisp parentheses describe structure Most languages use ad-hoc syntax, data formats Simplicity is elegance
  190. 190. Reason 3: Structural Editing
  191. 191. Good editors let you work with code in blocks and forget about the parentheses.
  192. 192. Hard-to-show Examples
  193. 193. Hard-to-show Examples When you type (, emacs adds the )
  194. 194. Hard-to-show Examples When you type (, emacs adds the ) Indentation is automatic
  195. 195. Hard-to-show Examples When you type (, emacs adds the ) Indentation is automatic You can easily navigate heirarchically
  196. 196. Hard-to-show Examples When you type (, emacs adds the ) Indentation is automatic You can easily navigate heirarchically Take next three expressions, apply them to a function
  197. 197. Summary of Lisp Parentheses
  198. 198. Summary of Lisp Parentheses Parentheses render explicit the tree-structure of your program
  199. 199. Summary of Lisp Parentheses Parentheses render explicit the tree-structure of your program Homoiconicity lets you write programs with programs
  200. 200. Summary of Lisp Parentheses Parentheses render explicit the tree-structure of your program Homoiconicity lets you write programs with programs Structural Editing is fun and easy
  201. 201. Syntax is bad because
  202. 202. Syntax is bad because It hides the program’s structure
  203. 203. Syntax is bad because It hides the program’s structure It destroys homoiconicity
  204. 204. Syntax is bad because It hides the program’s structure It destroys homoiconicity It is needlessly complex
  205. 205. ”Things should be made as simple as possible – but no simpler.” – Albert Einstein
  206. 206. Part 3: OOP isn’t the only path to Polymorphism and Code Reuse
  207. 207. OOP has good goals
  208. 208. OOP has good goals 1. to group objects together
  209. 209. OOP has good goals 1. to group objects together 2. to encapsulate
  210. 210. OOP has good goals 1. to group objects together 2. to encapsulate 3. to dispatch polymorphically
  211. 211. OOP has good goals 1. to group objects together 2. to encapsulate 3. to dispatch polymorphically 4. to reuse code
  212. 212. These are all good ideas and good goals.
  213. 213. However, OOP is not the only way to reach these goals.
  214. 214. Claim #3: ”Functions compose better than objects.”
  215. 215. The fundamental mechanism of OOP – the inheritance of data, interfaces, type, or methods from a parent – is often more difficult to use in practice than techniques that use functions to achieve the same effect.
  216. 216. Functions are simpler than objects.
  217. 217. Objects are semantic compounds of types, data, and methods.
  218. 218. Implementation inheritance is bad:
  219. 219. Implementation inheritance is bad: Forces “is-a” relationship instead of “has-a”, and “has-a” is almost always better
  220. 220. Implementation inheritance is bad: Forces “is-a” relationship instead of “has-a”, and “has-a” is almost always better Heirarchical nominalization is difficult
  221. 221. Implementation inheritance is bad: Forces “is-a” relationship instead of “has-a”, and “has-a” is almost always better Heirarchical nominalization is difficult Changes to a class affect all the subclasses
  222. 222. An example will help clarify.
  223. 223. Balls
  224. 224. Balls Ball class (presumably round)
  225. 225. Balls Ball class (presumably round) rollingBall subclass
  226. 226. Balls Ball class (presumably round) rollingBall subclass bouncingBall subclass
  227. 227. Problems
  228. 228. Problems What happens if we want to make a ball that both rolls and bounces?
  229. 229. Problems What happens if we want to make a ball that both rolls and bounces? Do/Can we inherit from both?
  230. 230. Problems What happens if we want to make a ball that both rolls and bounces? Do/Can we inherit from both? What if our ball cracks and loses its bounciness?
  231. 231. Problems What happens if we want to make a ball that both rolls and bounces? Do/Can we inherit from both? What if our ball cracks and loses its bounciness? Is a non-round rugby ball a subclass of ball too?
  232. 232. Interfaces are Simpler
  233. 233. Interfaces are Simpler Define functional interfaces, but don’t inherit the implementation
  234. 234. Interfaces are Simpler Define functional interfaces, but don’t inherit the implementation If you want to use another object’s function to accomplish a task, just use it
  235. 235. Interfaces are Simpler Define functional interfaces, but don’t inherit the implementation If you want to use another object’s function to accomplish a task, just use it No need to encapsulate their function in an object
  236. 236. Interfaces are Simpler Define functional interfaces, but don’t inherit the implementation If you want to use another object’s function to accomplish a task, just use it No need to encapsulate their function in an object Multiple interfaces are simpler than multiple inheritance
  237. 237. FREE THE VERBS!! Separate your object methods from your objects!
  238. 238. Example: Single vs Multiple Dispatch
  239. 239. Making Drum Noises
  240. 240. Making Drum Noises Drum, cymbal and stick classes
  241. 241. Making Drum Noises Drum, cymbal and stick classes When I hit something with the stick, it makes a noise
  242. 242. Making Drum Noises Drum, cymbal and stick classes When I hit something with the stick, it makes a noise Single Dispatch: drum.makeNoise(drumstick)
  243. 243. Making Drum Noises Drum, cymbal and stick classes When I hit something with the stick, it makes a noise Single Dispatch: drum.makeNoise(drumstick) cymbal.makeNoise(drumstick)
  244. 244. The verbs are owned by the nouns.
  245. 245. But what happens when I add a different stick class?
  246. 246. Now I will need to modify the drum and cymbal classes and add new methods to handle the mallets!
  247. 247. When two objects hit, the sound is function of both objects.
  248. 248. With multi-method functions
  249. 249. With multi-method functions hit(drumstick, cymbal) = crash
  250. 250. With multi-method functions hit(drumstick, cymbal) = crash hit(mallet, cymbal) = roar
  251. 251. With multi-method functions hit(drumstick, cymbal) = crash hit(mallet, cymbal) = roar hit(drumstick, drum) = bam
  252. 252. With multi-method functions hit(drumstick, cymbal) = crash hit(mallet, cymbal) = roar hit(drumstick, drum) = bam hit(mallet, drum) = bom
  253. 253. With multi-method functions hit(drumstick, cymbal) = crash hit(mallet, cymbal) = roar hit(drumstick, drum) = bam hit(mallet, drum) = bom hit(drumstick, drumstick) = click
  254. 254. With multi-method functions hit(drumstick, cymbal) = crash hit(mallet, cymbal) = roar hit(drumstick, drum) = bam hit(mallet, drum) = bom hit(drumstick, drumstick) = click hit(cymbal, cymbal) = loud crash
  255. 255. IMPORTANT: hit() is not a single function, but a collection of functions. (It is called a multi-method or generic function)
  256. 256. The particular function that is called is determined by the type of both of its arguments.
  257. 257. As you add more classes, just add more definitions of hit().
  258. 258. Part 3 Conclusion
  259. 259. Part 3 Conclusion Polymorphism is better done through interfaces than subtype inheritance.
  260. 260. Part 3 Conclusion Polymorphism is better done through interfaces than subtype inheritance. Functions do not require a heirarchy
  261. 261. Part 3 Conclusion Polymorphism is better done through interfaces than subtype inheritance. Functions do not require a heirarchy Functions allow simple multiple dispatch
  262. 262. The End
  263. 263. Any Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×