Functional Programming for Computing Clouds


Published on

With the advent of multicore CPUs, cloud computing and Big Data, we are
currently observing changes that will eventually lead information
technology into a whole new era, and we are need to search for
programming language paradigms that match with it. Will Functional
Programming languages (FPLs) be the game changer?

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Functional Programming for Computing Clouds

  1. 1. Functional Programming for computing cloudsJoerg Fritsch, NATO CI AgencySchool of Computer Science & InformaticsCardiff University, 24 October 2012
  2. 2. Agenda• Essentials of Functional Programming Languages• Vision and requirements of computing clouds• Haskell: a Pure Functional Programming Language• Some innocent code: working with immutable data• Gaps between the cloud computing vision and FPLs 2
  3. 3. Functional Programming Languages• Based on the λ-calculus• Declarative• Functions are declared, describe relation between input and output• Functions always evaluate to the same value for a given argument (“free of side effects”).• Variables are assigned once.• Functional PLs that by default exclude destructive modifications (to data structures) are called “pure.” 3
  4. 4. λ - calculus• Alonzo Church 1930s• Small Grammar• Grammar can partially be found back in LISP and Haskell syntax• Can express everything that is computable• No state ! 4
  5. 5. Pure FPLs• Functions can be composed, curried, etc..• All pure functions can be executed in parallel• Compiler can make it fit for multicore: e.g. re-arrange order of function execution or inline.• Runtime can cache function evaluation.• IO is a beast that disturbs this concepts & needs to be tamed (for example with a monad).• Every Haskell program is a function in the IO monad. 5
  6. 6. Pure FPLs (continued)• Haskell • Scala• Clean • Clojure• Go • XSLT• F# • Erlang• ML / OCaml • SQL• Lisp / Scheme • Mathematica 6
  7. 7. Vision & requirements of cloud computing• Clouds will need to support scalable programs.• “Any” application scaled through distribution over parallel (multicore) hardware.• Applications with high concurrency are good candidates for parallelization. 7
  8. 8. Elasticity in Computing Clouds (now)• Duplication!• IaaS – Duplicate VMs including OS.• PaaS – Duplicate language App Servers (e.g. JVM, Rails) or RTS and guest code. – Duplicate app execution engine (a component of the PaaS platform that is).• (Virtualized) Load Balancers are the glue. “Clustering"• Concurrency is enabler for parallelization.• Map reduce sold as separate capability.• Multi-tenancy always supported. 8
  9. 9. Elasticity in Computing Clouds (continued) Virtualization Hardware Operating Operating Deployed Example (either 1: Platforms Systems Systems applica- many or Adminis- tion many:1) Instantiated tered Instances Hardware 1 3 3 3 IBM Virtualization z/VM Hypervisor 1 4 (including 3 3 KVM, hypervisor VMWare IaaS OS) OS 1 1 (0 - )3 3 BSD Virtualization Jails PaaS 1 4 0 3 App- (current) scale PaaS 1 1 0 3 cwmwl (ideal) Clustering 3 3 3 3 9
  10. 10. Elasticity in Computing Clouds (in the future?)Legacy/IaaS Future/PaaS • Borders of building blocks are• Currently prevailing dissolved• Unit of scale = OS, VM, • Unit of scale = (Green)thread? Runtime • Requires new software, new• Duplication of units programming languages, new designs. 10
  11. 11. Haskell• Named after Haskell Brooks Curry (1900 - 1982). Combinatory logic (1930s).• Born as Haskell 1.0 standard in 1990 (approximately at the same time than Erlang)• Haskell 98 is most prominent definition yet 11
  12. 12. Haskell (continued)• Is a pure functional PL• Has a static type system• Is Lazy• Function composition and currying mimicking mathematical functions• Has monads (related to category theory)• Is sometimes mind boggling blowing 12
  13. 13. What does Haskell bring to the table? Inherently Immutable by Parallel default Strong Types Lazy evaluation Functional Haskell Code Maintainability 13
  14. 14. Functions• Functions are Data as well• Functions consist of way less code than objects• Higher order functions• return is a function name• Function signatures declare constraints (types) and computational strategies. adder :: [Int] -> Int --type signature fun_name :: input_type -> output_type adder [] = 0 --define output for the empty list adder (x:xs) = x + sum xs --use some fancy reursion 14
  15. 15. Immutability of Data• The consequences are huge. There is more data than you think. For example a counter: c = c + 1;• Haskell implementation of “counters” depends on what you need to achieve.• Common to use Map and Fold (aka reduce)• Eventually counters represent some sort of state. Use the state monad: Control.Monad.State• Haskell is by default pure. Mutable data structures can be used: Data.IORef, Data.Judy but are seen as “not idiomatic”. 15
  16. 16. Immutability of Data (continued)• Data.IORef part of the base package.• The function unsafePerformIO can “subvert” the type system and allows any kind of mutable state.• A large number of Haskell modules make use of it! – Randomness & Encryption, GUIs, …• Is immutability over-emphasized? 16
  17. 17. Immutability of Data: There is no listLists are build on top of cons cells.Cons cells contain pairs of values.Example. cons (:) and append (++) to a “list”.[1,2,3,4] = 1:2:3:4:[] = 1 : (2 : (3 : (4 : [] ) ) )cons :0:1:2:3:4:[] = 0 : (1 : (2 : (3 : (4 : [] ) ) ) )Result is new list [0] plus a pointer to the previous list. Runs in O(1) time.This is also called “sharing”.append ++1:2:3:4:5:[] = 1 : (2 : (3 : (4 : (5 : []) ) ) )Destructive operation, whole data structure taken apart recursively. Result is an all new datastructure. Runs in O(n) time. 17
  18. 18. Data.Map.Fold (Map Reduce)• Fold adder :: [Int] -> Int adder xs = foldr (+) 0 xs -- reduce a map using the +• Data Structures can have a left and a right: foldl, foldr, foldM 18
  19. 19. Strong Type System• All monomorphic types are part of the category of Haskell types, “Hask”. Maps between types are the functions in Haskell.• Data types can be tainted. E.g. IO Int, Maybe Int• Type system supports safety and correctness. Haskell code is reasonably easy to test.• At the beginning the type system frequently gets into your way.• Maintainability: I am often positively surprised how many changes to my existing code work at the first compilation (once I get the types right).• Definition of own types and type classes etc. bears the foundation for great flexibility. 19
  20. 20. (Parametric) Polymorphism: Type Variables adder :: Num a => [a] -> a adder xs = foldr (+) 0 xs -- reduce a map using the +• More powerful types of polymorphism: type classes, kinds, … .• The type system is Turing complete & allows manipulations far exceeding any other PL• Type classes & type level programming 20
  21. 21. Lazy Evaluation• Lazy evaluation, “call-by-need”.• Partially the paradigm that makes immutable data structures workable (see also “sharing”).• Risk of space leaks• Opens up a door to “infinity”: infinite lists *1, …+, Fibonacci numbers, primes, e, … & to new strategies in AI (Hughes, 1990). 21
  22. 22. Do Cloud Computing and FPLs match?  Immutable Data. Shared nothing.  Message passing (e.g. actors) available to re-synchronize processes  STM better manageable than locks.  FPLs are inerently parallel. Functions, Closures, Currying  Declarative  Compiler has freedom to re-arrange “everything”  Elasticity is left to the developer or to the “app engine”  Code easily testable & maintainable  No  “Safe Haskell” may be a good start. 22
  23. 23. Multi Tenancy: Safe Haskell• Released to public in early 2012.• Vision: tenants upload code (e.g. a worker) that gets compiled and executed as plugin by a Haskell app- engine.• Plugin-concept based on library System.Eval.Haskell• New language extensions to allow secure code only: -XSafe, -XTrustworthy, -Xunsafe• Eventually based on type safety. 23
  24. 24. Safe Haskell (continued)• Two routes decide what to be trusted:• -XSafe = trust inferred by the compiler, limiting Haskell to a (small) subset.• In PaaS subsets and restrictions are “normal”. Think Java on the Google App Engine.• -XTrustworthy = trust decided by a person. Not a powerful security concept? 24
  25. 25. Issues• There is no obvious way how to match functions to threads.• Threads are more related to sequential programming (with shared memory) than to FP. Think CSP.• Many programs have to parallelize relative small computations with high inter-dependency.• Message passing & actors also no fit to distribute small computations.• Function composition is … sequential execution! 25
  26. 26. Issues (continued)• When a computation is moved to a remote node, little is known about cost of transport and state (e.g. load of the remote node). Multi-tenancy!• Cost model required.• (Network)Protocols are the most prominent cost center.• It is extremely unlikely that commercial clouds will use “niche” hardware or proprietary protocols.• Protocol design will need to be simple and light weight.• Protocols in distributed environments will orchestrate and coordinate. Basis for a DS coordination language? 26
  27. 27. Amdahl’s Law• Possible to calculate the speed improvement when n% of the code are parallel.• Unknown under what conditions the law holds.• Relatively small influences have huge adversary effects: – code that has a parallel portion of 95% results in a speed improvement of factor six on an 8 core CPU. – code that has a parallel portion of 75% results in a speed improvement of factor three on an 8 core CPU. 27
  28. 28. Amdahl’s Law (continued) 28
  29. 29. If Amdahl’s law holds, then …We better go on & develop sequential codeBecauseInefficiencies and overhead add up:• Compiler• Runtime• Competition for the cpu cores & resources• By the way: (OS) threads are costly to provision, here elastic may become plastic! 29
  30. 30. Thank You. 30
  31. 31. Spare Slides 31
  32. 32. Mythical Walk-ThroughQuantum Field TheoryJones PolynomialKnot TheoryCategory of TanglesCategory Theory“Hask”, Category of Haskell Types (and maps)Haskell 32
  33. 33. OOP• “OOP is eliminated entirely from the introductory curriculum [of Carnegie Mellon University], because it is both anti-modular and anti-parallel by its very nature, and hence unsuitable for a modern CS curriculum.” (Harper, 2011) 33
  34. 34. Common Claims & Expectations• FPLs my let us get away with less duplication.• FPLs are inherently parallel• FPLs are inherently thread safe• FPLs are inherently modular• FPLs are easily testable and maintainable. 34