The Promise of Functional Programming by Konrad Hinsen


Published on

ogrammimg style could make your programs more robust,more compact,and more easily parallelizable.However,mastering it requires some effort.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The Promise of Functional Programming by Konrad Hinsen

  1. 1. S cie n t i f ic Programming Editors: Konstantin Läufer, Konrad Hinsen, hinsen@cnrs-orleans.frThe Promisesof Functional ProgrammingBy Konrad HinsenAdopting a functional programming style could make your programs more robust, more compact, and moreeasily parallelizable. However, mastering it requires some effort.S ince the early days of com- puting, software development techniques have changedalmost as much as computer tech-nology itself. Ever more powerful (Alonzo Church’s λ-calculus) and as a programming technique in the 1950s (John McCarthy’s Lisp language)— functional programming. Although functional programming has been probably Mathematica, although most Mathematica users don’t write com- plex functional programs. Another example of symbolic processing is the widely used fast Fourier transform li-hardware made it possible to write very popular in computer science brary, FFTW,1 which uses a function-ever more complex software, which research for several decades, its use al code optimizer written in OCamlboth required ever better develop- for writing real-life programs is rela- to produce optimized C code for ament tools and techniques and made tively recent. Of the many reasons Fourier transform of a given length.their implementation possible. Pro- for this, the two most important are In numerical applications, functionalgrammers thus moved from machine that functional programming is very programming doesn’t yet play an im-code to assembly languages and then different from traditional program- portant role. Perhaps the most ambi-problem-oriented programming lan- ming (also referred to as imperative tious project to introduce it into theguages, which have evolved to inte- programming) and thus requires a lot number-crunching world was Law-grate techniques such as structural of learning and unlearning, and that rence Livermore National Laborato-and object-oriented programming. computer hardware implements the ry’s Streams and Iteration in a SingleAnother evolution went from mono- imperative programming model, so Assignment Language (SISAL) proj-lithic programs via separately compil- imperative programs are easier to ect, which started in 1983. SISAL is aable modules and libraries to software compile into efficient machine code functional language with paralleliza-component technologies. However, in than functional programs. However, tion support, designed specifically forone respect, today’s popular program- several clear signs indicate a growing the needs of scientific applications.ming techniques are still the same as interest in functional programming Unfortunately, it didn’t attract thethose the pioneers used: our programs techniques—recent programming attention it deserved, and fundingconsist of statements that modify data languages (such as Sun’s Fortress or stopped in 1996. Today, SISAL livesstored in the computer’s memory until Microsoft’s F#) have the explicit goal on as an open source project (http://that memory contains the desired re- of supporting it. The reason is that This approach closely resembles functional programming has several This article’s purpose is to explainhow a computer works at the hardware advantages for concurrent and par- what functional programming is andlevel: the processor fetches data from allel programming. Experience also how it differs from traditional impera-memory, performs elementary opera- suggests that functional programs are tive programming. I also explain howtions on it, and writes the result back more robust and easier to test than functional programming helps withto a memory cell. imperative ones. concurrent and parallel program- In fact, this approach is so com- In scientific computing, research- ming. The language I use in the ex-mon that most of you have probably ers have mainly used functional pro- amples is Clojure, a modern dialectnever questioned it. And yet, an al- gramming for symbolic processing. In of Lisp (see the sidebar “Clojure andternative approach was developed fact, the most widely used functional the Lisp Family”), but everything saidas mathematical theory in the 1930s programming language in science is here applies equally to other function-86 Copublished by the IEEE CS and the AIP 1521-9615/09/$25.00 © 2009 IEEE Computing in Science Engineering
  2. 2. Clojure and the Lisp Family T he language used in the main text’s examples is Clojure, a modern dialect of the Lisp language family. Lisp stands for “list processing,” which hints at the motivations of John McCarthy, who developed the first Lisp language in the late 1950s for use in artificial intelligence research. Today, the most widelyal languages, only the syntax will be used Lisp dialects are Scheme and Common Lisp.different. If you’re interested in func- The Lisp language family’s distinctive feature is the principle that “code istional programming, you should also data.” Lisp provides a syntax for a simple yet flexible data structure—nested (or reread) Jerzy Karczmarczuk’s It then defines how to interpret nested lists that follow specific conventions as1999 article in this magazine,2 which executable code. The advantage of expressing code in a data structure is thatillustrates how functional program- Lisp code can easily generate (and then execute) other Lisp code, a fact that pro-ming can yield elegant solutions to grammers have exploited from the beginning through Lisp macros, the world’sproblems in mathematical modeling. first meta-programming system and probably still its most powerful one. Lisp has the reputation of being slow, but that isn’t true anymore. ManyThe Basics modern compilers can produce code that can compete with C in performanceThe fundamental principle of func- when given appropriate optimization hints. However, it remains very difficulttional programming is that you re- to produce programs that are both efficient and portable between compilersalize a computation by composing and platforms.functions. The word “function” is Clojure is a recent Lisp dialect that sets itself apart by three features: it hasused here in the mathematical sense— four highly optimized data structures (lists, vectors, maps, and sets) designeda mapping from input values to output for pure functional programming, it offers extensive support for concurrency,values—whereas what most program- and it was designed for the Java Virtual Machine with the goal of easy inter­ming languages call “functions” are operability with other JVM languages, including Java itself. For more informa-subroutines that return a value. One tion about Clojure, see its Web site at difference is that a func-tion in the mathematical sense alwaysproduces the same output when giventhe same input. An operation such as write a useful program in a functional guage like Python and JavaScript). If n“get the next line from a file” isn’t a style, but keep reading. is zero, the return value of countdownfunction because each time you call, What replaces loops in functional is (list 0), a list containing the sin-it produces a different return value. programs is recursion. A function gle element 0. Otherwise (we’re look-Another important difference is that is called recursive if it calls itself—­ ing at lines four to six now), the returna mathematical function doesn’t “do” directly or indirectly. Of course, value is a list constructed by prepend-anything other than return a value. It calling itself makes sense only if the ing n to the return value of count-isn’t supposed to have side effects—for arguments change between calls. down for (- n 1), which is Clojure’sexample, it shouldn’t write anything to Moreover, if you ever want to get out way of writing n-1. The function calla file or change a variable in memory. of the call-yourself chain, a recursive (countdown 1) thus returns (cons If a program is composed of func- function must return without calling 1 (countdown 0)), which, after ex-tions, and functions aren’t supposed itself for some arguments. Let’s look ecuting the recursive function call,to change any variables, then what are at a simple example of recursion: becomes (cons 1 (list 0)). Look-variables good for? Nothing, and that’s ing up the definitions of the functionswhy functional programming doesn’t (defn countdown [n] cons and list will tell you that thehave variables. This is probably the (if (zero? n) final result is the two-element list (1biggest surprise to those who discover (list 0) 0). This simple example illustratesfunctional programming because vari- (cons n how to use recursion for looping: withables are so very fundamental to our (countdown each recursive call, the argument be-traditional ways of writing programs. (- n 1))))) comes smaller, up to the point whereThe other missing fundamental con- it’s handled directly without any fur-cept is loops. After all, what’s the point These six lines define a function ther recursion.of a loop if nothing can change be- countdown of a single argument n, A closer look at the chain of recur-tween iterations because there are no which should be an integer (although sive calls to countdown also revealsvariables? By now, you might be con- this is neither said nor enforced, Clo- why functional programming can livevinced that it’s absolutely impossible to jure being a dynamicallly typed lan- without variables. At each recursiveJuly/August 2009 87
  3. 3. S cie n t i f ic P r o g r a m m i n gcall, n decreases by one, which looks a to the programmer. As is so often the are called higher-order functions, as op-bit as if n were a variable decremented case, both approaches have their good posed to first-order functions whosefrom its initial value to zero; indeed, and bad sides. arguments and return values are alla compiler could transform the recur- standard data items. In mathematics,sive function into a subroutine with a Functional Abstractions you would use the term operator, anloop over n for efficiency reasons. The Abstractions are fundamental to writ- example being the derivative operatorcrucial difference is that n isn’t a refer- ing nontrivial programs. They permit that maps a function to its derivative,ence to a piece of memory that could expressing an algorithm in terms of which is itself a modified at will, intentionally or concepts that are useful in its context, As a simple example, let’s considerby mistake. In fact, as anyone with rather than in terms of operations that the following operations: calculatingdebugging experience has learned the the computer or the programming the sum of a list of numbers, calculat-hard way, the problem with variables language already happens to provide. ing the product of a list of numbers,is often that looking at a variable’s A programmer would write a least- and finding the set of all items that oc-value doesn’t tell you where that value squares fit problem, for example, in cur in a list of values. What these (andcame from. In contrast, you can al- terms of linear-algebra operations— many more) operations have in com-ways trace the call chain that leads to such as matrix multiplication and mon is a simple algorithmic pattern:n having a specific value at a specific solving linear systems of equations— you start with an initially “empty” ac-point in the function countdown back that work on an array data structure. cumulator value (0, 1, the empty set)to its beginning. The function call Compilers and libraries (written by and then iterate over a list, combiningchain is a complete description of the you or by someone else) then trans- at each step the current accumula-data flow through the program, and form the algorithm into something tor value with one list element. Theit’s a very useful feature for verifying a that a computer can handle. combination operations for the threeprogram’s correctness. The abstractions provided by popu- examples given are addition, multi- I hope I’ve convinced you that vari- lar programming languages for sci- plication, and adding an item to a set.ables and loops aren’t as essential as entific computing include basic data This algorithmic pattern is known asyou might have thought. But what structures (integer, real and complex reduction; it’s implemented in Clojureabout other side effects? Is it really numbers, arrays, and structures) via the function reduce. We can thuspractical to work with programs that and a notation for numerical expres- write our three examples ascan’t write data to a file? Or, in fact, sions that’s similar to mathemati-produce any output? Of course, the cal notation. Programmer-provided (defn sum [numbers]answer is no: the pure functional pro- abstractions are mainly subroutines. (reduce + 0 numbers))grams I’ve described to this point will Object-oriented languages add power- (defn product [numbers]just heat up your computer. You need ful constructs for data abstraction: the (reduce * 1 numbers))side effects if you want your program programmer can add problem-specific (defn set-of-items [items]to have any interaction with the real data types to the general ones provid- (reduce conj #{} items))world. But you can—and should— ed by compiler and runtime systems.limit side effects to very few places in In functional programming, al- The first argument to reduce is thea program. gorithmic abstractions are the most combination operation, which is a We can categorize functional pro- prominent. A developer identifies and function of two arguments. In Clojure,gramming languages by their attitude implements patterns that occur re- we can simply use + and * for additionto unavoidable side effects. Pure lan- peatedly in algorithms in the form of and multiplication because they’reguages (such as Haskell) allow them functions, which is made possible by plain functions—there’s no specialonly inside special language con- the fact that functional programming notation for mathematical operators.structs, permitting the compiler to languages consider functions to be In the last example, conj is a functionverify the absence of accidental side data. It’s possible, and even very com- that adds an item to a collection.effects everywhere else. Impure lan- mon, to write functions that take oth- It’s interesting to see how we couldguages (the majority) leave the respon- er functions as parameters, returning implement reduce if it weren’t provid-sibility for the use of side effects fully yet another function. Such functions ed already. Here’s one way to write it:88 Computing in Science Engineering
  4. 4. Other Functional Languages is possible only in a pure functional setting because side effects would otherwise occur at completely unpredictable W e can group the most popular languages that sup- port functional programming into just a few families. The oldest one, the Lisp family, is described in the “Clojure times. Lazy evaluation makes it possible to work with infi- nite data structures, avoid unnecessary computations, and define control structures as simple functions. However, lazy and the Lisp Family” sidebar. evaluation also makes a program’s CPU time and memory The second group is the ML family, which first appeared usage profile more difficult to understand and generally in the 1970s. Today, its most prominent members are Stan- leads to slower programs overall because of unavoidable dard ML and OCaml, but Microsoft’s recently published F# bookkeeping overhead. language might soon catch up with them. The ML languag- Two recent languages, Scala (for the Java Virtual Ma- es differ from the Lisp family in two respects: they’re stati- chine) and Nemerle (for the .NET platform), are hybrid cally typed, using the Hindley-Milner inference algorithm to languages that add functional programming features to permit the compiler to deduce the types of most functions otherwise quite standard object-oriented languages. Sun’s from the way they’re used, and they propose a pattern- new Fortress language, whose main intended application matching syntax for defining functions that makes many domain is high-performance computing, also proposes a definitions look similar to common mathematical notation. mixture of functional and object-oriented features. The Haskell language is the result of a collective effort to Moving on to special-purpose languages, Wolfram’s define a common functional language for use in program- computer algebra system Mathematica is based on a ming language research. It’s similar in many respects to the proprietary functional programming language. Other ML family, the most important difference being lazy evalu- computer algebra systems also propose functional features, ation: a data item (such as a list element) is evaluated only although not always to the point of allowing a full func- when its value is actually required in a computation. This tional programming style.(defn my-reduce make-adder that takes a numerical eral execution threads operating on [op initial items] argument x and returns a function of the same data) and parallelism (the (if (empty? items) another numerical argument y that division of a computational task into initial adds x and y: multiple communicating processes (my-reduce running in parallel) are two aspects op (defn make-adder [x] of computing rapidly gaining in im- (op initial (fn [y] (+ x y))) portance. The main reason is that (first items)) single-processor performance is no (rest items)))) Calling (make-adder 2) returns a longer improving at the pace it used function that adds 2 to its argument. to; instead, computers are becom-This is again a recursive function. If We can use this function just like any ing more powerful by integratingits input list items is empty, it just re- other one, such as more computational cores. Exploitingturns the initial value. This is the re- such machines requires concurrency,cursion’s exit; otherwise, it applies the ((make-adder 2) 3) parallelism, or both. Unfortunately,function op to the initial value and the today’s mainstream techniques forfirst element of the list ((op ini- which yields 5. Note that the result concurrent and parallel programmingtial (first items))) and feeds of (make-adder 2) is a function are difficult to learn and quite error-the result to a recursive call on what’s that stores an integer value (2) inter- prone in practice.left of the list after removing the first nally that was passed as an argument The big issue with concurrency iselement ((rest items)). Note that to make-adder. A function that cap- the difficulty of maintaining the dataClojure doesn’t treat the argument op tures a value in this way is called a clo- in a coherent state: it must be impos-in any special way merely because it’s sure; it’s a widely used programming sible for one execution thread to mod-a function. Functions are perfectly technique in functional programs. As ify data that another one is accessingordinary data items, just like integers you might have guessed, this is where (reading or writing) at the same time.and text strings. the Clojure language derived its name, So, if several threads need to modify a Functions can also create and re- with the “j” hinting at Java. data item, they must do so in a coordi-turn other functions, as illustrated in nated way. This is currently achievedthe following (somewhat contrived) Concurrency and Parallelism via locks, but they’re difficult to use,example, which defines a function Concurrency (the existence of sev- and their incorrect use can go un-July/August 2009 89
  5. 5. S cie n t i f ic P r o g r a m m i n gnoticed for a long time before errors Functional programming is fre- while guaranteeing that the program’sshow up. As for parallelism, the main quently cited as a promising tech- result won’t change.difficulties are identifying independent nique in this context. Pure functionalcomputations inside a program and code has no variables and thus no datacoordinating them with the requiredcommunication operations such thatthe resulting program always produces coherence issues or need for locking. Moreover, all data dependencies are explicit, making it possible to apply a I f this makes you hope that auto- matic parallelization will be your welcome gift once you succeed inthe correct result and does so efficient- large number of program transforma- entering the world of functionally for typical input parameters. tions (with the goal of parallelization) programming, you’re in for disap- pointment. Although it’s true that compilers for functional languages could in principle transform a se- rial into an equivalent parallel pro- gram automatically, there remains the problem of finding such a trans- The American Institute of Physics is a not-for-profit membership corporation chartered in New York State in 1931 for the purpose of promoting the advancement and diffusion formation that actually yields an ef- of the knowledge of physics and its application to human welfare. Leading societies in ficient program for a given parallel the fields of physics, astronomy, and related sciences are its members. computer and given input data. Com- In order to achieve its purpose, AIP serves physics and related fields of science and piler technology isn’t yet up to this technology by serving its member societies, individual scientists, educators, students, task, although this could well change RD leaders, and the general public with programs, services, and publications— in the future, in particular with par- information that matters. The Institute publishes its own scientific journals as well as allelizing just-in-time compilers that those of its member societies; provides abstracting and indexing services; provides have access to a program’s execution online database services; disseminates reliable information on physics to the public; collects and analyzes statistics on the profession and on physics education; encourages time profile. For the near future, it’s and assists in the documentation and study of the history and philosophy of physics; reasonable to expect compilers that cooperates with other organizations on educational projects at all levels; and collects create parallel programs semiauto- and analyzes information on federal programs and budgets. matically based on programmer-pro- The scientists represented by the Institute through its member societies number more vided performance hints. This would than 134 000. In addition, approximately 6000 students in more than 700 colleges and already be a significant step forward universities are members of the Institute’s Society of Physics Students, which includes compared to today’s parallel pro- the honor society Sigma Pi Sigma. Industry is represented through the membership of 37 gramming techniques. Corporate Associates. Governing Board: Louis J. Lanzerotti (chair)*, Lila M. Adair, David E. Aspnes, Anthony References Atchley*, Arthur Bienenstock, Charles W. Carter Jr*, Timothy A. Cohn*, Bruce H. Curran*, 1. M. Frigo and S.G. Johnson, “The Design and Morton M. Denn*, Alexander Dickison, Michael D. Duncan, H. Frederick Dylla (ex Implementation of FFTW3,” Proc. IEEE, vol. officio)*, Janet Fender, Judith Flippen-Anderson, Judy R. Franz*, Brian J. Fraser, Jaime 93, no. 2, 2005, pp. 216–231. Fucugauchi, John A. Graham, Timothy Grove, Mark Hamilton, Warren W. Hein*, William 2. J. Karczmarczuk, “Scientific Computation and Hendee, James Hollenhorst, Judy C. Holoviak, Leo Kadanoff, Angela R. Keyser, Timothy Functional Programming,” Computing in Sci- L. Killeen, Harvey Leff, Rudolf Ludeke*, Kevin B. Marvel*, Patricia Mooney, Cherry ence Eng., vol. 1, no. 3, 1999, pp. 64–72. Murray, Elizabeth A. Rogan*, Bahaa E. A. Saleh, Charles E. Schmid, Joseph Serene, Benjamin B. Snavely (ex officio)*, A. F. Spilhaus Jr, Gene Sprouse, Hervey (Peter) Stockman, Quinton L. Williams. *Members of the Executive Committee. Konrad Hinsen is a researcher at the Centre de Biophysique Moléculaire in Orléans and at Management Committee: H. Frederick Dylla, Executive Director and CEO; Richard the Synchrotron Soleil in Saint Aubin, France. Baccante, Treasurer and CFO; Theresa C. Braun, Vice President, Human Resources; Catherine O’Riordan, Vice President, Physics Resources; John S. Haynes, Vice His research interests include protein struc- President, Publishing; Benjamin B. Snavely, Secretary. ture and dynamics and scientific computing. Hinsen has a PhD in theoretical physics from w w w. RWTH Aachen University, Germany. Contact him at Computing in Science Engineering
  6. 6. This article was featured in For access to more content from the IEEE Computer Society, see articles, podcasts, and more.