We do research and development on anti-money laundering and some of the largest banks in the world use products we made every day.
How many of you loved legos when you were kids? Isn’t this what we really want programming to be like? A big pile of little parts, it’s obvious how they work but we don’t want to have to make them all from scratch.I loved legos as a child. They’re simple and intuitive but you can compose them into amazing things.If you had to make the legos yourself from smaller pieces they would be tedious. But big blocks like duplos severely constrain your imagination. Legos sit at just the right level of abstraction.Learning to program in BASIC was more of a challenge though, it ended up being more like weaving and less like building with legos. Tragically, I didn’t discover functional programming until I was almost 30.
At one time I was an imperative guy who wrote imagine processing code in C++ and C#. Those were hard times, full of null exceptions and race conditions. It could take several months to make a product releasable with a sizable team.Then I went to a talk by Rick Hickey and he showed me justhow awesome FP can be.After I learned functional programming in F# my productivity Skyrocketed, bugs disappeared, I was able to make much cooler stuff in a much shorter time. I even had more time because the old stuff required less maintenance. - More than quadrupled my productivityNow I spend all that newly found free time giving talks and arguing with object oriented programmers on the internet .
Here’s where it gets tricky. Just what is functional programming? Well it depends on who you ask. Python users will tell you it’s something that comes in a module, while Haskell programmers will tell you that just about everyone else is faking it. Really, it’s more of a spectrum where as you get more and more functional you gain more and more benefits but also have to give up some things along the way.
Now I could spend days telling you all about functional programming, but there’s one idea in FP that I would say is the most important. That idea is referential transparency.All referential transparency means is that from here I can understand what all the stuff in scope does. It’s all deterministic. For a given input, you’ll always get the same output (unless there’s something like hardware failure).The most interesting thing about referential transparency is that it doesn’t need to hold for your entire program to hold for most of it. You can write that algorithm you know is fast imperatively in that style and if you wrap it intelligently it’s just useful from the outside as if it was done with pure functional programming.But you do lose some confidence about the properties of that function.
As a fuzzy definition – complementary set of convenience features and constraintsFor example, if you know how all the code underneath you works it makes it much easier to ensure safety at a higher level.Constraints make it much easier to think about what your program is doing.
We’re all being dragged along, some faster than others depending on the kind of work we do.In fact, some older programming languages like Python and OCaml were effectively crippled by decisions they made years ago when it looked like we could scale on one CPU forever. In both languages the root of this problem is called “The giant lock”. I hope as we move to even larger scopes those locks become somewhat irrelevant as the cost of having many processes becomes dwarfed by other factors.
Source: http://www.indybay.org/newsitems/2006/05/18/18240941.phpThis is the mandatory Moore's law slide before talking about cloud computing. All aspects of computing will eventually end up looking like the graph on the right. Strange that computation peaked out way before storage and memory, but that’s just how it ended up. Now we’re left with having to find interesting ways to deal with it.
(Raise hands if you’ve seen this slide before)However, Something like Moore’s Law still lives, for now. As you’ve probably heard we’re still gaining ground on the power efficiency front. We can scale out instead of up.
You really can’t escape it, tablets are just the beginning and desktop computers as we know them are on the way out for most people. The unfortunate part of this whole deal is that we can’t apply most of our work directly in any of the scaled out models. So we’re stuck in a world that boldly marches on, dragging us kicking and screaming into a much harder way of doing things. There’s a lot of new things to consider in this brave new world beyond clock cycles.
This is a complete conceptual hazard. It’s really hard to keep all of this in your head at once and still come up with solutions to interesting problems. Each of these things have different sub properties to consider in different situations as well. For example, sometimes just network bandwidth matters, sometimes latency, sometimes both.Getting things done in the global network, we’re going to need a way to reason about these things without having to keep them all in our head at once.
At first, I was convinced that cloud hosting in general was pretty much a big scam, but as the data has grown bigger I’ve seen the error of my ways. Often you don’t need these computers for very long. Maintaining the cluster For many problems you can get near linear scaling, so it’s pretty awesome to be able to fire up a ton of instances, in general this costs about the same as using fewer computers They provide great frameworks and tools for thinking about these kinds of problems that reduce the need to worry about resources so much. They allow you to think in more general terms (like O notation) With each methodology you take on some communication constraints in order to make problems easier to think about While type systems are like constraining a floor that you can’t fall through, cloud computing methodologies in general are more like a ceiling that dictate how parts of your program are combined
In the past three years the number of papers published on algorithms in the cloud has skyrocketed.Rich Hickey – working on datanomicsSimon Payton-Jones – working on parallel haskellWhy? Because it’s hugely useful for solving hard problems and computers just aren’t getting much faster. And there’s the fact that the amount of data lying around is skyrocketing as our storage capabilities continue to increase. What are we going to do with all that data?
MapReduce - You can only do two things, and in this order MPI/Agents – more about how you communicate Generalizations of mapreduceSo what do we do? We force you into one of several choices of computing methodology. Each has different constraints. With multi-paradigm problems you can often fake it for smaller data sets, but as they grow it becomes more and more important to be flexible. - Finally, unlike functional programming, there is no magic escape hatch. Calling a C library is no longer the answer to all of life’s performance problems.
Whirlwind tour!When you first get the cloud computing bug it can all be a bit overwhelming. There’s just a ton of frameworks which each sport very nice benchmarks for the things they choose to benchmark on. Some are mature and others are small projects. They all have limitations and cohorts of enthusiastic followers. You may be familiar with some of these but we’re just going to focus on a few.MPI – Centrally controlled Agents – Can launch each otherMapReduce – Very constraining but people always break the rulesIterative Map-Reduce (Academic/unpolished)Spark – I see as the future of cloud computing, constraining but not so much that you must constantly break the rules
This research was done with 14 off the shelf computers put together by college students in Hungary. It’s one of the first examples on entity resolution on data that can handle what exists right now in the real world.In my business large scale entity resolution with any kind of guarantees seemed like a pipe dream. This paper changed my whole perspective. Sure, we’re measuring time in hours, but if your task is already taking hours on one computer what does it matter.
- Word Count in MapReduce with Java, Pretty much the “Hello Word” of the MapReduce paradigm. This hurts to even look at.Just when I thought I had escaped the tedium of object oriented programming, here I was trying to use a paradigm that fits functional programming like a glove and yet I was reduced to writing pages of code to do a simple word count. To make matters even worse, I had lost my beautiful Visual Studio tooling. The friction was just unbelievable. The thought of how many lines of code a real entity resolution system would be made me a bit queasy to say the least.- Note iteration over elements. Is this really necessary when we can have higher level abstractions?!
- Map -> Choose (Open Map) – one to many- Partition -> Sort and group by key Reduce -> Constrained Reduce – many to many (or fewer)Now, don’t get me wrong. There’s are reasons why many of the largest software companies (including IBM and Microsoft) are embracing Hadoop. It’s big, it schedules well, it’s pretty darn fast and, most importantly it’s mature.
Very simple and clean, you say what you want, not how you want it done
Very similar to Scoobi, although based on Cascading. For some reason the Scalding folks love to use a lot of type annotations.
Both of these programs are doing almost exactly what that java code before was except they are composing little functional subprograms instead of trying to do it all by hand.Actually, scalding is doing a bit more work in this code, as it lowercases and removes punctuation. Otherwise they’re actually quite similar and fairly beautiful to look at.Against the pages of Java it took to do Word Count before this is a god send.I do think the Scoobi looks a bit nicer because it’s a bit less verbose. That’s a matter of style and comfort with the language and less about the frameworks though.
There are just a ton of Hadoop toolkits, but let’s focus on the blue and green ones .Seeing as how Scalding is built on top of cascading, it seems like a poor choice for a small company with limited resources.Here we run into a bit of a problem though. On one hand we have Scalding, a big project by the folks at twitter. On the other we have Scoobi made by OpenNICTAwhich is a small institution with a bend toward scientific computing.Pangool is a slightly less horrible API for JavaScrunch for Crunch
ButMapReduce is just one of many choices for cloud computing paradigms. When you go solve a difficult problem with the cloud your choice should depend on a ton of factors. Can you accomplish what you’re looking to do? What technologies are you comfortable with? Are you comfortable using research software? Most importantly, can I get this done without talking to anyone from IT?
For a smaller company with limited resources like mine, Mesos is quite significant. It allows you to build one cluster and perform many different styles of computation all sharing the same scheduler.Notice that Spark writes a lot like Scoobi, but without all of the ceremony. It also loosens the straps of the standard Map-Reduce straight jacket a bit by allowing you to keep things in memory between iterations.
For a lot of difficult problems Spark is hugely better, but Hadoop has better tooling and a lot of people using it. With Mesos you get the best of both worlds. It’s a recent discovery for me, but I’m already a huge fan.
Just to come back to this for a minute, look at this code and imagine it was your future. The thought of this for myself almost brought me to tears.While the giants of tech like Microsoft and IBM have been asleep at the wheel, functional programmers have been busy solving the hard problems. There’s absolutely no reason to go back to this kind of nightmare, you’d have to be certifiably insane.
Now imagine this, you split the lines, add a number to the word, combine them up and then add them. It reads almost like English.
Not just the ideas from functional programming, So, just as predicted years ago, functional programming has come to dominate at least one aspect of modern computing. Even in using java you are forced to write small programs which are referentially transparent and operate in parallel. But with functional programming you can have small composable referentially transparent parts with hidden implementation you don’t have to care about.
We’ll have some of the Cloud Numericsguys there giving a tutorialhow to do linear algebra in the cloud.
Functional Ideas for a Cloudy Future
Functional Ideas fora Cloudy Future Richard Minerich @Rickasaurus Senior Researcher at Bayard Rock
Properties of FP? (It Depends on Who You Ask)- First Class Functions- Currying, Composition, Combinators- Low Level Abstraction, Metaprogramming- Immutability, Fancy Types, Constraints- Fast Tail Recursion, Scope Minimization
The Spectrum of FunctionalConvenience “FP” ConstraintsMake Life Easy Now Make Life Easy Later
Referential Transparency- Its all about scope!- Mutation only infects in so far as it’s scope- Global variables can be ok, if your referential transparency scope is a process- This can be function, class, thread, process, or even a whole computer
What is Functional Programming?- Complementary convenience and constraints- A highly constrained set of approaches to programing- Where you lose in order to gain- Low level constraints that propagate upwards to the top level of your program
Program Scope over my Career• Largest scope was usually a process with one thread• Then a process with a few threads• Then a process with many threads• Then a few machines• Now a ton of machines
We need to scale out• Desktop apps are going away• Hosted hardware is on the way out• No one cares about little data - But! -• Old algorithms don’t generalize well• New tradeoffs between speed and scope• Too many costs to keep track of
Thinking about Resource CostsFar Machines Far Network Machines Network Processes Disk Threads Memory Instructions Cache
What is Cloud Computing?- More than just a sneaky way to charge a ton for hosting- Paradigms that simply resource management- You always lose in order to gain- High level constraints that propagate downward into your subtasks
Papers Published Over Time(Microsoft Academic Search April 2012) “Cloud Computing” “Type System”
Properties of Cloud Computing- Resources (Network, Disk, Memory, Cache)- What constraints can make this easier? - Force everything into one of a few styles of computation? - What if want we want to do is still possible but doesnt fit our cluster’s paradigm? - Wheres the escape hatch?
Cloud Computing is Functional Programming- Can’t Escape Referential Transparency- Simple Composition is Key to Small Programs- Object Oriented: a Square Peg in a Round Hole
Thanks for Listening! Any Questions?Visit my blog for ants and rants: RichardMinerich.comFollow me on Twitter: @RickasaurusCome to NYC for the SkillsMatter F# Tutorials June 5th and 6Th: is.gd/fsharptutorials