Expressiveness, Simplicity and Users

Expressiveness, Simplicity, and Users Craig Chambers Google

A Brief Bio MIT: 82-86 Argus, with Barbara Liskov, Bill Weihl, Mark Day Stanford: 86-91 Self, with David Ungar, UrsHölzle, … U. of Washington: 91-07 Cecil, MultiJava, ArchJava; Vortex, DyC, Rhodium, ... Jeff Dean, Dave Grove, Jonathan Aldrich, Todd Millstein, Sorin Lerner, … Google: 07- Flume, …

Some Questions What makes an idea successful? Which ideas are adopted most? Which ideas have the most impact?

Outline Some past projects Self language, Self compiler Cecil language, Vortex compiler A current project Flume: data-parallel programming system

Self Language[Ungar & Smith 87] Purified essence of Smalltalk-like languages all data are objects no classes all actions are messages field accesses, control structures Core ideas are very simple widely cited and understood

Self v2[Chambers, Ungar, Chang 91] Added encapsulation and privacy Added prioritized multiple inheritance supported both ordered and unordered mult. inh. Sophisticated, or complicated? Unified, or kitchen sink? Not adopted; dropped from Self v3

Self Compiler[Chambers, Ungar 89-91] Dynamic optimizer (an early JIT compiler) Customization: specialize code for each receiver class Class/type dataflow analysis; lots of inlining Lazy compilation of uncommon code paths 89: customization + simple analysis: effective 90: + complicated analysis: more effective but slow 91: + lazy compilation: still more effective, and fast [Hölzle, … 92-94]: + dynamic type feedback: zowie! Simple analysis + type feedback widely adopted

Cecil Language[Chambers, Leavens, Millstein, Litvinov 92-99] Pure objects, pure messages Multimethods, static typechecking encapsulation modules, modular typechecking constraint-based polymorphic type system integrates F-bounded poly. and “where” clauses later: MultiJava, EML [Lee], Diesel, … Work on multimethods, “open classes” is well-known Multimethods not widely available 

Vortex Compiler[Chambers, Dean, Grove, Lerner, … 94-01] Whole-program optimizer, for Cecil, Java, … Class hierarchy analysis Profile-guided class/type feedback Dataflow analysis, code specialization Interprocedural static class/type analysis Fast context-insensitive [Defouw], context-sensitive Incremental recompilation; composable dataflow analyses Project well-known CHA: my most cited paper; a very simple idea More-sophisticated work less widely adopted

Some Other Work DyC [Grant, Philipose, Mock, Eggers 96-00] Dynamic compilation for C ArchJava, AliasJava, … [Aldrich, Notkin 01-04 …] PL support for software architecture Cobalt, Rhodium [Lerner, Millstein 02-05 …] Provably correct compiler optimizations

Trends Simpler ideas easier to adopt Sophisticated ideas need a simple story to be impactful Ideal: “deceptively simple” Unification != Swiss Army Knife Language papers have had more citations;compiler work has had more practical impact The combination can work well

A Current Project:Flume[Chambers, Raniwala, Perry, ... 10] Make data-parallel MapReduce-like pipelineseasy to write yetefficient to run

Data-Parallel Programming Analyze & transform large, homogeneous data sets, processing separate elements in parallel Web pages Click logs Purchase records Geographical data sets Census data … Ideal: “embarrassingly parallel” analysis ofpetabytes of data

Challenges Parallel distributed programming is hard To do: Assign machines Distribute program binaries Partition input data across machines Synchronize jobs, communicate data when needed Monitor jobs Deal with faults in programs, machines, network, … Tune: stragglers, work stealing, … What if user is a domain expert, not a systems/PL expert?

MapReduce[Dean & Ghemawat, 04] purchases queries map item -> co-item term -> hour+city shuffle item -> all co-items term-> (hour+city)* reduce item -> recommend term-> what’s hot, when

MapReduce Greatly eases writing fault-tolerant data-parallel programs Handles many tedious and/or tricky details Has excellent (batch) performance Offers a simple programming model Lots of knobs for tuning Pipelines of MapReduces? Additional details to handle temp files pipeline control Programming model becomes low-level

Flume Ease task of writing data-parallel pipelines Offer high-level data-parallel abstractions,as a Java or C++ library Classes for (possibly huge) immutable collections Methods for data-parallel operations Easily composed to form pipelines Entire pipeline in a single program Automatically optimize and execute pipeline,e.g., via a series of MapReduces Manage lower-level details automatically

Flume Classes and Methods Core data-parallel collection classes: PCollection<T>, PTable<K,V> Core data-parallel methods: parallelDo(DoFn) groupByKey() combineValues(CombineFn) flatten(...) read(Source), writeTo(Sink), … Derive other methods from these primitives: join(...), count(), top(CompareFn,N), ...

Example: TopWords PCollection<String> lines =read(TextIO.source(“/gfs/corpus/*.txt”)); PCollection<String> words =lines.parallelDo(newExtractWordsFn()); PTable<String, Long> wordCounts =words.count(); PCollection<Pair<String, Long>> topWords =wordCounts.top(newOrderCountsFn(), 1000); PCollection<String>formattedOutput =topWords.parallelDo(newFormatCountFn()); formattedOutput.writeTo(TextIO.sink(“cnts.txt”)); FlumeJava.run();

Example: TopWords read(TextIO.source(“/gfs/corpus/*.txt”)) .parallelDo(newExtractWordsFn()) .count() .top(new OrderCountsFn(), 1000) .parallelDo(new FormatCountFn()) .writeTo(TextIO.sink(“cnts.txt”)); FlumeJava.run();

Execution Graph Data-parallel primitives (e.g., parallelDo) are “lazy” Don’t actually run right away, but wait until demanded Calls to primitives build an execution graph Nodes are operations to be performed Edges are PCollections that will hold the results An unevaluated result PCollection is a “future” Points to the graph that computes it Derived operations (e.g., count, user code) call lazy primitives and so get inlined away Evaluation is “demanded” by FlumeJava.run() Optimizes, then executes

read read(TextIO.source(“/…/*.txt”)) pDo parallelDo(newExtractWordsFn()) pDo count() gbk Execution Graph cv pDo gbk top(new OrderCountsFn(), 1000) pDo pDo parallelDo(new FormatCountFn()) write writeTo(TextIO.sink(“cnts.txt”))

Optimizer Fuse trees of parallelDo operations into one Producer-consumer,co-consumers (“siblings”) Eliminate now-unused intermediate PCollections Form MapReduces pDo + gbk + cv + pDo MapShuffleCombineReduce (MSCR) General: multi-mapper, multi-reducer, multi-output pDo pDo pDo pDo pDo pDo

read read(TextIO.source(“/…/*.txt”)) mscr pDo pDo parallelDo(newExtractWordsFn()) pDo count() gbk Final Pipeline Fusion cv mscr pDo 8 operations 2 operations gbk top(new OrderCountsFn(), 1000) pDo pDo pDo parallelDo(new FormatCountFn()) write writeTo(TextIO.sink(“cnts.txt”))

Executor Runs each optimized MSCR If small data, runs locally, sequentially develop and test in normal IDE If large data, runs remotely, in parallel Handles creating, deleting temp files Supports fast re-execution of incomplete runs Caches, reuses partial pipeline results

Another Example: SiteData GetPScoreFn, GetVerticalFn pDo pDo pDo GetDocInfoFn gbk PickBestFn cv pDo pDo pDo join() gbk pDo pDo MakeDocTraitsFn

Another Example: SiteData pDo pDo pDo pDo mscr mscr pDo gbk cv pDo pDo pDo 11 ops 2 ops gbk pDo pDo pDo

Experience FlumeJava released to Google users in May 2009 Now: hundreds of pipelines run by hundreds of users every month Real pipelines process megabytes <=> petabytes Users find FlumeJava a lot easier than MapReduce Advanced users can exert control over optimizer and executor if/when necessary But when things go wrong, lower abstraction levels intrude

How Well Does It Work? How does FlumeJava compare in speed to: an equally modular Java MapReduce pipeline? a hand-optimized Java MapReduce pipeline? a hand-optimized Sawzall pipeline? Sawzall: language for logs processing How big are pipelines in practice? How much does the optimizer help?

Current and Future Work FlumeC++ just released to Google users Auto-tuner Profile executions,choose good settings for tuning MapReduces Other execution substrates than MapReduce Continuous/streaming execution? Dynamic code generation and optimization?

A More Advanced Approach Apply advanced PL ideas to the data-parallel domain A custom language tuned to this domain A sophisticated static optimizer and code generator An integrated parallel run-time system

Lumberjack A language designed for data-parallel programming An implicitly parallel model All collections potentially PCollections All loops potentially parallel Functional Mostly side-effect free Concise lambdas Advanced type system to minimize verbosity

Static Optimizer Decide which collections are PCollections,which loops are parallel loops Interprocedural context-sensitive analysis OO type analysis side-effect analysis inlining dead assignment elimination …

Parallel Run-Time System Similar to Flume’s run-time system Schedules MapReduces Manages temp files Handles faults

Result: Not Successful A new language is a hard sell to most developers Language details obscure key new concepts Hard to be proficient in yet another language with yet another syntax Libraries? Increases risk to their projects Optimizer constrained by limits of static analysis

Response: FlumeJava Replace custom language with Java + Flume library More verbose syntactically ,[object Object]

All standard libraries & coding idioms preserved

Easy to try out, easy to like, easy to adopt

Expressiveness, Simplicity and Users

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (11)

Similar to Expressiveness, Simplicity and Users

Similar to Expressiveness, Simplicity and Users (20)

More from greenwop

More from greenwop (9)

Recently uploaded

Recently uploaded (20)

Expressiveness, Simplicity and Users