What’s in Store for Scala? Martin Odersky DEVOXX 2011
What’s in Store for Scala? Martin Odersky Typesafe and EPFL
Scala Today
Some adoption vectors: Web platforms Trading platforms Financial modeling Simulation Fast to first product, scalable afterwards
Scala 2.9: Parallel and concurrent computing libraries DelayedInit and App Faster REPL Progress on IDEs: Eclipse, IntelliJ, Neatbeans, ENSIME Better docs
Play Framework 2.0 Play Framework is an open source web application framework, heavily inspired by Ruby on Rails, for Java and Scala Play Framework 2.0 retains full Java support while moving to a  Scala core  and builds on key pieces of the Typesafe Stack, including Akka middleware and SBT Play will be integrated in TypeSafe stack 2.0 Typesafe will contribute to development and provide commercial support and maintenance.
Scala Eclipse IDE 2.0 The Scala Eclipse IDE has been completely reworked. First Release Candidate for IDE 2.0 is available now. Goals: reliable  (no crashes/lock ups) responsive  (never wait when typing) work with large projects/files
Scala 2.10: New reflection framework Reification type  Dynamic More IDE improvements: find-references, debugger, worksheet. Faster builds SIPs: string interpolation, simpler implicits. ETA: Early 2012.
“ Scala” comes from “Scalable” Scalability: Powerful concepts that hold up from small to large At JavaOne - Scala in the small:  Winner of the Script Bowl (Thank you, Dick!) This talk - Scala in the large: Scale to many cores Scale to large systems
Scaling to Many Cores The world of mainstream software is changing: Moore’s law now achieved  by increasing # of cores  not clock cycles Huge volume workloads  that require horizontal scaling “ PPP” Grand Challenge Data from Kunle Olukotun, Lance Hammond, Herb Sutter, Burton Smith, Chris Batten, and Krste Asanovic “ The free lunch is over”
Concurrency and Parallelism Parallel  programming   Execute programs faster on   parallel hardware.  Concurrent  programming   Manage concurrent execution     threads explicitly. Both are too hard!
The Root of The Problem Non-determinism  caused by  concurrent threads  accessing  shared mutable  state. It helps to encapsulate state in actors  or transactions, but the fundamental  problem stays the same. So,   non-determinism  =  parallel processing   +  mutable state To get deterministic processing, avoid the mutable state! Avoiding mutable state means programming  functionally . var  x = 0 async { x = x + 1 } async { x = x * 2 } // can give 0, 1, 2
Space vs Time Time (imperative/concurrent) Space (functional/parallel)
Scala is a Unifier Agile, with lightweight syntax  Object-Oriented  Scala  Functional Safe and performant, with strong static tpying
Scala is a Unifier Agile, with lightweight syntax  Parallel Object-Oriented  Scala  Functional Sequential Safe and performant, with strong static tpying
Scala’s Toolbox
Different Tools for Different Purposes Parallelism : Parallel Collections Collections Distributed Collections Parallel DSLs Concurrency : Actors  Software transactional memory   Akka Futures
Let’s see an example:
A class ... public   class  Person { public final  String  name ; public final   int   age ; Person(String name,  int  age) { this . name  = name; this . age  = age; } } class  Person( val  name: String,    val  age:  Int ) ... in Java: ... in Scala:
... and its usage import  java.util.ArrayList; ... Person[]  people ; Person[]  minors ; Person[]  adults ; {  ArrayList<Person> minorsList =  new  ArrayList<Person>(); ArrayList<Person> adultsList =  new  ArrayList<Person>(); for  ( int  i = 0; i <  people . length ; i++) ( people [i]. age  < 18 ? minorsList : adultsList)   .add( people [i]); minors  = minorsList.toArray( people ); adults  = adultsList.toArray( people ); } ... in Java: ... in Scala: val  people:  Array [Person] val   (minors, adults) = people partition (_.age < 18) A simple pattern match An infix method call A function value
Going Parallel ? (for now) ... in Java: ... in Scala: val  people:  Array [Person] val   (minors, adults) = people .par  partition (_.age < 18)
Parallel Collections Use Java 7 Fork Join framework Split work by number of Processors Each Thread has a work queue that is split exponentially. Largest on end of queue Granularity balance against scheduling overhead On completion threads “work steals” from end of other thread queues
General Collection Hierarchy GenTraversable GenIterable GenSeq Traversable Iterable Seq ParIterable ParSeq
Going Distributed Can we get the power of parallel collections to work on 10’000s of computers? Hot technologies: MapReduce (Google’s and Hadoop) But not everything is easy to fit into that mold Sometimes 100’s of map-reduce steps are needed. Distributed collections retain most operations, provide a powerful frontend for MapReduce computations. Scala’s uniform collection model is designed to also accommodate parallel and distributed. Projects at Google (Cascade), Berkeley (Spark), EPFL.
The Future Scala’s persistent collections are easy to use: concise: safe: fast: scalable: We see them play a rapidly increasing role in software development. few steps to do the job one word replaces a whole loop type checker is really good at catching errors collection ops are tuned, can be parallelized one vocabulary to work on all kinds of collections: sequential, parallel, or distributed.
Going further: Parallel DSLs But how do we keep a tomorrow’s hardware loaded? How to find and deal with 10000+ threads in an application? Parallel collections and actors are necessary but not sufficient for this. Our bet for the mid term future: parallel embedded DSLs. Find parallelism in domains: physics simulation, machine learning, statistics, ... Joint work with Kunle Olukuton, Pat Hanrahan @ Stanford. EPFL side funded by ERC.
EPFL / Stanford Research Applications Domain Specific Languages Heterogeneous Hardware DSL Infrastructure OOO Cores SIMD Cores Threaded Cores Specialized Cores Programmable Hierarchies Scalable  Coherence Isolation & Atomicity On-chip Networks Pervasive Monitoring Domain Embedding Language ( Scala ) Virtual Worlds Personal Robotics Data informatics Scientific Engineering Physics ( Liszt ) Scripting Probabilistic (RandomT) Machine Learning ( OptiML ) Rendering Parallel Runtime ( Delite, Sequoia, GRAMPS ) Dynamic Domain Spec. Opt. Locality Aware Scheduling Staging Polymorphic Embedding Task & Data Parallelism Hardware Architecture Static Domain Specific Opt.
Example: Liszt - A DSL for Physics Simulation Mesh-based Numeric Simulation Huge domains  millions of cells Example: Unstructured Reynolds-averaged Navier Stokes (RANS) solver Fuel injection Transition Thermal Turbulence Turbulence Combustion
Liszt as Virtualized Scala val // calculating scalar convection (Liszt) val Flux = new Field[Cell,Float] val Phi = new Field[Cell,Float] val cell_volume = new Field[Cell,Float] val deltat = .001 ... untilconverged { for(f <- interior_faces) { val flux = calc_flux(f) Flux(inside(f)) -= flux Flux(outside(f)) += flux } for(f <- inlet_faces) { Flux(outside(f)) += calc_boundary_flux(f) } for(c <- cells(mesh)) { Phi(c) += deltat * Flux(c) /cell_volume(c) } for(f <- faces(mesh)) Flux(f) = 0.f } AST Hardware DSL Library Optimisers Generators … … Schedulers GPU, Multi-Core, etc
New in Scala 2.10: Reflection Previously: Needed to use Java reflection, no runtime info available on Scala’s types. Now you can do:
(Bare-Bones) Reflection in Java Want to know whether type A conforms to B? Write your own Java compiler! Why not add some meaningful operations? Need to write essential parts of a compiler (hard). Need to ensure that both compilers agree (almost impossible).
How to do Better? Problem is managing dependencies between compiler and reflection. Time to look at DI again. Dependency Injection Idea: Avoid hard dependencies to specific classes. Instead of calling specific classes with  new , have someone else do the wiring.
Using Guice for Dependency Injection (Example by Jan Kriesten)
... plus some Boilerplate
Dependency Injection in Scala Components are classes or traits Requirements are abstract values Wiring by implementing requirement values But what about cyclic dependencies?
The Cake Pattern Requirements are  types of  this Components are traits Wiring by mixin composition
Cake Pattern in the Compiler The Scala compiler uses the cake pattern for everything Here’s a schema: (In reality there are about ~20 slices in the cake.)
Towards Better Reflection Can we unify the core parts of the compiler and reflection? Compiler  Reflection Different requirements: Error diagnostics, file access, classpath handling - but we are close!
Compiler Architecture reflect.internal.Universe nsc.Global  (scalac) reflect.runtime.Mirror Problem: This exposes way too much detail!
Complete Reflection Architecture Cleaned-up facade: Full implementation: reflect.internal.Universe nsc.Global  (scalac) reflect.runtime.Mirror reflect.api.Universe / reflect.mirror
How to Make a Facade The Facade The Implementation Interfaces are not enough!
Conclusion Scala is a very regular language when it comes to composition: Everything can be nested: classes, methods, objects, types Everything can be abstract: methods, values, types The type of  this   can be declared freely, can thus express dependencies  This gives great flexibility for SW architecture, allows us to attack previously unsolvable problems.
Follow us on twitter: @typesafe scala-lang.org typesafe.com akka.io scala-lang.org

Devoxx

  • 1.
    What’s in Storefor Scala? Martin Odersky DEVOXX 2011
  • 2.
    What’s in Storefor Scala? Martin Odersky Typesafe and EPFL
  • 3.
  • 4.
    Some adoption vectors:Web platforms Trading platforms Financial modeling Simulation Fast to first product, scalable afterwards
  • 5.
    Scala 2.9: Paralleland concurrent computing libraries DelayedInit and App Faster REPL Progress on IDEs: Eclipse, IntelliJ, Neatbeans, ENSIME Better docs
  • 6.
    Play Framework 2.0Play Framework is an open source web application framework, heavily inspired by Ruby on Rails, for Java and Scala Play Framework 2.0 retains full Java support while moving to a Scala core and builds on key pieces of the Typesafe Stack, including Akka middleware and SBT Play will be integrated in TypeSafe stack 2.0 Typesafe will contribute to development and provide commercial support and maintenance.
  • 7.
    Scala Eclipse IDE2.0 The Scala Eclipse IDE has been completely reworked. First Release Candidate for IDE 2.0 is available now. Goals: reliable (no crashes/lock ups) responsive (never wait when typing) work with large projects/files
  • 8.
    Scala 2.10: Newreflection framework Reification type Dynamic More IDE improvements: find-references, debugger, worksheet. Faster builds SIPs: string interpolation, simpler implicits. ETA: Early 2012.
  • 9.
    “ Scala” comesfrom “Scalable” Scalability: Powerful concepts that hold up from small to large At JavaOne - Scala in the small: Winner of the Script Bowl (Thank you, Dick!) This talk - Scala in the large: Scale to many cores Scale to large systems
  • 10.
    Scaling to ManyCores The world of mainstream software is changing: Moore’s law now achieved by increasing # of cores not clock cycles Huge volume workloads that require horizontal scaling “ PPP” Grand Challenge Data from Kunle Olukotun, Lance Hammond, Herb Sutter, Burton Smith, Chris Batten, and Krste Asanovic “ The free lunch is over”
  • 11.
    Concurrency and ParallelismParallel programming Execute programs faster on parallel hardware. Concurrent programming Manage concurrent execution threads explicitly. Both are too hard!
  • 12.
    The Root ofThe Problem Non-determinism caused by concurrent threads accessing shared mutable state. It helps to encapsulate state in actors or transactions, but the fundamental problem stays the same. So, non-determinism = parallel processing + mutable state To get deterministic processing, avoid the mutable state! Avoiding mutable state means programming functionally . var x = 0 async { x = x + 1 } async { x = x * 2 } // can give 0, 1, 2
  • 13.
    Space vs TimeTime (imperative/concurrent) Space (functional/parallel)
  • 14.
    Scala is aUnifier Agile, with lightweight syntax Object-Oriented Scala Functional Safe and performant, with strong static tpying
  • 15.
    Scala is aUnifier Agile, with lightweight syntax Parallel Object-Oriented Scala Functional Sequential Safe and performant, with strong static tpying
  • 16.
  • 17.
    Different Tools forDifferent Purposes Parallelism : Parallel Collections Collections Distributed Collections Parallel DSLs Concurrency : Actors Software transactional memory Akka Futures
  • 18.
  • 19.
    A class ...public class Person { public final String name ; public final int age ; Person(String name, int age) { this . name = name; this . age = age; } } class Person( val name: String, val age: Int ) ... in Java: ... in Scala:
  • 20.
    ... and itsusage import java.util.ArrayList; ... Person[] people ; Person[] minors ; Person[] adults ; { ArrayList<Person> minorsList = new ArrayList<Person>(); ArrayList<Person> adultsList = new ArrayList<Person>(); for ( int i = 0; i < people . length ; i++) ( people [i]. age < 18 ? minorsList : adultsList) .add( people [i]); minors = minorsList.toArray( people ); adults = adultsList.toArray( people ); } ... in Java: ... in Scala: val people: Array [Person] val (minors, adults) = people partition (_.age < 18) A simple pattern match An infix method call A function value
  • 21.
    Going Parallel ?(for now) ... in Java: ... in Scala: val people: Array [Person] val (minors, adults) = people .par partition (_.age < 18)
  • 22.
    Parallel Collections UseJava 7 Fork Join framework Split work by number of Processors Each Thread has a work queue that is split exponentially. Largest on end of queue Granularity balance against scheduling overhead On completion threads “work steals” from end of other thread queues
  • 23.
    General Collection HierarchyGenTraversable GenIterable GenSeq Traversable Iterable Seq ParIterable ParSeq
  • 24.
    Going Distributed Canwe get the power of parallel collections to work on 10’000s of computers? Hot technologies: MapReduce (Google’s and Hadoop) But not everything is easy to fit into that mold Sometimes 100’s of map-reduce steps are needed. Distributed collections retain most operations, provide a powerful frontend for MapReduce computations. Scala’s uniform collection model is designed to also accommodate parallel and distributed. Projects at Google (Cascade), Berkeley (Spark), EPFL.
  • 25.
    The Future Scala’spersistent collections are easy to use: concise: safe: fast: scalable: We see them play a rapidly increasing role in software development. few steps to do the job one word replaces a whole loop type checker is really good at catching errors collection ops are tuned, can be parallelized one vocabulary to work on all kinds of collections: sequential, parallel, or distributed.
  • 26.
    Going further: ParallelDSLs But how do we keep a tomorrow’s hardware loaded? How to find and deal with 10000+ threads in an application? Parallel collections and actors are necessary but not sufficient for this. Our bet for the mid term future: parallel embedded DSLs. Find parallelism in domains: physics simulation, machine learning, statistics, ... Joint work with Kunle Olukuton, Pat Hanrahan @ Stanford. EPFL side funded by ERC.
  • 27.
    EPFL / StanfordResearch Applications Domain Specific Languages Heterogeneous Hardware DSL Infrastructure OOO Cores SIMD Cores Threaded Cores Specialized Cores Programmable Hierarchies Scalable Coherence Isolation & Atomicity On-chip Networks Pervasive Monitoring Domain Embedding Language ( Scala ) Virtual Worlds Personal Robotics Data informatics Scientific Engineering Physics ( Liszt ) Scripting Probabilistic (RandomT) Machine Learning ( OptiML ) Rendering Parallel Runtime ( Delite, Sequoia, GRAMPS ) Dynamic Domain Spec. Opt. Locality Aware Scheduling Staging Polymorphic Embedding Task & Data Parallelism Hardware Architecture Static Domain Specific Opt.
  • 28.
    Example: Liszt -A DSL for Physics Simulation Mesh-based Numeric Simulation Huge domains millions of cells Example: Unstructured Reynolds-averaged Navier Stokes (RANS) solver Fuel injection Transition Thermal Turbulence Turbulence Combustion
  • 29.
    Liszt as VirtualizedScala val // calculating scalar convection (Liszt) val Flux = new Field[Cell,Float] val Phi = new Field[Cell,Float] val cell_volume = new Field[Cell,Float] val deltat = .001 ... untilconverged { for(f <- interior_faces) { val flux = calc_flux(f) Flux(inside(f)) -= flux Flux(outside(f)) += flux } for(f <- inlet_faces) { Flux(outside(f)) += calc_boundary_flux(f) } for(c <- cells(mesh)) { Phi(c) += deltat * Flux(c) /cell_volume(c) } for(f <- faces(mesh)) Flux(f) = 0.f } AST Hardware DSL Library Optimisers Generators … … Schedulers GPU, Multi-Core, etc
  • 30.
    New in Scala2.10: Reflection Previously: Needed to use Java reflection, no runtime info available on Scala’s types. Now you can do:
  • 31.
    (Bare-Bones) Reflection inJava Want to know whether type A conforms to B? Write your own Java compiler! Why not add some meaningful operations? Need to write essential parts of a compiler (hard). Need to ensure that both compilers agree (almost impossible).
  • 32.
    How to doBetter? Problem is managing dependencies between compiler and reflection. Time to look at DI again. Dependency Injection Idea: Avoid hard dependencies to specific classes. Instead of calling specific classes with new , have someone else do the wiring.
  • 33.
    Using Guice forDependency Injection (Example by Jan Kriesten)
  • 34.
    ... plus someBoilerplate
  • 35.
    Dependency Injection inScala Components are classes or traits Requirements are abstract values Wiring by implementing requirement values But what about cyclic dependencies?
  • 36.
    The Cake PatternRequirements are types of this Components are traits Wiring by mixin composition
  • 37.
    Cake Pattern inthe Compiler The Scala compiler uses the cake pattern for everything Here’s a schema: (In reality there are about ~20 slices in the cake.)
  • 38.
    Towards Better ReflectionCan we unify the core parts of the compiler and reflection? Compiler Reflection Different requirements: Error diagnostics, file access, classpath handling - but we are close!
  • 39.
    Compiler Architecture reflect.internal.Universensc.Global (scalac) reflect.runtime.Mirror Problem: This exposes way too much detail!
  • 40.
    Complete Reflection ArchitectureCleaned-up facade: Full implementation: reflect.internal.Universe nsc.Global (scalac) reflect.runtime.Mirror reflect.api.Universe / reflect.mirror
  • 41.
    How to Makea Facade The Facade The Implementation Interfaces are not enough!
  • 42.
    Conclusion Scala isa very regular language when it comes to composition: Everything can be nested: classes, methods, objects, types Everything can be abstract: methods, values, types The type of this can be declared freely, can thus express dependencies This gives great flexibility for SW architecture, allows us to attack previously unsolvable problems.
  • 43.
    Follow us ontwitter: @typesafe scala-lang.org typesafe.com akka.io scala-lang.org

Editor's Notes

  • #28 This leads to our vision, applications driven by a set of interoperable DSLs. We are developing DSLs to provide evidence as to their effectiveness in extracting parallel performance. But we are also very interested in empowering other to easily build such DSLs, so we are investing heavily in developing frameworks and runtimes to make parallel DSL development easier. And the goal is to run single source programs on a variety of very different hardware targets.
  • #29 Liszt is another language we have implemented. It is designed to support the creation of solvers for mesh-based partial differential equations. Problems in this domain typically simulate complex physical systems such as fluid flow or mechanics by breaking up space into discrete cells. A typical mesh may contain hundreds of millions of these cells (here we are visualizing a scram-jet designed to work at hypersonic speeds). Liszt is an ideal candidate for a DSL because while the problems are large and highly parallel, the mesh introduces many data-dependencies that are difficult to reason about, making writing solvers tedious.