Practical type mining in Scala


Published on

As the author of an open-source serialization library in Scala, I've undergone a lot of struggle to understand and harness the power of Scala's type system. My library was based on parsing pickled Scala signatures, which was a subterranean and sparely documented feature of Scala 2.8. I wanted to serialize and deserialize options, lists and maps, which required defeating type erasure when serializing while skating by on type erasure when deserializing. I struggled with multiple constructors, checking for annotation types, specialization, more. The new reflection libraries introduced in Scala 2.10 provided easier access to the same information I had been getting from the pickled signatures. This talk will address practical aspects of type mining, providing a library of hands-on examples using the Scala 2.10 reflection API.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Practical type mining in Scala

  1. 1. Practical type mining in Scalathe fastest way from A to Z.tpeRose Toomey, Novus Partners11 June 2013 @ Scala Days
  2. 2. Where Salat started began because we wanted asimple, seamless way to serialize anddeserialize our data model without externalmappings.Pickled Scala signatures (SID #10) allowed usto mine type information without resorting toruntime reflection.
  3. 3. Scala reflection before 2.10Why did Salat resort to pickled Scala signatures?• Scala before 2.10 used reflection from Java– Reflection didn’t know about Scala features likeimplicits, path-dependent types, etc.– Type erasure: parameterized types wereunrecoverable at runtime without ManifestworkaroundWhy should we settle for having less information than thecompiler does?Workaround: raid the compiler’s secret stash.
  4. 4. Benefits of Scala 2.10 reflection• Choose between runtime and compile timereflection• Significant parts of the compiler API are now exposed• Reify Scala expressions into abstract syntax trees• Vastly better documentation!
  5. 5. Navigating the universeA universe is an environment withaccess to trees, symbols and theirtypes.• scala.reflect.runtime.universelinks symbols and types to theunderlying classes and runtimevalues of the JVM• scala.reflect.macros.Universeis the compiler universe
  6. 6. Macros and the compilerThe compiler universe has one access the compiler universe andmirror via an instance ofscala.reflect.macros.ContextTo get started with a simple example, seeEugene Burmako’s printf macro:
  7. 7. Mirror, mirrorMirrors provide access to the symboltable within a universe.The compiler has one universe and onemirror, which loads symbols from pickledScala signatures using ClassFileParser.At runtime there is only one universe, butit has a mirror for each classloader. Theclassloader mirror creates invokermirrors, which are used forinstances, classes, methods, fields –everything.
  8. 8. Which universe?Play with the compiler’s universe using the Scala REPL :power mode.At runtime, get a mirror for your classloader and then usereflect, reflectClass and reflectModule to get more specific invokermirrors.scala.reflect.runtime.currentMirrorFor macros, your macro implementation takes a Context c and then importthe macro universe.The macro universe exposes the compiler universe and provides mutabilityfor reflection artifacts so your macros can create or transform ASTs.import c.universe._
  9. 9. Symbols and TypesSymbols exist in a hierarchy thatprovides all available informationabout the declaration of entities andmembers.Types represent information aboutthe type of a symbol: itsmembers, basetypes, erasure, modifiers, etc.
  10. 10. What can I do with a Type?• Comparisons: check equality, subtyping• Mine type information about the members and inner types– declarations gets all the members declared on the type– members gets all the members of this type, either declared orinherited– Use declaration or member to find a type by symbolGet the type’s own termSymbol or typeSymbolType instances represent information about the type of acorresponding symbol – so to understand types we need toexamine which types of symbols are interesting and why.
  11. 11. Great! Now I want a type…Import a universe and use typeOf:scala> import scala.reflect.runtime.universe._import scala.reflect.runtime.universe._scala> case class Foo(x: Int)defined class Fooscala> val fooTpe = typeOf[Foo]fooTpe: reflect.runtime.universe.Type = Fooscala.reflect.internal.Definitions defines value classtypes (Unit, primitives) and trivial types(Any, AnyVal, AnyRef).
  12. 12. Comparing typesDon’t compare types using == to check forequality because under certain conditions it doesnot work. Type aliases are one example but dueto some internal implementation details, == couldfail even for the same types if they were loadeddifferently.Use these handy emoji instead:=:= Is this type equal to that type?<:< Is this type a subtype of that type?* Don’t confuse these type comparisons with deprecated Manifest operations like <:< and >:>* Tip of the hat to @softprops for the original emoji usage in his presentation on sbt
  13. 13. Inspecting types in detailThe REPL :power mode is full of undocumented treats like :type –vscala> :type -v case class Foo[T](t: T)// Type signature[T]AnyRefwith Productwith Serializable {val t: Tprivate[this] val t: Tdef <init>(t: T): Foo[T]def copy[T](t: T): Foo[T]...}// Internal Type structurePolyType(typeParams = List(TypeParam(T))resultType = ClassInfoType(...))
  14. 14. Symbols in more depthStart here:scala.reflect.internal.SymbolsTypeSymbol represents types, classes, traits and type parameters.It provides information about covariance and contravariance.TermSymbol covers a lot of ground: var, val, def, objectdeclarations.SymbolApi provides is methods to check whether a Symbolinstance can be cast to a more specific type of symbol, as well asas methods to actually cast, e.g. isTerm and asTerm.
  15. 15. Interesting type symbolsClassSymbol provides access to all the informationcontained in a class or trait.• baseClasses in linear order from most to least specific• isAbstractClass, isTrait, isCaseClass• isNumeric, isPrimitive, isPrimitiveValueClass• Find companion objects
  16. 16. The world of term symbolsTerm symbols represent val, var, def, and objectdeclarations as well as packages and value parameters.Accordingly you can find interesting methods on them like:• isVal, isVar• isGetter, isSetter, isAccessor, isParamAccessor• isParamWithDefault (note there is not any easy way to getthe value of the default argument yet)• isByNameParam (big improvement!)• isLazy
  17. 17. Term symbols: methodsUse MethodSymbol to get all the details of methods:• is it a constructor? the primary constructor?• use paramss to get all the parameter lists of the methods(ss = list of lists of symbols)• return type• type params (empty for non parameterized methods)• does the method support variable length argument lists?When members or member(ru.Name) returns a Symbol, you canconvert it to a MethodSymbol using asMethod
  18. 18. Term symbols: modulesUse ModuleSymbol to navigate object declarations:• Find companion objects (See this StackOverflowdiscussion)• Find nested objects (See this StackOverflowdiscussion)Given a ClassSymbol, use companionSymbol.asModule to get aModuleSymbol which you can turn into a companionobject instance using the mirrorreflectModule(moduleSymbol).instance
  19. 19. Getting symbols out of typesHave a Type?- typeSymbol returns either NoSymbol or a Symbol which canbe cast using asType- similarly, termSymbolUse the members method to get a MemberScope, which has aniterator of symbols:scala> typeOf[Foo].membersres61: reflect.runtime.universe.MemberScope =Scopes(constructor Foo, value x, ...
  20. 20. Ask for it by nameIf you know exactly what you want, use newTermName andnewTypeName. If it doesn’t work out, you’ll get back NoSymbol.scala> case class Foo(x: Int)defined class Fooscala> typeOf[Foo].member(ru.newTermName("x"))res64: reflect.runtime.universe.Symbol = value xscala> typeOf[Foo].member(ru.newTypeName("x"))res65: reflect.runtime.universe.Symbol = <none>
  21. 21. Find the constructorscala.reflect.api.StandardNames provides standard termnames as nme, available from your universe.scala> typeOf[Foo].member(nme.CONSTRUCTOR)res66: reflect.runtime.universe.Symbol =constructor Fooscala> res66.asMethod.isPrimaryConstructorres68: Boolean = true
  22. 22. TreesTrees (ASTs) are the foundation ofScala’s abstract type syntax forrepresenting code.The parser creates an untyped treestructure that is immutable except forPosition, Symbol and Type. A laterstage of the compiler then fills in thisinformation.
  23. 23. From tree to Scala signature$ scalac -Xshow-phasesphase name id description---------- -- -----------parser 1 parse source into ASTs, perform simpledesugaringnamer 2 resolve names, attach symbols to named treestyper 4 the meat and potatoes: type the treespickler 8 serialize symbol tables• The parser creates trees• The namer fills in tree symbols, creates completers (• The typer computes types for trees• The pickler serializes symbols along with types into ScalaSignatureannotation
  24. 24. Make it soreify takes a Scala expression and convertsinto into a tree.When you use reify to create a tree, it ishygienic: once the identifiers in the tree arebound, the meaning cannot later change.The return type of reify is Expr, which wrapsa typed tree with its TypeTag and somemethods like splice for transforming trees.
  25. 25. Creating a treescala> reify{ object MyOps { def add(a: Int, b: Int)= a + b } }.treeres15: reflect.runtime.universe.Tree ={object MyOps extends AnyRef {def <init>() = {super.<init>();()};def add(a: Int, b: Int) = a.$plus(b)};()}
  26. 26. Inspecting the raw treeOnce you’ve reified an expression using the macrouniverse, you can use showRaw to show the raw tree, which youcan use in a macro:scala> showRaw(reify{ object MyOps { def add(a: Int, b: Int) = a+ b } })res16: String =Expr(Block(List(ModuleDef(Modifiers(), newTermName("MyOps"), Template(List(Ident(newTypeName("AnyRef"))), emptyValDef, List(DefDef(Modifiers(), nme.CONSTRUCTOR, List(), List(List()), TypeTree(), Block(List(Apply(Select(Super(This(tpnme.EMPTY), tpnme.EMPTY), nme.CONSTRUCTOR), List())), Literal(Constant(())))), DefDef(Modifiers(), newTermName("add"), List(), List(List(ValDef(Modifiers(PARAM), newTermName("a"), Ident(scala.Int), EmptyTree), ValDef(Modifiers(PARAM), newTermName("b"), Ident(scala.Int), EmptyTree))), TypeTree(), Apply(Select(Ident(newTermName("a")), newTermName("$plus")), List(Ident(newTermName("b"))))))))), Literal(Constant(()))))
  27. 27. Scala ToolBox: compile at runtimeRuntime classloader mirrors can createa compilation toolbox whose symboltable is populated by that mirror.Want a tree? Use ToolBox#parse toturn a string of code representing anexpression into an AST.Have a tree? Use Toolbox#eval to spawnthe compiler, compiler in memory, andlaunch the code.See formore, as well as this StackOverflowdiscussion.
  28. 28. Type erasure: fighting the good fight$ scalac -Xshow-phasesphase name id description---------- -- -----------erasure 16 erase types, add interfaces for traitsWhen you inspect types at runtime, you will be missing some ofthe type information that was available to the compiler duringstages before the JVM bytecode was generated.If you want to mine types out of options, collections andparameterized classes, you need to ask the compiler to stash thetype information where youll be able to get to it at runtime.
  29. 29. Across the riverWhat ferries compiler type information toruntime?Before 2.10: Manifest[T]After 2.10: TypeTag[T]Request the compiler generate this informationusing:- using an implicit parameter of type Manifest orTypeTag- context bound of a type parameter on amethod or a class- via the methods manifest[T] or typeTag[T]
  30. 30. Before Scala 2.10: manifestsThe manifest is a shim where the compiler stores typeinformation, which is used to later provide runtime accessto the erased type as a Class instance.scala> case class A[T : Manifest](t: T) { def m =manifest[T] }defined class Ascala> A("test").mres26: Manifest[java.lang.String] = java.lang.Stringscala> A(1).mres27: Manifest[Int] = Int
  31. 31. Scala 2.10: type tagMirabile visu: instead of getting back a manifest, we getback an actual type.scala> case class A[T : TypeTag](t: T) { def tpe =typeOf[T] }defined class Ascala> A("test").tperes19: reflect.runtime.universe.Type = Stringscala> A(1).tperes20: reflect.runtime.universe.Type = Int
  32. 32. Type arguments: before Scala 2.10Using manifests:scala> A(Map.empty[String, A[Int]]).m.erasureres5: java.lang.Class[_] = interfacescala.collection.immutable.Mapscala> A(Map.empty[String, A[Int]]).m.typeArgumentsres6: List[scala.reflect.Manifest[_]] =List(java.lang.String, A[Int])
  33. 33. Type arguments: Scala 2.10The parameterized types are now a list of types:scala> A(Map.empty[String,A[Int]]).tpe.erasureres17: reflect.runtime.universe.Type =scala.collection.immutable.Map[_, Any]scala> res10 match { case TypeRef(_, _, args)=> args }res18: List[reflect.runtime.universe.Type] =List(String, A[Int])
  34. 34. Sadly…The runtime reflection API isnot currently thread safe.Keep an eye on this issue fordevelopments. up! The reflection usedin macros is not affected.
  35. 35. Reflection toolsThe Scala REPL has a magnificent :power mode which isnot well explained. Examine its underpinnings more details by using scalac to compile small testfiles – start by playing around with the –Xprint:compiler
  36. 36. sbt projectTo use Scala 2.10 reflection:libraryDependencies <+= (scalaVersion)("org.scala-lang" %"scala-compiler" % _)To use pickled Scala signatureslibraryDependencies <+= scalaVersion("org.scala-lang" %"scalap" % _)
  37. 37. Macros in the wild• Spire – a numeric library for Scala (examples ofmacros andspecialization• Sherpa – a serialization toolkit and ‘reflection-less’case class mapper for Scala• sqlτyped – a macro which infers Scala types byanalysing SQL statements
  38. 38. Things to read, things to watch• Martin Oderskys Lang-NEXT 2012 keynote, Reflectionand compilers• Paul Phillips ScalaDays 2012 presentation, Inside theSausage Factory: scalac internals• Eugene Burmako’s Metaprogramming in Scala• Daniel Sobral’s blog posts on JSON serialization withreflection in Scala (Part I / Part II)• StackOverflow posts tagged with Scala 2.10 reflection• Scala issue tracker reflection tickets contain detaileddiscussion and useful links
  39. 39. Thanks to…• Eugene Burmako (@xeno_by) not only for manyhelpful StackOverflow posts, but also his commentson these slidesFollow me on Twitter for more interestingpresentations - @prasinous