Scala collections wizardry - Scalapeño
Upcoming SlideShare
Loading in...5
×
 

Scala collections wizardry - Scalapeño

on

  • 2,734 views

 

Statistics

Views

Total Views
2,734
Views on SlideShare
2,571
Embed Views
163

Actions

Likes
10
Downloads
36
Comments
0

5 Embeds 163

https://twitter.com 147
http://www.linkedin.com 7
https://www.linkedin.com 5
http://localhost 3
https://mail.google.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Scala collections wizardry - Scalapeño Scala collections wizardry - Scalapeño Presentation Transcript

  • Scala collections Sagie Davidovich @mesagie singularityworld.com linkedin.com/in/sagied
  • Warm up example: Fibonacci sequence val fibs: Stream[Int] = 0 #:: fibs.scanLeft(1)(_ + _) Key concepts: • Recursive values • Streams • Scan • Binary place-holder notation
  • Immutable collections You’ll know about • Avoid memory allocation for empty collections • Optimize for small collections • Equal-hashCode contract • Asymptotic behavior of common operations View slide
  • Nil List.empty and Nil are singletons. No new memory is allocated View slide
  • Option[A]
  • Immutable Sets – emptySet emptySet is a singleton too
  • Immutable Sets – Set1 Optimized for sets of size 1
  • Immutable Sets – Set2 Optimized for sets of size 2
  • Immutable Sets – Set4 A HashSet is (finally) instantiated
  • Immutable Collections
  • Mutable Collections
  • One liners
  • Computing a derivative def derivative(nums: Iterable[Double]) = nums.sliding(2) .map (pair => pair._2 - pair._1) What can be improved in this solution? Bonus question: change a few characters to find the max slope
  • Counting occurrences (histogram) "encyclopedia" groupBy identity mapValues (_.size) Map ( e -> 2, n -> 1, y -> 1, a -> 1, i -> 1, l -> 1, p -> 1, c -> 2, o -> 1, d -> 1 )
  • Word n-grams val range = 1 to 3 val text = "hello sweet world" val tokenize = (s: String) => s.split(" ") range flatMap (size => tokenize(text) sliding size) Result: Vector(Array(hello), Array(sweet), Array(world), Array(hello, sweet), Array(sweet, world), Array(hello, sweet, world))
  • Are all members of a greater than corresponding members of b val a = List(2,3,4) val b = List(1,2,3) // O(n^2) and not very elegant. (0 until a.size) forall (i => a(i) > b(i)) // O(n) but creates tuples and a temporary list. Yet, more elegant. a zip b forall (x=> x._1 > x._2) // same as above but doesn't create a temporary list (lazy) a.view zip b forall (x=> x._1 > x._2) // O(n), without tuple or temporary list creation, and even more elegant. (a corresponds b)(_ > _)
  • Strings are collections. How come? “abc”.max @inline implicit def augmentString(x: String) = new StringOps(x) String <% StringOps <: StringLike <: IndexedSeqOptimized …
  • Complexity of collection operations • Linear: – Unary: O(n): • Mappers: map, collect • Reducers: reduce, foldLeft, foldRight • Others: foreach, filter, indexOf, reverse, find, mkString – Binary: O(n+ m): • union, diff, and intersect
  • Immutable Collections time complexity head tail apply update prepend append List C C L L C L Stream C C L L C L Vector eC eC eC eC eC eC Stack C C L L C L Queue aC aC L L L C Range C C C - - - String C L C L L L
  • Mutable Collections time complexity head tail apply update prepend append insert ArrayBuffer C L C C L aC L ListBuffer C L L L C C L StringBuilde r C L C C L aC L MutableList C L L L C C L Queue C L L L C C L ArraySeq C L C C - - - Stack C L L L C L L ArrayStack C L C C aC L L Array C L C C - - -
  • Bonus question What’s the complexity of Range.sum?
  • Range
  • Equals-hashCode contract (a equals b)  (a.hashCode == b.hashCode) All Scala collection implement the contract Bad idea: Set[Array[Int]] Good idea: Set[Vector[Int]] Bad Idea: Set[ArrayBuffer[Int]] Bad Idea: Set[collection.mutable._] Good Idea: Set[collection.immutable._]
  • More on collections equality val (a, b) = (1 to 3, List(1, 2, 3)) a == b // true Q: Wait, how efficient is Range.hashCode? A: O(n) override def hashCode = util.hashing.MurmurHash3.seqHash(seq) Challenge yourself: Is there a closed (o(1)) formula for a range hashCode?
  • Java interoperability Implicit (less boilerplate): import collection.javaConversions._ javaCollection.filter(…) Explicit (better control): Import collection.javaConverters._ javaCollection.asScala.filter(…) scalaCollection.asJava
  • The power of type-level programming graph path-finding in compile time import scala.language.implicitConversions // Vertices case class A(l: List[Char]) case class B(l: List[Char]) case class C(l: List[Char]) case class D(l: List[Char]) case class E(l: List[Char]) // Edges implicit def ad[A1 <% A](x: A1) = D(x.l :+ 'A') implicit def bc[B1 <% B](x: B1) = C(x.l :+ 'B') implicit def ce[C1 <% C](x: C1) = E(x.l :+ 'C') implicit def ea[E1 <% E](x: E1) = A(x.l :+ 'E') def pathFrom(end:D) = end pathFrom(B(Nil)) // res0: D = D(List(B, C, E, A))
  • Want to go Pro? • Shapeless (Miles Sabin) – Polytypic programming & Heterogenous lists – github.com/milessabin/shapeless • Scalaxy (Olivier Chafik) – Macros for boosting performance of collections – github.com/ochafik/Scalaxy