Tales About Scala Performance


Published on

A session given on Scalapeño conference 2013.

Published in: Technology
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Tales About Scala Performance

  1. 1. Tales About Scala Performance © Copyright Performize-IT LTD.
  2. 2. About Me My Name: Haim Yadid Hard to Pronounce Luckily it is meaningful Haim => Life Yadid => Friend hybrid nick: lifey this :: :: :: :: :: :: Nil © Copyright Performize-IT LTD.
  3. 3. Performize-IT © Copyright Performize-IT LTD.
  4. 4. Performize-IT Optimizing Software since 2007 Performance Bottlenecks OutOfMemory Crashes Concurrency GC Tuning Training&Mentoring © Copyright Performize-IT LTD.
  5. 5. Contact Me lifey@performize-it.com blog.performize-it.com www.performize-it.com https://github.com/lifey http://il.linkedin.com/in/haimyadid @lifeyx © Copyright Performize-IT LTD.
  6. 6. Once Upon A Time © Copyright Performize IT LTD.
  7. 7. Benchmarks by Google So we are done © Copyright Performize-IT LTD.
  8. 8. So what is this talk about? Best practices Micro benchmarks? Understanding © Copyright Performize-IT LTD.
  9. 9. Understand How to Find performance problems How to solve them Reach a well performing production system Prerequisites: Familiarity with the JVM Basic knowledge of Scala © Copyright Performize-IT LTD.
  10. 10. Performance is all about Methodology Monitoring Hotspots Isolation Analysis Solution Tools are your Best Friends for this task © Copyright Performize-IT LTD.
  11. 11. Scala Runs on the JVM All JVM capabilities and tools still apply Take your best friends with you © Copyright Performize-IT LTD.
  12. 12. Premature Optimization I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely I shall not optimize prematurely © Copyright Performize-IT LTD.
  13. 13. Monitoring the JVM Java management extensions (JMX) on the same machine(Attach) Remotely via command line params Tools JConsole JVisualVM Mission Control © Copyright Performize-IT LTD.
  14. 14. Remote Monitoring - JMX Add params to command line of profiled app -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=<port> -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false Recommend authentication and security, refer to http://java.sun.com/j2se/1.5.0/docs/guide/management/agent.html Production © Copyright Performize-IT LTD.
  15. 15. A Tale about a Stack © Copyright Performize IT LTD.
  16. 16. Your First Scala Function Functional Programming recursion Easy to understand Probably your 1st program in Scala will look like: def sumOfSquares(st:Int , end : Int ) = { if (st>end) 0 else st*st + sumOfSquares(st+1,end) } © Copyright Performize-IT LTD.
  17. 17. And your first exception will be: java.lang.StackOverflowError at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:8) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) © Copyright Performize-IT LTD.
  18. 18. Tail Recursion Recursive call to the function must be the value returned  if  (number  ==  1)  1  else  number  *  factorial  (number  -­‐  1) © Copyright Performize-IT LTD.
  19. 19. Favor tail recursion The JVM does not optimize recursion Meaning extra call for every iteration Limit on recursion depth Scala compiler can optimize tail recursion!! @tailrec def sumOfSquares(st:Int , end : Int, sum = 0 ) = { if (st>end) sum else sumOfSquares(st+1,end,sum + st*st) } © Copyright Performize-IT LTD.
  20. 20. @tailrec Annotation A compile time directive fail compilation if tail recursion optimization cannot be applied Use whenever the fact tail recursion is used is mandatory for performance and functionality © Copyright Performize-IT LTD.
  21. 21. Stack Size Ranges from 256k-1024k Depending on platform and JVM version What is it in your system? java -XX:+PrintFlagsFinal -version |& grep ThreadStackSize Tune thread stack to your needs Example: -Xss1312k Production © Copyright Performize-IT LTD.
  22. 22. Stacks in Scala Scala stack is just like Java Stack jstack is your best friend Scala terminology may be obscured E.g. List will look like $colon$colon © Copyright Performize-IT LTD.
  23. 23. JStack Part of the JDK Dumps stack traces of all live threads Synopsis: jstack -l Use when Get a snapshot for program activity detect deadlocks © Copyright Performize-IT LTD.
  24. 24. Takipi’s Stackifier www.stackifier.com © Copyright Performize-IT LTD.
  25. 25. Humpty Dumpty sat on a heap, Humpty Dumpty had anOutOfMemory flip. All the king’s horses and all the king’s men Couldn’t put Humpty together again Max Used Heap © Copyright Performize IT LTD.
  26. 26. In a Perfect World..... Heap(Or Perm Gen) is depleted -XX:+HeapDumpOnOutOfMemoryError Scala code does not have larger memory footprint Scala code may have larger permgen footprint Production © Copyright Performize-IT LTD.
  27. 27. MAT MAT - Memory Analyzer Tool A very powerful tool analyzing heap dumps Use to investigate : Memory leaks OutOfMemory errors Memory footprint Alternatives Yourkit /JProbe/JProfiler (Commercial) VisualVM(JDK) JHat(JDK) © Copyright Performize-IT LTD.
  28. 28. MAT-name-resolver Add-on for MAT Helps MAT understand Scala Developed by Iulian Dragos from Typesafe Github project https://github.com/dragos/MAT-name-resolver © Copyright Performize-IT LTD.
  29. 29. List[Int] ? © Copyright Performize-IT LTD.
  30. 30. OutOfMemory Perm Space Class byte code resides in PermGen Scala will use more perm space You can write small piece of code which will create a lot of byte-code © Copyright Performize-IT LTD.
  31. 31. @ScalaSignature @ScalaSignature(bytes="... Meta data needed for: Reflection Compilation Larger class files © Copyright Performize-IT LTD.
  32. 32. More classes Each closure is actually a JVM class Implicit conversions are classes Companion objects are also classes © Copyright Performize-IT LTD.
  33. 33. Well ClosureExample$.class object ClosureExample extends App { val f = (x: Int) => x*x println (s"closure ${f(5)}"); } package com.performizeit.scalapeno.demos; ClosureExample.class import scala.Function0; import scala.Function1; package com.performizeit.scalapeno.demos; import scala.LowPriorityImplicits; import scala.Predef.; import scala.App; import scala.StringContext; import scala.App.class; import scala.reflect.ScalaSignature; import scala.DelayedInit; import scala.runtime.AbstractFunction0; import scala.Function0; import scala.runtime.BoxedUnit; import scala.Function1; import scala.runtime.BoxesRunTime; import scala.Serializable; package com.performizeit.scalapeno.demos; import scala.collection.mutable.ListBuffer; @ScalaSignature(bytes="006001035:Q!001002t002-tab0217pgV024X-022=b[BdWM003002004t005)A-Z7pg*021QAB001ng016fG.0319f]>T!a002005002031A,'OZ8s[&TX-033;013003%t1aY8n007001001"001D import scala.runtime.AbstractFunction1.mcII.sp; 007016003t1QA004002t002=021ab0217pgV024X-022=b[BdWmE002016!Y001"!005013016003IQ021aE001006g016fG.Y005003+I021a!0218z%0264007CAt030023tA"CA002BaBDQAG007005002mta import scala.Serializable; import scala.runtime.BoxedUnit; 001P5oSRtD#A006t017ui!031!C001=005ta-F001 !021t002Et022n005005022"!003$v]016$030n03482!tt2%003002%%t031021J034;tr031j001025!003 003t1007005") import scala.runtime.AbstractFunction1.mcII.sp; public final class ClosureExample public final classpackage com.performizeit.scalapeno.demos; ClosureExample$ { public implements Appfinal class ClosureExample$$anonfun$1 extends AbstractFunction1.mcII.sp public static void main(String[] paramArrayOfString) implements Serializable { import scala.Function1; { { public static final MODULE$; import scala.LowPriorityImplicits; ClosureExample..MODULE$.main(paramArrayOfString); public static scala.Predef.; private Function1<Object, Object> f; import final long serialVersionUID = 0L; } private final long executionStart; import scala.StringContext; public final scala.runtime.AbstractFunction0; private String[] importint apply(int x) scala$App$$_args; public static void delayedInit(Function0<BoxedUnit> paramFunction0) { private final ListBuffer<Function0<BoxedUnit>> scala$App$$initCode; import scala.runtime.BoxedUnit; { return apply$mcII$sp(x); } import scala.runtime.BoxesRunTime; ClosureExample..MODULE$.delayedInit(paramFunction0); public int apply$mcII$sp(int x) { return x * x; } static } { public final class ClosureExample$delayedInit$body extends AbstractFunction0 new (); } { public static String[] args() } private final ClosureExample. $outer; { ClosureExample$$anonfun$1.class ClosureExample$delayedInit$body.class return ClosureExample..MODULE$.args(); public long executionStart() Object apply() public final { { return this.executionStart; } this.$outer.f_$eq(new ClosureExample..anonfun.1()); public static void scala$App$_setter_$executionStart_$eq(long paramLong) public String[] scala$App$$_args() { return this.scala$App$$_args; } Predef..MODULE$.println(new StringContext(Predef..MODULE$.wrapRefArray((Object[])new String[] { "closure ", "" })).s(Predef..MODULE$.genericWrapArray(new Object[] { public void scala$App$$_args_$eq(String[] x$1) { this.scala$App$$_args = x$1;}))); { BoxesRunTime.boxToInteger(this.$outer.f().apply$mcII$sp(5)) } ClosureExample..MODULE$.scala$App$_setter_$executionStart_$eq(paramLong); public ListBuffer<Function0<BoxedUnit>> scala$App$$initCode() { return this.scala$App$$initCode; } } public void scala$App$_setter_$executionStart_$eq(long x$1) { this.executionStart = x$1; } return BoxedUnit.UNIT; public void scala$App$_setter_$scala$App$$initCode_$eq(ListBuffer x$1) { this.scala$App$$initCode = x$1; } } public static long executionStart() public String[] args() { return App.class.args(this); } { public void delayedInit(Function0<BoxedUnit> body) { App.class.delayedInit(this, body); } public ClosureExample$delayedInit$body(ClosureExample. $outer) return ClosureExample..MODULE$.executionStart(); public void main(String[] args) { App.class.main(this, args); } { } public Function1<Object, Object> f() { return this.f; } public void f_$eq(Function1 x$1) { this.f = x$1; } public static Function1<Object, Object> f() { return ClosureExample..MODULE$.f(); } } public static class delayedInit$body extends AbstractFunction0 { private final ClosureExample. $outer; public final Object apply() { this.$outer.f_$eq(new ClosureExample..anonfun.1()); Predef..MODULE$.println(new StringContext(Predef..MODULE$.wrapRefArray((Object[])new String[] { "closure ", "" })).s(Predef..MODULE$.genericWrapArray(new Object[] { BoxesRunTime.boxToInteger(this.$outer.f().apply $mcII$sp(5)) }))); return BoxedUnit.UNIT; } public delayedInit$body(ClosureExample. $outer) { } } } © Copyright Performize-IT LTD.
  34. 34. @specialized Generics implemented by type erasure For primitive types this means : Boxing/Unboxing Performance hit Large memory footprint @specialized annotation enables specialized implementations © Copyright Performize-IT LTD.
  35. 35. What about code cache? Code cache hold optimized assembly code Should be large enough to hold If you need more perm gen You may need more code cache -XX:CodeCacheSize= Monitor it via JMX Production © Copyright Performize-IT LTD.
  36. 36. @specialized Nightmare class SpecializeNightmare { trait S1[@specialized A, @specialized B] { def f(p1:A): Unit } } Generates 165 classes Don’t try with 3,4,5 © Copyright Performize-IT LTD.
  37. 37. OutOfMemory Perm Gen Space Congrats you have a perm gen OOM -XX:MaxPermSize=1024m (Or -J-XX:MaxPermSize=1024m if you use Scala command line) Production © Copyright Performize-IT LTD.
  38. 38. Oh dear! Oh dear! I shall be too late! © Copyright Performize IT LTD.
  39. 39. -optimise A scalac command line parameter Performs optimizations of bytecode Inlining boxing/unboxing elimination etc Improves performance Slower compilation Production © Copyright Performize-IT LTD.
  40. 40. Inlining Scala uses information it has in compile time To know which methods can be inlined It can do better job than the JVM Automatic when you -optimise Production © Copyright Performize-IT LTD.
  41. 41. Inlining Visibility On scala compiler level Add -Ylog:inline to see what inlined scalac -optimise -Ylog:inline -d ../bin /ClosureExampleInline.scala |& grep inlined com/performizeit/scalapeno/demos [log inliner] inlined ClosureExampleInline.<init> // 1 inlined: ClosureExampleInline.delayedInit [log inliner] inlined com.performizeit.scalapeno.demos.ClosureExampleInline$$anonfun$f $1.apply // 1 inlined: com.performizeit.scalapeno.demos.anonfun$f$1.apply$mcII$sp com.performizeit.scalapeno.demos. com.performizeit.scalapeno.demos. © Copyright Performize-IT LTD.
  42. 42. Inlining Visibility JVM JIT Compiler compiler options Not recommended for production -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining ! Prod © Copyright Performize-IT LTD.
  43. 43. @inline You may direct the compiler to inline a method Usually you will not need it the compiler will do it anyway. Or the JVM will do it anyway No real need to clutter the code.... @inline final def f = (x: Int) => x*x © Copyright Performize-IT LTD.
  44. 44. Member accessors Get/Set getters to a val fields getters&setters to var fields Will you pay for this? Nope ! JVM inlines accessor methods (by default) If you insist on penalty -XX:-UseFastAccessorMethods © Copyright Performize-IT LTD.
  45. 45. Parallel Collections ParArray ParVector mutable.ParHashMap mutable.ParHashSet immutable.ParHashMap immutable.ParHashSet ParRange ParTrieMap © Copyright Performize-IT LTD.
  46. 46. Parallel Collections Apply only when has a location is a hotspot Very easy to use behind the scenes ForkJoinFramework (Java 6) Dangerous when code : Only has side effects Non associative when proven to improve Easy to use val v = Vector(Range(0,10000000)).flatten v.par.map(_ + 1) © Copyright Performize-IT LTD.
  47. 47. Profiler - JVisualVM Part of the JDK A profiler Use when Want to identify hotspot Analyze memory allocation bottlenecks Alternatives Yourkit (Commercial) JProbe(Commercial) JProfiler(Commercial) © Copyright Performize-IT LTD.
  48. 48. Sampling vs Instrumentation Sampling - sample application threads and stack traces to get statistics Instrumentation - modify byte code to record times and invocation counts © Copyright Performize-IT LTD.
  49. 49. Scala Stacks revisited while (true) { var a = List(Range(0,1000)).flatten // println(a) for (i <- 1 to 10 ) { a = a :+ i println(a.last) } } © Copyright Performize-IT LTD.
  50. 50. Garbage Collection © Copyright Performize IT LTD.
  51. 51. Immutability Immutability may cause more objects allocation Not necessary a performance hit Short lived objects GC handles them efficiently Escape analysis Parallelization!!! © Copyright Performize-IT LTD.
  52. 52. VisualVM (allocation hotspots) Find locations large amounts of bytes are being allocated. large number of objects being allocation © Copyright Performize-IT LTD.
  53. 53. Large (im)mutable state You have a huge graph which changes gradually Eventually end up in Old Generation A small change may cause huge impact on state That may screw up GC © Copyright Performize-IT LTD.
  54. 54. GC Visibility GC can be visualized partially through JMX The best way to do get the whole picture is by GC logs -Xloggc:<log file name> -XX:+PrintGCDetails -XX:+PrintGCDateStamps Java 7 supports a “rolling appender” -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<#files> -XX:GCLogFileSize=<number>M Prod © Copyright Performize-IT LTD.
  55. 55. GCViewer Analysis GC logs Use when: Experience GC problems Is GC efficient ?(throughput ) Does GC stops application ( pause time) Alternatives Cesnum (Commercial) © Copyright Performize-IT LTD.
  56. 56. And They Lived Happily Ever After © Copyright Performize IT LTD.
  57. 57. slides /: (_ + _) Don’t be afraid of Scala You will be able to optimize large scale apps Optimize where needed You need to (Java =>) Scala Yourself ATM - Know Java to optimize Scala © Copyright Performize-IT LTD.
  58. 58. Q&A © Copyright Performize IT LTD.
  59. 59. The End © Copyright Performize IT LTD.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.