Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Pharo VM Performance

1,563 views

Published on

Talk from Pharodays 2017.
Video: https://www.youtube.com/watch?v=lRQ447XI6Hc

Published in: Software
  • Video: https://www.youtube.com/watch?v=lRQ447XI6Hc
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Pharo VM Performance

  1. 1. Pharo VM performance Clement Bera
  2. 2. Myself • Clément Béra • 2011-2013: Engineer on the Pharo VM • 2013-2017: PhD student • Optimisations of the Pharo VM JIT compiler
  3. 3. 0 2 4 6 8 10 12 14 16 Interpreter 2005 Stack 2009 Cog V1 2010 Cog V2 2011 Spur 2014 Sista future Binary tree benchmark
  4. 4. 0 2 4 6 8 10 12 14 16 Interpreter 2005 Stack 2009 Cog V1 2010 Cog V2 2011 Spur 2014 Sista future Binary tree benchmark Pharo 5 2016
  5. 5. Plan • Pharo 5 (stable) • First time we out benched most competitors • Pharo 6 (released next week ???) • Pharo 7
  6. 6. Code execution GC
  7. 7. GC • Pharo 5 • New memory manager Spur • Pharo 6 • New compactor • Pharo 7 • Incremental GC ???
  8. 8. Pharo 5: Spur • Efficient scavenges • In most applications, most GC time is now in scavenges Code execution GC
  9. 9. Pharo 6: New compactor Loading a 200 Mb Moose Model in 250 Mb image February April Total time 2 min 1 min 2 sec Time in Full GC 1 min 2 sec Full GC avg pause 15 sec 0.5 sec Time in scavenge 15 sec 15 sec
  10. 10. Pharo 6: New compactor Loading a 200 Mb Moose Model in 250 Mb image February April Total time 2 min 1 min 2 sec Time in Full GC 1 min 2 sec Full GC avg pause 15 sec 0.5 sec Time in scavenge 15 sec 15 sec <- GC tuning gets it down to 5 sec
  11. 11. Pharo 7: Incremental GC ?? • Full GC pauses: ~500 ms at ~500Mb • Java default GC at 200ms soft real time • Solution • Incremental marking • Incremental compaction
  12. 12. Code execution • Pharo 5: • Spur got 1.8x • Pharo 6: • Polishing and micro-optimisations • Pharo 7: • Sista gets 1.5x-5x
  13. 13. Pharo 5: Spur 1.8x • Class table speeds-up look-up caches • New immediate objects • 22 bits hash
  14. 14. Pharo 6 • Register allocation improvements • Two path compilation • Frameless code for setter-like methods
  15. 15. Sista: Pharo 7 ? • Program introspection • Speculate on types based on previous runs • Optimize frequently used code • Deoptimize and reoptimize code incorrectly speculated
  16. 16. Goals • Program readability • Performance
  17. 17. Program readability 1 to: array size do: [ :i | (array at: i) yourself ]. array do: [ :elem | elem yourself ]. array do: #yourself.
  18. 18. Program readability 1 to: array size do: [ :i | (array at: i) yourself ]. array do: [ :elem | elem yourself ]. array do: #yourself. 0 2 5 20 87M/ sec 28M/ sec 13M/ sec 3.7M /sec 15M/ sec 21M/ sec 10M/ sec 3.9M /sec 94M/ sec 40M/ sec 22M/ sec 6.5M /sec
  19. 19. Performance 0 1 2 3 4 5 6 A* ThreadRing SpectralNorm JSJSON BinaryTree DeltaBlue Richards TCAP Kmeans Sista Pharo
  20. 20. Getting stable • Support most development workflow • Support image recompilation • Integration has started
  21. 21. In-image design Smalltalk image Virtual machine Cogit CompiledCode to native code Machine-specific optimisations Scorch CompiledCode to CompiledCode Smalltalk-specific optimisations CompiledCode (persisted across start-ups) native functions (discarded on shut-down) Baseline JIT Optimising JIT
  22. 22. Missing • IDE support • Debugger • Methods to show • Stability, testing
  23. 23. Are you interested ? • Incremental GC ? • VM performance ? • VM features ? • Come and talk to us !
  24. 24. We are looking for… • Use-cases showing what to improve • Large real-world benchmarks • Contributors • Investment
  25. 25. Conclusion • Pharo 5: Fastest VM • Pharo 6: Polishing • Pharo 7: Going further

×