Implementing a JavaScript Engine

7,152 views

Published on

Presentation done at JingJS, Nov 10, 2013.

Published in: Technology
0 Comments
32 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,152
On SlideShare
0
From Embeds
0
Number of Embeds
33
Actions
Shares
0
Downloads
220
Comments
0
Likes
32
Embeds 0
No embeds

No notes for slide
  • Self 4.0 cannot run a block after its enclosing method has returned.
  • http://blogs.msdn.com/b/ie/archive/2012/06/13/advances-in-javascript-performance-in-ie10-and-windows-8.aspxChakra employs a conservative, quasi-generational, mark and sweep, garbage collector that does most of its work concurrently on a dedicated thread to minimize script execution pauses that would interrupt the user experience.
  • Implementing a JavaScript Engine

    1. 1. Implementing a JavaScript Engine Krystal Mok (@rednaxelafx) 2013-11-10
    2. 2. Implementing a (Modern, High Performance?) JavaScript Engine
    3. 3. About Me • Programming language and virtual machine enthusiast • Worked on the HotSpot JVM at Taobao and Oracle • Also worked on a JavaScript engine project • Twitter / Sina Weibo: @rednaxelafx • Blog: English / Chinese
    4. 4. Agenda • • • • Know the Heritage JavaScript Engine Overview Implementation Strategies and Tradeoffs A bit about Nashorn
    5. 5. The roots of JavaScript language and modern JavaScript engines KNOW THE HERITAGE
    6. 6. Heritage of the Language Scheme function closure Self prototype-based OO C-like syntax, built-in objects Java … JavaScript
    7. 7. Language Comparison Self • Prototype-based OO • Multiple Prototype • Dynamically Typed • Dynamically Extend Objects • Mirror-based Reflection • Block (closure) • Support Non-local Return • (pass a error handler to methods that might one) JavaScript (ECMAScript 5) • Prototype-based OO • Single Prototype • Dynamically Typed • Dynamically Extend Objects • Reflection • First-class Function (closure) • (no non-local return) • Exception Handling
    8. 8. Heritage of the Language function MyPoint(x, y) { this.x = x; this.y = y; } MyPoint.prototype.distance = function (p) { var xd = this.x - p.x, yd = this.y - p.y; return Math.sqrt(xd*xd + yd*yd); } var p = new Point(2013, 11);
    9. 9. Heritage of the Language traits myPoint = (| parent* = traits clonable. initX: newX Y: newY = (x: newX. y: newY) distance: p = (| xd. yd | xd: x - p x. yd: y - p y. (xd squared + yd squared) squareRooted ). |). myPoint = (| parent* = traits myPoint. x <- 0. y <- 0 |). p: myPoint copy initX: 2013 Y: 11
    10. 10. Heritage of the Language on Self 4.4 / Mac OS X 10.7.5
    11. 11. Heritage of the VM CLDC-HI (Java) HotSpot VM (Java) Strongtalk VM (Smalltalk) Self VM V8 (Self) (JavaScript)
    12. 12. What’s in common? • Lars Bak!
    13. 13. VM Comparison Self VM (3rd Generation) V8 (with Crankshaft) • Fast Object w/Hidden Class • Tiered Compilation • Fast Object w/Hidden Class • Tiered Compilation – OSR and deoptimization – support for full-speed debugging – OSR and deoptimization – support for full-speed debugging • Type Feedback – Polymorphic Inline Caching • • • • Type Inference Method/Block Inlining Method Customization Generational Scavenging – Scavenging + Mark-Compact • Type Feedback – Polymorphic Inline Caching • Type Inference • Function Inlining • Generational Scavenging – Scavenging + MarkSweep/Mark-Compact
    14. 14. To Implement a High Performance JavaScript Engine • Learn from Self VM as a basis!
    15. 15. Themes • Pay-as-you-go / Lazy • Take advantage of runtime information – Type feedback • Take advantage of actual code stability – Try to behave as static as possible
    16. 16. Outline of the main components JAVASCRIPT ENGINE OVERVIEW
    17. 17. Components of a JavaScript Engine • • • • • • Parser Runtime Execution Engine Garbage Collector (GC) Foreign Function Interface (FFI) Debugger and Diagnostics
    18. 18. Components of a JavaScript Engine Source Code Parser F F I host / external library AST Execution Engine Memory (Runtime Data Areas) Call Stack JavaScript Objects GC
    19. 19. Components of a JavaScript Engine • • • • • • Parser Runtime Execution Engine Garbage Collector (GC) Foreign Function Interface (FFI) Debugger and Diagnostics
    20. 20. Parser • Parse source code into internal representation • Usually generates AST VarDecl: z var z = x + y BinaryArith: + x y
    21. 21. Components of a JavaScript Engine • • • • • • Parser Runtime Execution Engine Garbage Collector (GC) Foreign Function Interface (FFI) Debugger and Diagnostics
    22. 22. Runtime • • • • Value Representation Object Model Built-in Objects Misc. Object Function __proto__ prototype __proto__ __proto__ constructor prototype … __proto__ null __proto__ x 2013 constructor y 11 … … …
    23. 23. Components of a JavaScript Engine • • • • • • Parser Runtime Execution Engine Garbage Collector (GC) Foreign Function Interface (FFI) Debugger and Diagnostics
    24. 24. Execution Engine • Execute JavaScript Code VarDecl: z addl %rcx, %rax BinaryArith: + x y
    25. 25. Components of a JavaScript Engine • • • • • • Parser Runtime Execution Engine Garbage Collector (GC) Foreign Function Interface (FFI) Debugger and Diagnostics
    26. 26. Garbage Collector • Collect memory from unused objects
    27. 27. Components of a JavaScript Engine • • • • • • Parser Runtime Execution Engine Garbage Collector (GC) Foreign Function Interface (FFI) Debugger and Diagnostics
    28. 28. Foreign Function Interface • Handle interaction between JavaScript and “the outside world” • JavaScript call out to native function • Native function call into JavaScript function, or access JavaScript object
    29. 29. Components of a JavaScript Engine • • • • • • Parser Runtime Execution Engine Garbage Collector (GC) Foreign Function Interface (FFI) Debugger and Diagnostics
    30. 30. Debugger and Diagnostics
    31. 31. IMPLEMENTATION STRATEGIES AND TRADEOFFS
    32. 32. Parser • • • • • LR LL Recursive Descent Operator Precedence Lazy Parsing / Deferred Parsing
    33. 33. Value Representation • Pointers, and all values allocated on heap • Discriminated Union • Tagged Value / Tagged Pointer
    34. 34. Value Representation • Pointers, and all values allocated on heap • Discriminated Union • Tagged Value / Tagged Pointer Tag_Int 2013 typedef Object* JSValue;
    35. 35. Value Representation • Pointers, and all values allocated on heap • Discriminated Union • Tagged Value / Tagged Pointer Tag_Int 2013 class JSValue { ObjectType ot; union { double n; bool b; Object* o; // … } u; }
    36. 36. Tagged • Tagged Pointer small integer 00 pointer 01 – Non-zero tag on pointer – Favor small integer arithmetics • Tagged Value – Non-zero tag on non-pointer – Favor pointer access • NaN-boxing – use special NaN value as box
    37. 37. Tagged • Tagged Pointer – Non-zero tag on pointer – Favor small integer arithmetics • Tagged Value – Non-zero tag on non-pointer – Favor pointer access • NaN-boxing – use special NaN value as box small integer 01 pointer 00
    38. 38. Tagged • Tagged Pointer – Non-zero tag on pointer – Favor small integer arithmetics • Tagged Value – Non-zero tag on non-pointer 00000000 – Favor pointer access • NaN-boxing pointer xxxxxxxx 11111111 double 00000000 – use special QNaN value as box integer
    39. 39. Value Representation in Self
    40. 40. Numeric Tower • • • • Internal Numeric Tower Smi -> HeapDouble int -> long -> double unboxed number
    41. 41. Object Model • Hash based – “Dictionary Mode” • Hidden Class based – “Fast Object”
    42. 42. Object Model Example: behind Groovy’s “object literal”-ish syntax Groovy code: Equivalent Java code: obj = [ x: 2013, y: 42 ]; obj = new LinkedHashMap(2); obj.put("x", 2013); obj.put("y", 42); i = obj.x; i = obj.get("x");
    43. 43. key “y” value keySet null 0 next null values null 1 hash 126 before table size threshold 1 loadFactor modCount after 2 entrySet header 42 key null 0.75 value null key 2 next null value null hash -1 next null before hash 127 after before header accessOrder x false after “x” header y 2013
    44. 44. Nashorn Object Model Key Setter “x” x getter x setter “y” map Getter y getter y setter map __proto__ context … flags 0 spill __proto__ null … arrayData EMPTY_ARRAY L0 x L1 y L2 (unused) 2013 L3 (unused) 42
    45. 45. Let’s ignore some fields for now Getter Setter “x” x getter x setter “y” map Key y getter y setter map __proto__ context … flags 0 spill __proto__ null … arrayData EMPTY_ARRAY L0 x L1 y L2 (unused) 2013 L3 (unused) 42
    46. 46. … and we’ll get this Key Getter Setter “x” x getter x setter “y” y getter y setter map L0 x L1 y 2013 42
    47. 47. looks just like a Java object Key Offset “x” +12 “y” +16 metadata x class Point { Object x; Object y; } y 2013 … with boxed fields 42
    48. 48. would be even better if … Key Offset “x” +12 “y” +16 metadata x 2013 y 42 class Point { int x; int y; } but Nashorn doesn’t go this far yet
    49. 49. Key Setter “x” x getter x setter “y” y getter y setter “z” z getter z setter “a” a getter a setter “b” map Getter b getter b setter __proto__ context … flags 0 map __proto__ … spill arrayData b L0 x L1 y L2 z L3 a 0 6 1 7 1 2 3 4 5
    50. 50. Inline Cache • Facilitated by use of hidden class • Improve property access efficiency • Collect type information for type feedback – later fed to JIT compilers for better optimization • Works with both interpreted and compiled code
    51. 51. String • • • • • Flat string Rope / ConsString / ConcatString Substring / Span Symbol / Atom External String
    52. 52. RegExp • • • • NFA Optimize to DFA where profitable Interpreted JIT Compiled
    53. 53. Call Stack • Native or separate? • Native – fast – easier transition between execution modes – harder to implement • Separate (aka “stack-less”) – easy to implement – slow – overhead when transitioning between exec modes
    54. 54. Execution Engine • Interpreter • Compiler – Ahead-of-Time Compiler – Just-in-Time Compiler – Dynamic / Adaptive Compiler • Mixed-mode • Tiered
    55. 55. Execution Engine in Self
    56. 56. Interpreter • • • • Line Interpreter AST Interpreter Stack-based Bytecode Interpreter Register-based Bytecode Interpreter
    57. 57. Interpreter • Written in – C/C++ – Assembler – others?
    58. 58. Compiler Concurrency • Foreground/Blocking Compilation • Background Compilation • Parallel Compilation
    59. 59. Baseline Compiler • Fast compilation, little optimization • Should generate type-stable code
    60. 60. Optimizing Compiler • Type Feedback • Type Inference • Function Inlining
    61. 61. On-stack Replacement
    62. 62. Garbage Collection • Reference Counting? – not really used by any mainstream impl • Tracing GC – mark-sweep – mark-compact – copying
    63. 63. GC Advances • • • • Generational GC Incremental GC Concurrent GC Parallel GC
    64. 64. GC Concurrency Mark-Sweep Application Thread JavaScript GC mark sweep
    65. 65. GC Concurrency Mark-Compact Application Thread JavaScript GC mark compact
    66. 66. GC Concurrency Scavenging Application Thread JavaScript GC scavenge
    67. 67. GC Concurrency Incremental Mark Application Thread JavaScript GC incremental mark sweep
    68. 68. GC Concurrency Lazy Sweep Application Thread JavaScript GC mark lazy sweep
    69. 69. GC Concurrency Incremental Mark + Lazy Sweep Application Thread JavaScript GC incremental mark lazy sweep
    70. 70. GC Concurrency Generational: Scavenging + (Incremental Mark + Lazy Sweep) Application Thread JavaScript GC incremental mark and scavenge lazy sweep
    71. 71. GC Concurrency (Mostly) Concurrent Mark-Sweep Application Thread JavaScript GC Thread GC reset initial mark remark concurrent mark concurrent sweep
    72. 72. A new high performance JavaScript on top of the JVM A BIT ABOUT NASHORN
    73. 73. What is Nashorn? Overview • Oracle’s ECMAScript 5.1 implementation, on the JVM • Clean code base, 100% Java – started from scratch; no code from Rhino • An OpenJDK project • GPLv2 licensed
    74. 74. What is Nashorn? Origins of the “Nashorn” name: the Rhino book
    75. 75. What is Nashorn? Origins of the “Nashorn” name: Mozilla Rhino
    76. 76. What is Nashorn? Origins of the “Nashorn” name: the unofficial Nashorn logo
    77. 77. What is Nashorn? Origins of the “Nashorn” name: my impression
    78. 78. Dynamic Languages on the JVM Can easily get to a sports-car-ish level
    79. 79. Dynamic Languages on the JVM Takes some effort to get to a decent sports car level
    80. 80. Dynamic Languages on the JVM Hard to achieve extremely good performance
    81. 81. Nashorn Execution Model JavaScript Source Code Compiler Backend Constant Folding Parser (Compiler Frontend) Control-flow Lowering Lexical Analysis Type Annotating Syntax Analysis Range Analysis (*) Code Splitting AST Type Hardening Bytecode Generation * Not complete yet Java Bytecode

    ×