Successfully reported this slideshow.
Your SlideShare is downloading. ×

High Performance JavaScript

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

High Performance JavaScript

  1. 1. High Performance JavaScript Andreas Gal (with lots of help from David Mandelin and David Anderson) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  2. 2. Power of the Web July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  3. 3. How did we get here? • Making existing web tech faster and faster • Adding new technologies, like video, audio, integration and 2D+3D drawing July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  4. 4. The Browser July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  5. 5. Getting a Webpage twitter.com 199.59.149.198 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  6. 6. Getting a Webpage GET / HTTP/1.0 twitter.com (HTTP Request) 199.59.149.198 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  7. 7. Getting a Webpage <html> <head> <title>Twitter / Home</title> </head> <script language="javascript"> function doStuff() {...} </script> <body> ... (HTML, CSS, JavaScript) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  8. 8. Core Web Components July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  9. 9. Core Web Components • HTML provides content July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  10. 10. Core Web Components • HTML provides content • CSS provides styling July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  11. 11. Core Web Components • HTML provides content • CSS provides styling • JavaScript provides a programming interface July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  12. 12. New Technologies • 2D Canvas • Video, Audio tags • WebGL (OpenGL + JavaScript) • Local Storage July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  13. 13. DOM (Source: http://www.w3schools.com/htmldom/ default.asp) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  14. 14. CSS • Cascading Style Sheets • Separates presentation from content .warning { color: red; text-decoration: underline; } • <b class=”warning”>This is red</b> July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  15. 15. What is JavaScript? July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  16. 16. What is JavaScript? • C-like syntax July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  17. 17. What is JavaScript? • C-like syntax • De facto language of the web July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  18. 18. What is JavaScript? • C-like syntax • De facto language of the web • Interacts with the DOM and the browser July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  19. 19. C-Like Syntax • Braces, semicolons, familiar keywords if (x) { for (i = 0; i < 100; i++) print("Hello!"); } July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  20. 20. Displaying a Webpage July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  21. 21. Displaying a Webpage • Parsing, JavaScript engine executes code as it is encountered. • This can change DOM, or even trigger a “reflow” (layout) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  22. 22. Displaying a Webpage • Parsing, JavaScript engine executes code as it is encountered. • This can change DOM, or even trigger a “reflow” (layout) • Layout engine applies CSS to DOM July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  23. 23. Displaying a Webpage • Parsing, JavaScript engine executes code as it is encountered. • This can change DOM, or even trigger a “reflow” (layout) • Layout engine applies CSS to DOM • Computes geometry of elements July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  24. 24. Displaying a Webpage • Parsing, JavaScript engine executes code as it is encountered. • This can change DOM, or even trigger a “reflow” (layout) • Layout engine applies CSS to DOM • Computes geometry of elements • Finished layout is sent to graphics layer July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  25. 25. Run to completion • JavaScript is executed during HTML parsing. Semantically , nothing can (*) proceed until JS execution is done. • Very different VM latency requirements than Java or C#. • Sub millisecond compilation delays matter. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  26. 26. The Result July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  27. 27. JavaScript • Invented by Brendan Eich in 1995 for Netscape 2 • Initial version written in 10 days • Anything from small browser interactions, complex apps (Gmail) to intense graphics (WebGL) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  28. 28. Think LISP or Self, not Java July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  29. 29. Think LISP or Self, not Java • Untyped - no type declarations July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  30. 30. Think LISP or Self, not Java • Untyped - no type declarations • Multi-Paradigm – objects, closures, first-class functions July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  31. 31. Think LISP or Self, not Java • Untyped - no type declarations • Multi-Paradigm – objects, closures, first-class functions • Highly dynamic - objects are dictionaries July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  32. 32. No Type Declarations • Properties, variables, return values can be anything: function f(a) { var x = 72.3; if (a) x = a + "string"; return x; } July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  33. 33. No Type Declarations • Properties, variables, return values can be anything: function f(a) { var x = “hi”; if (a) x = a + 33.2; return x; } July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  34. 34. No Type Declarations • Properties, variables, return values can be anything: function f(a) { if a is: var x = “hi”; number: add; if (a) string: concat; x = a + 33.2; return x; } July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  35. 35. Functional • Functions may be returned, passed as arguments: function f(a) { return function () { return a; } } var m = f(5); print(m()); July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  36. 36. Objects • Objects are dictionaries mapping strings to values • Properties may be deleted or added at any time! var point = { x : 5, y : 10 }; delete point.x; point.z = 12; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  37. 37. Prototypes July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  38. 38. Prototypes • Every object can have a prototype object July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  39. 39. Prototypes • Every object can have a prototype object • If a property is not found on an object, its prototype is searched instead July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  40. 40. Prototypes • Every object can have a prototype object • If a property is not found on an object, its prototype is searched instead • … And the prototype’s prototype, etc.. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  41. 41. Numbers • JavaScript specifies that numbers are IEEE-754 64-bit floating point • Engines use 32-bit integers to optimize • Must preserve semantics: integers overflow to doubles July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  42. 42. Numbers var x = 0x7FFFFFFF; int x = 0x7FFFFFFF; x++; x++; JavaScript C++ x 2147483648 -2147483648 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  43. 43. JavaScript Implementations • Values (Objects, strings, numbers) • Garbage collector • Runtime library • Execution engine (VM) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  44. 44. Values • The runtime must be able to query a value’s type to perform the right computation. • When storing values to variables or object fields, the type must be stored as well. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  45. 45. Values Boxed Unboxed Purpose Storage Computation Examples (INT32, 9000) (int)9000 (STRING, “hi”) (String *) “hi” (DOUBLE, 3.14) (double)3.14 Definition (Type tag, C++ value) C++ value • Boxed values required for variables, object fields • Unboxed values required for computation July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  46. 46. Boxed Values July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  47. 47. Boxed Values • Need a representation in C++ July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  48. 48. Boxed Values • Need a representation in C++ • Easy idea: 96-bit struct (LUA) • 32-bits for type • 64-bits for double, pointer, or integer July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  49. 49. Boxed Values • Need a representation in C++ • Easy idea: 96-bit struct (LUA) • 32-bits for type • 64-bits for double, pointer, or integer • Too big! We have to pack better. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  50. 50. Boxed Values • We used to use pointers, and tag the low bits (V8 still does) • Doubles have to be allocated on the heap • Indirection, GC pressure is bad • There is a middle ground… July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  51. 51. Nunboxing IA32, ARM July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  52. 52. Nunboxing IA32, ARM • Values are 64-bit July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  53. 53. Nunboxing IA32, ARM • Values are 64-bit • Doubles are normal IEEE-754 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  54. 54. Nunboxing IA32, ARM • Values are 64-bit • Doubles are normal IEEE-754 • How to pack non-doubles? July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  55. 55. Nunboxing IA32, ARM • Values are 64-bit • Doubles are normal IEEE-754 • How to pack non-doubles? • 51 bits of NaN space! July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  56. 56. Nunboxing IA32, ARM Type Payload 0x400c0000 0x00000000 63 0 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  57. 57. Nunboxing IA32, ARM Type Payload 0x400c0000 0x00000000 63 0 • Full value: 0x400c000000000000 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  58. 58. Nunboxing IA32, ARM Type Payload 0x400c0000 0x00000000 63 0 • Full value: 0x400c000000000000 • Type is double because it’s not a NaN July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  59. 59. Nunboxing IA32, ARM Type Payload 0x400c0000 0x00000000 63 0 • Full value: 0x400c000000000000 • Type is double because it’s not a NaN • Encodes: (Double, 3.5) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  60. 60. Nunboxing IA32, ARM Type Payload 0xFFFF0001 0x00000040 63 0 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  61. 61. Nunboxing IA32, ARM Type Payload 0xFFFF0001 0x00000040 63 0 • Full value: 0xFFFF000100000040 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  62. 62. Nunboxing IA32, ARM Type Payload 0xFFFF0001 0x00000040 63 0 • Full value: 0xFFFF000100000040 • Value is in NaN space July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  63. 63. Nunboxing IA32, ARM Type Payload 0xFFFF0001 0x00000040 63 0 • Full value: 0xFFFF000100000040 • Value is in NaN space • Type is 0xFFFF0001 (Int32) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  64. 64. Nunboxing IA32, ARM Type Payload 0xFFFF0001 0x00000040 63 0 • Full value: 0xFFFF000100000040 • Value is in NaN space • Type is 0xFFFF0001 (Int32) • Value is (Int32, 0x00000040) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  65. 65. NaN boxing July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  66. 66. NaN boxing • Nunboxing is a word-play on NaN boxing July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  67. 67. NaN boxing • Nunboxing is a word-play on NaN boxing • NaN boxing is like Nunboxing, but prefers pointers over doubles (mask to get a pointer, shift to get a double) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  68. 68. NaN boxing • Nunboxing is a word-play on NaN boxing • NaN boxing is like Nunboxing, but prefers pointers over doubles (mask to get a pointer, shift to get a double) • Attributed to Apple’s JSC, but really invented by Ed Smith/Adobe (I think) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  69. 69. Punboxing x86-64 63 Type 47 Payload 0 1111 1111 1111 1000 0 000 .. 0000 0000 0000 0000 0100 0000 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  70. 70. Punboxing x86-64 63 Type 47 Payload 0 1111 1111 1111 1000 0 000 .. 0000 0000 0000 0000 0100 0000 • Full value: 0xFFFF800000000040 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  71. 71. Punboxing x86-64 63 Type 47 Payload 0 1111 1111 1111 1000 0 000 .. 0000 0000 0000 0000 0100 0000 • Full value: 0xFFFF800000000040 • Value is in NaN space July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  72. 72. Punboxing x86-64 63 Type 47 Payload 0 1111 1111 1111 1000 0 000 .. 0000 0000 0000 0000 0100 0000 • Full value: 0xFFFF800000000040 • Value is in NaN space • Bits >> 47 == 0x1FFFF == INT32 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  73. 73. Punboxing x86-64 63 Type 47 Payload 0 1111 1111 1111 1000 0 000 .. 0000 0000 0000 0000 0100 0000 • Full value: 0xFFFF800000000040 • Value is in NaN space • Bits >> 47 == 0x1FFFF == INT32 • Bits 47-63 masked off to retrieve payload July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  74. 74. Punboxing x86-64 63 Type 47 Payload 0 1111 1111 1111 1000 0 000 .. 0000 0000 0000 0000 0100 0000 • Full value: 0xFFFF800000000040 • Value is in NaN space • Bits >> 47 == 0x1FFFF == INT32 • Bits 47-63 masked off to retrieve payload • Value(Int32, 0x00000040) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  75. 75. Nunboxing Punboxing Fits in register NO YES Trivial to decode YES NO Portability 32-bit only* x64 only * Some 64-bit OSes can restrict mmap() to 32-bits July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  76. 76. Garbage Collection • Need to reclaim memory without pausing user workflow, animations, etc • Need very fast object allocation • Consider lots of small objects like points, vectors • Reading and writing to the heap must be fast July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  77. 77. Object Representation • JavaScript objects are very different from Java objects • Start empty, grow and shrink over time. • We use a header with a variable number of built-in slots, malloc() beyond that. • Predicting number of built-in slots to use is black magic. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  78. 78. Mark and Sweep July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  79. 79. Mark and Sweep • All live objects are found via a recursive traversal of the root value set July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  80. 80. Mark and Sweep • All live objects are found via a recursive traversal of the root value set • All dead objects are added to a free list July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  81. 81. Mark and Sweep • All live objects are found via a recursive traversal of the root value set • All dead objects are added to a free list • Very slow to traverse entire heap July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  82. 82. Mark and Sweep • All live objects are found via a recursive traversal of the root value set • All dead objects are added to a free list • Very slow to traverse entire heap • Building free lists can bring “dead” memory into cache July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  83. 83. Improvements July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  84. 84. Improvements • Sweeping, freelist building have been moved off-thread July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  85. 85. Improvements • Sweeping, freelist building have been moved off-thread • Marking can be incremental, interleaved with program execution July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  86. 86. Generational GC • Observation: most objects are short- lived • All new objects are bump-allocated into a newborn space • Once newborn space is full, live objects are moved to the tenured space • Newborn space is then reset to empty July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  87. 87. Generational GC Inactive Active Newborn 8MB 8MB Space Tenured Space July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  88. 88. After Minor GC Active Inactive Newborn 8MB 8MB Space Tenured Space July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  89. 89. Barriers • Incremental marking, generational GC need read or write barriers • Read barriers are much slower • Write barrier seems to be well- predicted? July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  90. 90. Unknowns • How much do cache effects matter? • Memory GC has to touch • Locality of objects • What do we need to consider to make GC fast? • Lots of research on Java. Almost nothing on JavaScript. Very different heaps. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  91. 91. Running JavaScript July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  92. 92. Running JavaScript • Interpreter – Runs code, not fast July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  93. 93. Running JavaScript • Interpreter – Runs code, not fast • Basic JIT – Simple, untyped compiler July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  94. 94. Running JavaScript • Interpreter – Runs code, not fast • Basic JIT – Simple, untyped compiler • Trace Compiler - Typed compiler for traces (low latency, often fast). July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  95. 95. Running JavaScript • Interpreter – Runs code, not fast • Basic JIT – Simple, untyped compiler • Trace Compiler - Typed compiler for traces (low latency, often fast). • Heavy-Duty JIT – Typed compiler for whole methods (high latency, always fast). July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  96. 96. Interpreter • Good for code that runs once • Giant switch loop • Handles all edge cases of JS semantics July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  97. 97. Interpreter while (true) { switch (*pc) { case OP_ADD: ... case OP_SUB: ... case OP_RETURN: ... } pc++; } July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  98. 98. Interpreter case OP_ADD: { Value lhs = POP(); Value rhs = POP(); Value result; if (lhs.isInt32() && rhs.isInt32()) { int left = rhs.toInt32(); int right = rhs.toInt32(); if (AddOverflows(left, right, left + right)) result.setInt32(left + right); else result.setNumber(double(left) + double(right)); } else if (lhs.isString() || rhs.isString()) { String *left = ValueToString(lhs); String *right = ValueToString(rhs); String *r = Concatenate(left, right); result.setString(r); } else { double left = ValueToNumber(lhs); double right = ValueToNumber(rhs); result.setDouble(left + right); } PUSH(result); break; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  99. 99. Interpreter July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  100. 100. Interpreter • Very slow! • Lots of opcode and type dispatch • Lots of interpreter stack traffic July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  101. 101. Interpreter • Very slow! • Lots of opcode and type dispatch • Lots of interpreter stack traffic • Just-in-time compilation solves both July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  102. 102. Just In Time July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  103. 103. Just In Time • Compilation must be very fast • Can’t introduce noticeable pauses July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  104. 104. Just In Time • Compilation must be very fast • Can’t introduce noticeable pauses • People care about memory use • Can’t JIT everything • Can’t create bloated code • May have to discard code at any time July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  105. 105. Basic JIT July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  106. 106. Basic JIT • JägerMonkey in Firefox 4 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  107. 107. Basic JIT • JägerMonkey in Firefox 4 • Every opcode has a hand-coded template of assembly (registers left blank) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  108. 108. Basic JIT • JägerMonkey in Firefox 4 • Every opcode has a hand-coded template of assembly (registers left blank) • Method at a time: • Single pass through bytecode stream! • Compiler uses assembly templates corresponding to each opcode July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  109. 109. Bytecode GETARG 0 ; fetch x function Add(x, y) { GETARG 1 ; fetch y return x + y; ADD } RETURN July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  110. 110. Interpreter case OP_ADD: { Value lhs = POP(); Value rhs = POP(); Value result; if (lhs.isInt32() && rhs.isInt32()) { int left = rhs.toInt32(); int right = rhs.toInt32(); if (AddOverflows(left, right, left + right)) result.setInt32(left + right); else result.setNumber(double(left) + double(right)); } else if (lhs.isString() || rhs.isString()) { String *left = ValueToString(lhs); String *right = ValueToString(rhs); String *r = Concatenate(left, right); result.setString(r); } else { double left = ValueToNumber(lhs); double right = ValueToNumber(rhs); result.setDouble(left + right); } PUSH(result); break; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  111. 111. Assembling ADD July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  112. 112. Assembling ADD • Inlining that huge chunk for every ADD would be very slow July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  113. 113. Assembling ADD • Inlining that huge chunk for every ADD would be very slow • Observation: • Some input types much more common than others July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  114. 114. Integer Math – Common var j = 0; for (i = 0; i < 10000; i++) { j += i; } Weird Stuff – Rare! var j = 12.3; for (i = 0; i < 10000; i++) { j += new Object() + i.toString(); } July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  115. 115. Assembling ADD • Only generate code for the easiest and most common cases • Large design space • Can consider integers common, or • Integers and doubles, or • Anything! July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  116. 116. Basic JIT - ADD if (arg0.type != INT32) goto slow_add; if (arg1.type != INT32) goto slow_add; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  117. 117. Basic JIT - ADD if (arg0.type != INT32) goto slow_add; if (arg1.type != INT32) goto slow_add; R0 = arg0.data R1 = arg1.data July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  118. 118. Basic JIT - ADD if (arg0.type != INT32) goto slow_add; if (arg1.type != INT32) goto slow_add; Greedy R0 = arg0.data R1 = arg1.data Register Allocator July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  119. 119. Basic JIT - ADD if (arg0.type != INT32) goto slow_add; if (arg1.type != INT32) goto slow_add; R0 = arg0.data; R1 = arg1.data; R2 = R0 + R1; if (OVERFLOWED) goto slow_add; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  120. 120. Slow Paths slow_add: Value result = runtime::Add(arg0, arg1); July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  121. 121. Inline Out-of-line if (arg0.type != INT32) goto slow_add; slow_add: if (arg1.type != INT32) Value result = Interpreter::Add(arg0, arg1); R2 = result.data; goto slow_add; goto rejoin; R0 = arg0.data; R1 = arg1.data; R2 = R0 + R1; if (OVERFLOWED) goto slow_add; rejoin: July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  122. 122. Rejoining Inline Out-of-line if (arg0.type != INT32) goto slow_add; slow_add: if (arg1.type != INT32) Value result = runtime::Add(arg0, arg1); goto slow_add; R2 = result.data; R3 = result.type; R0 = arg0.data; goto rejoin; R1 = arg1.data; R2 = R0 + R1; if (OVERFLOWED) goto slow_add; R3 = TYPE_INT32; rejoin: July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  123. 123. Final Code Inline Out-of-line if (arg0.type != INT32) goto slow_add; if (arg1.type != INT32) slow_add: Value result = Interpreter::Add(arg0, arg1); goto slow_add; R2 = result.data; R0 = arg0.data; R3 = result.type; R1 = arg1.data; goto rejoin; R2 = R0 + R1; if (OVERFLOWED) goto slow_add; R3 = TYPE_INT32; rejoin: return Value(R3, R2); July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  124. 124. Observations July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  125. 125. Observations • Even though we have fast paths, no type information can flow in between opcodes • Cannot assume result of ADD is integer • Means more branches July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  126. 126. Observations • Even though we have fast paths, no type information can flow in between opcodes • Cannot assume result of ADD is integer • Means more branches • Types increase register pressure July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  127. 127. Basic JIT – Object Access function f(x) { return x.y; } • x.y is a dictionary search! • How can we create a fast-path? • We could try to cache the lookup, but • We want it to work for >1 objects July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  128. 128. Object Access July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  129. 129. Object Access • Observation • Usually, objects flowing through an access site look the same July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  130. 130. Object Access • Observation • Usually, objects flowing through an access site look the same • If we knew an object’s layout, we could bypass the dictionary lookup July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  131. 131. Object Layout var obj = { x: 50, y: 100, z: 20 }; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  132. 132. Object Layout July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  133. 133. Object Layout JSObject Shape *shape; Value *slots; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  134. 134. Object Layout Shape JSObject Property Name Slot Number Shape *shape; x 0 Value *slots; y 1 z 2 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  135. 135. Object Layout Shape JSObject Property Name Slot Number Shape *shape; 0 (Int32, 50) x 0 Value *slots; 1 (Int32, 100) y 1 2 (Int32, 20) z 2 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  136. 136. Object Layout var obj = { x: 50, y: 100, z: 20 }; … var obj2 = { x: 78, y: 93, z: 600 }; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  137. 137. Object Layout Shape JSObject Property Name Slot Number 0 (Int32, 50) Shape *shape; x 0 1 (Int32, 100) Value *slots; 2 (Int32, 20) y 1 z 2 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  138. 138. Object Layout Shape JSObject Property Name Slot Number 0 (Int32, 50) Shape *shape; x 0 1 (Int32, 100) Value *slots; 2 (Int32, 20) y 1 z 2 JSObject Shape *shape; Value *slots; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  139. 139. Object Layout Shape JSObject Property Name Slot Number 0 (Int32, 50) Shape *shape; x 0 1 (Int32, 100) Value *slots; 2 (Int32, 20) y 1 z 2 JSObject Shape *shape; Value *slots; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  140. 140. Object Layout Shape JSObject Property Name Slot Number 0 (Int32, 50) Shape *shape; x 0 1 (Int32, 100) Value *slots; 2 (Int32, 20) y 1 z 2 JSObject 0 (Int32, 78) Shape *shape; 1 (Int32, 93) Value *slots; 2 (Int32, 600) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  141. 141. Familiar Problems July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  142. 142. Familiar Problems • We don’t know an object’s shape during JIT compilation July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  143. 143. Familiar Problems • We don’t know an object’s shape during JIT compilation • ... Or even that a property access receives an object! July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  144. 144. Familiar Problems • We don’t know an object’s shape during JIT compilation • ... Or even that a property access receives an object! • Solution: leave code “blank,” lazily generate it later July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  145. 145. Inline Caches function f(x) { return x.y; } July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  146. 146. Inline Caches • Start as normal, guarding on type and loading data if (arg0.type != OBJECT) goto slow_property; JSObject *obj = arg0.data; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  147. 147. Inline Caches • No information yet: leave blank (nops). if (arg0.type != OBJECT) goto slow_property; JSObject *obj = arg0.data; goto property_ic; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  148. 148. Property ICs function f(x) { return x.y; } f({x: 5, y: 10, z: 30}); July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  149. 149. Property ICs Shape S1 JSObject Property Name Slot Number Shape *shape; 0 (Int32, 5) x 0 Value *slots; 1 (Int32, 10) y 1 2 (Int32, 30) z 2 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  150. 150. Property ICs Shape S1 Property Name Slot Number x 0 y 1 z 2 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  151. 151. Property ICs Shape S1 Property Name Slot Number x 0 y 1 z 2 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  152. 152. Property ICs Shape S1 Property Name Slot Number x 0 y 1 z 2 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  153. 153. Inline Caches • Shape = S , slot is 1 1 if (arg0.type != OBJECT) goto slow_property; JSObject *obj = arg0.data; goto property_ic; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  154. 154. Inline Caches • Shape = S , slot is 1 1 if (arg0.type != OBJECT) goto slow_property; JSObject *obj = arg0.data; goto property_ic; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  155. 155. Inline Caches • Shape = SHAPE , slot is 1 1 if (arg0.type != OBJECT) goto slow_property; JSObject *obj = arg0.data; if (obj->shape != SHAPE1) goto property_ic; R0 = obj->slots[1].type; R1 = obj->slots[1].data; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  156. 156. Polymorphism • What happens if two different shapes pass through a property access? July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  157. 157. Polymorphism function f(x) { return x.y; } f({ x: 5, y: 10, z: 30}); f({ y: 40}); • What happens if two different shapes pass through a property access? July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  158. 158. Polymorphism function f(x) { return x.y; } f({ x: 5, y: 10, z: 30}); f({ y: 40}); July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  159. 159. Polymorphism function f(x) { return x.y; } f({ x: 5, y: 10, z: 30}); f({ y: 40}); • What happens if two different shapes pass through a property access? July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  160. 160. Polymorphism function f(x) { return x.y; } f({ x: 5, y: 10, z: 30}); f({ y: 40}); • What happens if two different shapes pass through a property access? July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  161. 161. Polymorphism Main Method if (arg0.type != OBJECT) goto slow_path; JSObject *obj = arg0.data; if (obj->shape != SHAPE1) goto property_ic; R0 = obj->slots[1].type; R1 = obj->slots[1].data; rejoin: July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  162. 162. Polymorphism Main Method Generated Stub stub: if (arg0.type != OBJECT) if (obj->shape != SHAPE2) goto slow_path; goto property_ic; JSObject *obj = arg0.data; R0 = obj->slots[0].type; if (obj->shape != SHAPE0) R1 = obj->slots[0].data; goto property_ic; goto rejoin; R0 = obj->slots[1].type; R1 = obj->slots[1].data; rejoin: July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  163. 163. Polymorphism Main Method Generated Stub stub: if (arg0.type != OBJECT) if (obj->shape != SHAPE2) goto slow_path; goto property_ic; JSObject *obj = arg0.data; R0 = obj->slots[0].type; if (obj->shape != SHAPE1) R1 = obj->slots[0].data; goto stub; goto rejoin; R0 = obj->slots[1].type; R1 = obj->slots[1].data; rejoin: July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  164. 164. Chain of Stubs if (obj->shape == S1) { result = obj->slots[0]; } else { if (obj->shape == S2) { result = obj->slots[1]; } else { if (obj->shape == S3) { result = obj->slots[2]; } else { if (obj->shape == S4) { result = obj->slots[3]; } else { goto property_ic; } } } } July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  165. 165. Code Memory • Generated code is patched from inside the method • Self-modifying, but single threaded • Code memory is always rwx • Concerned about protection- flipping expense July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  166. 166. Basic JIT Summary July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  167. 167. Basic JIT Summary • Generates simple code to handle common cases July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  168. 168. Basic JIT Summary • Generates simple code to handle common cases • Inline caches adapt code based on runtime observations July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  169. 169. Basic JIT Summary • Generates simple code to handle common cases • Inline caches adapt code based on runtime observations • Lots of guards, poor type information July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  170. 170. Optimizing Harder July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  171. 171. Optimizing Harder • Single pass too limited: we want to perform whole-method optimizations July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  172. 172. Optimizing Harder • Single pass too limited: we want to perform whole-method optimizations • We could generate an IR, but without type information... July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  173. 173. Optimizing Harder • Single pass too limited: we want to perform whole-method optimizations • We could generate an IR, but without type information... • Slow paths prevent most optimizations. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  174. 174. Code Motion? function Add(x, n) { var sum = 0; for (var i = 0; i < n; i++) sum += x + n; return sum; } The addition looks loop invariant. Can we hoist it? July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  175. 175. Code Motion? function Add(x, n) { var sum = 0; var temp0 = x + n; for (var i = 0; i < n; i++) sum += temp0; return sum; } Let’s try it. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  176. 176. Wrong Result var global = 0; var obj = { valueOf: function () { return ++global; } } Add(obj, 10); July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  177. 177. Wrong Result var global = 0; var obj = { valueOf: function () { return ++global; } } Add(obj, 10); • Original code returns 155 July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  178. 178. Wrong Result var global = 0; var obj = { valueOf: function () { return ++global; } } Add(obj, 10); • Original code returns 155 • Hoisted version returns 110! July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  179. 179. Idempotency • Untyped operations are usually not provably idempotent • Slow paths are re-entrant, can have observable side effects • This prevents code motion, redundancy elimination • Just ask the Chakra team... July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  180. 180. Ideal Scenario July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  181. 181. Ideal Scenario • Actual knowledge about types! July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  182. 182. Ideal Scenario • Actual knowledge about types! • Remove slow paths July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  183. 183. Ideal Scenario • Actual knowledge about types! • Remove slow paths • Record and execute typed traces! (TraceMonkey) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  184. 184. Ideal Scenario • Actual knowledge about types! • Remove slow paths • Record and execute typed traces! (TraceMonkey) • Or eve perform whole-method optimizations July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  185. 185. Heavy Duty JITs July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  186. 186. Heavy Duty JITs • Make optimistic guesses about types July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  187. 187. Heavy Duty JITs • Make optimistic guesses about types • Guesses must be informed • Don’t want to waste time compiling code that can’t or won’t run July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  188. 188. Heavy Duty JITs • Make optimistic guesses about types • Guesses must be informed • Don’t want to waste time compiling code that can’t or won’t run • Using type inference or runtime profiling July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  189. 189. Heavy Duty JITs • Make optimistic guesses about types • Guesses must be informed • Don’t want to waste time compiling code that can’t or won’t run • Using type inference or runtime profiling • Generate an IR, perform textbook compiler optimizations July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  190. 190. Optimism Pays Off • People naturally write code as if it were typed • Variables and object fields that change types are rare July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  191. 191. IonMonkey July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  192. 192. IonMonkey • Work in progress July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  193. 193. IonMonkey • Work in progress • Constructs high and low-level IRs July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  194. 194. IonMonkey • Work in progress • Constructs high and low-level IRs • Applies type information using runtime feedback July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  195. 195. IonMonkey • Work in progress • Constructs high and low-level IRs • Applies type information using runtime feedback • Inlining, GVN, LICM, LSRA July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  196. 196. GETARG 0 ; fetch x function Add(x, y) { GETARG 1 ; fetch y } return x + y + x; ADD GETARG 0 ; fetch x ADD RETURN July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  197. 197. Build SSA GETARG 0 ; fetch x v0 = arg0 GETARG 1 ; fetch y v1 = arg1 ADD GETARG 0 ; fetch x ADD RETURN July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  198. 198. Build SSA GETARG 0 ; fetch x v0 = arg0 GETARG 1 ; fetch y v1 = arg1 ADD GETARG 0 ; fetch x ADD RETURN July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  199. 199. Type Oracle • Need a mechanism to inform compiler about likely types • Type Oracle: given program counter, returns types for inputs and outputs • May use any data source – runtime profiling, type inference, etc July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  200. 200. Type Oracle • For this example, assume the type oracle returns “integer” for the output and both inputs to all ADD operations July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  201. 201. Build SSA GETARG 0 ; fetch x v0 = arg0 GETARG 1 ; fetch y v1 = arg1 ADD v2 = iadd(v0, v1) GETARG 0 ; fetch x ADD RETURN July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  202. 202. Build SSA GETARG 0 ; fetch x v0 = arg0 GETARG 1 ; fetch y v1 = arg1 ADD v2 = iadd(v0, v1) GETARG 0 ; fetch x ADD v3 = iadd(v2, v0) RETURN July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  203. 203. Build SSA GETARG 0 ; fetch x v0 = arg0 GETARG 1 ; fetch y v1 = arg1 ADD v2 = iadd(v0, v1) GETARG 0 ; fetch x ADD v3 = iadd(v2, v0) RETURN v4 = return(v3) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  204. 204. SSA (Intermediate) v0 = arg0 v1 = arg1 v2 = iadd(v0, v1) v3 = iadd(v2, v0) v4 = return(v3) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  205. 205. SSA (Intermediate) Untyped v0 = arg0 v1 = arg1 v2 = iadd(v0, v1) v3 = iadd(v2, v0) v4 = return(v3) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  206. 206. SSA (Intermediate) Untyped v0 = arg0 v1 = arg1 v2 = iadd(v0, v1) Typed v3 = iadd(v2, v0) v4 = return(v3) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  207. 207. Intermediate SSA July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  208. 208. Intermediate SSA • SSA does not type check July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  209. 209. Intermediate SSA • SSA does not type check • Type analysis: • Makes sure SSA inputs have correct type • Inserts conversions, value decoding July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  210. 210. Unoptimized SSA v0 = arg0 v1 = arg1 v2 = unbox(v0, INT32) v3 = unbox(v1, INT32) v4 = iadd(v2, v3) v5 = unbox(v0, INT32) v6 = iadd(v4, v5) v7 = return(v6) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  211. 211. Optimization v0 = arg0 v1 = arg1 v2 = unbox(v0, INT32) v3 = unbox(v1, INT32) v4 = iadd(v2, v3) v5 = unbox(v0, INT32) v6 = iadd(v4, v5) v7 = return(v6) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  212. 212. Optimization v0 = arg0 v1 = arg1 v2 = unbox(v0, INT32) v3 = unbox(v1, INT32) v4 = iadd(v2, v3) v5 = unbox(v0, INT32) v6 = iadd(v4, v5) v7 = return(v6) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  213. 213. Optimized SSA v0 = arg0 v1 = arg1 v2 = unbox(v0, INT32) v3 = unbox(v1, INT32) v4 = iadd(v2, v3) v5 = iadd(v4, v2) v6 = return(v5) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  214. 214. Register Allocation • Linear Scan Register Allocation • Based on Christian Wimmer’s work • Linear Scan Register Allocation for the Java HotSpot™ Client Compiler. Master's thesis, Institute for System Software, Johannes Kepler University Linz, 2004 • Linear Scan Register Allocation on SSA Form. In Proceedings of the International Symposium on Code Generation and Optimization, pages 170–179. ACM Press, 2010. • Same algorithm used in Hotspot JVM July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  215. 215. Allocate Registers 0: stack arg0 1: stack arg1 2: r1 unbox(0, INT32) 3: r0 unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(5) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  216. 216. Code Generation SSA 0: stack arg0 1: stack arg1 2: r1 unbox(0, INT32) 3: r0 unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(5) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  217. 217. Code Generation SSA Native Code 0: stack arg0 if (arg0.type != INT32) 1: stack arg1 goto BAILOUT; 2: r1 unbox(0, INT32) r1 = arg0.data; 3: r0 unbox(1, INT32) 4: r0 iadd(2, 3) 5: r0 iadd(4, 2) 6: r0 return(5) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  218. 218. Code Generation SSA Native Code 0: stack arg0 if (arg0.type != INT32) 1: stack arg1 goto BAILOUT; 2: r1 unbox(0, INT32) r1 = arg0.data; 3: r0 unbox(1, INT32) if (arg1.type != INT32) 4: r0 iadd(2, 3) goto BAILOUT; 5: r0 iadd(4, 2) r0 = arg1.data; 6: r0 return(5) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  219. 219. Code Generation SSA Native Code 0: stack arg0 if (arg0.type != INT32) 1: stack arg1 goto BAILOUT; 2: r0 unbox(0, INT32) r0 = arg0.data; 3: r1 unbox(1, INT32) if (arg1.type != INT32) 4: r0 iadd(2, 3) goto BAILOUT; 5: r0 iadd(4, 2) r1 = arg1.data; 6: r0 return(5) r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  220. 220. Code Generation SSA Native Code 0: stack arg0 if (arg0.type != INT32) 1: stack arg1 goto BAILOUT; 2: r0 unbox(0, INT32) r0 = arg0.data; 3: r1 unbox(1, INT32) if (arg1.type != INT32) 4: r0 iadd(2, 3) goto BAILOUT; 5: r0 iadd(4, 2) r1 = arg1.data; 6: r0 return(5) r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  221. 221. Code Generation SSA Native Code 0: stack arg0 if (arg0.type != INT32) 1: stack arg1 goto BAILOUT; 2: r0 unbox(0, INT32) r0 = arg0.data; 3: r1 unbox(1, INT32) if (arg1.type != INT32) 4: r0 iadd(2, 3) goto BAILOUT; 5: r0 iadd(4, 2) r1 = arg1.data; 6: r0 return(4) r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; return r0; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  222. 222. Generated Code if (arg0.type != INT32) goto BAILOUT; r0 = arg0.data; if (arg1.type != INT32) goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; return r0; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  223. 223. Generated Code if (arg0.type != INT32) goto BAILOUT; r0 = arg0.data; if (arg1.type != INT32) goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; return r0; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  224. 224. Generated Code if (arg0.type != INT32) goto BAILOUT; • Bailout exits JIT r0 = arg0.data; if (arg1.type != INT32) goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; return r0; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  225. 225. Generated Code if (arg0.type != INT32) goto BAILOUT; • Bailout exits JIT r0 = arg0.data; if (arg1.type != INT32) • May recompile function goto BAILOUT; r1 = arg1.data; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; r0 = r0 + r1; if (OVERFLOWED) goto BAILOUT; return r0; July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  226. 226. Guards Still Needed • Object shapes for reading properties • Types of… • Arguments • Values read from the heap • Values returned from C++ July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  227. 227. Guards Still Needed • Math • Integer Overflow • Divide by Zero • Multiply by -0 • NaN is falsey in JS, truthy in ISAs July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  228. 228. Heavy Duty JITs July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  229. 229. Heavy Duty JITs • Full type speculation - no slow paths • Can move and eliminate code July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  230. 230. Heavy Duty JITs • Full type speculation - no slow paths • Can move and eliminate code • If speculation fails, method is recompiled or deoptimized July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  231. 231. Heavy Duty JITs • Full type speculation - no slow paths • Can move and eliminate code • If speculation fails, method is recompiled or deoptimized • Can still use techniques like inline caching! July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  232. 232. To Infinity, and Beyond July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  233. 233. To Infinity, and Beyond • Heavy-Duty JIT example has four guards: July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  234. 234. To Infinity, and Beyond • Heavy-Duty JIT example has four guards: • Two type checks July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  235. 235. To Infinity, and Beyond • Heavy-Duty JIT example has four guards: • Two type checks • Two overflow checks July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  236. 236. To Infinity, and Beyond • Heavy-Duty JIT example has four guards: • Two type checks • Two overflow checks • Better than Basic JIT, but we can do better July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  237. 237. To Infinity, and Beyond • Heavy-Duty JIT example has four guards: • Two type checks • Two overflow checks • Better than Basic JIT, but we can do better • Interval analysis can remove overflow checks July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  238. 238. To Infinity, and Beyond • Heavy-Duty JIT example has four guards: • Two type checks • Two overflow checks • Better than Basic JIT, but we can do better • Interval analysis can remove overflow checks • Type Inference! July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  239. 239. Type Inference July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  240. 240. Type Inference • Whole program analysis to determine types of variables July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  241. 241. Type Inference • Whole program analysis to determine types of variables • Hybrid: both static and dynamic July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  242. 242. Type Inference • Whole program analysis to determine types of variables • Hybrid: both static and dynamic • Replaces checks in JIT with checks in virtual machine • If VM breaks an assumption held in any JIT code, that JIT code is discarded July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  243. 243. Conclusions • The web needs JavaScript to be fast • Untyped, highly dynamic nature makes this challenging • Value boxing has different tradeoffs on different ISAs • GC needs to be fast. Lots of room for future research. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  244. 244. Conclusions • JITs getting more powerful • Type specialization is key • Still need checks and special cases on many operations July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  245. 245. Onwards • The web is just getting warmed up • Fast JavaScript execution and new platform capabilities are pushing the boundaries of whats possible inside a web browser. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  246. 246. pdf.js July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  247. 247. pdf.js • 12k lines of JS code • C++ poppler library closer to 200k of code • Excellent performance (GPU accelerated canvas API) • Unbeatable security story July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  248. 248. dom.js July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  249. 249. dom.js • The DOM is the OS / API of the web platform • Traditionally implemented in C++. • Calling into the DOM requires expensive wrapping. • C++ code is re-entrant. Nightmare for any analysis. July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  250. 250. dom.js • dom.js re-implements the DOM in JavaScript • Several new language features are being standardized for JS to make this possible (Proxies, WeakMaps) • Massive speedup for some operations (by using JS vs C++!) July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  251. 251. Why should you care? • The web is displacing proprietary Operating Systems as the next application platform • A lot of academic research is still focused on Java/C++/etc. Those are obsolete technologies as far as the web is concerned July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  252. 252. Research Opportunities • JavaScript still 1.5-10x behind C++ in some cases • Many unknowns in the area of GC, Debugging, Profiling, Static Analysis July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  253. 253. Next Up: Boot to Web July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  254. 254. Want to get involved? • The web is open technology. Anyone can help extend and improve it • Several Universities participate in the JavaScript standards group (TC39), for example • Tom Van Cutsem’s JavaScript Proxies work already available to 300mil people. Soon also in Chrome July 28, 2011 Lancaster, UK Thursday, July 28, 2011
  255. 255. Questions? July 28, 2011 Lancaster, UK Thursday, July 28, 2011

×