Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
COSCUP 2015
ZongShen Shen
andy.zsshen@gmail.com
A Beginner’s Journey to Mozilla
SpiderMonkey JS Engine
Why Joining SpiderMonkey
• Explore a real language engine implementation
• Good First Features encouraging beginners
About the Talk
• Under the hood of engine implementation
• Begineer’s view and experience sharing
Outline
•Bytecode & Interpreter Basics
•JIT Optimization
SpiderMonkey Overview
NativeCode
Bytecode
JIT Compiler
JS Source
Compiler
Interpreter
CPU
SpiderMonkey Overview
NativeCode
Bytecode
JIT Compiler
JS Source
Compiler
Interpreter
CPU
Bytecode Generation
SpiderMonkey Overview
NativeCode
Bytecode
JIT Compiler
JS Source
Compiler
Interpreter
CPU
Bytecode Interpretation
Bytecode...
SpiderMonkey Overview
NativeCode
Bytecode
JIT Compiler
JS Source
Compiler
Interpreter
CPU
Bytecode Interpretation
Hot Code...
Bytecode Compiler
• Lexical Analysis
• Split the source script into token stream
• Syntactic Analysis
• Parse token stream...
Lexical Analysis
var x = y + z ;
var a = b * c ;
Variable
Name
Assignment
Add
Semicolon
VarOrExprs → varVars | Expr
Vars → Var | Var,Vars
Var → Id | Id = AssignExpr
Expr → AssignExpr | AssignExpr, Expr
AssignEx...
Syntactic Analysis
Statement List
Assignment
Def : x BinaryAdd
Use : y Use : z
Assignment
Def : a BinaryMultiply
Use : b U...
Code Generation
= =
x
y
S
z
+ a
b c
*
Code Generation
= =
x
y
S
z
+ a
b c
*
DefVar x
BindName x
Code Generation
= =
x
y
S
z
+ a
b c
*
DefVar x
BindName x
GetName y
Code Generation
= =
x
y
S
z
+ a
b c
*
DefVar x
BindName x
GetName y
GetName z
Code Generation
= =
x
y
S
z
+ a
b c
*
DefVar x
BindName x
GetName y
GetName z
Add
Code Generation
= =
x
y
S
z
+ a
b c
*
DefVar x
BindName x
GetName y
GetName z
Add
SetName x
Code Generation
= =
x
y
S
z
+ a
b c
*
DefVar x
DefVar a
BindName x
GetName y
GetName z
Add
SetName x
BindName a
GetName b
...
Bytecode Interpreter
• Prepare the stack frame to interpret bytecode
• Dispatch bytecode in a large switch statement
INTER...
function add (src, dst) {
return src + dst;
}
add(“coscup”, 2015);
GetName “add”
Undefined
String “coscup”
Int16 2015
Call...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
Caller Callee
Stack Frame
Interpret...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
Caller Callee
Stack...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
JSVal: Undef
Caller...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
JSVal: Undef
JSVal:...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
JSVal: Undef
JSVal:...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
JSVal: Undef
JSVal:...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
JSVal: Undef
JSVal:...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
JSVal: Undef
JSVal:...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
JSVal: Undef
JSVal:...
GetName “add”
Undefined
String “coscup”
Int16 2015
Call 2
GetArg 0
GetArg 1
Add
Return
JSVal: Func_add
JSVal: Undef
JSVal:...
Performance Disadvantage
• Immediate execution without proper redundancy
elimination and task specialized optimization
Performance Disadvantage
• Immediate execution without proper redundancy
elimination and task specialized optimization
Exa...
JS Object
var People = {
Name : “Me”,
Age : 1,
Gender : “M”
};
Property Value
People.Name
People.Age
People.Gender
Propert...
Object Internal
• A list of shapes each of which
• Represents a named property
• A vector of slots each of which
• Stores ...
Object Property Access
• Object layout traversal
1. Search shape list to locate
the target property shape
2. Access slot v...
Object Property Access
• Object layout traversal
1. Search shape list to locate
the target property shape
2. Access slot v...
Performance Gap
lea eax, obj
mov ebx, [eax + 4]
 
AoT Compilation
Direct access Slow object
layout traversal
struct Obje...
Can we improve the performance?
In addition to object property access,
Still many issues…
Can we improve the performance?
In addition to object property access,
Still many issues…
Interpretation
JIT Compilation
JIT Compilation
• Generate extremely fast native code
• Baseline for hot methods
• Inline cache to speed up dynamic proper...
Inline Cache
• Objective
• Mitigate the overhead of object layout traversal
for each single property access
• Idea
• Cache...
Inline Cache
var res = obj.prop;
GetName “obj”
GetProp “prop”
Inline Cache
var res = obj.prop;
GetName “obj”
GetProp “prop”
Dynamic lookup logic
Inline Cache
• Efficient code for direct access
• But if obj is modified, the code will be unsafe
var res = obj.prop;
GetN...
Direct Access Guard
• If an object is modified with property insertion or
deletion, its layout is also changed
• Execute t...
Direct Access Guard
• Benefit from object shape
• Object has a shape to describe its overall attribute
• The object shape ...
Direct Access Guard
• Benefit from object shape
• Object has a shape to describe its overall attribute
• The object shape ...
Inline Cache Instance
Prologue
mov eax, obj
call VM_CallBack
Inline Cache Instance
Prologue Interpreter Callback
mov eax, obj
call VM_CallBack
1. Resolve designated property
Inline Cache Instance
Prologue Interpreter Callback
mov eax, obj
call VM_CallBack
1. Resolve designated property
2. Genera...
Inline Cache Instance
Prologue Interpreter Callback
mov eax, obj 1. Resolve designated property
2. Generate direct access ...
Inline Cache Instance
Prologue Interpreter Callback
mov eax, obj 1. Resolve designated property
2. Generate direct access ...
Inline Cache Instance
Prologue Interpreter Callback
mov eax, obj 1. Resolve designated property
2. Generate direct access ...
What If ...
var dog = {
Name : “dog”,
Bow : function( ){ },
}
var cat = {
Name : “cat”,
Meow : function( ){ },
}
for (var ...
Polymorphic IC
• Cache multiple sets of object shapes and the
resolved values
cmp [eax+ShapeOfst], CachedShape1
jne SHAPE2...
IonMonkey
• Translate bytecode to static single assignment
form (SSA) and build control flow graph
• Apply data and contro...
Warm up for basic terms…
Static Single Assignment
• Each expression has at most 3 operands
• Each target operand has an unique assignment
X = 1
X =...
Control Flow Graph
• The control flow relation
among basic blocks
• Basic block
Consecutiveinstructionswith
last one as co...
Lets start the optimizations…
Value Numbering
• Eliminate redundant expressions
X1 = A1 + B1
Y1 = 1
Z1 = A1 + B1
X1 = A1 + B1
Y1 = 1
Z1 = X1
• Often com...
Value Numbering
• Assign a hash value to each expression
• Expressions containing the same value of a
former expression ca...
X1 = A1–B1
X2 = 3
Y1 = A1+B1
Z1 = 3 + 3
T1 = Z1+ 3
U1 = B1+A1
V1 = B1* 8
A1
B1
3
8
Operand
V1
V2
V3
V4
ValueHash Key
(A1)
...
X1 = A1–B1
X2 = 3
Y1 = A1+B1
Z1 = 3 + 3
T1 = Z1+ 3
U1 = B1+A1
V1 = B1* 8
A1
B1
3
8
X1
Operand
V1
V2
V3
V4
V5
ValueHash Key...
X1 = A1–B1
X2 = 3
Y1 = A1+B1
Z1 = 3 + 3
T1 = Z1+ 3
U1 = B1+A1
V1 = B1* 8
A1
B1
3
8
X1
X2
Operand
V1
V2
V3
V4
V5
V3
ValueHa...
X1 = A1–B1
X2 = 3
Y1 = A1+B1
Z1 = 3 + 3
T1 = Z1+ 3
U1 = B1+A1
V1 = B1* 8
A1
B1
3
8
X1
X2
Y1
Operand
V1
V2
V3
V4
V5
V3
V6
V...
X1 = A1–B1
X2 = 3
Y1 = A1+B1
Z1 = 3 + 3
T1 = Z1+ 3
U1 = B1+A1
V1 = B1* 8
Z1 = 6
A1
B1
3
8
6
X1
X2
Y1
Z1
Operand
V1
V2
V3
V...
X1 = A1–B1
X2 = 3
Y1 = A1+B1
Z1 = 3 + 3
T1 = Z1+ 3
U1 = B1+A1
V1 = B1* 8
Z1 = 6
T1 = 9
A1
B1
3
8
6
9
X1
X2
Y1
Z1
T1
Operan...
X1 = A1–B1
X2 = 3
Y1 = A1+B1
Z1 = 3 + 3
T1 = Z1+ 3
U1 = B1+A1
V1 = B1* 8
Z1 = 6
T1 = 9
U1 =Y1
A1
B1
3
8
6
9
X1
X2
Y1
Z1
T1...
X1 = A1–B1
X2 = 3
Y1 = A1+B1
Z1 = 3 + 3
T1 = Z1+ 3
U1 = B1+A1
V1 = B1* 8
Z1 = 6
T1 = 9
U1 =Y1
V1 = B1<<3
A1
B1
3
8
6
9
X1
...
Extend to Global Scope
• Require analysis for dominating relation in CFG
• For exprs e1 and e2, e2 can be reduced if
• e2 ...
Global Scope
GotoCond
X1 = 3
Y1 = A1+B1
Z1 = X1+ 3
T1 = A1 – B1
Z1 > 3
V1 = A1+B1
W1 = B1- 3
U1 = B1- 3
T F
T F
B1
B2 B3
B...
Global Scope
GotoCond
X1 = 3
Y1 = A1+B1
Z1 = X1+ 3
T1 = A1 – B1
Z1 > 3
V1 = A1+B1
W1 = B1- 3
U1 = B1- 3
T F
T F
B1
B2 B3
B...
Global Scope
GotoCond
X1 = 3
Y1 = A1+B1
Z1 = X1+ 3
T1 = A1 – B1
Z1 > 3
V1 = A1+B1
W1 = B1- 3
U1 = B1- 3
T F
T F
B1
B2 B3
B...
Global Scope
Cond
X1 = 3
Y1 = A1+B1
Z1 = X1+ 3
T1 = A1 – B1
Z1 > 3
V1 = A1+B1
W1 = B1- 3
U1 = B1- 3
T
T F
B1
B2
B4 B5
• Do...
Global Scope
Cond
X1 = 3
Y1 = A1+B1
Z1 = X1+ 3
T1 = A1 – B1
Z1 > 3
V1 = A1+B1
W1 = B1- 3
U1 = B1- 3
T
T F
B1
B2
B4 B5
• Do...
Global Scope
Cond
X1 = 3
Y1 = A1+B1
Z1 = X1+ 3
T1 = A1 – B1
Z1 > 3
V1 = A1+B1
W1 = B1- 3
U1 = B1- 3
T
T F
B1
B2
B4 B5
• Do...
Global Scope
Cond
X1 = 3
Y1 = A1+B1
Z1 = X1+ 3
T1 = A1 – B1
Z1 > 3
V1 = A1+B1
W1 = B1- 3
U1 = B1- 3
T
T F
B1
B2
B4 B5
• Do...
Loop Invariant Code Motion
• Hoist the loop invariant exprs outside the loop
• For a loop invariant expression x = y + z
•...
Loop Invariant Code Motion
X1 = A1+B1
Y1 = X1+ 3
Z1 =Y1+ A1
T1 = A1- B1
U1 =T1+ 3
V1 =Y1+ U1
• Invariant expressions
• e1:...
Loop Invariant Code Motion
X1 = A1+B1
Y1 = X1+ 3
T1 = A1-B1
Z1 =Y1+ A1
U1 =T1+ 3
V1 =Y1+ U1
• Invariant expressions
• e1: ...
More Optimizations
• SSA and control flow optimizations
• Dead code elimination
• Value range analysis
• Loop unrolling
• ...
Conclusion
•Under the hood of SpiderMonkey
•General but slow bytecode interpretation
•Two level JIT optimizations for hot ...
About Me
Security Researcher from
DSNS Lab @ NCTU
• Interests
• Virtual Machine
• Binary Translation
• Current Works
• And...
Thanks for Listening
Upcoming SlideShare
Loading in …5
×

(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine

11,115 views

Published on

This is my slides of COSCUP 2015 at Taipei, Taiwan.

The material is about the engine implementation overview from a 3 month experienced mentor bug contributor.

Published in: Technology
  • Be the first to comment

(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine

  1. 1. COSCUP 2015 ZongShen Shen andy.zsshen@gmail.com A Beginner’s Journey to Mozilla SpiderMonkey JS Engine
  2. 2. Why Joining SpiderMonkey • Explore a real language engine implementation • Good First Features encouraging beginners
  3. 3. About the Talk • Under the hood of engine implementation • Begineer’s view and experience sharing
  4. 4. Outline •Bytecode & Interpreter Basics •JIT Optimization
  5. 5. SpiderMonkey Overview NativeCode Bytecode JIT Compiler JS Source Compiler Interpreter CPU
  6. 6. SpiderMonkey Overview NativeCode Bytecode JIT Compiler JS Source Compiler Interpreter CPU Bytecode Generation
  7. 7. SpiderMonkey Overview NativeCode Bytecode JIT Compiler JS Source Compiler Interpreter CPU Bytecode Interpretation Bytecode Generation
  8. 8. SpiderMonkey Overview NativeCode Bytecode JIT Compiler JS Source Compiler Interpreter CPU Bytecode Interpretation Hot Code Optimization Native Code Execution Bytecode Generation
  9. 9. Bytecode Compiler • Lexical Analysis • Split the source script into token stream • Syntactic Analysis • Parse token stream and build Abstract Syntax Tree • Code Generation • Traverse the AST to emit bytecode
  10. 10. Lexical Analysis var x = y + z ; var a = b * c ; Variable Name Assignment Add Semicolon
  11. 11. VarOrExprs → varVars | Expr Vars → Var | Var,Vars Var → Id | Id = AssignExpr Expr → AssignExpr | AssignExpr, Expr AssignExpr → CondExpr | CondExpr AssignOp AssignExpr AddExprs → MulExpr | MulExpr + AddExpr MulExpr → UnaryExpr | UnaryExpr * MulExpr PrimaryExpr → (Expr) | Id | LitInt | LitFloat | LitString | false | true | null | this Syntactic Analysis . . . Recursive Descent Parsing . . . Top to Bottom Left to Right
  12. 12. Syntactic Analysis Statement List Assignment Def : x BinaryAdd Use : y Use : z Assignment Def : a BinaryMultiply Use : b Use : c Result AST
  13. 13. Code Generation = = x y S z + a b c *
  14. 14. Code Generation = = x y S z + a b c * DefVar x BindName x
  15. 15. Code Generation = = x y S z + a b c * DefVar x BindName x GetName y
  16. 16. Code Generation = = x y S z + a b c * DefVar x BindName x GetName y GetName z
  17. 17. Code Generation = = x y S z + a b c * DefVar x BindName x GetName y GetName z Add
  18. 18. Code Generation = = x y S z + a b c * DefVar x BindName x GetName y GetName z Add SetName x
  19. 19. Code Generation = = x y S z + a b c * DefVar x DefVar a BindName x GetName y GetName z Add SetName x BindName a GetName b GetName c Mul SetName a
  20. 20. Bytecode Interpreter • Prepare the stack frame to interpret bytecode • Dispatch bytecode in a large switch statement INTERPRETER_LOOP ( ) CASE ( JSOP_GETNAME ) { GetNameOperation( ) } CASE ( JSOP_ADD ) { AddOperation( ) } CASE ( JSOP_SETNAME ) { SetNameOperation( ) } ... ... More Handlers ... ... END_LOOP ( )
  21. 21. function add (src, dst) { return src + dst; } add(“coscup”, 2015); GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return Interpretation Example
  22. 22. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return Caller Callee Stack Frame Interpretation Example
  23. 23. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add Caller Callee Stack Frame Interpretation Example
  24. 24. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add JSVal: Undef Caller Callee Stack Frame Interpretation Example
  25. 25. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add JSVal: Undef JSVal:“coscup” Caller Callee Stack Frame Interpretation Example
  26. 26. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add JSVal: Undef JSVal:“coscup” JSVal: 2015 Caller Callee Stack Frame Interpretation Example
  27. 27. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add JSVal: Undef JSVal:“coscup” JSVal: 2015 Caller Callee Stack Frame JSVal:“coscup” JSVal: 2015 Interpretation Example
  28. 28. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add JSVal: Undef JSVal:“coscup” JSVal: 2015 JSVal:“coscup” Caller Callee Stack Frame JSVal:“coscup” JSVal: 2015 Interpretation Example
  29. 29. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add JSVal: Undef JSVal:“coscup” JSVal: 2015 JSVal:“coscup” JSVal: 2015 Caller Callee Stack Frame JSVal:“coscup” JSVal: 2015 Interpretation Example
  30. 30. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add JSVal: Undef JSVal:“coscup” JSVal: 2015 Caller Callee JSVal:“coscup2015” Stack Frame JSVal:“coscup” JSVal: 2015 Interpretation Example
  31. 31. GetName “add” Undefined String “coscup” Int16 2015 Call 2 GetArg 0 GetArg 1 Add Return JSVal: Func_add JSVal: Undef JSVal:“coscup” JSVal: 2015 Caller Callee JSVal:“coscup2015” Stack Frame Interpretation Example
  32. 32. Performance Disadvantage • Immediate execution without proper redundancy elimination and task specialized optimization
  33. 33. Performance Disadvantage • Immediate execution without proper redundancy elimination and task specialized optimization Example Object Property Access Obj.Prop
  34. 34. JS Object var People = { Name : “Me”, Age : 1, Gender : “M” }; Property Value People.Name People.Age People.Gender Property Access
  35. 35. Object Internal • A list of shapes each of which • Represents a named property • A vector of slots each of which • Stores the value of the mapped property • A shape to describe its overall attributes Object Name “Me” Shape List SlotVectorAttr Shape Age Gender 1 “M”
  36. 36. Object Property Access • Object layout traversal 1. Search shape list to locate the target property shape 2. Access slot vector with the index found in the shape P1 Pi Pj Pn Object
  37. 37. Object Property Access • Object layout traversal 1. Search shape list to locate the target property shape 2. Access slot vector with the index found in the shape • To speed up traversal • Attach hash tables with some shapes for table indexing P1 Pi Pj Pn Object Pi Pj
  38. 38. Performance Gap lea eax, obj mov ebx, [eax + 4]   AoT Compilation Direct access Slow object layout traversal struct Object { int Prop1; int Prop2; }; int prop = obj -> Prop2; var obj = { Prop1 : 1, Prop2 : 2, } var prop = obj.Prop2; Interpretation VS GetName obj GetProp Prop2
  39. 39. Can we improve the performance? In addition to object property access, Still many issues…
  40. 40. Can we improve the performance? In addition to object property access, Still many issues… Interpretation JIT Compilation
  41. 41. JIT Compilation • Generate extremely fast native code • Baseline for hot methods • Inline cache to speed up dynamic property lookup • IonMonkey for very hot methods • Comprehensive optimization to remove redundancy
  42. 42. Inline Cache • Objective • Mitigate the overhead of object layout traversal for each single property access • Idea • Cache the resolved value after dynamic lookup • Emit a piece of direct access code for that value
  43. 43. Inline Cache var res = obj.prop; GetName “obj” GetProp “prop”
  44. 44. Inline Cache var res = obj.prop; GetName “obj” GetProp “prop” Dynamic lookup logic
  45. 45. Inline Cache • Efficient code for direct access • But if obj is modified, the code will be unsafe var res = obj.prop; GetName “obj” GetProp “prop” mov eax, obj mov eax, [eax + OfstSlot]
  46. 46. Direct Access Guard • If an object is modified with property insertion or deletion, its layout is also changed • Execute the cached code may cause invalid access • Need a guard to check for object modification • Object remains the same, enter cached code • Otherwise, fallback to dynamic lookup and reoptimize
  47. 47. Direct Access Guard • Benefit from object shape • Object has a shape to describe its overall attribute • The object shape is synchronized with its layout
  48. 48. Direct Access Guard • Benefit from object shape • Object has a shape to describe its overall attribute • The object shape is synchronized with its layout • Applying object shape to guard the cached code mov eax, obj cmp [eax + ShapeOfst], CachedShape
  49. 49. Inline Cache Instance Prologue mov eax, obj call VM_CallBack
  50. 50. Inline Cache Instance Prologue Interpreter Callback mov eax, obj call VM_CallBack 1. Resolve designated property
  51. 51. Inline Cache Instance Prologue Interpreter Callback mov eax, obj call VM_CallBack 1. Resolve designated property 2. Generate direct access code cmp [eax+ShapeOfst], CachedShape jne MISS mov eax, [eax+CachedSlotOfst] jmp EXIT MISS: call VM_CallBack EXIT: Cached code
  52. 52. Inline Cache Instance Prologue Interpreter Callback mov eax, obj 1. Resolve designated property 2. Generate direct access code 3. Modify original call site cmp [eax+ShapeOfst], CachedShape jne MISS mov eax, [eax+CachedSlotOfst] jmp EXIT MISS: call VM_CallBack EXIT: Cached code call VM_CallBack call Cached_Code
  53. 53. Inline Cache Instance Prologue Interpreter Callback mov eax, obj 1. Resolve designated property 2. Generate direct access code 3. Modify original call site 4. Jump to cached code cmp [eax+ShapeOfst], CachedShape jne MISS mov eax, [eax+CachedSlotOfst] jmp EXIT MISS: call VM_CallBack EXIT: Cached code call VM_CallBack call Cached_Code
  54. 54. Inline Cache Instance Prologue Interpreter Callback mov eax, obj 1. Resolve designated property 2. Generate direct access code 3. Modify original call site 4. Jump to cached code cmp [eax+ShapeOfst], CachedShape jne MISS mov eax, [eax+CachedSlotOfst] jmp EXIT MISS: call VM_CallBack EXIT: Cached code call VM_CallBack call Cached_Code After code linking, It will be direct access, If shape not changed
  55. 55. What If ... var dog = { Name : “dog”, Bow : function( ){ }, } var cat = { Name : “cat”, Meow : function( ){ }, } for (var i = 0 ; i < 100 ; i++) { WhoAmI(dog); WhoAmI(cat); } function WhoAmI (obj) { return obj.Name; } dog cat dog cat . . . Expensive cache and flush
  56. 56. Polymorphic IC • Cache multiple sets of object shapes and the resolved values cmp [eax+ShapeOfst], CachedShape1 jne SHAPE2 mov eax, [eax+CachedSlotOfst1] jmp EXIT SHAPE2: cmp [eax+ShapeOfst], CachedShape2 jne SHAPE3 mov eax, [eax+CachedSlotOfst2] jmp EXIT ……… MISS: call VM_CallBack EXIT:
  57. 57. IonMonkey • Translate bytecode to static single assignment form (SSA) and build control flow graph • Apply data and control flow hybrid optimization • Translate optimized SSAs to native code
  58. 58. Warm up for basic terms…
  59. 59. Static Single Assignment • Each expression has at most 3 operands • Each target operand has an unique assignment X = 1 X = 2 Y = X + 1 Z = 3 Y = X + 2 X1 = 1 X2 = 2 Y1 = X2 + 1 Z1 = 3 Y2 = X2 + 2 Original Code SSA Form
  60. 60. Control Flow Graph • The control flow relation among basic blocks • Basic block Consecutiveinstructionswith last one as control transferGotoCond X1 = 3 Y1 = A1+B1 Z1 = X1+ 3 Cond V1 = A1+B1 W1 = B1- 3 U1 = B1- 3 T F T F B2 B3 B4 B5 B1
  61. 61. Lets start the optimizations…
  62. 62. Value Numbering • Eliminate redundant expressions X1 = A1 + B1 Y1 = 1 Z1 = A1 + B1 X1 = A1 + B1 Y1 = 1 Z1 = X1 • Often combined with other optimizations • Constant folding and propagation • Expression simplification • Unreachable code elimination
  63. 63. Value Numbering • Assign a hash value to each expression • Expressions containing the same value of a former expression can be reduced • Same set of source values • Same operator considering algebraic commutative X1 = A1 + B1 Z1 = B1 + A1 (+,V1,V2) V3 Hash Key Value Z1 = X1
  64. 64. X1 = A1–B1 X2 = 3 Y1 = A1+B1 Z1 = 3 + 3 T1 = Z1+ 3 U1 = B1+A1 V1 = B1* 8 A1 B1 3 8 Operand V1 V2 V3 V4 ValueHash Key (A1) (B1) (3) (8) Local Scope
  65. 65. X1 = A1–B1 X2 = 3 Y1 = A1+B1 Z1 = 3 + 3 T1 = Z1+ 3 U1 = B1+A1 V1 = B1* 8 A1 B1 3 8 X1 Operand V1 V2 V3 V4 V5 ValueHash Key (A1) (B1) (3) (8) (-,V1,V2) Local Scope
  66. 66. X1 = A1–B1 X2 = 3 Y1 = A1+B1 Z1 = 3 + 3 T1 = Z1+ 3 U1 = B1+A1 V1 = B1* 8 A1 B1 3 8 X1 X2 Operand V1 V2 V3 V4 V5 V3 ValueHash Key (A1) (B1) (3) (8) (-,V1,V2) (V3) Local Scope
  67. 67. X1 = A1–B1 X2 = 3 Y1 = A1+B1 Z1 = 3 + 3 T1 = Z1+ 3 U1 = B1+A1 V1 = B1* 8 A1 B1 3 8 X1 X2 Y1 Operand V1 V2 V3 V4 V5 V3 V6 ValueHash Key (A1) (B1) (3) (8) (-,V1,V2) (V3) (+,V1,V2) Local Scope
  68. 68. X1 = A1–B1 X2 = 3 Y1 = A1+B1 Z1 = 3 + 3 T1 = Z1+ 3 U1 = B1+A1 V1 = B1* 8 Z1 = 6 A1 B1 3 8 6 X1 X2 Y1 Z1 Operand V1 V2 V3 V4 V7 V5 V3 V6 V7 ValueHash Key (A1) (B1) (3) (8) (6) (-,V1,V2) (V3) (+,V1,V2) (V7) Local Scope Constant Folding
  69. 69. X1 = A1–B1 X2 = 3 Y1 = A1+B1 Z1 = 3 + 3 T1 = Z1+ 3 U1 = B1+A1 V1 = B1* 8 Z1 = 6 T1 = 9 A1 B1 3 8 6 9 X1 X2 Y1 Z1 T1 Operand V1 V2 V3 V4 V7 V8 V5 V3 V6 V7 V8 ValueHash Key (A1) (B1) (3) (8) (6) (9) (-,V1,V2) (V3) (+,V1,V2) (V7) (V8) Local Scope Constant Folding Const Propagation
  70. 70. X1 = A1–B1 X2 = 3 Y1 = A1+B1 Z1 = 3 + 3 T1 = Z1+ 3 U1 = B1+A1 V1 = B1* 8 Z1 = 6 T1 = 9 U1 =Y1 A1 B1 3 8 6 9 X1 X2 Y1 Z1 T1 U1 Operand V1 V2 V3 V4 V7 V8 V5 V3 V6 V7 V8 V6 ValueHash Key (A1) (B1) (3) (8) (6) (9) (-,V1,V2) (V3) (+,V1,V2) (V7) (V8) (+,V1,V2) Local Scope Constant Folding Const Propagation
  71. 71. X1 = A1–B1 X2 = 3 Y1 = A1+B1 Z1 = 3 + 3 T1 = Z1+ 3 U1 = B1+A1 V1 = B1* 8 Z1 = 6 T1 = 9 U1 =Y1 V1 = B1<<3 A1 B1 3 8 6 9 X1 X2 Y1 Z1 T1 U1 Operand V1 V2 V3 V4 V7 V8 V5 V3 V6 V7 V8 V6 ValueHash Key (A1) (B1) (3) (8) (6) (9) (-,V1,V2) (V3) (+,V1,V2) (V7) (V8) (+,V1,V2) V1 V9(<<,V2,V3) Local Scope Constant Folding Const Propagation Expr Simplification
  72. 72. Extend to Global Scope • Require analysis for dominating relation in CFG • For exprs e1 and e2, e2 can be reduced if • e2 has the same value with e1 • e1 dominates e2 in CFG, that is, all paths from entry point to e2 must go through e1 • Examine basic blocks in reverse post order • Guarantee dominating exprs are handled first
  73. 73. Global Scope GotoCond X1 = 3 Y1 = A1+B1 Z1 = X1+ 3 T1 = A1 – B1 Z1 > 3 V1 = A1+B1 W1 = B1- 3 U1 = B1- 3 T F T F B1 B2 B3 B4 B5 • Dominating relation • B1 dominates B2,B3,B4,B5 • Reverse post order • B1, B3, B2, B5, B4 • In B1 • In B4
  74. 74. Global Scope GotoCond X1 = 3 Y1 = A1+B1 Z1 = X1+ 3 T1 = A1 – B1 Z1 > 3 V1 = A1+B1 W1 = B1- 3 U1 = B1- 3 T F T F B1 B2 B3 B4 B5 • Dominating relation • B1 dominates B2,B3,B4,B5 • Reverse post order • B1, B3, B2, B5, B4 • In B1 • In B4
  75. 75. Global Scope GotoCond X1 = 3 Y1 = A1+B1 Z1 = X1+ 3 T1 = A1 – B1 Z1 > 3 V1 = A1+B1 W1 = B1- 3 U1 = B1- 3 T F T F B1 B2 B3 B4 B5 • Dominating relation • B1 dominates B2,B3,B4,B5 • Reverse post order • B1, B3, B2, B5, B4 • In B1 • Z1 = 6 • In B4
  76. 76. Global Scope Cond X1 = 3 Y1 = A1+B1 Z1 = X1+ 3 T1 = A1 – B1 Z1 > 3 V1 = A1+B1 W1 = B1- 3 U1 = B1- 3 T T F B1 B2 B4 B5 • Dominating relation • B1 dominates B2,B3,B4,B5 • Reverse post order • B1, B3, B2, B5, B4 • In B1 • Z1 = 6 • B3 is removed via UCE • In B4
  77. 77. Global Scope Cond X1 = 3 Y1 = A1+B1 Z1 = X1+ 3 T1 = A1 – B1 Z1 > 3 V1 = A1+B1 W1 = B1- 3 U1 = B1- 3 T T F B1 B2 B4 B5 • Dominating relation • B1 dominates B2,B3,B4,B5 • Reverse post order • B1, B3, B2, B5, B4 • In B1 • Z1 = 6 • B3 is removed via UCE • In B4
  78. 78. Global Scope Cond X1 = 3 Y1 = A1+B1 Z1 = X1+ 3 T1 = A1 – B1 Z1 > 3 V1 = A1+B1 W1 = B1- 3 U1 = B1- 3 T T F B1 B2 B4 B5 • Dominating relation • B1 dominates B2,B3,B4,B5 • Reverse post order • B1, B3, B2, B5, B4 • In B1 • Z1 = 6 • B3 is removed via UCE • In B4 • V1 =Y1
  79. 79. Global Scope Cond X1 = 3 Y1 = A1+B1 Z1 = X1+ 3 T1 = A1 – B1 Z1 > 3 V1 = A1+B1 W1 = B1- 3 U1 = B1- 3 T T F B1 B2 B4 B5 • Dominating relation • B1 dominates B2,B3,B4,B5 • Reverse post order • B1, B3, B2, B5, B4 • In B1 • Z1 = 6 • B3 is removed via UCE • In B4 • V1 =Y1 • W1 cannot be simplified
  80. 80. Loop Invariant Code Motion • Hoist the loop invariant exprs outside the loop • For a loop invariant expression x = y + z • y and z should not depend on the operands defined in the loop
  81. 81. Loop Invariant Code Motion X1 = A1+B1 Y1 = X1+ 3 Z1 =Y1+ A1 T1 = A1- B1 U1 =T1+ 3 V1 =Y1+ U1 • Invariant expressions • e1: Y1 = X1 + 3 • e2: T1 = A1 – B1 • Hoist e1 and e2 from B3 to B1 B1 B2 B3 V1 < 100
  82. 82. Loop Invariant Code Motion X1 = A1+B1 Y1 = X1+ 3 T1 = A1-B1 Z1 =Y1+ A1 U1 =T1+ 3 V1 =Y1+ U1 • Invariant expressions • e1: Y1 = X1 + 3 • e2: T1 = A1 – B1 • Hoist e1 and e2 from B3 to B1 B1 B2 B3 V1 < 100
  83. 83. More Optimizations • SSA and control flow optimizations • Dead code elimination • Value range analysis • Loop unrolling • And more . . . • Native code generation • Linear scan register allocation • And more . . .
  84. 84. Conclusion •Under the hood of SpiderMonkey •General but slow bytecode interpretation •Two level JIT optimizations for hot codes
  85. 85. About Me Security Researcher from DSNS Lab @ NCTU • Interests • Virtual Machine • Binary Translation • Current Works • Android Code Obfuscation • App Protection
  86. 86. Thanks for Listening

×