Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Virtual Machine & JavaScript Engine
               @nwind
(HLL) Virtual Machine
Take the red pill
I will show you the rabbit hole.
Virtual Machine history
•   pascal 1970

•   smalltalk 1980

•   self 1986

•   python 1991

•   java 1995

•   javascript...
The Smalltalk demonstration showed three amazing features. One
was how computers could be networked; the second was how
ob...
How Virtual Machine Work?

•   Parser

•   Intermediate Representation (IR)

•   Interpreter

•   Garbage Collection

•   ...
Parser


•   Tokenize

•   AST
Tokenize
                    identifier           number


keyword
          var foo = 10;                          semicol...
AST

               Assign




Variable foo            Constant 10
{
            AST demo (Esprima)
    "type": "Program",
    "body": [
        {
            "type": "VariableDeclaration",...
Intermediate Representation


•   Bytecode

•   Stack vs. register
Bytecode (SpiderMonkey)
                      00000:   deffun 0 null
                      00005:   nop
                  ...
Bytecode (JSC)
                      8 m_instructions; 168 bytes at 0x7fc1ba3070e0;
                      1 parameter(s); ...
Stack vs. register
•   Stack

    •   JVM, .NET, php, python, Old JavaScript engine

•   Register

    •   Lua, Dalvik, Al...
Stack vs. register
                                                local a,t,i    1:   PUSHNIL      3
                    ...
Interpreter


•   Switch statement

•   Direct threading, Indirect threading, Token threading ...
Switch statement
while (true) {
! switch (opcode) {   mov    %edx,0xffffffffffffffe4(%rbp)
! ! case ADD:         cmpl   $0...
Direct threading
typedef void *Inst;                      mov       0xffffffffffffffe8(%rbp),%rdx
Inst program[] = { &&ADD...
Garbage Collection
•   Reference counting (php, python ...), smart pointer

•   Tracing

    •   Stop the world

    •   C...
Precise vs. conservative
•   Conservative

    •   If it looks like a pointer, treat it as a pointer

    •   Might have m...
It is time for the DARK
          Magic
Optimization Magic
•   Interpreter optimization

•   Compiler optimization

•   JIT

•   Type inference

•   Hidden Type

...
Interpreter optimization
Switch work inefficient, Why?
CPU Pipeline


•   Fetch, Decode, Execute, Write-back

•   Branch prediction
http://en.wikipedia.org/wiki/File:Pipeline,_4_stage.svg
Solution: Inline Threading
ICONST_1_START: *sp++ = 1;
ICONST_1_END: goto **(pc++);
INEG_START: sp[-1] = -sp[-1];
INEG_END:...
Compiler optimization
Compiler optimization

•   SSA

•   Data-flow

•   Control-flow

•   Loop

•   ...
What a JVM can do...
compiler tactics                          language-specific techniques              loop transformati...
Just-In-Time (JIT)
JIT


•   Method JIT, Trace JIT, Regular expression JIT

•   Code generation

•   Register allocation
How JIT work?

•   mmap/new/malloc (mprotect)

•   generate native code

•   c cast/reinterpret_cast

•   call the function
Trampoline (JSC x86)
                                                        asm (
                                       ...
Register allocation


•   Linear scan

•   Graph coloring
Code generation


•   Pipelining

•   SIMD (SSE2, SSE3 ...)

•   Debug
Type inference
a+b
Property access
“foo.bar”
foo.bar in C


00001f63!movl!
             %ecx,0x04(%edx)
__ZN2v88internal7HashMap6LookupEPvjb:
00000338!   pushl!%ebp
00000339!   pushl!%ebx
0000033a!   pushl!%edi
0000033b!
00000...
How to optimize?
Hidden Type
                         add property x




 then add property y




                       http://code.google...
But nothing is perfect
one secret in V8 hidden class




                                          20x times
                                    ...
in Figure 5, reads are far more common than writes: over all
                                                             ...
Optimize method call
bar can be anything



function foo(bar) {
    return bar.pro();
}
adaptive optimization for self
Polymorphic inline cache
Tagged pointer
Tagged pointer
typedef union {
  void *p;
  double d;
  long l;
} Value;

typedef struct {
  unsigned char type;    sizeof...
Tagged pointer

   In almost all system, the pointer address will be aligned (4 or 8 bytes)


“The address of a block retu...
Tagged pointer

Example: 0xc00ab958               the pointer’s last 2 or 3 bits must be 0



                1     0   0 ...
How about double?
NaN-tagging (JSC 64 bit)
In 64 bit system, we can only use 48 bits, that means it will have 16 bits are 0

           * Th...
V8
V8

•   Lars Bak

•   Hidden Class, PICs

•   Built-in objects written in JavaScript

•   Crankshaft

•   Precise generati...
Lars Bak

•   implement VM since 1988

•   Beta

•   Self

•   HotSpot
Source code   Native Code



              High-Level IR    Low-Level IR   Opt Native Code



                    }   Cran...
Hotspot client compiler
Crankshaft

•   Profiling

•   Compiler optimization

•   On-stack replacement

•   Deoptimize
High-Level IR (Hydrogen)
     •   function inline

     •   type inference

     •   stack check elimination

     •   loo...
Low-Level IR (Lithium)


  •   linear-scan register allocator

  •   code generate

  •   lazy deoptimization




http://w...
Built-in objects written in JS
function ArraySort(comparefn) {
  if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this))...
GC
V8 performance
Can V8 be faster?
Dart
•   Clear syntax, Optional types, Libraries

•   Performance

•   Can compile to JavaScript

•   But IE, WebKit and M...
Embed V8
Embed
Expose Function
v8::Handle<v8::Value> Print(const v8::Arguments& args) {
  for (int i = 0; i < args.Length(); i++) {
    v...
Node.JS
•   Pros                              •   Cons

    •   Async                             •   Lack of great librar...
JavaScriptCore (Nitro)
Where it comes from?
1997 Macworld
“Apple has decided to make Internet Explorer it’s default browser
on macintosh.”

“Since we believe in choice. We going to...
JavaScriptCore History
•   2001 KJS (kde-2.2)       •   2008 SquirrelFish Extreme

    •   Bison                    •   PI...
Interpreter




AST   Bytecode   Method JIT




                 SSA           DFG JIT
SipderMonkey
Monkey
•   SpiderMonkey                  •   JägerMonkey

    •   Written by Brendan Eich       •   PICs

    •   interpre...
IonMonkey
•   SSA

•   function inline

•   linear-scan register allocation

•   dead code elimination

•   loop-invariant...
http://www.arewefastyet.com/
Chakra (IE9)
Chakra

•   Interpreter/JIT

•   Type System (hidden class)

•   PICs

•   Delay parse

•   Use utf-8 internal
Unlocking the JavaScript Opportunity with Internet Explorer 9
Unlocking the JavaScript Opportunity with Internet Explorer 9
Carakan (Opera)
Carakan

•   Register VM

•   Method JIT, Regex JIT

•   Hidden type

•   Function inline
Rhino and JVM
Rhino is SLOW, why?
Because JVM is slow?
JVM did’t support dynamic
     language well
Solution: invokedynamic
Hard to optimize in JVM




Before   Caller    Some tricks              Method




                  Invokedynamic
After  ...
One ring to rule them all?
Rhino + invokedynamic
•   Pros                               •   Cons

    •   Easier to implement                •   Only...
Compiler optimization is
       HARD
It there an easy way?
LLVM
LLVM
•   Clang, VMKit, GHC, PyPy, Rubinius ...

•   DragonEgg: replace GCC back-end

•   IR

•   Optimization

•   Link, C...
LLVM simplify
define i32 @foo(i32 %bar) nounwind ssp {
                        entry:
                          %bar_addr = alloca i32, ...
define i32 @foo(i32 %bar) nounwind ssp {
entry:
  %bar_addr = alloca i32, align 4
  %retval = alloca i32
  %0 = alloca i32...
Optimization (70+)




    http://llvm.org/docs/Passes.html
define i32 @foo(i32 %bar) nounwind readnone ssp {
entry:
  %0 = add nsw i32 %bar, 1
  ret i32 %0
}
                       ...
exe &
                                             Libraries                             LLVM
                    LLVM
   ...
LLVM on JavaScript
Emscripten


•   C/C++ to LLVM IR

•   LLVM IR to JavaScript

•   Run on browser
...

                                                    function _foo($bar) {
define i32 @foo(i32 %bar) nounwind readnone...
Emscripten demo

•   Python, Ruby, Lua virtual machine (http://repl.it/)

•   OpenJPEG

•   Poppler

•   FreeType

•   ......
Performance? good enough!

    benchmark             SM      V8      gcc     ratio     two Ja
    fannkuch (10)        1.1...
JavaScript on LLVM
Fabric Engine

•   JavaScript Integration

•   Native code compilation (LLVM)

•   Multi-threaded execution

•   OpenGL Re...
Fabric Engine




http://fabric-engine.com/2011/11/server-performance-benchmarks/
Conclusion?
All problems in computer science can be solved
by another level of indirection


                               David Whee...
References
•   The behavior of efficient virtual   •   Context Threading: A Flexible
    machine interpreters on           ...
References
•   Design of the Java HotSpotTM    •   LLVM: A Compilation
    Client Compiler for Java 6          Framework f...
References
•   Adaptive Optimization for SELF   •   Design, Implementation, and
                                         E...
References
•   Representing Type Information      •   The Structure and Performance
    in Dynamically Typed              ...
!ank y"
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Virtual machine and javascript engine
Upcoming SlideShare
Loading in …5
×

of

Virtual machine and javascript engine Slide 1 Virtual machine and javascript engine Slide 2 Virtual machine and javascript engine Slide 3 Virtual machine and javascript engine Slide 4 Virtual machine and javascript engine Slide 5 Virtual machine and javascript engine Slide 6 Virtual machine and javascript engine Slide 7 Virtual machine and javascript engine Slide 8 Virtual machine and javascript engine Slide 9 Virtual machine and javascript engine Slide 10 Virtual machine and javascript engine Slide 11 Virtual machine and javascript engine Slide 12 Virtual machine and javascript engine Slide 13 Virtual machine and javascript engine Slide 14 Virtual machine and javascript engine Slide 15 Virtual machine and javascript engine Slide 16 Virtual machine and javascript engine Slide 17 Virtual machine and javascript engine Slide 18 Virtual machine and javascript engine Slide 19 Virtual machine and javascript engine Slide 20 Virtual machine and javascript engine Slide 21 Virtual machine and javascript engine Slide 22 Virtual machine and javascript engine Slide 23 Virtual machine and javascript engine Slide 24 Virtual machine and javascript engine Slide 25 Virtual machine and javascript engine Slide 26 Virtual machine and javascript engine Slide 27 Virtual machine and javascript engine Slide 28 Virtual machine and javascript engine Slide 29 Virtual machine and javascript engine Slide 30 Virtual machine and javascript engine Slide 31 Virtual machine and javascript engine Slide 32 Virtual machine and javascript engine Slide 33 Virtual machine and javascript engine Slide 34 Virtual machine and javascript engine Slide 35 Virtual machine and javascript engine Slide 36 Virtual machine and javascript engine Slide 37 Virtual machine and javascript engine Slide 38 Virtual machine and javascript engine Slide 39 Virtual machine and javascript engine Slide 40 Virtual machine and javascript engine Slide 41 Virtual machine and javascript engine Slide 42 Virtual machine and javascript engine Slide 43 Virtual machine and javascript engine Slide 44 Virtual machine and javascript engine Slide 45 Virtual machine and javascript engine Slide 46 Virtual machine and javascript engine Slide 47 Virtual machine and javascript engine Slide 48 Virtual machine and javascript engine Slide 49 Virtual machine and javascript engine Slide 50 Virtual machine and javascript engine Slide 51 Virtual machine and javascript engine Slide 52 Virtual machine and javascript engine Slide 53 Virtual machine and javascript engine Slide 54 Virtual machine and javascript engine Slide 55 Virtual machine and javascript engine Slide 56 Virtual machine and javascript engine Slide 57 Virtual machine and javascript engine Slide 58 Virtual machine and javascript engine Slide 59 Virtual machine and javascript engine Slide 60 Virtual machine and javascript engine Slide 61 Virtual machine and javascript engine Slide 62 Virtual machine and javascript engine Slide 63 Virtual machine and javascript engine Slide 64 Virtual machine and javascript engine Slide 65 Virtual machine and javascript engine Slide 66 Virtual machine and javascript engine Slide 67 Virtual machine and javascript engine Slide 68 Virtual machine and javascript engine Slide 69 Virtual machine and javascript engine Slide 70 Virtual machine and javascript engine Slide 71 Virtual machine and javascript engine Slide 72 Virtual machine and javascript engine Slide 73 Virtual machine and javascript engine Slide 74 Virtual machine and javascript engine Slide 75 Virtual machine and javascript engine Slide 76 Virtual machine and javascript engine Slide 77 Virtual machine and javascript engine Slide 78 Virtual machine and javascript engine Slide 79 Virtual machine and javascript engine Slide 80 Virtual machine and javascript engine Slide 81 Virtual machine and javascript engine Slide 82 Virtual machine and javascript engine Slide 83 Virtual machine and javascript engine Slide 84 Virtual machine and javascript engine Slide 85 Virtual machine and javascript engine Slide 86 Virtual machine and javascript engine Slide 87 Virtual machine and javascript engine Slide 88 Virtual machine and javascript engine Slide 89 Virtual machine and javascript engine Slide 90 Virtual machine and javascript engine Slide 91 Virtual machine and javascript engine Slide 92 Virtual machine and javascript engine Slide 93 Virtual machine and javascript engine Slide 94 Virtual machine and javascript engine Slide 95 Virtual machine and javascript engine Slide 96 Virtual machine and javascript engine Slide 97 Virtual machine and javascript engine Slide 98 Virtual machine and javascript engine Slide 99 Virtual machine and javascript engine Slide 100 Virtual machine and javascript engine Slide 101 Virtual machine and javascript engine Slide 102 Virtual machine and javascript engine Slide 103 Virtual machine and javascript engine Slide 104 Virtual machine and javascript engine Slide 105 Virtual machine and javascript engine Slide 106 Virtual machine and javascript engine Slide 107 Virtual machine and javascript engine Slide 108 Virtual machine and javascript engine Slide 109 Virtual machine and javascript engine Slide 110 Virtual machine and javascript engine Slide 111 Virtual machine and javascript engine Slide 112 Virtual machine and javascript engine Slide 113 Virtual machine and javascript engine Slide 114 Virtual machine and javascript engine Slide 115 Virtual machine and javascript engine Slide 116 Virtual machine and javascript engine Slide 117 Virtual machine and javascript engine Slide 118 Virtual machine and javascript engine Slide 119 Virtual machine and javascript engine Slide 120 Virtual machine and javascript engine Slide 121 Virtual machine and javascript engine Slide 122 Virtual machine and javascript engine Slide 123 Virtual machine and javascript engine Slide 124 Virtual machine and javascript engine Slide 125 Virtual machine and javascript engine Slide 126 Virtual machine and javascript engine Slide 127 Virtual machine and javascript engine Slide 128 Virtual machine and javascript engine Slide 129 Virtual machine and javascript engine Slide 130 Virtual machine and javascript engine Slide 131 Virtual machine and javascript engine Slide 132 Virtual machine and javascript engine Slide 133 Virtual machine and javascript engine Slide 134 Virtual machine and javascript engine Slide 135
Upcoming SlideShare
Groovy overview, DSLs and ecosystem - Mars JUG - 2010
Next
Download to read offline and view in fullscreen.

72 Likes

Share

Download to read offline

Virtual machine and javascript engine

Download to read offline

Introduction to virtual machine and JavaScript engine implement.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Virtual machine and javascript engine

  1. Virtual Machine & JavaScript Engine @nwind
  2. (HLL) Virtual Machine
  3. Take the red pill I will show you the rabbit hole.
  4. Virtual Machine history • pascal 1970 • smalltalk 1980 • self 1986 • python 1991 • java 1995 • javascript 1995
  5. The Smalltalk demonstration showed three amazing features. One was how computers could be networked; the second was how object-oriented programming worked. But Jobs and his team paid little attention to these attributes because they were so amazed by the third feature, ...
  6. How Virtual Machine Work? • Parser • Intermediate Representation (IR) • Interpreter • Garbage Collection • Optimization
  7. Parser • Tokenize • AST
  8. Tokenize identifier number keyword var foo = 10; semicolon space equal
  9. AST Assign Variable foo Constant 10
  10. { AST demo (Esprima) "type": "Program", "body": [ { "type": "VariableDeclaration", "declarations": [ { "id": { "type": "Identifier", "name": "foo" }, "init": { "type": "BinaryExpression", var foo = bar + 1; "operator": "+", "left": { "type": "Identifier", "name": "bar" }, "right": { "type": "Literal", "value": 1 } } } ], "kind": "var" } } ] http://esprima.org/demo/parse.html
  11. Intermediate Representation • Bytecode • Stack vs. register
  12. Bytecode (SpiderMonkey) 00000: deffun 0 null 00005: nop 00006: callvar 0 00009: int8 2 function foo(bar) { 00011: call 1 return bar + 1; 00014: pop } 00015: stop foo(2); foo: 00020: getarg 0 00023: one 00024: add 00025: return 00026: stop
  13. Bytecode (JSC) 8 m_instructions; 168 bytes at 0x7fc1ba3070e0; 1 parameter(s); 10 callee register(s) [ 0] enter [ 1] mov! ! r0, undefined(@k0) [ 4] get_global_var! r1, 5 [ 7] mov! ! r2, undefined(@k0) [ 10] mov! ! r3, 2(@k1) [ 13] call!! r1, 2, 10 function foo(bar) { [ 17] op_call_put_result! ! r0 return bar + 1; [ 19] end! ! r0 } Constants: k0 = undefined foo(2); k1 = 2 3 m_instructions; 64 bytes at 0x7fc1ba306e80; 2 parameter(s); 1 callee register(s) [ 0] enter [ 1] add! ! r0, r-7, 1(@k0) [ 6] ret! ! r0 Constants: k0 = 1 End: 3
  14. Stack vs. register • Stack • JVM, .NET, php, python, Old JavaScript engine • Register • Lua, Dalvik, All modern JavaScript engine • Smaller, Faster (about 30%) • RISC
  15. Stack vs. register local a,t,i 1: PUSHNIL 3 a=a+i 2: GETLOCAL 0 ; a 3: GETLOCAL 2 ; i 4: ADD local a,t,i 1: LOADNIL 0 2 0 5: SETLOCAL 0 ; a a=a+i 2: ADD 0 0 2 a=a+1 6: SETLOCAL 0 ; a a=a+1 3: ADD 0 0 250 ; a 7: ADDI 1 a=t[i] 4: GETTABLE 0 1 2 8: SETLOCAL 0 ; a a=t[i] 9: GETLOCAL 1 ; t 10: GETINDEXED 2 ; i 11: SETLOCAL 0 ; a
  16. Interpreter • Switch statement • Direct threading, Indirect threading, Token threading ...
  17. Switch statement while (true) { ! switch (opcode) { mov %edx,0xffffffffffffffe4(%rbp) ! ! case ADD: cmpl $0x1,0xffffffffffffffe4(%rbp) ! ! ! ... je 6e <interpret+0x6e> ! ! ! break; cmpl $0x1,0xffffffffffffffe4(%rbp) ! ! case SUB: jb 4a <interpret+0x4a> ! ! ! ... cmpl $0x2,0xffffffffffffffe4(%rbp) ! ! ! break; je 93 <interpret+0x93> ... jmp 22 <interpret+0x22> ! } ... }
  18. Direct threading typedef void *Inst; mov 0xffffffffffffffe8(%rbp),%rdx Inst program[] = { &&ADD, &&SUB }; lea 0xffffffffffffffe8(%rbp),%rax Inst *ip = program; addq $0x8,(%rax) goto *ip++; mov %rdx,0xffffffffffffffd8(%rbp) jmpq *0xffffffffffffffd8(%rbp) ADD: ... ADD: goto *ip++; ... mov 0xffffffffffffffe8(%rbp),%rdx SUB: lea 0xffffffffffffffe8(%rbp),%rax ... addq $0x8,(%rax) goto *ip++; mov %rdx,0xffffffffffffffd8(%rbp) jmp 2c <interpreter+0x2c> http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
  19. Garbage Collection • Reference counting (php, python ...), smart pointer • Tracing • Stop the world • Copying, Mark-and-sweep, Mark-and-compact • Generational GC • Precise vs. conservative
  20. Precise vs. conservative • Conservative • If it looks like a pointer, treat it as a pointer • Might have memory leak • Cant’ move object, have memory fragmentation • Precise • Indirectly vs. Directly reference
  21. It is time for the DARK Magic
  22. Optimization Magic • Interpreter optimization • Compiler optimization • JIT • Type inference • Hidden Type • Method inline, PICs
  23. Interpreter optimization
  24. Switch work inefficient, Why?
  25. CPU Pipeline • Fetch, Decode, Execute, Write-back • Branch prediction
  26. http://en.wikipedia.org/wiki/File:Pipeline,_4_stage.svg
  27. Solution: Inline Threading ICONST_1_START: *sp++ = 1; ICONST_1_END: goto **(pc++); INEG_START: sp[-1] = -sp[-1]; INEG_END: goto **(pc++); DISPATCH_START: goto **(pc++); DISPATCH_END: ; size_t iconst_size = (&&ICONST_1_END - &&ICONST_1_START); size_t ineg_size = (&&INEG_END - &&INEG_START); size_t dispatch_size = (&&DISPATCH_END - &&DISPATCH_START); void *buf = malloc(iconst_size + ineg_size + dispatch_size); void *current = buf; memcpy(current, &&ICONST_START, iconst_size); current += iconst_size; memcpy(current, &&INEG_START, ineg_size); current += ineg_size; memcpy(current, &&DISPATCH_START, dispatch_size); ... goto **buf; Interpreter? JIT!
  28. Compiler optimization
  29. Compiler optimization • SSA • Data-flow • Control-flow • Loop • ...
  30. What a JVM can do... compiler tactics language-specific techniques loop transformations delayed compilation class hierarchy analysis loop unrolling Tiered compilation devirtualization loop peeling on-stack replacement symbolic constant propagation safepoint elimination delayed reoptimization autobox elimination iteration range splitting program dependence graph representation escape analysis range check elimination static single assignment representation lock elision loop vectorization proof-based techniques lock fusion global code shaping exact type inference de-reflection inlining (graph integration) memory value inference speculative (profile-based) techniques global code motion memory value tracking optimistic nullness assertions heat-based code layout constant folding optimistic type assertions switch balancing reassociation optimistic type strengthening throw inlining operator strength reduction optimistic array length strengthening control flow graph transformation null check elimination untaken branch pruning local code scheduling type test strength reduction optimistic N-morphic inlining local code bundling type test elimination branch frequency prediction delay slot filling algebraic simplification call frequency prediction graph-coloring register allocation common subexpression elimination memory and placement transformation linear scan register allocation integer range typing expression hoisting live range splitting flow-sensitive rewrites expression sinking copy coalescing conditional constant propagation redundant store elimination constant splitting dominating test detection adjacent store fusion copy removal flow-carried type narrowing card-mark elimination address mode matching dead code elimination merge-point splitting instruction peepholing DFA-based code generator http://www.oracle.com/us/technologies/java/java7-renaissance-vm-428200.pdf
  31. Just-In-Time (JIT)
  32. JIT • Method JIT, Trace JIT, Regular expression JIT • Code generation • Register allocation
  33. How JIT work? • mmap/new/malloc (mprotect) • generate native code • c cast/reinterpret_cast • call the function
  34. Trampoline (JSC x86) asm ( ".textn" ".globl " SYMBOL_STRING(ctiTrampoline) "n" // Execute the code! HIDE_SYMBOL(ctiTrampoline) "n" inline JSValue execute(RegisterFile* registerFile, SYMBOL_STRING(ctiTrampoline) ":" "n" CallFrame* callFrame, "pushl %ebp" "n" JSGlobalData* globalData) "movl %esp, %ebp" "n" { "pushl %esi" "n" JSValue result = JSValue::decode( "pushl %edi" "n" ctiTrampoline( "pushl %ebx" "n" m_ref.m_code.executableAddress(), "subl $0x3c, %esp" "n" registerFile, "movl $512, %esi" "n" callFrame, "movl 0x58(%esp), %edi" "n" 0, "call *0x50(%esp)" "n" Profiler::enabledProfilerReference(), "addl $0x3c, %esp" "n" globalData)); "popl %ebx" "n" return globalData->exception ? jsNull() : result; "popl %edi" "n" } "popl %esi" "n" "popl %ebp" "n" "ret" "n" );
  35. Register allocation • Linear scan • Graph coloring
  36. Code generation • Pipelining • SIMD (SSE2, SSE3 ...) • Debug
  37. Type inference
  38. a+b
  39. Property access
  40. “foo.bar”
  41. foo.bar in C 00001f63!movl! %ecx,0x04(%edx)
  42. __ZN2v88internal7HashMap6LookupEPvjb: 00000338! pushl!%ebp 00000339! pushl!%ebx 0000033a! pushl!%edi 0000033b! 0000033c! 0000033f! 00000343! 00000346! 00000349! pushl!%esi subl! $0x0c,%esp movl! 0x20(%esp),%esi movl! 0x08(%esi),%eax movl! 0x0c(%esi),%ecx imull!$0x0c,%ecx,%edi foo.bar in JavaScript 0000034c! leal! 0xff(%ecx),%ecx 0000034f! addl! %eax,%edi 00000351! movl! 0x28(%esp),%ebx 00000355! andl! %ebx,%ecx 00000357! imull!$0x0c,%ecx,%ebp 0000035a! addl! %eax,%ebp 0000035c! jmp! 0x0000036a 0000035e! nop 00000360! addl! $0x0c,%ebp 00000363! cmpl! %edi,%ebp 00000365! jb! 0x0000036a 00000367! movl! 0x08(%esi),%ebp 0000036a! 0000036d! 0000036f! movl! 0x00(%ebp),%eax testl!%eax,%eax je! 0x0000038b __ZN2v88internal7HashMap6LookupEPvjb 00000371! cmpl! %ebx,0x08(%ebp) 00000374! jne! 0x00000360 00000376! movl! %eax,0x04(%esp) 0000037a! 0000037e! 00000381! movl! 0x24(%esp),%eax movl! %eax,(%esp) call! *0x04(%esi) means: 00000384! testb!%al,%al 00000386! je! 0x00000360 00000388! movl! 0x00(%ebp),%eax 0000038b! 0000038d! 00000393! testl!%eax,%eax jne! 0x00000418 cmpb! $0x00,0x2c(%esp) v8::internal::HashMap::Lookup(void*, unsigned int, bool) 00000398! jne! 0x0000039e 0000039a! xorl! %ebp,%ebp 0000039c! jmp! 0x00000418 0000039e! movl! 0x24(%esp),%eax 000003a2! movl! %eax,0x00(%ebp) 000003a5! movl! $0x00000000,0x04(%ebp) 000003ac! movl! %ebx,0x08(%ebp) 000003af! movl! 0x10(%esi),%eax 000003b2! leal! 0x01(%eax),%ecx 000003b5! movl! %ecx,0x10(%esi) 000003b8! shrl! $0x02,%ecx 000003bb! leal! 0x01(%ecx,%eax),%eax ... 27 lines more
  43. How to optimize?
  44. Hidden Type add property x then add property y http://code.google.com/apis/v8/design.html
  45. But nothing is perfect
  46. one secret in V8 hidden class 20x times slower! http://jsperf.com/test-v8-delete
  47. in Figure 5, reads are far more common than writes: over all Write_indx roughly comparable to me 1. Write_prop Read_prop 0.8 traces the proportion of reads to writes is 6 to 1. Deletes comprise Write_hash Read_hash class-based languages, suc Write_indx Read_indx only .1% of all events. That graph further breaks reads, writes Write_prop Read_prop Delet_prop ric discussed in [23]. Studi But property are rarely deleted and deletes into various specific types; prop Delet_hash to accesses refers 0.8 Write_hash Read_hash DIT of 8 and a median of 0.6 Write_indx Read_indx Delet_indx Write_prop Read_prop Delet_prop Define and maximum of 10. Figu Write_hash Read_hash Write_indx Read_indx Delet_hash Delet_indx Create Call median prototype chain le 0.6 10 Write_prop Read_prop Delet_prop Define Throw chain length 1, the minimu 0.4 Write_hash Read_hash Delet_hash Create Catch Write_indx Read_indx Delet_indx Call have at least one prototyp Read_prop Define Object.prototype. The m 1.0 Delet_prop Throw 9 0.4 Read_hash Delet_hash Create Catch is 10. The majority of site 0.2 Read_indx Delet_indx Call Delet_prop Define Throw Delet_hash Create Catch reuse, but this is possibly 8 0.8 Delet_indx Call to achieve code reuse in J 0.2 Define Throw 0.0 Create Catch sures directly into a field o prototypes have similar in 7 Call 280s Fbok Apme Bing Blog Digg Flkr Gmai Gmap Lvly Twit Wiki Goog IShk Word Ebay YTub All* Prototype chain length Throw 0.6 0.4 Flkr 0.0 Catch Only 0.1% delete 5.4 Object Kinds 280s Fbok Gmai Gmap Lvly Twit Wiki Apme Bing Blog Digg Goog IShk Word Ebay YTub All* 6 280S BING BLOG EBAY FBOK DIGG FLKR GMIL GMAP GOGL ISHK LIVE MECM TWIT ALL* WIKI WORD YTUB Figure 7 breaks down the Fbok Bing Blog Digg Flkr Gmai Gmap Lvly Twit Wiki Goog IShk Word Ebay YTub All* into a number of categorie 5 built-in data types: dates (D Fbok Gmap Lvly Twit Wiki Flkr Gmai Goog IShk Word Ebay YTub All* 0.2 ument and layout objects 4 rors. The remaining objec Lvly Twit Wiki Goog IShk Word Ebay 0.0 YTub All* mous objects, instances, fu jects are constructed with a 3 Figure 5. Instruction mix. The per-site proportion of read, write, while instances are constr 280S BING BLOG EBAY FBOK LIVE ALL* DIGG FLKR GMIL GMAP GOGL ISHK MECM TWIT WIKI WORD YTUB delete, call instructions (averaged over multiple traces). A function object is creat 2 An Analysis of the Dynamic Behavior ofthe interpreter a uated by JavaScript Programs
  48. Optimize method call
  49. bar can be anything function foo(bar) { return bar.pro(); }
  50. adaptive optimization for self
  51. Polymorphic inline cache
  52. Tagged pointer
  53. Tagged pointer typedef union { void *p; double d; long l; } Value; typedef struct { unsigned char type; sizeof(a)?? Value value; } Object; if everything is object, it will be too much overhead for small integer Object a;
  54. Tagged pointer In almost all system, the pointer address will be aligned (4 or 8 bytes) “The address of a block returned by malloc or realloc in the GNU system is always a multiple of eight (or sixteen on 64-bit systems). ” http://www.gnu.org/s/libc/manual/html_node/Aligned-Memory-Blocks.html
  55. Tagged pointer Example: 0xc00ab958 the pointer’s last 2 or 3 bits must be 0 1 0 0 0 1 0 0 1 8 9 Pointer Small Number
  56. How about double?
  57. NaN-tagging (JSC 64 bit) In 64 bit system, we can only use 48 bits, that means it will have 16 bits are 0 * The top 16-bits denote the type of the encoded JSValue: * * Pointer { 0000:PPPP:PPPP:PPPP * / 0001:****:****:**** * Double { ... * FFFE:****:****:**** * Integer { FFFF:0000:IIII:IIII
  58. V8
  59. V8 • Lars Bak • Hidden Class, PICs • Built-in objects written in JavaScript • Crankshaft • Precise generation GC
  60. Lars Bak • implement VM since 1988 • Beta • Self • HotSpot
  61. Source code Native Code High-Level IR Low-Level IR Opt Native Code } Crankshaft
  62. Hotspot client compiler
  63. Crankshaft • Profiling • Compiler optimization • On-stack replacement • Deoptimize
  64. High-Level IR (Hydrogen) • function inline • type inference • stack check elimination • loop-invariant code motion • common subexpression elimination • ... http://wingolog.org/archives/2011/08/02/a-closer-look-at-crankshaft-v8s-optimizing-compiler
  65. Low-Level IR (Lithium) • linear-scan register allocator • code generate • lazy deoptimization http://wingolog.org/archives/2011/09/05/from-ssa-to-native-code-v8s-lithium-language
  66. Built-in objects written in JS function ArraySort(comparefn) { if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) { throw MakeTypeError("called_on_null_or_undefined", ["Array.prototype.sort"]); } // In-place QuickSort algorithm. // For short (length <= 22) arrays, insertion sort is used for efficiency. if (!IS_SPEC_FUNCTION(comparefn)) { comparefn = function (x, y) { if (x === y) return 0; if (%_IsSmi(x) && %_IsSmi(y)) { return %SmiLexicographicCompare(x, y); } x = ToString(x); y = ToString(y); if (x == y) return 0; else return x < y ? -1 : 1; }; } ... v8/src/array.js
  67. GC
  68. V8 performance
  69. Can V8 be faster?
  70. Dart • Clear syntax, Optional types, Libraries • Performance • Can compile to JavaScript • But IE, WebKit and Mozilla rejected it • What do you think? • My thought: Will XML replace HTML? No, but thanks Google, for push the web forward
  71. Embed V8
  72. Embed
  73. Expose Function v8::Handle<v8::Value> Print(const v8::Arguments& args) { for (int i = 0; i < args.Length(); i++) { v8::HandleScope handle_scope; v8::String::Utf8Value str(args[i]); const char* cstr = ToCString(str); printf("%s", cstr); } return v8::Undefined(); } v8::Handle<v8::ObjectTemplate> global = v8::ObjectTemplate::New(); global->Set(v8::String::New("print"), v8::FunctionTemplate::New(Print));
  74. Node.JS • Pros • Cons • Async • Lack of great libraries • One language for everything • ES5 code hard to maintain • Faster than PHP, Python • Still too youth • Community
  75. JavaScriptCore (Nitro)
  76. Where it comes from?
  77. 1997 Macworld
  78. “Apple has decided to make Internet Explorer it’s default browser on macintosh.” “Since we believe in choice. We going to be shipping other Internet Browser...” Steve Jobs
  79. JavaScriptCore History • 2001 KJS (kde-2.2) • 2008 SquirrelFish Extreme • Bison • PICs • AST interpreter • method JIT • 2008 SquirrelFish • regular expression JIT • Bytecode(Register) • DFG JIT (March 2011) • Direct threading
  80. Interpreter AST Bytecode Method JIT SSA DFG JIT
  81. SipderMonkey
  82. Monkey • SpiderMonkey • JägerMonkey • Written by Brendan Eich • PICs • interpreter • method JIT (from JSC) • TraceMonkey • IonMonkey • trace JIT • Type Inference • removed • Compiler optimization
  83. IonMonkey • SSA • function inline • linear-scan register allocation • dead code elimination • loop-invariant code motion • ...
  84. http://www.arewefastyet.com/
  85. Chakra (IE9)
  86. Chakra • Interpreter/JIT • Type System (hidden class) • PICs • Delay parse • Use utf-8 internal
  87. Unlocking the JavaScript Opportunity with Internet Explorer 9
  88. Unlocking the JavaScript Opportunity with Internet Explorer 9
  89. Carakan (Opera)
  90. Carakan • Register VM • Method JIT, Regex JIT • Hidden type • Function inline
  91. Rhino and JVM
  92. Rhino is SLOW, why?
  93. Because JVM is slow?
  94. JVM did’t support dynamic language well
  95. Solution: invokedynamic
  96. Hard to optimize in JVM Before Caller Some tricks Method Invokedynamic After Caller Method method handle
  97. One ring to rule them all?
  98. Rhino + invokedynamic • Pros • Cons • Easier to implement • Only in JVM7 • Lots of great Java Libraries • Not fully optimized yet • JVM optimization for free • Hard to beat V8
  99. Compiler optimization is HARD
  100. It there an easy way?
  101. LLVM
  102. LLVM • Clang, VMKit, GHC, PyPy, Rubinius ... • DragonEgg: replace GCC back-end • IR • Optimization • Link, Code generate, JIT • Apple
  103. LLVM simplify
  104. define i32 @foo(i32 %bar) nounwind ssp { entry: %bar_addr = alloca i32, align 4 %retval = alloca i32 %0 = alloca i32 %one = alloca i32 %"alloca point" = bitcast i32 0 to i32 store i32 %bar, i32* %bar_addr store i32 1, i32* %one, align 4 %1 = load i32* %bar_addr, align 4 %2 = load i32* %one, align 4 %3 = add nsw i32 %1, %2 store i32 %3, i32* %0, align 4 int foo(int bar) { %4 = load i32* %0, align 4 int one = 1; store i32 %4, i32* %retval, align 4 return bar + one; br label %return } return: int main() { %retval1 = load i32* %retval foo(3); ret i32 %retval1 } } define i32 @main() nounwind ssp { entry: %retval = alloca i32 %"alloca point" = bitcast i32 0 to i32 %0 = call i32 @foo(i32 3) nounwind ssp br label %return return: %retval1 = load i32* %retval ret i32 %retval1 }
  105. define i32 @foo(i32 %bar) nounwind ssp { entry: %bar_addr = alloca i32, align 4 %retval = alloca i32 %0 = alloca i32 %one = alloca i32 %"alloca point" = bitcast i32 0 to i32 store i32 %bar, i32* %bar_addr store i32 1, i32* %one, align 4 %1 = load i32* %bar_addr, align 4 %2 = load i32* %one, align 4 %3 = add nsw i32 %1, %2 define i32 @foo(i32 %bar) nounwind readnone ssp { store i32 %3, i32* %0, align 4 entry: %4 = load i32* %0, align 4 %0 = add nsw i32 %bar, 1 store i32 %4, i32* %retval, align 4 ret i32 %0 br label %return } return: define i32 @main() nounwind readnone ssp { %retval1 = load i32* %retval entry: } ret i32 %retval1 Optimization } ret i32 undef define i32 @main() nounwind ssp { entry: %retval = alloca i32 %"alloca point" = bitcast i32 0 to i32 %0 = call i32 @foo(i32 3) nounwind ssp br label %return return: %retval1 = load i32* %retval ret i32 %retval1 }
  106. Optimization (70+) http://llvm.org/docs/Passes.html
  107. define i32 @foo(i32 %bar) nounwind readnone ssp { entry: %0 = add nsw i32 %bar, 1 ret i32 %0 } LLVM backend define i32 @main() nounwind readnone ssp { entry: ret i32 undef }
  108. exe & Libraries LLVM LLVM exe & Offline Reoptimizer LLVM Compiler FE 1 LLVM Native exe Profile . CPU Info LLVM Linker CodeGen Profile & Trace . .o files IPO/IPA LLVM exe Info Runtime Compiler FE N JIT LLVM Optimizer LLVM LLVM Figure 4: LLVM system architecture diagram code in non-conforming languages is executed as “un- managed code”. Such code is represented in native External static LLVM compilers (referred to as front-e form and not in the CLI intermediate representation, translate source-language programs into the LLVM vir so it is not exposed to CLI optimizations. These sys- instruction set. Each static compiler can perform three tems do not provide #2 with #1 or #3 because run- tasks, of which the first and third are optional: (1) Per time optimization is generally only possible when us- language-specific optimizations, e.g., optimizing closure ing JIT code generation. They do not aim to provide languages with higher-order functions. (2) Translate so
  109. LLVM on JavaScript
  110. Emscripten • C/C++ to LLVM IR • LLVM IR to JavaScript • Run on browser
  111. ... function _foo($bar) { define i32 @foo(i32 %bar) nounwind readnone ssp { var __label__; entry: var $0=((($bar)+1)|0); %0 = add nsw i32 %bar, 1 return $0; ret i32 %0 } } function _main() { define i32 @main() nounwind readnone ssp { var __label__; entry: return undef; ret i32 undef } } Module["_main"] = _main; ...
  112. Emscripten demo • Python, Ruby, Lua virtual machine (http://repl.it/) • OpenJPEG • Poppler • FreeType • ... https://github.com/kripken/emscripten/wiki
  113. Performance? good enough! benchmark SM V8 gcc ratio two Ja fannkuch (10) 1.158 0.931 0.231 4.04 benchm fasta (2100000) 1.115 1.128 0.452 2.47 operati primes 1.443 3.194 0.438 3.29 code th raytrace (7,256) 1.930 2.944 0.228 8.46 to usin dlmalloc (400,400) 5.050 1.880 0.315 5.97 (The m ‘nativiz The first column is the name of the benchmark, and in Bein parentheses any parameters used in running it. The source C++ co
  114. JavaScript on LLVM
  115. Fabric Engine • JavaScript Integration • Native code compilation (LLVM) • Multi-threaded execution • OpenGL Rendering
  116. Fabric Engine http://fabric-engine.com/2011/11/server-performance-benchmarks/
  117. Conclusion?
  118. All problems in computer science can be solved by another level of indirection David Wheeler
  119. References • The behavior of efficient virtual • Context Threading: A Flexible machine interpreters on and Efficient Dispatch modern architectures Technique for Virtual Machine Interpreters • Virtual Machine Showdown: Stack Versus Registers • Effective Inline-Threaded Interpretation of Java Bytecode • The implementation of Lua 5.0 Using Preparation Sequences • Why Is the New Google V8 • Smalltalk-80: the language and Engine so Fast? its implementation
  120. References • Design of the Java HotSpotTM • LLVM: A Compilation Client Compiler for Java 6 Framework for Lifelong Program Analysis & • Oracle JRockit: The Definitive Transformation Guide • Emscripten: An LLVM-to- • Virtual Machines: Versatile JavaScript Compiler platforms for systems and processes • An Analysis of the Dynamic Behavior of JavaScript • Fast and Precise Hybrid Type Programs Inference for JavaScript
  121. References • Adaptive Optimization for SELF • Design, Implementation, and Evaluation of Optimizations in a • Bytecodes meet Combinators: Just-In-Time Compiler invokedynamic on the JVM • Optimizing direct threaded • Context Threading: A Flexible code by selective inlining and Efficient Dispatch Technique for Virtual Machine • Linear scan register allocation Interpreters • Optimizing Invokedynamic • Efficient Implementation of the Smalltalk-80 System
  122. References • Representing Type Information • The Structure and Performance in Dynamically Typed of Efficient Interpreters Languages • Know Your Engines: How to • The Behavior of Efficient Virtual Make Your JavaScript Fast Machine Interpreters on Modern Architectures • IE Blog, Chromium Blog, WebKit Blog, Opera Blog, • Trace-based Just-in-Time Type Mozilla Blog, Wingolog’s Blog, Specialization for Dynamic RednaxelaFX’s Blog, David Languages Mandelin’s Blog...
  123. !ank y"
  • HeatherLyons19

    Nov. 23, 2021
  • YutakaOwada

    May. 5, 2021
  • wellington1993

    May. 7, 2019
  • prodromouf

    Feb. 4, 2018
  • Aaronmichael539

    Aug. 5, 2017
  • raminfarajpour1

    Mar. 9, 2017
  • YujiSakata

    Jan. 4, 2017
  • hkoba

    Jan. 4, 2017
  • dynekun

    Dec. 11, 2016
  • twlk28

    Oct. 6, 2016
  • TeodorTITE

    May. 26, 2016
  • feeldesign.cn

    May. 23, 2016
  • lendo8298

    Mar. 25, 2016
  • beherca

    Jan. 17, 2016
  • samlin9693

    Oct. 24, 2015
  • mdebonnaire

    Oct. 21, 2015
  • healingme

    Jul. 22, 2015
  • HuyNgo5

    Jul. 15, 2015
  • yarikponomarenko

    Jul. 14, 2015
  • IngarsRibners

    Jun. 29, 2015

Introduction to virtual machine and JavaScript engine implement.

Views

Total views

104,472

On Slideshare

0

From embeds

0

Number of embeds

58,635

Actions

Downloads

489

Shares

0

Comments

0

Likes

72

×