SlideShare a Scribd company logo
1 of 135
Download to read offline
Virtual Machine & JavaScript Engine
               @nwind
(HLL) Virtual Machine
Take the red pill
I will show you the rabbit hole.
Virtual Machine history
•   pascal 1970

•   smalltalk 1980

•   self 1986

•   python 1991

•   java 1995

•   javascript 1995
The Smalltalk demonstration showed three amazing features. One
was how computers could be networked; the second was how
object-oriented programming worked. But Jobs and his team paid
little attention to these attributes because they were so amazed by
the third feature, ...
How Virtual Machine Work?

•   Parser

•   Intermediate Representation (IR)

•   Interpreter

•   Garbage Collection

•   Optimization
Parser


•   Tokenize

•   AST
Tokenize
                    identifier           number


keyword
          var foo = 10;                          semicolon


            space               equal
AST

               Assign




Variable foo            Constant 10
{
            AST demo (Esprima)
    "type": "Program",
    "body": [
        {
            "type": "VariableDeclaration",
            "declarations": [
                {
                    "id": {
                        "type": "Identifier",
                        "name": "foo"
                    },
                    "init": {
                        "type": "BinaryExpression",   var foo = bar + 1;
                        "operator": "+",
                        "left": {
                            "type": "Identifier",
                            "name": "bar"
                        },
                        "right": {
                            "type": "Literal",
                            "value": 1
                        }
                    }
                }
            ],
            "kind": "var"
        }

}
    ]
                   http://esprima.org/demo/parse.html
Intermediate Representation


•   Bytecode

•   Stack vs. register
Bytecode (SpiderMonkey)
                      00000:   deffun 0 null
                      00005:   nop
                      00006:   callvar 0
                      00009:   int8 2
function foo(bar) {   00011:   call 1
    return bar + 1;   00014:   pop
}                     00015:   stop

foo(2);               foo:
                      00020:   getarg 0
                      00023:   one
                      00024:   add
                      00025:   return
                      00026:   stop
Bytecode (JSC)
                      8 m_instructions; 168 bytes at 0x7fc1ba3070e0;
                      1 parameter(s); 10 callee register(s)

                      [    0]   enter
                      [    1]   mov! !    r0, undefined(@k0)
                      [    4]   get_global_var!   r1, 5
                      [    7]   mov! !    r2, undefined(@k0)
                      [   10]   mov! !    r3, 2(@k1)
                      [   13]   call!!    r1, 2, 10
function foo(bar) {   [   17]   op_call_put_result! !     r0
    return bar + 1;   [   19]   end! !    r0
}                     Constants:
                         k0 = undefined
foo(2);                  k1 = 2

                      3 m_instructions; 64 bytes at 0x7fc1ba306e80;
                      2 parameter(s); 1 callee register(s)

                      [    0] enter
                      [    1] add! !     r0, r-7, 1(@k0)
                      [    6] ret! !     r0

                      Constants:
                         k0 = 1

                      End: 3
Stack vs. register
•   Stack

    •   JVM, .NET, php, python, Old JavaScript engine

•   Register

    •   Lua, Dalvik, All modern JavaScript engine

    •   Smaller, Faster (about 30%)

    •   RISC
Stack vs. register
                                                local a,t,i    1:   PUSHNIL      3
                                                a=a+i          2:   GETLOCAL     0 ; a
                                                               3:   GETLOCAL     2 ; i
                                                               4:   ADD
local a,t,i   1:   LOADNIL    0   2   0
                                                               5:   SETLOCAL     0   ; a
a=a+i         2:   ADD        0   0   2
                                                a=a+1          6:   SETLOCAL     0   ; a
a=a+1         3:   ADD        0   0   250 ; a
                                                               7:   ADDI         1
a=t[i]        4:   GETTABLE   0   1   2
                                                               8:   SETLOCAL     0   ;   a
                                                a=t[i]         9:   GETLOCAL     1   ;   t
                                                              10:   GETINDEXED   2   ;   i
                                                              11:   SETLOCAL     0   ;   a
Interpreter


•   Switch statement

•   Direct threading, Indirect threading, Token threading ...
Switch statement
while (true) {
! switch (opcode) {   mov    %edx,0xffffffffffffffe4(%rbp)
! ! case ADD:         cmpl   $0x1,0xffffffffffffffe4(%rbp)
! ! ! ...             je     6e <interpret+0x6e>
! ! ! break;          cmpl   $0x1,0xffffffffffffffe4(%rbp)
! ! case SUB:         jb     4a <interpret+0x4a>
! ! ! ...             cmpl   $0x2,0xffffffffffffffe4(%rbp)
! ! ! break;          je     93 <interpret+0x93>
     ...              jmp    22 <interpret+0x22>
! }                   ...
}
Direct threading
typedef void *Inst;                      mov       0xffffffffffffffe8(%rbp),%rdx
Inst program[] = { &&ADD, &&SUB };       lea       0xffffffffffffffe8(%rbp),%rax
Inst *ip = program;                      addq      $0x8,(%rax)
goto *ip++;                              mov       %rdx,0xffffffffffffffd8(%rbp)
                                         jmpq      *0xffffffffffffffd8(%rbp)
ADD:
   ...                                   ADD:
   goto *ip++;                              ...
                                            mov       0xffffffffffffffe8(%rbp),%rdx
SUB:                                        lea       0xffffffffffffffe8(%rbp),%rax
   ...                                      addq      $0x8,(%rax)
   goto *ip++;                              mov       %rdx,0xffffffffffffffd8(%rbp)
                                            jmp       2c <interpreter+0x2c>




            http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
Garbage Collection
•   Reference counting (php, python ...), smart pointer

•   Tracing

    •   Stop the world

    •   Copying, Mark-and-sweep, Mark-and-compact

    •   Generational GC

    •   Precise vs. conservative
Precise vs. conservative
•   Conservative

    •   If it looks like a pointer, treat it as a pointer

    •   Might have memory leak

    •   Cant’ move object, have memory fragmentation

•   Precise

    •   Indirectly vs. Directly reference
It is time for the DARK
          Magic
Optimization Magic
•   Interpreter optimization

•   Compiler optimization

•   JIT

•   Type inference

•   Hidden Type

•   Method inline, PICs
Interpreter optimization
Switch work inefficient, Why?
CPU Pipeline


•   Fetch, Decode, Execute, Write-back

•   Branch prediction
http://en.wikipedia.org/wiki/File:Pipeline,_4_stage.svg
Solution: Inline Threading
ICONST_1_START: *sp++ = 1;
ICONST_1_END: goto **(pc++);
INEG_START: sp[-1] = -sp[-1];
INEG_END: goto **(pc++);
DISPATCH_START: goto **(pc++);
DISPATCH_END: ;

size_t iconst_size = (&&ICONST_1_END - &&ICONST_1_START);
size_t ineg_size = (&&INEG_END - &&INEG_START);
size_t dispatch_size = (&&DISPATCH_END - &&DISPATCH_START);

void *buf = malloc(iconst_size + ineg_size + dispatch_size);
void *current = buf;
memcpy(current, &&ICONST_START, iconst_size); current += iconst_size;
memcpy(current, &&INEG_START, ineg_size); current += ineg_size;
memcpy(current, &&DISPATCH_START, dispatch_size);
...

goto **buf;
                                                              Interpreter? JIT!
Compiler optimization
Compiler optimization

•   SSA

•   Data-flow

•   Control-flow

•   Loop

•   ...
What a JVM can do...
compiler tactics                          language-specific techniques              loop transformations
  delayed compilation                       class hierarchy analysis                  loop unrolling
  Tiered compilation                        devirtualization                          loop peeling
  on-stack replacement                      symbolic constant propagation             safepoint elimination
  delayed reoptimization                    autobox elimination                       iteration range splitting
  program dependence graph representation   escape analysis                           range check elimination
  static single assignment representation   lock elision                              loop vectorization
proof-based techniques                      lock fusion                             global code shaping
  exact type inference                      de-reflection                             inlining (graph integration)
  memory value inference                  speculative (profile-based) techniques      global code motion
  memory value tracking                     optimistic nullness assertions            heat-based code layout
  constant folding                          optimistic type assertions                switch balancing
  reassociation                             optimistic type strengthening             throw inlining
  operator strength reduction               optimistic array length strengthening   control flow graph transformation
  null check elimination                    untaken branch pruning                    local code scheduling
  type test strength reduction              optimistic N-morphic inlining             local code bundling
  type test elimination                     branch frequency prediction               delay slot filling
  algebraic simplification                  call frequency prediction                 graph-coloring register allocation
  common subexpression elimination        memory and placement transformation         linear scan register allocation
  integer range typing                      expression hoisting                       live range splitting
flow-sensitive rewrites                     expression sinking                        copy coalescing
  conditional constant propagation          redundant store elimination               constant splitting
  dominating test detection                 adjacent store fusion                     copy removal
  flow-carried type narrowing               card-mark elimination                     address mode matching
  dead code elimination                     merge-point splitting                     instruction peepholing
                                                                                      DFA-based code generator


http://www.oracle.com/us/technologies/java/java7-renaissance-vm-428200.pdf
Just-In-Time (JIT)
JIT


•   Method JIT, Trace JIT, Regular expression JIT

•   Code generation

•   Register allocation
How JIT work?

•   mmap/new/malloc (mprotect)

•   generate native code

•   c cast/reinterpret_cast

•   call the function
Trampoline (JSC x86)
                                                        asm (
                                                        ".textn"
                                                        ".globl " SYMBOL_STRING(ctiTrampoline) "n"
// Execute the code!                                    HIDE_SYMBOL(ctiTrampoline) "n"
inline JSValue execute(RegisterFile* registerFile,      SYMBOL_STRING(ctiTrampoline) ":" "n"
                       CallFrame* callFrame,                "pushl %ebp" "n"
                       JSGlobalData* globalData)            "movl %esp, %ebp" "n"
{                                                           "pushl %esi" "n"
    JSValue result = JSValue::decode(                       "pushl %edi" "n"
      ctiTrampoline(                                        "pushl %ebx" "n"
        m_ref.m_code.executableAddress(),                   "subl $0x3c, %esp" "n"
        registerFile,                                       "movl $512, %esi" "n"
        callFrame,                                          "movl 0x58(%esp), %edi" "n"
        0,                                                  "call *0x50(%esp)" "n"
        Profiler::enabledProfilerReference(),               "addl $0x3c, %esp" "n"
        globalData));                                       "popl %ebx" "n"
    return globalData->exception ? jsNull() : result;       "popl %edi" "n"
}                                                           "popl %esi" "n"
                                                            "popl %ebp" "n"
                                                            "ret" "n"
                                                        );
Register allocation


•   Linear scan

•   Graph coloring
Code generation


•   Pipelining

•   SIMD (SSE2, SSE3 ...)

•   Debug
Type inference
a+b
Property access
“foo.bar”
foo.bar in C


00001f63!movl!
             %ecx,0x04(%edx)
__ZN2v88internal7HashMap6LookupEPvjb:
00000338!   pushl!%ebp
00000339!   pushl!%ebx
0000033a!   pushl!%edi
0000033b!
0000033c!
0000033f!
00000343!
00000346!
00000349!
            pushl!%esi
            subl! $0x0c,%esp
            movl! 0x20(%esp),%esi
            movl! 0x08(%esi),%eax
            movl! 0x0c(%esi),%ecx
            imull!$0x0c,%ecx,%edi
                                           foo.bar in JavaScript
0000034c!   leal! 0xff(%ecx),%ecx
0000034f!   addl! %eax,%edi
00000351!   movl! 0x28(%esp),%ebx
00000355!   andl! %ebx,%ecx
00000357!   imull!$0x0c,%ecx,%ebp
0000035a!   addl! %eax,%ebp
0000035c!   jmp! 0x0000036a
0000035e!   nop
00000360!   addl! $0x0c,%ebp
00000363!   cmpl! %edi,%ebp
00000365!   jb!   0x0000036a
00000367!   movl! 0x08(%esi),%ebp
0000036a!
0000036d!
0000036f!
            movl! 0x00(%ebp),%eax
            testl!%eax,%eax
            je!   0x0000038b
                                                __ZN2v88internal7HashMap6LookupEPvjb
00000371!   cmpl! %ebx,0x08(%ebp)
00000374!   jne! 0x00000360
00000376!   movl! %eax,0x04(%esp)
0000037a!
0000037e!
00000381!
            movl! 0x24(%esp),%eax
            movl! %eax,(%esp)
            call! *0x04(%esi)
                                                means:
00000384!   testb!%al,%al
00000386!   je!   0x00000360
00000388!   movl! 0x00(%ebp),%eax
0000038b!
0000038d!
00000393!
            testl!%eax,%eax
            jne! 0x00000418
            cmpb! $0x00,0x2c(%esp)
                                                v8::internal::HashMap::Lookup(void*, unsigned int, bool)
00000398!   jne! 0x0000039e
0000039a!   xorl! %ebp,%ebp
0000039c!   jmp! 0x00000418
0000039e!   movl! 0x24(%esp),%eax
000003a2!   movl! %eax,0x00(%ebp)
000003a5!   movl! $0x00000000,0x04(%ebp)
000003ac!   movl! %ebx,0x08(%ebp)
000003af!   movl! 0x10(%esi),%eax
000003b2!   leal! 0x01(%eax),%ecx
000003b5!   movl! %ecx,0x10(%esi)
000003b8!   shrl! $0x02,%ecx
000003bb!   leal! 0x01(%ecx,%eax),%eax
... 27 lines more
How to optimize?
Hidden Type
                         add property x




 then add property y




                       http://code.google.com/apis/v8/design.html
But nothing is perfect
one secret in V8 hidden class




                                          20x times
                                           slower!


       http://jsperf.com/test-v8-delete
in Figure 5, reads are far more common than writes: over all
                                                                                                                                                                                                       Write_indx
                                                                                                                                                                                                                                                                                                                                                  roughly comparable to me




                                                                                                                  1.
                                                                                                                                                                                            Write_prop Read_prop


                         0.8
                                                                                                                                                   traces the proportion of reads to writes is 6 to 1. Deletes comprise
                                                                                                                                                                                            Write_hash Read_hash                                                                                                                                  class-based languages, suc
                                                                                                                                                                                            Write_indx Read_indx
                                                                                                                                                   only .1% of all events. That graph further breaks reads, writes
                                                                                                                                                                                Write_prop  Read_prop Delet_prop                                                                                                                                  ric discussed in [23]. Studi

                                                            But property are rarely deleted                                                        and deletes into various specific types; prop Delet_hash to accesses
                                                                                                                                                                                                       refers




                                                                                                                  0.8
                                                                                                                                                                                Write_hash  Read_hash
                                                                                                                                                                                                                                                                                                                                                  DIT of 8 and a median of
                         0.6

                                                                                                                                                                                Write_indx  Read_indx Delet_indx
                                                                                                                                                                                                             Write_prop                                  Read_prop                       Delet_prop       Define                                  and maximum of 10. Figu
                                                                                                                                                                                                             Write_hash                                  Read_hash
                                                                                                                                                                                                             Write_indx                                  Read_indx
                                                                                                                                                                                                                                                                                         Delet_hash
                                                                                                                                                                                                                                                                                         Delet_indx
                                                                                                                                                                                                                                                                                                          Create
                                                                                                                                                                                                                                                                                                          Call                                    median prototype chain le




                                                                                                                  0.6


                                                                                                                                                   10
                                                                                                                                                               Write_prop                                    Read_prop                                   Delet_prop                      Define           Throw                                   chain length 1, the minimu
                         0.4




                                                                                                                                                               Write_hash                                    Read_hash                                   Delet_hash                      Create           Catch
                                                                                                                                                               Write_indx                                    Read_indx                                   Delet_indx                      Call                                                     have at least one prototyp
                                                                                                                                                               Read_prop                                                                                 Define
                                                                                                                                                                                                                                                                                                                                                  Object.prototype. The m
                                                                                                                   1.0
                                                                                                                                                                                                             Delet_prop                                                                  Throw




                                                                                                                                                   9
                                                                                                                 0.4
                                                                                                                                                               Read_hash                                     Delet_hash                                  Create                          Catch
                                                                                                                                                                                                                                                                                                                                                  is 10. The majority of site
                         0.2




                                                                                                                                                               Read_indx                                     Delet_indx                                  Call
                                                                                                                                                               Delet_prop                                    Define                                      Throw
                                                                                                                                                               Delet_hash                                    Create                                      Catch                                                                                    reuse, but this is possibly
                                                                                                                                                   8
                                                                                                                   0.8

                                                                                                                                                               Delet_indx                                    Call                                                                                                                                 to achieve code reuse in J
                                                                                                                 0.2

                                                                                                                                                               Define                                        Throw
                         0.0




                                                                                                                                                               Create                                        Catch                                                                                                                                sures directly into a field o
                                                                                                                                                                                                                                                                                                                                                  prototypes have similar in
                                                                                                                                                   7
                                                                                                                                                               Call
                                                            280s




                                                                                                                                               Fbok
                                                                           Apme

                                                                                          Bing

                                                                                                         Blog

                                                                                                                                     Digg




                                                                                                                                                           Flkr

                                                                                                                                                                         Gmai

                                                                                                                                                                                       Gmap




                                                                                                                                                                                                                                 Lvly

                                                                                                                                                                                                                                               Twit

                                                                                                                                                                                                                                                        Wiki
                                                                                                                                                                                                     Goog

                                                                                                                                                                                                                   IShk




                                                                                                                                                                                                                                                                 Word

                                                                                                                                                                                                                                                                           Ebay

                                                                                                                                                                                                                                                                                   YTub

                                                                                                                                                                                                                                                                                            All*
                                                                                                                 Prototype chain length




                                                                                                                                                               Throw
                                                                                                                                     0.6
                                                                                                                          0.4 Flkr 0.0




                                                                                                                                                               Catch

                                                                                                                                                                                                                                                                                                                                               Only 0.1% delete
                                                                                                                                                                                                                                                                                                                                                 5.4 Object Kinds
                        280s




                                                                                                   Fbok




                                                                                                                                            Gmai

                                                                                                                                                       Gmap




                                                                                                                                                                                                   Lvly

                                                                                                                                                                                                                Twit

                                                                                                                                                                                                                              Wiki
                                       Apme

                                                      Bing

                                                                     Blog

                                                                                    Digg




                                                                                                                                                                     Goog

                                                                                                                                                                                    IShk




                                                                                                                                                                                                                                            Word

                                                                                                                                                                                                                                                       Ebay

                                                                                                                                                                                                                                                                 YTub

                                                                                                                                                                                                                                                                           All*
                                                                                                                                                   6
                                                                                                                                                      280S

                                                                                                                                                                    BING

                                                                                                                                                                                  BLOG




                                                                                                                                                                                                              EBAY

                                                                                                                                                                                                                            FBOK
                                                                                                                                                                                                DIGG




                                                                                                                                                                                                                                          FLKR

                                                                                                                                                                                                                                                      GMIL

                                                                                                                                                                                                                                                               GMAP

                                                                                                                                                                                                                                                                        GOGL

                                                                                                                                                                                                                                                                                  ISHK

                                                                                                                                                                                                                                                                                           LIVE

                                                                                                                                                                                                                                                                                                   MECM

                                                                                                                                                                                                                                                                                                           TWIT




                                                                                                                                                                                                                                                                                                                                        ALL*
                                                                                                                                                                                                                                                                                                                   WIKI

                                                                                                                                                                                                                                                                                                                          WORD

                                                                                                                                                                                                                                                                                                                                 YTUB
                                                                                                                                                                                                                                     Figure 7 breaks down the
                                                    Fbok
       Bing

                      Blog

                                     Digg




                                                                   Flkr

                                                                                  Gmai

                                                                                                 Gmap




                                                                                                                                                    Lvly

                                                                                                                                                                  Twit

                                                                                                                                                                                Wiki
                                                                                                                Goog

                                                                                                                                     IShk




                                                                                                                                                                                              Word

                                                                                                                                                                                                            Ebay

                                                                                                                                                                                                                          YTub

                                                                                                                                                                                                                                        All*
                                                                                                                                                                                                                                     into a number of categorie
                                                                                                                                                   5




                                                                                                                                                                                                                                     built-in data types: dates (D
Fbok




                                             Gmap




                                                                                          Lvly

                                                                                                         Twit

                                                                                                                  Wiki
               Flkr

                              Gmai




                                                            Goog

                                                                           IShk




                                                                                                                                               Word

                                                                                                                                                           Ebay

                                                                                                                                                                         YTub

                                                                                                                                                                                       All*
                                                                                                                 0.2




                                                                                                                                                                                                                                     ument and layout objects
                                                                                                                                                   4




                                                                                                                                                                                                                                     rors. The remaining objec
                                      Lvly

                                                     Twit

                                                                    Wiki
        Goog

                       IShk




                                                                                   Word

                                                                                                  Ebay

                                                                                                                 0.0 YTub
                                                                                                                                        All*




                                                                                                                                                                                                                                     mous objects, instances, fu
                                                                                                                                                                                                                                     jects are constructed with a
                                                                                                                                                   3




                                                                                                                                                   Figure 5. Instruction mix. The per-site proportion of read, write,                while instances are constr
                                                                                                                                                        280S

                                                                                                                                                                     BING

                                                                                                                                                                                   BLOG




                                                                                                                                                                                                               EBAY

                                                                                                                                                                                                                             FBOK




                                                                                                                                                                                                                                                                                           LIVE




                                                                                                                                                                                                                                                                                                                                        ALL*
                                                                                                                                                                                                 DIGG




                                                                                                                                                                                                                                           FLKR

                                                                                                                                                                                                                                                       GMIL

                                                                                                                                                                                                                                                                GMAP

                                                                                                                                                                                                                                                                         GOGL

                                                                                                                                                                                                                                                                                  ISHK




                                                                                                                                                                                                                                                                                                   MECM

                                                                                                                                                                                                                                                                                                           TWIT

                                                                                                                                                                                                                                                                                                                   WIKI

                                                                                                                                                                                                                                                                                                                          WORD

                                                                                                                                                                                                                                                                                                                                 YTUB
                                                                                                                                                   delete, call instructions (averaged over multiple traces).                        A function object is creat
                                                                                                                                                   2




                                                                                                                                                                                                              An Analysis of the Dynamic Behavior ofthe interpreter a
                                                                                                                                                                                                                                     uated by JavaScript Programs
Optimize method call
bar can be anything



function foo(bar) {
    return bar.pro();
}
adaptive optimization for self
Polymorphic inline cache
Tagged pointer
Tagged pointer
typedef union {
  void *p;
  double d;
  long l;
} Value;

typedef struct {
  unsigned char type;    sizeof(a)??
  Value value;
} Object;                if everything is object, it will be too much overhead
                         for small integer
Object a;
Tagged pointer

   In almost all system, the pointer address will be aligned (4 or 8 bytes)


“The address of a block returned by malloc or realloc in the GNU system is
always a multiple of eight (or sixteen on 64-bit systems). ”
                             http://www.gnu.org/s/libc/manual/html_node/Aligned-Memory-Blocks.html
Tagged pointer

Example: 0xc00ab958               the pointer’s last 2 or 3 bits must be 0



                1     0   0   0              1   0   0   1
                       8                           9
                    Pointer                  Small Number
How about double?
NaN-tagging (JSC 64 bit)
In 64 bit system, we can only use 48 bits, that means it will have 16 bits are 0

           * The top 16-bits denote the type of the encoded JSValue:
           *
           *     Pointer { 0000:PPPP:PPPP:PPPP
           *              / 0001:****:****:****
           *     Double {          ...
           *               FFFE:****:****:****
           *     Integer { FFFF:0000:IIII:IIII
V8
V8

•   Lars Bak

•   Hidden Class, PICs

•   Built-in objects written in JavaScript

•   Crankshaft

•   Precise generation GC
Lars Bak

•   implement VM since 1988

•   Beta

•   Self

•   HotSpot
Source code   Native Code



              High-Level IR    Low-Level IR   Opt Native Code



                    }   Crankshaft
Hotspot client compiler
Crankshaft

•   Profiling

•   Compiler optimization

•   On-stack replacement

•   Deoptimize
High-Level IR (Hydrogen)
     •   function inline

     •   type inference

     •   stack check elimination

     •   loop-invariant code motion

     •   common subexpression elimination

     •   ...

http://wingolog.org/archives/2011/08/02/a-closer-look-at-crankshaft-v8s-optimizing-compiler
Low-Level IR (Lithium)


  •   linear-scan register allocator

  •   code generate

  •   lazy deoptimization




http://wingolog.org/archives/2011/09/05/from-ssa-to-native-code-v8s-lithium-language
Built-in objects written in JS
function ArraySort(comparefn) {
  if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) {
    throw MakeTypeError("called_on_null_or_undefined",
                        ["Array.prototype.sort"]);
  }

 // In-place QuickSort algorithm.
 // For short (length <= 22) arrays, insertion sort is used for efficiency.

 if (!IS_SPEC_FUNCTION(comparefn)) {
   comparefn = function (x, y) {
     if (x === y) return 0;
     if (%_IsSmi(x) && %_IsSmi(y)) {
        return %SmiLexicographicCompare(x, y);
      }
      x = ToString(x);
      y = ToString(y);
     if (x == y) return 0;
     else return x < y ? -1 : 1;
   };
 }
 ...

                                     v8/src/array.js
GC
V8 performance
Can V8 be faster?
Dart
•   Clear syntax, Optional types, Libraries

•   Performance

•   Can compile to JavaScript

•   But IE, WebKit and Mozilla rejected it

•   What do you think?

    •   My thought: Will XML replace HTML? No, but thanks
        Google, for push the web forward
Embed V8
Embed
Expose Function
v8::Handle<v8::Value> Print(const v8::Arguments& args) {
  for (int i = 0; i < args.Length(); i++) {
    v8::HandleScope handle_scope;
    v8::String::Utf8Value str(args[i]);
    const char* cstr = ToCString(str);
    printf("%s", cstr);
  }
  return v8::Undefined();
}



v8::Handle<v8::ObjectTemplate> global = v8::ObjectTemplate::New();
global->Set(v8::String::New("print"), v8::FunctionTemplate::New(Print));
Node.JS
•   Pros                              •   Cons

    •   Async                             •   Lack of great libraries

    •   One language for everything       •   ES5 code hard to maintain

    •   Faster than PHP, Python           •   Still too youth

    •   Community
JavaScriptCore (Nitro)
Where it comes from?
1997 Macworld
“Apple has decided to make Internet Explorer it’s default browser
on macintosh.”

“Since we believe in choice. We going to be shipping other Internet
Browser...”
                                                        Steve Jobs
JavaScriptCore History
•   2001 KJS (kde-2.2)       •   2008 SquirrelFish Extreme

    •   Bison                    •   PICs

    •   AST interpreter          •   method JIT

•   2008 SquirrelFish            •   regular expression JIT

    •   Bytecode(Register)       •   DFG JIT (March 2011)

    •   Direct threading
Interpreter




AST   Bytecode   Method JIT




                 SSA           DFG JIT
SipderMonkey
Monkey
•   SpiderMonkey                  •   JägerMonkey

    •   Written by Brendan Eich       •   PICs

    •   interpreter                   •   method JIT (from JSC)

•   TraceMonkey                   •   IonMonkey

    •   trace JIT                     •   Type Inference

    •   removed                       •   Compiler optimization
IonMonkey
•   SSA

•   function inline

•   linear-scan register allocation

•   dead code elimination

•   loop-invariant code motion

•   ...
http://www.arewefastyet.com/
Chakra (IE9)
Chakra

•   Interpreter/JIT

•   Type System (hidden class)

•   PICs

•   Delay parse

•   Use utf-8 internal
Unlocking the JavaScript Opportunity with Internet Explorer 9
Unlocking the JavaScript Opportunity with Internet Explorer 9
Carakan (Opera)
Carakan

•   Register VM

•   Method JIT, Regex JIT

•   Hidden type

•   Function inline
Rhino and JVM
Rhino is SLOW, why?
Because JVM is slow?
JVM did’t support dynamic
     language well
Solution: invokedynamic
Hard to optimize in JVM




Before   Caller    Some tricks              Method




                  Invokedynamic
After    Caller                             Method
                  method handle
One ring to rule them all?
Rhino + invokedynamic
•   Pros                               •   Cons

    •   Easier to implement                •   Only in JVM7

    •   Lots of great Java Libraries       •   Not fully optimized yet

    •   JVM optimization for free          •   Hard to beat V8
Compiler optimization is
       HARD
It there an easy way?
LLVM
LLVM
•   Clang, VMKit, GHC, PyPy, Rubinius ...

•   DragonEgg: replace GCC back-end

•   IR

•   Optimization

•   Link, Code generate, JIT

•   Apple
LLVM simplify
define i32 @foo(i32 %bar) nounwind ssp {
                        entry:
                          %bar_addr = alloca i32, align 4
                          %retval = alloca i32
                          %0 = alloca i32
                          %one = alloca i32
                          %"alloca point" = bitcast i32 0 to i32
                          store i32 %bar, i32* %bar_addr
                          store i32 1, i32* %one, align 4
                          %1 = load i32* %bar_addr, align 4
                          %2 = load i32* %one, align 4
                          %3 = add nsw i32 %1, %2
                          store i32 %3, i32* %0, align 4
int foo(int bar) {
                          %4 = load i32* %0, align 4
    int one = 1;
                          store i32 %4, i32* %retval, align 4
    return bar + one;
                          br label %return
}
                        return:
int main() {
                          %retval1 = load i32* %retval
  foo(3);
                          ret i32 %retval1
}
                        }

                        define i32 @main() nounwind ssp {
                        entry:
                          %retval = alloca i32
                          %"alloca point" = bitcast i32 0 to i32
                          %0 = call i32 @foo(i32 3) nounwind ssp
                          br label %return

                        return:
                          %retval1 = load i32* %retval
                          ret i32 %retval1
                        }
define i32 @foo(i32 %bar) nounwind ssp {
entry:
  %bar_addr = alloca i32, align 4
  %retval = alloca i32
  %0 = alloca i32
  %one = alloca i32
  %"alloca point" = bitcast i32 0 to i32
  store i32 %bar, i32* %bar_addr
  store i32 1, i32* %one, align 4
  %1 = load i32* %bar_addr, align 4
  %2 = load i32* %one, align 4
  %3 = add nsw i32 %1, %2
                                                          define i32 @foo(i32 %bar) nounwind readnone ssp {
  store i32 %3, i32* %0, align 4
                                                          entry:
  %4 = load i32* %0, align 4
                                                            %0 = add nsw i32 %bar, 1
  store i32 %4, i32* %retval, align 4
                                                            ret i32 %0
  br label %return
                                                          }
return:
                                                          define i32 @main() nounwind readnone ssp {
  %retval1 = load i32* %retval
                                                          entry:
}
  ret i32 %retval1
                                           Optimization   }
                                                            ret i32 undef

define i32 @main() nounwind ssp {
entry:
  %retval = alloca i32
  %"alloca point" = bitcast i32 0 to i32
  %0 = call i32 @foo(i32 3) nounwind ssp
  br label %return

return:
  %retval1 = load i32* %retval
  ret i32 %retval1
}
Optimization (70+)




    http://llvm.org/docs/Passes.html
define i32 @foo(i32 %bar) nounwind readnone ssp {
entry:
  %0 = add nsw i32 %bar, 1
  ret i32 %0
}
                                                    LLVM backend
define i32 @main() nounwind readnone ssp {
entry:
  ret i32 undef
}
exe &
                                             Libraries                             LLVM
                    LLVM
                                                                       exe &                 Offline Reoptimizer
                                                                       LLVM
   Compiler FE 1                                     LLVM    Native                 exe                            Profile
           .                                                                                CPU                     Info
                                      LLVM
                                              Linker        CodeGen                                Profile
                                                                                                  & Trace
           .               .o files          IPO/IPA          LLVM
                                                                                           exe      Info     Runtime
   Compiler FE N                                                                            JIT       LLVM   Optimizer
                    LLVM                                                          LLVM

                                       Figure 4: LLVM system architecture diagram


code in non-conforming languages is executed as “un-
managed code”. Such code is represented in native                      External static LLVM compilers (referred to as front-e
form and not in the CLI intermediate representation,                 translate source-language programs into the LLVM vir
so it is not exposed to CLI optimizations. These sys-                instruction set. Each static compiler can perform three
tems do not provide #2 with #1 or #3 because run-                    tasks, of which the first and third are optional: (1) Per
time optimization is generally only possible when us-                language-specific optimizations, e.g., optimizing closure
ing JIT code generation. They do not aim to provide                  languages with higher-order functions. (2) Translate so
LLVM on JavaScript
Emscripten


•   C/C++ to LLVM IR

•   LLVM IR to JavaScript

•   Run on browser
...

                                                    function _foo($bar) {
define i32 @foo(i32 %bar) nounwind readnone ssp {
                                                      var __label__;
entry:
                                                      var $0=((($bar)+1)|0);
  %0 = add nsw i32 %bar, 1
                                                      return $0;
  ret i32 %0
                                                    }
}
                                                    function _main() {
define i32 @main() nounwind readnone ssp {
                                                      var __label__;
entry:
                                                      return undef;
  ret i32 undef
                                                    }
}
                                                    Module["_main"] = _main;

                                                    ...
Emscripten demo

•   Python, Ruby, Lua virtual machine (http://repl.it/)

•   OpenJPEG

•   Poppler

•   FreeType

•   ...


                https://github.com/kripken/emscripten/wiki
Performance? good enough!

    benchmark             SM      V8      gcc     ratio     two Ja
    fannkuch (10)        1.158   0.931   0.231    4.04      benchm
    fasta (2100000)      1.115   1.128   0.452    2.47      operati
    primes               1.443   3.194   0.438    3.29      code th
    raytrace (7,256)     1.930   2.944   0.228    8.46      to usin
    dlmalloc (400,400)   5.050   1.880   0.315    5.97      (The m
                                                            ‘nativiz
   The first column is the name of the benchmark, and in        Bein
parentheses any parameters used in running it. The source   C++ co
JavaScript on LLVM
Fabric Engine

•   JavaScript Integration

•   Native code compilation (LLVM)

•   Multi-threaded execution

•   OpenGL Rendering
Fabric Engine




http://fabric-engine.com/2011/11/server-performance-benchmarks/
Conclusion?
All problems in computer science can be solved
by another level of indirection


                               David Wheeler
References
•   The behavior of efficient virtual   •   Context Threading: A Flexible
    machine interpreters on                and Efficient Dispatch
    modern architectures                   Technique for Virtual Machine
                                           Interpreters
•   Virtual Machine Showdown:
    Stack Versus Registers             •   Effective Inline-Threaded
                                           Interpretation of Java Bytecode
•   The implementation of Lua 5.0          Using Preparation Sequences

•   Why Is the New Google V8           •   Smalltalk-80: the language and
    Engine so Fast?                        its implementation
References
•   Design of the Java HotSpotTM    •   LLVM: A Compilation
    Client Compiler for Java 6          Framework for Lifelong
                                        Program Analysis &
•   Oracle JRockit: The Definitive       Transformation
    Guide
                                    •   Emscripten: An LLVM-to-
•   Virtual Machines: Versatile         JavaScript Compiler
    platforms for systems and
    processes                       •   An Analysis of the Dynamic
                                        Behavior of JavaScript
•   Fast and Precise Hybrid Type        Programs
    Inference for JavaScript
References
•   Adaptive Optimization for SELF   •   Design, Implementation, and
                                         Evaluation of Optimizations in a
•   Bytecodes meet Combinators:          Just-In-Time Compiler
    invokedynamic on the JVM
                                     •   Optimizing direct threaded
•   Context Threading: A Flexible        code by selective inlining
    and Efficient Dispatch
    Technique for Virtual Machine    •   Linear scan register allocation
    Interpreters
                                     •   Optimizing Invokedynamic
•   Efficient Implementation of the
    Smalltalk-80 System
References
•   Representing Type Information      •   The Structure and Performance
    in Dynamically Typed                   of Efficient Interpreters
    Languages
                                       •   Know Your Engines: How to
•   The Behavior of Efficient Virtual       Make Your JavaScript Fast
    Machine Interpreters on
    Modern Architectures               •   IE Blog, Chromium Blog,
                                           WebKit Blog, Opera Blog,
•   Trace-based Just-in-Time Type          Mozilla Blog, Wingolog’s Blog,
    Specialization for Dynamic             RednaxelaFX’s Blog, David
    Languages                              Mandelin’s Blog...
!ank y"

More Related Content

What's hot

JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013
Vladimir Ivanov
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
ScyllaDB
 

What's hot (20)

JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013
 
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019 Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019
 
Pwning in c++ (basic)
Pwning in c++ (basic)Pwning in c++ (basic)
Pwning in c++ (basic)
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Q2.12: Debugging with GDB
Q2.12: Debugging with GDBQ2.12: Debugging with GDB
Q2.12: Debugging with GDB
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdb
 
Linux binary Exploitation - Basic knowledge
Linux binary Exploitation - Basic knowledgeLinux binary Exploitation - Basic knowledge
Linux binary Exploitation - Basic knowledge
 
What is JavaScript? Edureka
What is JavaScript? EdurekaWhat is JavaScript? Edureka
What is JavaScript? Edureka
 
Perl Introduction
Perl IntroductionPerl Introduction
Perl Introduction
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
Java Script ppt
Java Script pptJava Script ppt
Java Script ppt
 
Quick tour of PHP from inside
Quick tour of PHP from insideQuick tour of PHP from inside
Quick tour of PHP from inside
 
In the DOM, no one will hear you scream
In the DOM, no one will hear you screamIn the DOM, no one will hear you scream
In the DOM, no one will hear you scream
 
PHP 5.5ネーティブキャッシュの話
PHP 5.5ネーティブキャッシュの話PHP 5.5ネーティブキャッシュの話
PHP 5.5ネーティブキャッシュの話
 
きつねさんでもわかるLlvm読書会 第2回
きつねさんでもわかるLlvm読書会 第2回きつねさんでもわかるLlvm読書会 第2回
きつねさんでもわかるLlvm読書会 第2回
 
JSON: The Basics
JSON: The BasicsJSON: The Basics
JSON: The Basics
 
A hands-on introduction to the ELF Object file format
A hands-on introduction to the ELF Object file formatA hands-on introduction to the ELF Object file format
A hands-on introduction to the ELF Object file format
 
MacOS memory allocator (libmalloc) Exploitation
MacOS memory allocator (libmalloc) ExploitationMacOS memory allocator (libmalloc) Exploitation
MacOS memory allocator (libmalloc) Exploitation
 

Viewers also liked (6)

Groovy overview, DSLs and ecosystem - Mars JUG - 2010
Groovy overview, DSLs and ecosystem - Mars JUG - 2010Groovy overview, DSLs and ecosystem - Mars JUG - 2010
Groovy overview, DSLs and ecosystem - Mars JUG - 2010
 
Javascript framework and backbone
Javascript framework and backboneJavascript framework and backbone
Javascript framework and backbone
 
Backbone.js
Backbone.jsBackbone.js
Backbone.js
 
groovy DSLs from beginner to expert
groovy DSLs from beginner to expertgroovy DSLs from beginner to expert
groovy DSLs from beginner to expert
 
Groovy Ecosystem - JFokus 2011 - Guillaume Laforge
Groovy Ecosystem - JFokus 2011 - Guillaume LaforgeGroovy Ecosystem - JFokus 2011 - Guillaume Laforge
Groovy Ecosystem - JFokus 2011 - Guillaume Laforge
 
Backbone.js
Backbone.jsBackbone.js
Backbone.js
 

Similar to Virtual machine and javascript engine

Javascript engine performance
Javascript engine performanceJavascript engine performance
Javascript engine performance
Duoyi Wu
 
Exploring the x64
Exploring the x64Exploring the x64
Exploring the x64
FFRI, Inc.
 
Intel JIT Talk
Intel JIT TalkIntel JIT Talk
Intel JIT Talk
iamdvander
 
Devirtualizing FinSpy
Devirtualizing FinSpyDevirtualizing FinSpy
Devirtualizing FinSpy
jduart
 
8051 C Assignments with all examples covered
8051 C Assignments with all examples covered8051 C Assignments with all examples covered
8051 C Assignments with all examples covered
AbdulMunaf52
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++
Romain Francois
 

Similar to Virtual machine and javascript engine (20)

Javascript engine performance
Javascript engine performanceJavascript engine performance
Javascript engine performance
 
Exploring the x64
Exploring the x64Exploring the x64
Exploring the x64
 
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
 
Marat-Slides
Marat-SlidesMarat-Slides
Marat-Slides
 
3
33
3
 
Temperature sensor with a led matrix display (arduino controlled)
Temperature sensor with a led matrix display (arduino controlled)Temperature sensor with a led matrix display (arduino controlled)
Temperature sensor with a led matrix display (arduino controlled)
 
Intel JIT Talk
Intel JIT TalkIntel JIT Talk
Intel JIT Talk
 
Devirtualizing FinSpy
Devirtualizing FinSpyDevirtualizing FinSpy
Devirtualizing FinSpy
 
Software to the slaughter
Software to the slaughterSoftware to the slaughter
Software to the slaughter
 
Appsec obfuscator reloaded
Appsec obfuscator reloadedAppsec obfuscator reloaded
Appsec obfuscator reloaded
 
Reverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading SkillsReverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading Skills
 
8051 C Assignments with all examples covered
8051 C Assignments with all examples covered8051 C Assignments with all examples covered
8051 C Assignments with all examples covered
 
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
 
Full Stack Clojure
Full Stack ClojureFull Stack Clojure
Full Stack Clojure
 
Rcpp: Seemless R and C++
Rcpp: Seemless R and C++Rcpp: Seemless R and C++
Rcpp: Seemless R and C++
 
Mcs011 solved assignment by divya singh
Mcs011 solved assignment by divya singhMcs011 solved assignment by divya singh
Mcs011 solved assignment by divya singh
 
Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
 
プログラム実行の話と
OSとメモリの挙動の話
プログラム実行の話と
OSとメモリの挙動の話プログラム実行の話と
OSとメモリの挙動の話
プログラム実行の話と
OSとメモリの挙動の話
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) Things
 

Recently uploaded

Recently uploaded (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 

Virtual machine and javascript engine

  • 1. Virtual Machine & JavaScript Engine @nwind
  • 3. Take the red pill I will show you the rabbit hole.
  • 4. Virtual Machine history • pascal 1970 • smalltalk 1980 • self 1986 • python 1991 • java 1995 • javascript 1995
  • 5. The Smalltalk demonstration showed three amazing features. One was how computers could be networked; the second was how object-oriented programming worked. But Jobs and his team paid little attention to these attributes because they were so amazed by the third feature, ...
  • 6. How Virtual Machine Work? • Parser • Intermediate Representation (IR) • Interpreter • Garbage Collection • Optimization
  • 7. Parser • Tokenize • AST
  • 8. Tokenize identifier number keyword var foo = 10; semicolon space equal
  • 9. AST Assign Variable foo Constant 10
  • 10. { AST demo (Esprima) "type": "Program", "body": [ { "type": "VariableDeclaration", "declarations": [ { "id": { "type": "Identifier", "name": "foo" }, "init": { "type": "BinaryExpression", var foo = bar + 1; "operator": "+", "left": { "type": "Identifier", "name": "bar" }, "right": { "type": "Literal", "value": 1 } } } ], "kind": "var" } } ] http://esprima.org/demo/parse.html
  • 11. Intermediate Representation • Bytecode • Stack vs. register
  • 12. Bytecode (SpiderMonkey) 00000: deffun 0 null 00005: nop 00006: callvar 0 00009: int8 2 function foo(bar) { 00011: call 1 return bar + 1; 00014: pop } 00015: stop foo(2); foo: 00020: getarg 0 00023: one 00024: add 00025: return 00026: stop
  • 13. Bytecode (JSC) 8 m_instructions; 168 bytes at 0x7fc1ba3070e0; 1 parameter(s); 10 callee register(s) [ 0] enter [ 1] mov! ! r0, undefined(@k0) [ 4] get_global_var! r1, 5 [ 7] mov! ! r2, undefined(@k0) [ 10] mov! ! r3, 2(@k1) [ 13] call!! r1, 2, 10 function foo(bar) { [ 17] op_call_put_result! ! r0 return bar + 1; [ 19] end! ! r0 } Constants: k0 = undefined foo(2); k1 = 2 3 m_instructions; 64 bytes at 0x7fc1ba306e80; 2 parameter(s); 1 callee register(s) [ 0] enter [ 1] add! ! r0, r-7, 1(@k0) [ 6] ret! ! r0 Constants: k0 = 1 End: 3
  • 14. Stack vs. register • Stack • JVM, .NET, php, python, Old JavaScript engine • Register • Lua, Dalvik, All modern JavaScript engine • Smaller, Faster (about 30%) • RISC
  • 15. Stack vs. register local a,t,i 1: PUSHNIL 3 a=a+i 2: GETLOCAL 0 ; a 3: GETLOCAL 2 ; i 4: ADD local a,t,i 1: LOADNIL 0 2 0 5: SETLOCAL 0 ; a a=a+i 2: ADD 0 0 2 a=a+1 6: SETLOCAL 0 ; a a=a+1 3: ADD 0 0 250 ; a 7: ADDI 1 a=t[i] 4: GETTABLE 0 1 2 8: SETLOCAL 0 ; a a=t[i] 9: GETLOCAL 1 ; t 10: GETINDEXED 2 ; i 11: SETLOCAL 0 ; a
  • 16. Interpreter • Switch statement • Direct threading, Indirect threading, Token threading ...
  • 17. Switch statement while (true) { ! switch (opcode) { mov %edx,0xffffffffffffffe4(%rbp) ! ! case ADD: cmpl $0x1,0xffffffffffffffe4(%rbp) ! ! ! ... je 6e <interpret+0x6e> ! ! ! break; cmpl $0x1,0xffffffffffffffe4(%rbp) ! ! case SUB: jb 4a <interpret+0x4a> ! ! ! ... cmpl $0x2,0xffffffffffffffe4(%rbp) ! ! ! break; je 93 <interpret+0x93> ... jmp 22 <interpret+0x22> ! } ... }
  • 18. Direct threading typedef void *Inst; mov 0xffffffffffffffe8(%rbp),%rdx Inst program[] = { &&ADD, &&SUB }; lea 0xffffffffffffffe8(%rbp),%rax Inst *ip = program; addq $0x8,(%rax) goto *ip++; mov %rdx,0xffffffffffffffd8(%rbp) jmpq *0xffffffffffffffd8(%rbp) ADD: ... ADD: goto *ip++; ... mov 0xffffffffffffffe8(%rbp),%rdx SUB: lea 0xffffffffffffffe8(%rbp),%rax ... addq $0x8,(%rax) goto *ip++; mov %rdx,0xffffffffffffffd8(%rbp) jmp 2c <interpreter+0x2c> http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
  • 19. Garbage Collection • Reference counting (php, python ...), smart pointer • Tracing • Stop the world • Copying, Mark-and-sweep, Mark-and-compact • Generational GC • Precise vs. conservative
  • 20. Precise vs. conservative • Conservative • If it looks like a pointer, treat it as a pointer • Might have memory leak • Cant’ move object, have memory fragmentation • Precise • Indirectly vs. Directly reference
  • 21.
  • 22. It is time for the DARK Magic
  • 23.
  • 24. Optimization Magic • Interpreter optimization • Compiler optimization • JIT • Type inference • Hidden Type • Method inline, PICs
  • 27. CPU Pipeline • Fetch, Decode, Execute, Write-back • Branch prediction
  • 29. Solution: Inline Threading ICONST_1_START: *sp++ = 1; ICONST_1_END: goto **(pc++); INEG_START: sp[-1] = -sp[-1]; INEG_END: goto **(pc++); DISPATCH_START: goto **(pc++); DISPATCH_END: ; size_t iconst_size = (&&ICONST_1_END - &&ICONST_1_START); size_t ineg_size = (&&INEG_END - &&INEG_START); size_t dispatch_size = (&&DISPATCH_END - &&DISPATCH_START); void *buf = malloc(iconst_size + ineg_size + dispatch_size); void *current = buf; memcpy(current, &&ICONST_START, iconst_size); current += iconst_size; memcpy(current, &&INEG_START, ineg_size); current += ineg_size; memcpy(current, &&DISPATCH_START, dispatch_size); ... goto **buf; Interpreter? JIT!
  • 31. Compiler optimization • SSA • Data-flow • Control-flow • Loop • ...
  • 32. What a JVM can do... compiler tactics language-specific techniques loop transformations delayed compilation class hierarchy analysis loop unrolling Tiered compilation devirtualization loop peeling on-stack replacement symbolic constant propagation safepoint elimination delayed reoptimization autobox elimination iteration range splitting program dependence graph representation escape analysis range check elimination static single assignment representation lock elision loop vectorization proof-based techniques lock fusion global code shaping exact type inference de-reflection inlining (graph integration) memory value inference speculative (profile-based) techniques global code motion memory value tracking optimistic nullness assertions heat-based code layout constant folding optimistic type assertions switch balancing reassociation optimistic type strengthening throw inlining operator strength reduction optimistic array length strengthening control flow graph transformation null check elimination untaken branch pruning local code scheduling type test strength reduction optimistic N-morphic inlining local code bundling type test elimination branch frequency prediction delay slot filling algebraic simplification call frequency prediction graph-coloring register allocation common subexpression elimination memory and placement transformation linear scan register allocation integer range typing expression hoisting live range splitting flow-sensitive rewrites expression sinking copy coalescing conditional constant propagation redundant store elimination constant splitting dominating test detection adjacent store fusion copy removal flow-carried type narrowing card-mark elimination address mode matching dead code elimination merge-point splitting instruction peepholing DFA-based code generator http://www.oracle.com/us/technologies/java/java7-renaissance-vm-428200.pdf
  • 34. JIT • Method JIT, Trace JIT, Regular expression JIT • Code generation • Register allocation
  • 35. How JIT work? • mmap/new/malloc (mprotect) • generate native code • c cast/reinterpret_cast • call the function
  • 36. Trampoline (JSC x86) asm ( ".textn" ".globl " SYMBOL_STRING(ctiTrampoline) "n" // Execute the code! HIDE_SYMBOL(ctiTrampoline) "n" inline JSValue execute(RegisterFile* registerFile, SYMBOL_STRING(ctiTrampoline) ":" "n" CallFrame* callFrame, "pushl %ebp" "n" JSGlobalData* globalData) "movl %esp, %ebp" "n" { "pushl %esi" "n" JSValue result = JSValue::decode( "pushl %edi" "n" ctiTrampoline( "pushl %ebx" "n" m_ref.m_code.executableAddress(), "subl $0x3c, %esp" "n" registerFile, "movl $512, %esi" "n" callFrame, "movl 0x58(%esp), %edi" "n" 0, "call *0x50(%esp)" "n" Profiler::enabledProfilerReference(), "addl $0x3c, %esp" "n" globalData)); "popl %ebx" "n" return globalData->exception ? jsNull() : result; "popl %edi" "n" } "popl %esi" "n" "popl %ebp" "n" "ret" "n" );
  • 37. Register allocation • Linear scan • Graph coloring
  • 38. Code generation • Pipelining • SIMD (SSE2, SSE3 ...) • Debug
  • 40. a+b
  • 41.
  • 44. foo.bar in C 00001f63!movl! %ecx,0x04(%edx)
  • 45. __ZN2v88internal7HashMap6LookupEPvjb: 00000338! pushl!%ebp 00000339! pushl!%ebx 0000033a! pushl!%edi 0000033b! 0000033c! 0000033f! 00000343! 00000346! 00000349! pushl!%esi subl! $0x0c,%esp movl! 0x20(%esp),%esi movl! 0x08(%esi),%eax movl! 0x0c(%esi),%ecx imull!$0x0c,%ecx,%edi foo.bar in JavaScript 0000034c! leal! 0xff(%ecx),%ecx 0000034f! addl! %eax,%edi 00000351! movl! 0x28(%esp),%ebx 00000355! andl! %ebx,%ecx 00000357! imull!$0x0c,%ecx,%ebp 0000035a! addl! %eax,%ebp 0000035c! jmp! 0x0000036a 0000035e! nop 00000360! addl! $0x0c,%ebp 00000363! cmpl! %edi,%ebp 00000365! jb! 0x0000036a 00000367! movl! 0x08(%esi),%ebp 0000036a! 0000036d! 0000036f! movl! 0x00(%ebp),%eax testl!%eax,%eax je! 0x0000038b __ZN2v88internal7HashMap6LookupEPvjb 00000371! cmpl! %ebx,0x08(%ebp) 00000374! jne! 0x00000360 00000376! movl! %eax,0x04(%esp) 0000037a! 0000037e! 00000381! movl! 0x24(%esp),%eax movl! %eax,(%esp) call! *0x04(%esi) means: 00000384! testb!%al,%al 00000386! je! 0x00000360 00000388! movl! 0x00(%ebp),%eax 0000038b! 0000038d! 00000393! testl!%eax,%eax jne! 0x00000418 cmpb! $0x00,0x2c(%esp) v8::internal::HashMap::Lookup(void*, unsigned int, bool) 00000398! jne! 0x0000039e 0000039a! xorl! %ebp,%ebp 0000039c! jmp! 0x00000418 0000039e! movl! 0x24(%esp),%eax 000003a2! movl! %eax,0x00(%ebp) 000003a5! movl! $0x00000000,0x04(%ebp) 000003ac! movl! %ebx,0x08(%ebp) 000003af! movl! 0x10(%esi),%eax 000003b2! leal! 0x01(%eax),%ecx 000003b5! movl! %ecx,0x10(%esi) 000003b8! shrl! $0x02,%ecx 000003bb! leal! 0x01(%ecx,%eax),%eax ... 27 lines more
  • 47. Hidden Type add property x then add property y http://code.google.com/apis/v8/design.html
  • 48. But nothing is perfect
  • 49. one secret in V8 hidden class 20x times slower! http://jsperf.com/test-v8-delete
  • 50. in Figure 5, reads are far more common than writes: over all Write_indx roughly comparable to me 1. Write_prop Read_prop 0.8 traces the proportion of reads to writes is 6 to 1. Deletes comprise Write_hash Read_hash class-based languages, suc Write_indx Read_indx only .1% of all events. That graph further breaks reads, writes Write_prop Read_prop Delet_prop ric discussed in [23]. Studi But property are rarely deleted and deletes into various specific types; prop Delet_hash to accesses refers 0.8 Write_hash Read_hash DIT of 8 and a median of 0.6 Write_indx Read_indx Delet_indx Write_prop Read_prop Delet_prop Define and maximum of 10. Figu Write_hash Read_hash Write_indx Read_indx Delet_hash Delet_indx Create Call median prototype chain le 0.6 10 Write_prop Read_prop Delet_prop Define Throw chain length 1, the minimu 0.4 Write_hash Read_hash Delet_hash Create Catch Write_indx Read_indx Delet_indx Call have at least one prototyp Read_prop Define Object.prototype. The m 1.0 Delet_prop Throw 9 0.4 Read_hash Delet_hash Create Catch is 10. The majority of site 0.2 Read_indx Delet_indx Call Delet_prop Define Throw Delet_hash Create Catch reuse, but this is possibly 8 0.8 Delet_indx Call to achieve code reuse in J 0.2 Define Throw 0.0 Create Catch sures directly into a field o prototypes have similar in 7 Call 280s Fbok Apme Bing Blog Digg Flkr Gmai Gmap Lvly Twit Wiki Goog IShk Word Ebay YTub All* Prototype chain length Throw 0.6 0.4 Flkr 0.0 Catch Only 0.1% delete 5.4 Object Kinds 280s Fbok Gmai Gmap Lvly Twit Wiki Apme Bing Blog Digg Goog IShk Word Ebay YTub All* 6 280S BING BLOG EBAY FBOK DIGG FLKR GMIL GMAP GOGL ISHK LIVE MECM TWIT ALL* WIKI WORD YTUB Figure 7 breaks down the Fbok Bing Blog Digg Flkr Gmai Gmap Lvly Twit Wiki Goog IShk Word Ebay YTub All* into a number of categorie 5 built-in data types: dates (D Fbok Gmap Lvly Twit Wiki Flkr Gmai Goog IShk Word Ebay YTub All* 0.2 ument and layout objects 4 rors. The remaining objec Lvly Twit Wiki Goog IShk Word Ebay 0.0 YTub All* mous objects, instances, fu jects are constructed with a 3 Figure 5. Instruction mix. The per-site proportion of read, write, while instances are constr 280S BING BLOG EBAY FBOK LIVE ALL* DIGG FLKR GMIL GMAP GOGL ISHK MECM TWIT WIKI WORD YTUB delete, call instructions (averaged over multiple traces). A function object is creat 2 An Analysis of the Dynamic Behavior ofthe interpreter a uated by JavaScript Programs
  • 52. bar can be anything function foo(bar) { return bar.pro(); }
  • 56. Tagged pointer typedef union { void *p; double d; long l; } Value; typedef struct { unsigned char type; sizeof(a)?? Value value; } Object; if everything is object, it will be too much overhead for small integer Object a;
  • 57. Tagged pointer In almost all system, the pointer address will be aligned (4 or 8 bytes) “The address of a block returned by malloc or realloc in the GNU system is always a multiple of eight (or sixteen on 64-bit systems). ” http://www.gnu.org/s/libc/manual/html_node/Aligned-Memory-Blocks.html
  • 58. Tagged pointer Example: 0xc00ab958 the pointer’s last 2 or 3 bits must be 0 1 0 0 0 1 0 0 1 8 9 Pointer Small Number
  • 60. NaN-tagging (JSC 64 bit) In 64 bit system, we can only use 48 bits, that means it will have 16 bits are 0 * The top 16-bits denote the type of the encoded JSValue: * * Pointer { 0000:PPPP:PPPP:PPPP * / 0001:****:****:**** * Double { ... * FFFE:****:****:**** * Integer { FFFF:0000:IIII:IIII
  • 61. V8
  • 62.
  • 63. V8 • Lars Bak • Hidden Class, PICs • Built-in objects written in JavaScript • Crankshaft • Precise generation GC
  • 64.
  • 65. Lars Bak • implement VM since 1988 • Beta • Self • HotSpot
  • 66. Source code Native Code High-Level IR Low-Level IR Opt Native Code } Crankshaft
  • 68. Crankshaft • Profiling • Compiler optimization • On-stack replacement • Deoptimize
  • 69.
  • 70. High-Level IR (Hydrogen) • function inline • type inference • stack check elimination • loop-invariant code motion • common subexpression elimination • ... http://wingolog.org/archives/2011/08/02/a-closer-look-at-crankshaft-v8s-optimizing-compiler
  • 71. Low-Level IR (Lithium) • linear-scan register allocator • code generate • lazy deoptimization http://wingolog.org/archives/2011/09/05/from-ssa-to-native-code-v8s-lithium-language
  • 72.
  • 73. Built-in objects written in JS function ArraySort(comparefn) { if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) { throw MakeTypeError("called_on_null_or_undefined", ["Array.prototype.sort"]); } // In-place QuickSort algorithm. // For short (length <= 22) arrays, insertion sort is used for efficiency. if (!IS_SPEC_FUNCTION(comparefn)) { comparefn = function (x, y) { if (x === y) return 0; if (%_IsSmi(x) && %_IsSmi(y)) { return %SmiLexicographicCompare(x, y); } x = ToString(x); y = ToString(y); if (x == y) return 0; else return x < y ? -1 : 1; }; } ... v8/src/array.js
  • 74. GC
  • 76. Can V8 be faster?
  • 77. Dart • Clear syntax, Optional types, Libraries • Performance • Can compile to JavaScript • But IE, WebKit and Mozilla rejected it • What do you think? • My thought: Will XML replace HTML? No, but thanks Google, for push the web forward
  • 79. Embed
  • 80. Expose Function v8::Handle<v8::Value> Print(const v8::Arguments& args) { for (int i = 0; i < args.Length(); i++) { v8::HandleScope handle_scope; v8::String::Utf8Value str(args[i]); const char* cstr = ToCString(str); printf("%s", cstr); } return v8::Undefined(); } v8::Handle<v8::ObjectTemplate> global = v8::ObjectTemplate::New(); global->Set(v8::String::New("print"), v8::FunctionTemplate::New(Print));
  • 81.
  • 82. Node.JS • Pros • Cons • Async • Lack of great libraries • One language for everything • ES5 code hard to maintain • Faster than PHP, Python • Still too youth • Community
  • 84. Where it comes from?
  • 86. “Apple has decided to make Internet Explorer it’s default browser on macintosh.” “Since we believe in choice. We going to be shipping other Internet Browser...” Steve Jobs
  • 87. JavaScriptCore History • 2001 KJS (kde-2.2) • 2008 SquirrelFish Extreme • Bison • PICs • AST interpreter • method JIT • 2008 SquirrelFish • regular expression JIT • Bytecode(Register) • DFG JIT (March 2011) • Direct threading
  • 88. Interpreter AST Bytecode Method JIT SSA DFG JIT
  • 90.
  • 91. Monkey • SpiderMonkey • JägerMonkey • Written by Brendan Eich • PICs • interpreter • method JIT (from JSC) • TraceMonkey • IonMonkey • trace JIT • Type Inference • removed • Compiler optimization
  • 92. IonMonkey • SSA • function inline • linear-scan register allocation • dead code elimination • loop-invariant code motion • ...
  • 95. Chakra • Interpreter/JIT • Type System (hidden class) • PICs • Delay parse • Use utf-8 internal
  • 96. Unlocking the JavaScript Opportunity with Internet Explorer 9
  • 97. Unlocking the JavaScript Opportunity with Internet Explorer 9
  • 99. Carakan • Register VM • Method JIT, Regex JIT • Hidden type • Function inline
  • 101. Rhino is SLOW, why?
  • 102. Because JVM is slow?
  • 103. JVM did’t support dynamic language well
  • 105. Hard to optimize in JVM Before Caller Some tricks Method Invokedynamic After Caller Method method handle
  • 106. One ring to rule them all?
  • 107. Rhino + invokedynamic • Pros • Cons • Easier to implement • Only in JVM7 • Lots of great Java Libraries • Not fully optimized yet • JVM optimization for free • Hard to beat V8
  • 109. It there an easy way?
  • 110. LLVM
  • 111.
  • 112. LLVM • Clang, VMKit, GHC, PyPy, Rubinius ... • DragonEgg: replace GCC back-end • IR • Optimization • Link, Code generate, JIT • Apple
  • 114. define i32 @foo(i32 %bar) nounwind ssp { entry: %bar_addr = alloca i32, align 4 %retval = alloca i32 %0 = alloca i32 %one = alloca i32 %"alloca point" = bitcast i32 0 to i32 store i32 %bar, i32* %bar_addr store i32 1, i32* %one, align 4 %1 = load i32* %bar_addr, align 4 %2 = load i32* %one, align 4 %3 = add nsw i32 %1, %2 store i32 %3, i32* %0, align 4 int foo(int bar) { %4 = load i32* %0, align 4 int one = 1; store i32 %4, i32* %retval, align 4 return bar + one; br label %return } return: int main() { %retval1 = load i32* %retval foo(3); ret i32 %retval1 } } define i32 @main() nounwind ssp { entry: %retval = alloca i32 %"alloca point" = bitcast i32 0 to i32 %0 = call i32 @foo(i32 3) nounwind ssp br label %return return: %retval1 = load i32* %retval ret i32 %retval1 }
  • 115. define i32 @foo(i32 %bar) nounwind ssp { entry: %bar_addr = alloca i32, align 4 %retval = alloca i32 %0 = alloca i32 %one = alloca i32 %"alloca point" = bitcast i32 0 to i32 store i32 %bar, i32* %bar_addr store i32 1, i32* %one, align 4 %1 = load i32* %bar_addr, align 4 %2 = load i32* %one, align 4 %3 = add nsw i32 %1, %2 define i32 @foo(i32 %bar) nounwind readnone ssp { store i32 %3, i32* %0, align 4 entry: %4 = load i32* %0, align 4 %0 = add nsw i32 %bar, 1 store i32 %4, i32* %retval, align 4 ret i32 %0 br label %return } return: define i32 @main() nounwind readnone ssp { %retval1 = load i32* %retval entry: } ret i32 %retval1 Optimization } ret i32 undef define i32 @main() nounwind ssp { entry: %retval = alloca i32 %"alloca point" = bitcast i32 0 to i32 %0 = call i32 @foo(i32 3) nounwind ssp br label %return return: %retval1 = load i32* %retval ret i32 %retval1 }
  • 116. Optimization (70+) http://llvm.org/docs/Passes.html
  • 117. define i32 @foo(i32 %bar) nounwind readnone ssp { entry: %0 = add nsw i32 %bar, 1 ret i32 %0 } LLVM backend define i32 @main() nounwind readnone ssp { entry: ret i32 undef }
  • 118. exe & Libraries LLVM LLVM exe & Offline Reoptimizer LLVM Compiler FE 1 LLVM Native exe Profile . CPU Info LLVM Linker CodeGen Profile & Trace . .o files IPO/IPA LLVM exe Info Runtime Compiler FE N JIT LLVM Optimizer LLVM LLVM Figure 4: LLVM system architecture diagram code in non-conforming languages is executed as “un- managed code”. Such code is represented in native External static LLVM compilers (referred to as front-e form and not in the CLI intermediate representation, translate source-language programs into the LLVM vir so it is not exposed to CLI optimizations. These sys- instruction set. Each static compiler can perform three tems do not provide #2 with #1 or #3 because run- tasks, of which the first and third are optional: (1) Per time optimization is generally only possible when us- language-specific optimizations, e.g., optimizing closure ing JIT code generation. They do not aim to provide languages with higher-order functions. (2) Translate so
  • 120. Emscripten • C/C++ to LLVM IR • LLVM IR to JavaScript • Run on browser
  • 121. ... function _foo($bar) { define i32 @foo(i32 %bar) nounwind readnone ssp { var __label__; entry: var $0=((($bar)+1)|0); %0 = add nsw i32 %bar, 1 return $0; ret i32 %0 } } function _main() { define i32 @main() nounwind readnone ssp { var __label__; entry: return undef; ret i32 undef } } Module["_main"] = _main; ...
  • 122. Emscripten demo • Python, Ruby, Lua virtual machine (http://repl.it/) • OpenJPEG • Poppler • FreeType • ... https://github.com/kripken/emscripten/wiki
  • 123. Performance? good enough! benchmark SM V8 gcc ratio two Ja fannkuch (10) 1.158 0.931 0.231 4.04 benchm fasta (2100000) 1.115 1.128 0.452 2.47 operati primes 1.443 3.194 0.438 3.29 code th raytrace (7,256) 1.930 2.944 0.228 8.46 to usin dlmalloc (400,400) 5.050 1.880 0.315 5.97 (The m ‘nativiz The first column is the name of the benchmark, and in Bein parentheses any parameters used in running it. The source C++ co
  • 125. Fabric Engine • JavaScript Integration • Native code compilation (LLVM) • Multi-threaded execution • OpenGL Rendering
  • 128. All problems in computer science can be solved by another level of indirection David Wheeler
  • 129.
  • 130.
  • 131. References • The behavior of efficient virtual • Context Threading: A Flexible machine interpreters on and Efficient Dispatch modern architectures Technique for Virtual Machine Interpreters • Virtual Machine Showdown: Stack Versus Registers • Effective Inline-Threaded Interpretation of Java Bytecode • The implementation of Lua 5.0 Using Preparation Sequences • Why Is the New Google V8 • Smalltalk-80: the language and Engine so Fast? its implementation
  • 132. References • Design of the Java HotSpotTM • LLVM: A Compilation Client Compiler for Java 6 Framework for Lifelong Program Analysis & • Oracle JRockit: The Definitive Transformation Guide • Emscripten: An LLVM-to- • Virtual Machines: Versatile JavaScript Compiler platforms for systems and processes • An Analysis of the Dynamic Behavior of JavaScript • Fast and Precise Hybrid Type Programs Inference for JavaScript
  • 133. References • Adaptive Optimization for SELF • Design, Implementation, and Evaluation of Optimizations in a • Bytecodes meet Combinators: Just-In-Time Compiler invokedynamic on the JVM • Optimizing direct threaded • Context Threading: A Flexible code by selective inlining and Efficient Dispatch Technique for Virtual Machine • Linear scan register allocation Interpreters • Optimizing Invokedynamic • Efficient Implementation of the Smalltalk-80 System
  • 134. References • Representing Type Information • The Structure and Performance in Dynamically Typed of Efficient Interpreters Languages • Know Your Engines: How to • The Behavior of Efficient Virtual Make Your JavaScript Fast Machine Interpreters on Modern Architectures • IE Blog, Chromium Blog, WebKit Blog, Opera Blog, • Trace-based Just-in-Time Type Mozilla Blog, Wingolog’s Blog, Specialization for Dynamic RednaxelaFX’s Blog, David Languages Mandelin’s Blog...