JavaScript Engine
  Performance
关于我

•   Baidu资深工程师

•   目前主要做性能优化相关的工作

•   参与W3C的“HTML” 和“Web Performance” 工作组




             @nwind                 @nwind
请注意

•   我不是虚拟机的专家,仅仅是业余兴趣

•   很多内容都经过了简化,实际情况要复杂很多

•   这里面的观点仅代表我个人看法
大纲

•   虚拟机的基本原理

•   JavaScript引擎是如何优化性能的

•   V8、Dart、Node.js的介绍

•   如何编写高性能的JavaScript代码
VM basic
Virtual Machine history
•   pascal 1970

•   smalltalk 1980

•   self 1986

•   python 1991

•   java 1995

•   javascript 1995
Smalltalk的演示展现了三项惊人的成果。包括电脑之间如何实现
联网,以及面向对象编程是如何工作的。

但乔布斯和他的团队对这些并不感兴趣,因为他们的注意力被...
How Virtual Machine Work?

•   Parser

•   Intermediate Representation

•   Interpreter, JIT

•   Runtime, Garbage Collection
Parser


•   Tokenize

•   AST
Tokenize
              identifier           number


keyword
          var foo = 10;                    semicolon


                          equal
AST

               Assign




Variable foo            Constant 10
Intermediate Representation


•   Bytecode

•   Stack vs. register
Bytecode (SpiderMonkey)
                      00000:   deffun 0 null
                      00005:   nop
                      00006:   callvar 0
function foo(bar) {   00009:   int8 2
                      00011:   call 1
    return bar + 1;   00014:   pop
}                     00015:   stop

                      foo:
foo(2);               00020:   getarg 0
                      00023:   one
                      00024:   add
                      00025:   return
                      00026:   stop
Bytecode (JSC)
                      8 m_instructions; 168 bytes at 0x7fc1ba3070e0;
                      1 parameter(s); 10 callee register(s)

                      [    0]   enter
                      [    1]   mov! !    r0, undefined(@k0)
                      [    4]   get_global_var!   r1, 5
                      [    7]   mov! !    r2, undefined(@k0)

function foo(bar) {   [
                      [
                          10]
                          13]
                                mov! !
                                call!!
                                          r3, 2(@k1)
                                          r1, 2, 10
    return bar + 1;   [
                      [
                          17]
                          19]
                                op_call_put_result! !
                                end! !    r0
                                                          r0


}                     Constants:
                         k0 = undefined
                         k1 = 2
foo(2);               3 m_instructions; 64 bytes at 0x7fc1ba306e80;
                      2 parameter(s); 1 callee register(s)

                      [    0] enter
                      [    1] add! !     r0, r-7, 1(@k0)
                      [    6] ret! !     r0

                      Constants:
                         k0 = 1

                      End: 3
Stack vs. register
•   Stack

    •   JVM, .NET, PHP, Python, Old JavaScript engine

•   Register

    •   Lua, Dalvik, Modern JavaScript engine

    •   Smaller, Faster (about 20%~30%)

    •   RISC
Stack vs. register
local a,t,i    1:   PUSHNIL      3
a=a+i          2:   GETLOCAL     0 ; a
               3:   GETLOCAL     2 ; i
               4:   ADD
                                             local a,t,i   1:   LOADNIL    0   2   0
               5:   SETLOCAL     0   ; a
                                             a=a+i         2:   ADD        0   0   2
a=a+1          6:   SETLOCAL     0   ; a
                                             a=a+1         3:   ADD        0   0   250 ; a
               7:   ADDI         1
                                             a=t[i]        4:   GETTABLE   0   1   2
               8:   SETLOCAL     0   ;   a
a=t[i]         9:   GETLOCAL     1   ;   t
              10:   GETINDEXED   2   ;   i
              11:   SETLOCAL     0   ;   a
Interpreter


•   Switch statement

•   Direct threading, Indirect threading, Token threading ...
Switch statement
 while (true) {
 ! switch (opcode) {
 ! ! case ADD:
 ! ! ! ...
 ! ! ! break;

 ! ! case SUB:
 ! ! ! ...
 ! ! ! break;
       ...
 !}
 }
Direct threading
typedef void *Inst;
Inst program[] = { &&ADD, &&SUB };
Inst *ip = program;
goto *ip++;

ADD:
      ...
      goto *ip++;

SUB:
       ...

http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
Threaded Code
http://en.wikipedia.org/wiki/File:Pipeline,_4_stage.svg
Context Threading
          Essence of our Solution
…                      CTT - Context
iload_1                Threading Table           Bytecode bodies
iload_1               (generated code)           (ret terminated)
iadd
                     call   iload_1                 iload_1:
istore_1
iload_1              call   iload_1                   ..
bipush 64            call   iadd                    ret;
if_icmplt 2          call   istore_1
…                    call   iload_1                 iadd:
                     ..                               ..
                                                    ret;

                      Return Branch Predictor Stack

  Package bodies as subroutines andtechnique for virtual machine interpreters
            Context Threading: A flexible and efficient dispatch call them
Garbage Collection

•   Reference counting (php, python ...), Smart pointer

•   Tracing

    •   Generational

    •   Stop-the-world, Concurrent, Incremental
    •   Copying, Sweep, Compact
Why JavaScript is slow?

•   Dynamic Type

•   Weak Type

•   Need to parse every time

•   GC
Fight with Weak Type
Object model in most VM
     typedef union {
       void *p;
       double d;
       long l;
     } Value;

     typedef struct {
       unsigned char type;
       Value value;
     } Object;

     Object a;
Tagged pointer
在几乎所有系统中,指针地址会对齐 (4或8字节)




         http://www.gnu.org/s/libc/manual/html_node/Aligned-Memory-Blocks.html
这意味着
0xc00ab958               指针的最后2或3个位⼀一定是0


             可以在最后⼀一位加1来表示指针

       1     0   0   1      1   0   0   0
              9                  8
           Pointer         Small Number
Tagged pointer
                Memory
                    ...
var a = 1           2

var b = {a:1}   0x3d2aa00
                    ...
                    ...
                 object b
                    ...
Small Number

2 − 1 = 1073741823
 30


−2 = −1073741824
  30

 31位能表示十亿,对大部分应用来说足够了
External Fixed Typed Array

•   Strong type, Fixed length

•   Out of VM heap

•   Example: Int32Array, Float64Array
Small Number + Typed Array
                               Seconds (smaller is better)


                                                                       4200
5000

                                                                  4020
3750
                                                3180
2500
                                   40x
1250

       50         70            80
   0
       C/C++   Java(HotSpot)      V8                 PHP              Ruby              Python


                               http://shootout.alioth.debian.org/u32/performance.php?test=fannkuchredux
Warning: Benchmark
       lies
ES6 will have struct
ES6 StructType
Point2D = new StructType({   Color = new StructType({
! x: uint32,                 ! r: uint8,
! y: uint32                  ! g: uint8,
});                          ! b: uint8
                             });


              Pixel = new StructType({
              ! point: Point2D,
              ! color: Color
              });
Use typed array to run faster
Fight with Dynamic
       Type
foo.bar
foo.bar in C

movl 4(%edx), %ecx   //get
movl %ecx, 4(%edx)   //put
foo.bar in JavaScript
found = HashTable.FindEntry(key)
if (found) return found;

for (pt = GetPrototype();
       pt != null;
       pt = pt.GetPrototype()) {
    found = pt.HashTable.FindEntry(key)
    if (found) return found;
}
How to optimize?
First, We need to know
     Object layout
Add Type for object

                      add property y
add property x




                     http://code.google.com/apis/v8/design.html
Inline Cache

•   Slow lookup at first time

•   Modify the JIT code in-place

•   Next time will directly jump to the address
Inline cache make simple

                      return foo.lookupProperty(bar);
function fun(foo) {
    return foo.bar;
}


                      if (foo[hiddenClass] == 0xfe1) {
                          return foo[indexOf_bar];
                      }
                      return foo.lookupProperty(bar);
实际代码中的JS并不会那么动态
Delete操作只占了0.1%
                     “An Analysis of the Dynamic Behavior of JavaScript...”




99%的原始类型可以在运行通过静态分析确定
97%的属性访问可以被inline cache

                    “TypeCastor: Demystify Dynamic Typing of JavaScript...”
V8 can’t handle delete yet




                                         20x times
                                          slower!


      http://jsperf.com/test-v8-delete
Avoid alter object property
          layout
Faster Data Structure
    & Algorithm
Array push is faster
than String concat?
http://jsperf.com/nwind-string-concat-vs-array-push
Why?
other string optimizations

•   Adaptive string search

    •   Single char, Linear, Boyer-Moore-Horspool

•   Adaptive ascii and utf-8

•   Zero copy sub string
Feel free to use String in
     modern Engine
Just-In-Time (JIT)
JIT

•   Method JIT, Trace JIT, Regular expression JIT

•   Register allocation

•   Code generation
How JIT work?

•   mmap, malloc (mprotect)

•   generate native code

•   cast (c), reinterpret_cast (c++)

•   call the function
V8
V8

•   Lars Bak

•   Hidden Class, PICs

•   Some of Built-in objects are written in JavaScript

•   Crankshaft
•   Precise generation GC
Lars Bak
•   implement VM since 1988

•   Beta

•   Self

•   JVM (VM architect at Sun)

•   V8 (Google)
Lines of code (VM only)
                             .cpp/.c                .h
500000


         110831
375000



250000             70787

         359986                63975
125000             224038                  80867    8043    15475
                              135547               120941   108280     42113
                                           83920
                                                                       44646
     0
         HotSpot    V8      SpiderMonkey    JSC     Ruby    CPython   PHP-Zend
Crankshaft
Source code        Native Code


runtime profiling
                   High-Level IR    Low-Level IR   Opt Native Code



                         }   Crankshaft
Crankshaft

•   Profiling

•   Compiler optimization

•   Generate new JIT code

•   On-stack replacement
•   Deoptimize
High-Level IR (Hydrogen)
•   AST to SSA

•   Type inference (type feedback from inline cache)

•   Compiler optimization

    •   Function inline
    •   Loop-invariant code motion, Global value numbering

    •   Eliminate dead phis

    •   ...
Loop-invariant code motion

                            tmp = x + y;
for (i = 0; i < n; i++) {   for (i = 0; i < n; i++) {
    a[i] = x + y;               a[i] = tmp;
}                           }
Function inline limit for now
•   big function (large than 600 bytes)

•   have recursive

•   have unsupported statements

    •   with, switch
    •   try/catch/finally

    •   ...
Avoid “with”, “switch” and
    “try” in hot path
Built-in objects written in JS
   function ArraySort(comparefn) {
     ...
     // In-place QuickSort algorithm.
     // For short (length <= 22) arrays, insertion sort is used for efficiency.

    if (!IS_SPEC_FUNCTION(comparefn)) {
      comparefn = function (x, y) {
        if (x === y) return 0;
        if (%_IsSmi(x) && %_IsSmi(y)) {
           return %SmiLexicographicCompare(x, y);
         }
         x = ToString(x);
         y = ToString(y);
        if (x == y) return 0;
        else return x < y ? -1 : 1;
      };
    }
    ...



                               v8/src/array.js
GC

•   Precise

•   Stop-the-world

•   Generation

•   Incremental (2011-10)
V8 performance
V8 performance
V8 performance



     Why?
V8 performance



Unfair, they are using gmp library
Warning: Benchmark
       lies
Node.JS
•   Pros                                •   Cons

    •   Easy to write Async I/O             •   Lack of great libraries

    •   One language for everything         •   Large JS is hard to maintain

    •   Maybe Faster than PHP, Python       •   Easy to have Memory leak
                                                (compare to PHP, Erlang)
    •   Bet on JavaScript is safe
                                            •   Still too youth, unproved
Why Dart?

•   Build for large application

    •   option type, structured, libraries, tools

•   Performance

    •   lightweight process like erlang
    •   easy to write a faster vm than javascript
The future of Dart?

•   It will not replace JS

•   But it may replace GWT, and become a better choice for
    Building large front-end application

    •   with great IDE, mature libraries

    •   and some way to communicate with JavaScript
How to make
JavaScript faster?
How to make JavaScript faster?
 •   Wait for ES6: StructType, const, WeakMap, yield...

 •   High performance build-in library

 •   WebCL

 •   Embed another language

     •   KL(FabricEngine), GLSL(WebGL)

 •   Wait for Quantum computer :)
Things you can learn also
•   NaN tagging

•   Polymorphic Inline Cache

•   Type Inference

•   Regex JIT

•   Runtime optimization

•   ...
References
•   The behavior of efficient virtual   •   Context Threading: A Flexible and
    machine interpreters on modern        Efficient Dispatch Technique for
    architectures                         Virtual Machine Interpreters

•   Virtual Machine Showdown: Stack   •   Effective Inline-Threaded
    Versus Registers                      Interpretation of Java Bytecode
                                          Using Preparation Sequences
•   The implementation of Lua 5.0
                                      •   Smalltalk-80: the language and its
•   Why Is the New Google V8 Engine       implementation
    so Fast?
References
•   Design of the Java HotSpotTM          •   LLVM: A Compilation Framework
    Client Compiler for Java 6                for Lifelong Program Analysis &
                                              Transformation
•   Oracle JRockit: The Definitive Guide
                                          •   Emscripten: An LLVM-to-JavaScript
•   Virtual Machines: Versatile               Compiler
    platforms for systems and
    processes                             •   An Analysis of the Dynamic
                                              Behavior of JavaScript Programs
•   Fast and Precise Hybrid Type
    Inference for JavaScript
References
•   Adaptive Optimization for SELF      •   Design, Implementation, and
                                            Evaluation of Optimizations in a
•   Bytecodes meet Combinators:             Just-In-Time Compiler
    invokedynamic on the JVM
                                        •   Optimizing direct threaded code by
•   Context Threading: A Flexible and       selective inlining
    Efficient Dispatch Technique for
    Virtual Machine Interpreters        •   Linear scan register allocation

•   Efficient Implementation of the       •   Optimizing Invokedynamic
    Smalltalk-80 System
                                        •   Threaded Code
References
•   Why Not a Bytecode VM?             •   Making the Compilation
                                           "Pipeline" Explicit- Dynamic
•   A Survey of Adaptive                   Compilation Using Trace Tree
    Optimization in Virtual Machines       Specialization

•   An Efficient Implementation of       •   Uniprocessor Garbage Collection
    SELF, a Dynamically-Typed              Techniques
    Object-Oriented Language Based
    on Prototypes
References
•   Representing Type Information in   •   The Structure and Performance of
    Dynamically Typed Languages            Efficient Interpreters

•   The Behavior of Efficient Virtual    •   Know Your Engines: How to Make
    Machine Interpreters on Modern         Your JavaScript Fast
    Architectures
                                       •   IE Blog, Chromium Blog, WebKit
•   Trace-based Just-in-Time Type          Blog, Opera Blog, Mozilla Blog,
    Specialization for Dynamic             Wingolog’s Blog, RednaxelaFX’s
    Languages                              Blog, David Mandelin’s Blog,
                                           Brendan Eich’s Blog...
!ank y"

Javascript engine performance

  • 1.
    JavaScript Engine Performance
  • 2.
    关于我 • Baidu资深工程师 • 目前主要做性能优化相关的工作 • 参与W3C的“HTML” 和“Web Performance” 工作组 @nwind @nwind
  • 3.
    请注意 • 我不是虚拟机的专家,仅仅是业余兴趣 • 很多内容都经过了简化,实际情况要复杂很多 • 这里面的观点仅代表我个人看法
  • 4.
    大纲 • 虚拟机的基本原理 • JavaScript引擎是如何优化性能的 • V8、Dart、Node.js的介绍 • 如何编写高性能的JavaScript代码
  • 5.
  • 6.
    Virtual Machine history • pascal 1970 • smalltalk 1980 • self 1986 • python 1991 • java 1995 • javascript 1995
  • 7.
  • 8.
    How Virtual MachineWork? • Parser • Intermediate Representation • Interpreter, JIT • Runtime, Garbage Collection
  • 9.
    Parser • Tokenize • AST
  • 10.
    Tokenize identifier number keyword var foo = 10; semicolon equal
  • 11.
    AST Assign Variable foo Constant 10
  • 12.
    Intermediate Representation • Bytecode • Stack vs. register
  • 13.
    Bytecode (SpiderMonkey) 00000: deffun 0 null 00005: nop 00006: callvar 0 function foo(bar) { 00009: int8 2 00011: call 1 return bar + 1; 00014: pop } 00015: stop foo: foo(2); 00020: getarg 0 00023: one 00024: add 00025: return 00026: stop
  • 14.
    Bytecode (JSC) 8 m_instructions; 168 bytes at 0x7fc1ba3070e0; 1 parameter(s); 10 callee register(s) [ 0] enter [ 1] mov! ! r0, undefined(@k0) [ 4] get_global_var! r1, 5 [ 7] mov! ! r2, undefined(@k0) function foo(bar) { [ [ 10] 13] mov! ! call!! r3, 2(@k1) r1, 2, 10 return bar + 1; [ [ 17] 19] op_call_put_result! ! end! ! r0 r0 } Constants: k0 = undefined k1 = 2 foo(2); 3 m_instructions; 64 bytes at 0x7fc1ba306e80; 2 parameter(s); 1 callee register(s) [ 0] enter [ 1] add! ! r0, r-7, 1(@k0) [ 6] ret! ! r0 Constants: k0 = 1 End: 3
  • 15.
    Stack vs. register • Stack • JVM, .NET, PHP, Python, Old JavaScript engine • Register • Lua, Dalvik, Modern JavaScript engine • Smaller, Faster (about 20%~30%) • RISC
  • 16.
    Stack vs. register locala,t,i 1: PUSHNIL 3 a=a+i 2: GETLOCAL 0 ; a 3: GETLOCAL 2 ; i 4: ADD local a,t,i 1: LOADNIL 0 2 0 5: SETLOCAL 0 ; a a=a+i 2: ADD 0 0 2 a=a+1 6: SETLOCAL 0 ; a a=a+1 3: ADD 0 0 250 ; a 7: ADDI 1 a=t[i] 4: GETTABLE 0 1 2 8: SETLOCAL 0 ; a a=t[i] 9: GETLOCAL 1 ; t 10: GETINDEXED 2 ; i 11: SETLOCAL 0 ; a
  • 17.
    Interpreter • Switch statement • Direct threading, Indirect threading, Token threading ...
  • 18.
    Switch statement while(true) { ! switch (opcode) { ! ! case ADD: ! ! ! ... ! ! ! break; ! ! case SUB: ! ! ! ... ! ! ! break; ... !} }
  • 19.
    Direct threading typedef void*Inst; Inst program[] = { &&ADD, &&SUB }; Inst *ip = program; goto *ip++; ADD: ... goto *ip++; SUB: ... http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
  • 20.
  • 21.
  • 22.
    Context Threading Essence of our Solution … CTT - Context iload_1 Threading Table Bytecode bodies iload_1 (generated code) (ret terminated) iadd call iload_1 iload_1: istore_1 iload_1 call iload_1 .. bipush 64 call iadd ret; if_icmplt 2 call istore_1 … call iload_1 iadd: .. .. ret; Return Branch Predictor Stack Package bodies as subroutines andtechnique for virtual machine interpreters Context Threading: A flexible and efficient dispatch call them
  • 23.
    Garbage Collection • Reference counting (php, python ...), Smart pointer • Tracing • Generational • Stop-the-world, Concurrent, Incremental • Copying, Sweep, Compact
  • 24.
    Why JavaScript isslow? • Dynamic Type • Weak Type • Need to parse every time • GC
  • 25.
  • 26.
    Object model inmost VM typedef union { void *p; double d; long l; } Value; typedef struct { unsigned char type; Value value; } Object; Object a;
  • 27.
  • 28.
    在几乎所有系统中,指针地址会对齐 (4或8字节) http://www.gnu.org/s/libc/manual/html_node/Aligned-Memory-Blocks.html
  • 29.
    这意味着 0xc00ab958 指针的最后2或3个位⼀一定是0 可以在最后⼀一位加1来表示指针 1 0 0 1 1 0 0 0 9 8 Pointer Small Number
  • 30.
    Tagged pointer Memory ... var a = 1 2 var b = {a:1} 0x3d2aa00 ... ... object b ...
  • 31.
    Small Number 2 −1 = 1073741823 30 −2 = −1073741824 30 31位能表示十亿,对大部分应用来说足够了
  • 32.
    External Fixed TypedArray • Strong type, Fixed length • Out of VM heap • Example: Int32Array, Float64Array
  • 33.
    Small Number +Typed Array Seconds (smaller is better) 4200 5000 4020 3750 3180 2500 40x 1250 50 70 80 0 C/C++ Java(HotSpot) V8 PHP Ruby Python http://shootout.alioth.debian.org/u32/performance.php?test=fannkuchredux
  • 34.
  • 35.
  • 36.
    ES6 StructType Point2D =new StructType({ Color = new StructType({ ! x: uint32, ! r: uint8, ! y: uint32 ! g: uint8, }); ! b: uint8 }); Pixel = new StructType({ ! point: Point2D, ! color: Color });
  • 37.
    Use typed arrayto run faster
  • 38.
  • 39.
  • 40.
    foo.bar in C movl4(%edx), %ecx //get movl %ecx, 4(%edx) //put
  • 41.
    foo.bar in JavaScript found= HashTable.FindEntry(key) if (found) return found; for (pt = GetPrototype(); pt != null; pt = pt.GetPrototype()) { found = pt.HashTable.FindEntry(key) if (found) return found; }
  • 42.
  • 43.
    First, We needto know Object layout
  • 44.
    Add Type forobject add property y add property x http://code.google.com/apis/v8/design.html
  • 45.
    Inline Cache • Slow lookup at first time • Modify the JIT code in-place • Next time will directly jump to the address
  • 46.
    Inline cache makesimple return foo.lookupProperty(bar); function fun(foo) { return foo.bar; } if (foo[hiddenClass] == 0xfe1) { return foo[indexOf_bar]; } return foo.lookupProperty(bar);
  • 47.
    实际代码中的JS并不会那么动态 Delete操作只占了0.1% “An Analysis of the Dynamic Behavior of JavaScript...” 99%的原始类型可以在运行通过静态分析确定 97%的属性访问可以被inline cache “TypeCastor: Demystify Dynamic Typing of JavaScript...”
  • 48.
    V8 can’t handledelete yet 20x times slower! http://jsperf.com/test-v8-delete
  • 49.
    Avoid alter objectproperty layout
  • 50.
  • 51.
    Array push isfaster than String concat?
  • 52.
  • 53.
  • 54.
    other string optimizations • Adaptive string search • Single char, Linear, Boyer-Moore-Horspool • Adaptive ascii and utf-8 • Zero copy sub string
  • 55.
    Feel free touse String in modern Engine
  • 56.
  • 57.
    JIT • Method JIT, Trace JIT, Regular expression JIT • Register allocation • Code generation
  • 58.
    How JIT work? • mmap, malloc (mprotect) • generate native code • cast (c), reinterpret_cast (c++) • call the function
  • 59.
  • 60.
    V8 • Lars Bak • Hidden Class, PICs • Some of Built-in objects are written in JavaScript • Crankshaft • Precise generation GC
  • 61.
    Lars Bak • implement VM since 1988 • Beta • Self • JVM (VM architect at Sun) • V8 (Google)
  • 62.
    Lines of code(VM only) .cpp/.c .h 500000 110831 375000 250000 70787 359986 63975 125000 224038 80867 8043 15475 135547 120941 108280 42113 83920 44646 0 HotSpot V8 SpiderMonkey JSC Ruby CPython PHP-Zend
  • 63.
  • 65.
    Source code Native Code runtime profiling High-Level IR Low-Level IR Opt Native Code } Crankshaft
  • 66.
    Crankshaft • Profiling • Compiler optimization • Generate new JIT code • On-stack replacement • Deoptimize
  • 67.
    High-Level IR (Hydrogen) • AST to SSA • Type inference (type feedback from inline cache) • Compiler optimization • Function inline • Loop-invariant code motion, Global value numbering • Eliminate dead phis • ...
  • 68.
    Loop-invariant code motion tmp = x + y; for (i = 0; i < n; i++) { for (i = 0; i < n; i++) { a[i] = x + y; a[i] = tmp; } }
  • 69.
    Function inline limitfor now • big function (large than 600 bytes) • have recursive • have unsupported statements • with, switch • try/catch/finally • ...
  • 70.
    Avoid “with”, “switch”and “try” in hot path
  • 71.
    Built-in objects writtenin JS function ArraySort(comparefn) { ... // In-place QuickSort algorithm. // For short (length <= 22) arrays, insertion sort is used for efficiency. if (!IS_SPEC_FUNCTION(comparefn)) { comparefn = function (x, y) { if (x === y) return 0; if (%_IsSmi(x) && %_IsSmi(y)) { return %SmiLexicographicCompare(x, y); } x = ToString(x); y = ToString(y); if (x == y) return 0; else return x < y ? -1 : 1; }; } ... v8/src/array.js
  • 72.
    GC • Precise • Stop-the-world • Generation • Incremental (2011-10)
  • 73.
  • 74.
  • 75.
  • 76.
    V8 performance Unfair, theyare using gmp library
  • 77.
  • 79.
    Node.JS • Pros • Cons • Easy to write Async I/O • Lack of great libraries • One language for everything • Large JS is hard to maintain • Maybe Faster than PHP, Python • Easy to have Memory leak (compare to PHP, Erlang) • Bet on JavaScript is safe • Still too youth, unproved
  • 80.
    Why Dart? • Build for large application • option type, structured, libraries, tools • Performance • lightweight process like erlang • easy to write a faster vm than javascript
  • 81.
    The future ofDart? • It will not replace JS • But it may replace GWT, and become a better choice for Building large front-end application • with great IDE, mature libraries • and some way to communicate with JavaScript
  • 82.
  • 83.
    How to makeJavaScript faster? • Wait for ES6: StructType, const, WeakMap, yield... • High performance build-in library • WebCL • Embed another language • KL(FabricEngine), GLSL(WebGL) • Wait for Quantum computer :)
  • 84.
    Things you canlearn also • NaN tagging • Polymorphic Inline Cache • Type Inference • Regex JIT • Runtime optimization • ...
  • 85.
    References • The behavior of efficient virtual • Context Threading: A Flexible and machine interpreters on modern Efficient Dispatch Technique for architectures Virtual Machine Interpreters • Virtual Machine Showdown: Stack • Effective Inline-Threaded Versus Registers Interpretation of Java Bytecode Using Preparation Sequences • The implementation of Lua 5.0 • Smalltalk-80: the language and its • Why Is the New Google V8 Engine implementation so Fast?
  • 86.
    References • Design of the Java HotSpotTM • LLVM: A Compilation Framework Client Compiler for Java 6 for Lifelong Program Analysis & Transformation • Oracle JRockit: The Definitive Guide • Emscripten: An LLVM-to-JavaScript • Virtual Machines: Versatile Compiler platforms for systems and processes • An Analysis of the Dynamic Behavior of JavaScript Programs • Fast and Precise Hybrid Type Inference for JavaScript
  • 87.
    References • Adaptive Optimization for SELF • Design, Implementation, and Evaluation of Optimizations in a • Bytecodes meet Combinators: Just-In-Time Compiler invokedynamic on the JVM • Optimizing direct threaded code by • Context Threading: A Flexible and selective inlining Efficient Dispatch Technique for Virtual Machine Interpreters • Linear scan register allocation • Efficient Implementation of the • Optimizing Invokedynamic Smalltalk-80 System • Threaded Code
  • 88.
    References • Why Not a Bytecode VM? • Making the Compilation "Pipeline" Explicit- Dynamic • A Survey of Adaptive Compilation Using Trace Tree Optimization in Virtual Machines Specialization • An Efficient Implementation of • Uniprocessor Garbage Collection SELF, a Dynamically-Typed Techniques Object-Oriented Language Based on Prototypes
  • 89.
    References • Representing Type Information in • The Structure and Performance of Dynamically Typed Languages Efficient Interpreters • The Behavior of Efficient Virtual • Know Your Engines: How to Make Machine Interpreters on Modern Your JavaScript Fast Architectures • IE Blog, Chromium Blog, WebKit • Trace-based Just-in-Time Type Blog, Opera Blog, Mozilla Blog, Specialization for Dynamic Wingolog’s Blog, RednaxelaFX’s Languages Blog, David Mandelin’s Blog, Brendan Eich’s Blog...
  • 90.