Your SlideShare is downloading. ×
0
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Javascript engine performance
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Javascript engine performance

5,730

Published on

Published in: Technology
1 Comment
37 Likes
Statistics
Notes
  • nice job :)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
5,730
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
217
Comments
1
Likes
37
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. JavaScript Engine Performance
  • 2. 关于我• Baidu资深工程师• 目前主要做性能优化相关的工作• 参与W3C的“HTML” 和“Web Performance” 工作组 @nwind @nwind
  • 3. 请注意• 我不是虚拟机的专家,仅仅是业余兴趣• 很多内容都经过了简化,实际情况要复杂很多• 这里面的观点仅代表我个人看法
  • 4. 大纲• 虚拟机的基本原理• JavaScript引擎是如何优化性能的• V8、Dart、Node.js的介绍• 如何编写高性能的JavaScript代码
  • 5. VM basic
  • 6. Virtual Machine history• pascal 1970• smalltalk 1980• self 1986• python 1991• java 1995• javascript 1995
  • 7. Smalltalk的演示展现了三项惊人的成果。包括电脑之间如何实现联网,以及面向对象编程是如何工作的。但乔布斯和他的团队对这些并不感兴趣,因为他们的注意力被...
  • 8. How Virtual Machine Work?• Parser• Intermediate Representation• Interpreter, JIT• Runtime, Garbage Collection
  • 9. Parser• Tokenize• AST
  • 10. Tokenize identifier numberkeyword var foo = 10; semicolon equal
  • 11. AST AssignVariable foo Constant 10
  • 12. Intermediate Representation• Bytecode• Stack vs. register
  • 13. Bytecode (SpiderMonkey) 00000: deffun 0 null 00005: nop 00006: callvar 0function foo(bar) { 00009: int8 2 00011: call 1 return bar + 1; 00014: pop} 00015: stop foo:foo(2); 00020: getarg 0 00023: one 00024: add 00025: return 00026: stop
  • 14. Bytecode (JSC) 8 m_instructions; 168 bytes at 0x7fc1ba3070e0; 1 parameter(s); 10 callee register(s) [ 0] enter [ 1] mov! ! r0, undefined(@k0) [ 4] get_global_var! r1, 5 [ 7] mov! ! r2, undefined(@k0)function foo(bar) { [ [ 10] 13] mov! ! call!! r3, 2(@k1) r1, 2, 10 return bar + 1; [ [ 17] 19] op_call_put_result! ! end! ! r0 r0} Constants: k0 = undefined k1 = 2foo(2); 3 m_instructions; 64 bytes at 0x7fc1ba306e80; 2 parameter(s); 1 callee register(s) [ 0] enter [ 1] add! ! r0, r-7, 1(@k0) [ 6] ret! ! r0 Constants: k0 = 1 End: 3
  • 15. Stack vs. register• Stack • JVM, .NET, PHP, Python, Old JavaScript engine• Register • Lua, Dalvik, Modern JavaScript engine • Smaller, Faster (about 20%~30%) • RISC
  • 16. Stack vs. registerlocal a,t,i 1: PUSHNIL 3a=a+i 2: GETLOCAL 0 ; a 3: GETLOCAL 2 ; i 4: ADD local a,t,i 1: LOADNIL 0 2 0 5: SETLOCAL 0 ; a a=a+i 2: ADD 0 0 2a=a+1 6: SETLOCAL 0 ; a a=a+1 3: ADD 0 0 250 ; a 7: ADDI 1 a=t[i] 4: GETTABLE 0 1 2 8: SETLOCAL 0 ; aa=t[i] 9: GETLOCAL 1 ; t 10: GETINDEXED 2 ; i 11: SETLOCAL 0 ; a
  • 17. Interpreter• Switch statement• Direct threading, Indirect threading, Token threading ...
  • 18. Switch statement while (true) { ! switch (opcode) { ! ! case ADD: ! ! ! ... ! ! ! break; ! ! case SUB: ! ! ! ... ! ! ! break; ... !} }
  • 19. Direct threadingtypedef void *Inst;Inst program[] = { &&ADD, &&SUB };Inst *ip = program;goto *ip++;ADD: ... goto *ip++;SUB: ...http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
  • 20. Threaded Code
  • 21. http://en.wikipedia.org/wiki/File:Pipeline,_4_stage.svg
  • 22. Context Threading Essence of our Solution… CTT - Contextiload_1 Threading Table Bytecode bodiesiload_1 (generated code) (ret terminated)iadd call iload_1 iload_1:istore_1iload_1 call iload_1 ..bipush 64 call iadd ret;if_icmplt 2 call istore_1… call iload_1 iadd: .. .. ret; Return Branch Predictor Stack Package bodies as subroutines andtechnique for virtual machine interpreters Context Threading: A flexible and efficient dispatch call them
  • 23. Garbage Collection• Reference counting (php, python ...), Smart pointer• Tracing • Generational • Stop-the-world, Concurrent, Incremental • Copying, Sweep, Compact
  • 24. Why JavaScript is slow?• Dynamic Type• Weak Type• Need to parse every time• GC
  • 25. Fight with Weak Type
  • 26. Object model in most VM typedef union { void *p; double d; long l; } Value; typedef struct { unsigned char type; Value value; } Object; Object a;
  • 27. Tagged pointer
  • 28. 在几乎所有系统中,指针地址会对齐 (4或8字节) http://www.gnu.org/s/libc/manual/html_node/Aligned-Memory-Blocks.html
  • 29. 这意味着0xc00ab958 指针的最后2或3个位⼀一定是0 可以在最后⼀一位加1来表示指针 1 0 0 1 1 0 0 0 9 8 Pointer Small Number
  • 30. Tagged pointer Memory ...var a = 1 2var b = {a:1} 0x3d2aa00 ... ... object b ...
  • 31. Small Number2 − 1 = 1073741823 30−2 = −1073741824 30 31位能表示十亿,对大部分应用来说足够了
  • 32. External Fixed Typed Array• Strong type, Fixed length• Out of VM heap• Example: Int32Array, Float64Array
  • 33. Small Number + Typed Array Seconds (smaller is better) 42005000 40203750 31802500 40x1250 50 70 80 0 C/C++ Java(HotSpot) V8 PHP Ruby Python http://shootout.alioth.debian.org/u32/performance.php?test=fannkuchredux
  • 34. Warning: Benchmark lies
  • 35. ES6 will have struct
  • 36. ES6 StructTypePoint2D = new StructType({ Color = new StructType({! x: uint32, ! r: uint8,! y: uint32 ! g: uint8,}); ! b: uint8 }); Pixel = new StructType({ ! point: Point2D, ! color: Color });
  • 37. Use typed array to run faster
  • 38. Fight with Dynamic Type
  • 39. foo.bar
  • 40. foo.bar in Cmovl 4(%edx), %ecx //getmovl %ecx, 4(%edx) //put
  • 41. foo.bar in JavaScriptfound = HashTable.FindEntry(key)if (found) return found;for (pt = GetPrototype(); pt != null; pt = pt.GetPrototype()) { found = pt.HashTable.FindEntry(key) if (found) return found;}
  • 42. How to optimize?
  • 43. First, We need to know Object layout
  • 44. Add Type for object add property yadd property x http://code.google.com/apis/v8/design.html
  • 45. Inline Cache• Slow lookup at first time• Modify the JIT code in-place• Next time will directly jump to the address
  • 46. Inline cache make simple return foo.lookupProperty(bar);function fun(foo) { return foo.bar;} if (foo[hiddenClass] == 0xfe1) { return foo[indexOf_bar]; } return foo.lookupProperty(bar);
  • 47. 实际代码中的JS并不会那么动态Delete操作只占了0.1% “An Analysis of the Dynamic Behavior of JavaScript...”99%的原始类型可以在运行通过静态分析确定97%的属性访问可以被inline cache “TypeCastor: Demystify Dynamic Typing of JavaScript...”
  • 48. V8 can’t handle delete yet 20x times slower! http://jsperf.com/test-v8-delete
  • 49. Avoid alter object property layout
  • 50. Faster Data Structure & Algorithm
  • 51. Array push is fasterthan String concat?
  • 52. http://jsperf.com/nwind-string-concat-vs-array-push
  • 53. Why?
  • 54. other string optimizations• Adaptive string search • Single char, Linear, Boyer-Moore-Horspool• Adaptive ascii and utf-8• Zero copy sub string
  • 55. Feel free to use String in modern Engine
  • 56. Just-In-Time (JIT)
  • 57. JIT• Method JIT, Trace JIT, Regular expression JIT• Register allocation• Code generation
  • 58. How JIT work?• mmap, malloc (mprotect)• generate native code• cast (c), reinterpret_cast (c++)• call the function
  • 59. V8
  • 60. V8• Lars Bak• Hidden Class, PICs• Some of Built-in objects are written in JavaScript• Crankshaft• Precise generation GC
  • 61. Lars Bak• implement VM since 1988• Beta• Self• JVM (VM architect at Sun)• V8 (Google)
  • 62. Lines of code (VM only) .cpp/.c .h500000 110831375000250000 70787 359986 63975125000 224038 80867 8043 15475 135547 120941 108280 42113 83920 44646 0 HotSpot V8 SpiderMonkey JSC Ruby CPython PHP-Zend
  • 63. Crankshaft
  • 64. Source code Native Coderuntime profiling High-Level IR Low-Level IR Opt Native Code } Crankshaft
  • 65. Crankshaft• Profiling• Compiler optimization• Generate new JIT code• On-stack replacement• Deoptimize
  • 66. High-Level IR (Hydrogen)• AST to SSA• Type inference (type feedback from inline cache)• Compiler optimization • Function inline • Loop-invariant code motion, Global value numbering • Eliminate dead phis • ...
  • 67. Loop-invariant code motion tmp = x + y;for (i = 0; i < n; i++) { for (i = 0; i < n; i++) { a[i] = x + y; a[i] = tmp;} }
  • 68. Function inline limit for now• big function (large than 600 bytes)• have recursive• have unsupported statements • with, switch • try/catch/finally • ...
  • 69. Avoid “with”, “switch” and “try” in hot path
  • 70. Built-in objects written in JS function ArraySort(comparefn) { ... // In-place QuickSort algorithm. // For short (length <= 22) arrays, insertion sort is used for efficiency. if (!IS_SPEC_FUNCTION(comparefn)) { comparefn = function (x, y) { if (x === y) return 0; if (%_IsSmi(x) && %_IsSmi(y)) { return %SmiLexicographicCompare(x, y); } x = ToString(x); y = ToString(y); if (x == y) return 0; else return x < y ? -1 : 1; }; } ... v8/src/array.js
  • 71. GC• Precise• Stop-the-world• Generation• Incremental (2011-10)
  • 72. V8 performance
  • 73. V8 performance
  • 74. V8 performance Why?
  • 75. V8 performanceUnfair, they are using gmp library
  • 76. Warning: Benchmark lies
  • 77. Node.JS• Pros • Cons • Easy to write Async I/O • Lack of great libraries • One language for everything • Large JS is hard to maintain • Maybe Faster than PHP, Python • Easy to have Memory leak (compare to PHP, Erlang) • Bet on JavaScript is safe • Still too youth, unproved
  • 78. Why Dart?• Build for large application • option type, structured, libraries, tools• Performance • lightweight process like erlang • easy to write a faster vm than javascript
  • 79. The future of Dart?• It will not replace JS• But it may replace GWT, and become a better choice for Building large front-end application • with great IDE, mature libraries • and some way to communicate with JavaScript
  • 80. How to makeJavaScript faster?
  • 81. How to make JavaScript faster? • Wait for ES6: StructType, const, WeakMap, yield... • High performance build-in library • WebCL • Embed another language • KL(FabricEngine), GLSL(WebGL) • Wait for Quantum computer :)
  • 82. Things you can learn also• NaN tagging• Polymorphic Inline Cache• Type Inference• Regex JIT• Runtime optimization• ...
  • 83. References• The behavior of efficient virtual • Context Threading: A Flexible and machine interpreters on modern Efficient Dispatch Technique for architectures Virtual Machine Interpreters• Virtual Machine Showdown: Stack • Effective Inline-Threaded Versus Registers Interpretation of Java Bytecode Using Preparation Sequences• The implementation of Lua 5.0 • Smalltalk-80: the language and its• Why Is the New Google V8 Engine implementation so Fast?
  • 84. References• Design of the Java HotSpotTM • LLVM: A Compilation Framework Client Compiler for Java 6 for Lifelong Program Analysis & Transformation• Oracle JRockit: The Definitive Guide • Emscripten: An LLVM-to-JavaScript• Virtual Machines: Versatile Compiler platforms for systems and processes • An Analysis of the Dynamic Behavior of JavaScript Programs• Fast and Precise Hybrid Type Inference for JavaScript
  • 85. References• Adaptive Optimization for SELF • Design, Implementation, and Evaluation of Optimizations in a• Bytecodes meet Combinators: Just-In-Time Compiler invokedynamic on the JVM • Optimizing direct threaded code by• Context Threading: A Flexible and selective inlining Efficient Dispatch Technique for Virtual Machine Interpreters • Linear scan register allocation• Efficient Implementation of the • Optimizing Invokedynamic Smalltalk-80 System • Threaded Code
  • 86. References• Why Not a Bytecode VM? • Making the Compilation "Pipeline" Explicit- Dynamic• A Survey of Adaptive Compilation Using Trace Tree Optimization in Virtual Machines Specialization• An Efficient Implementation of • Uniprocessor Garbage Collection SELF, a Dynamically-Typed Techniques Object-Oriented Language Based on Prototypes
  • 87. References• Representing Type Information in • The Structure and Performance of Dynamically Typed Languages Efficient Interpreters• The Behavior of Efficient Virtual • Know Your Engines: How to Make Machine Interpreters on Modern Your JavaScript Fast Architectures • IE Blog, Chromium Blog, WebKit• Trace-based Just-in-Time Type Blog, Opera Blog, Mozilla Blog, Specialization for Dynamic Wingolog’s Blog, RednaxelaFX’s Languages Blog, David Mandelin’s Blog, Brendan Eich’s Blog...
  • 88. !ank y"

×