Are We Fast Yet? HTML & Javascript Performance - UtahJS


Published on

Presentation to UtahJS on webkit.js and HTML/Javascript performance.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Are We Fast Yet? HTML & Javascript Performance - UtahJS

  1. 1. Are we fast yet? JavaScript & HTML performance Trevor Linton - July 2014
  2. 2. JavaScript? Sure. •  Firefox has asm.js (A subset of JavaScript) •  Chrome has V8 •  Safari now has LLVM optimizations C++/Clang Firefox ASM.js In some measurements we’re butting up against native C++.
  3. 3. JS has road blocks in HTML though. JIT Begins optimizing. STOP, unknown what this function may do or return. Recalculate Layouts STOP, Waiting for return value from renderer STOP, Events have unknown values, cannot pre-optimize Mouse click from user Renderer kicks off JS HTML Renderer JS compiler Code function() { .... var el = document.getElementById(‘id’) .... var bounds = el.getBoundingClientRect() .... } addEventListener(‘click’,function(e) { ... });
  4. 4. An experiment to overcome this Re-implement rendering in HTML5 to be JavaScript based.
  5. 5. An experiment to overcome this •  Re-implement HTML5 rendering in JavaScript. •  JS can fully JIT through any DOM operation and optimize. •  JS optimizer has ability to anticipate inputs from C++ in sync/async events. •  Using ASM.js we can get near C++ runtime speeds. Original C++ WebKit Code (webcore actually) Using LLVM/Clang and emscripten compile it down to javascript. webkit.js
  6. 6. webkit.js speed results (x=iter.) •  Rendering becomes substantially faster after progressive runs. •  Rendering speed on pair with native speeds. •  Firefox faster due to built-in 1:1 ASM.js optimizations. •  DEMO: 0" 0.2" 0.4" 0.6" 0.8" 1" 1.2" 1" 2" 3" 4" 5" 6" 7" 8" webkit.js"Chrome"35" webkit.js"in"Firefox"30" Chrome"35" Firefox"30"
  7. 7. Getting over the DOM fence... •  Continue building a JS based HTML renderer. •  Firefox, Chromium and the later are working on pulling more of the DOM into native JS. •  Proposal is out to recreate CSS styles in JS for Chromium. •  Firefox is already getting close to this...
  8. 8. But WebKit is complex... CSS Animation Rendering Hardware Compositing Other Things… (Layout, Network, Parsing, DOM, CSS, Javascript) Compositing, Painting, Drawing and Rendering ChromeClient (Implemented as ChromeClient***.cpp) AcceleratedCompositor (GraphicsLayerClient) GraphicsLayer (TextureMapperLayer) WebView TextureMapperGL (TextureMapper / GraphicsLayerClient) ContextGL, OpenGLES V2 (platform specific, accelerated) CREATESBUTDOESNOTMANAGE CREAT ESCHROMECLIENTJS ANDHANDSOFF ACCELE RAT EDCOMPOSITORONCECREAT ED ChromeClientJS Executes on AcceleratedContext: setRootGraphicsLayer enabled?() scheduleLayerFlush resizeRootLayer Chrome class A proxy for ChromeClient interface passed into Frame When a graphics layer is created it sends attachRootGraphicsLayer to ChromeClientJS, in addition it will execute WidgetSizeChanged (or WebView may), setNeedsOneShotDrawingSynchronization, scheduleCompositingLayerFlush and scheduleAnimation. These are all passed through to the AcceleratedCompositor on behalf of webkit. Chrome//WebKit//WebCore will only do this if accelerated compositing is turned on by settings and ACCELERATED_COMPOSITING=1 && TEXTURE_MAPPER_GL=1 && TEXTURE_MAPPER=1 in compiler settings. DEVICE SCALE FACTOR, PAGE SIZE, ETC. Executes setDeviceScaleFactor(float) usually 2 in webkit.js for hide rendering. Also executes viewport size to set size of view. This will cause the frame in both accelerated and non accelerated mode to kick out twice the size of bitmap image when bitblting. However all coordinates are still in logical pixels. The “layout black box”. This is where the magic happens, we will be re- informed of results through the ChromeClient executed by Chrome Shoots the created layers and root layers to texture mapper which tiles and uploads them to GL for display, these manage for us scrolling, memory use and other things so we don’t just haphazardly create 20,000 different compositing layers, textures, etc. paintContents on ChromeClientJS actually draws contents to TextureMapperLayer as the TextureMapperGL interface needs, it’ll request these through paintContents GraphicsLayer (CREATION) HostWindow or IPC Channel for WebKit2/Chrome GraphicsContext3D (GraphicsContext) TextureMapper sends paintContents to AcceleratedCompositor, which in turn manages clearing OpenGL, maintaining buffers/contexts. Clears buffers, makes current context. Chrome/WebKit pushes a graphics layer to ChromeClient that is created by GraphicsLayerFactory GraphicsLayerFactory Created by taking ChromeClientJS that’s held by Frame or global default constructor to create all RenderLayer’s and GraphicsLayers Note, on some platforms this is part of ChromeClient WebView creates the chrome client that is platform specific, it’s sent to WebCore::Frame and a copy is retained for WebView. WebCore::Frame, WebCore::FrameView, WebCore::Page and a whole host of other classes run methods on chrome client when specific work needs to be done. Informs each other of size changes, when graphics layers needs to be flushed, and a whole host of other things to sync states. Pushes textures thatare tiled or full as “composited” layers to GL. Used for special transforms or accelerated scaling. Painting / Drawing cairo (or other drawing library, skia, CoreAnimation, etc.) pixman for fast patched drawing optimizations Image, ImageBuffer libjpegturbo, libpng (note gif and bmp are built in to webcore) zlib (decompressing pngs) FreeType, FontConfig used for font parsing and layout GraphicsContext (library/platform specific) WebCore::Frame WebCore::Page ChromeClient (Created by WebView is passed to Frame/Page for WebCore to use.) There’s also coordinated graphics and tiling. Platform Blit Surface (non-accelerated) Software Compositing TextureMapper GraphicsLayer When attachRootGraphicsLayer is executed by Chrome the Graphics Layer is passed into accelerated compositor. The compositor is checked to see if its enabled, if not compositing is turned off, if so compositing is turned on. Non-accelerated, non-composited, bitable path. Composited, but not accelerated path (not bitblted) WebView creates a device GL and EGL (openglesv2) context via SDL. This context in webkit is globally available once created. It then creates AcceleratedCompositor and does nothing else than hand it to ChromeClientJS. It also makes these the current context and sets the device viewport size (not the GL context size). ContextGL and ContextEGL are hacked to pass specific params to Emscripten to create the right compatible surface, these hacks are wrapped in PLATFORM(JS) Preprocessors RenderLayerCompositor RenderLayer Accelerated, but not composited bit-blt path. Composited and accelerated path Compiled Vertexes & Shaders Classes compile layout commands into OpenGL Vertex & Shader Program WebCore::FrameView WebCore::Document Video Codecs GraphicsLayerTextureMapper.cpp GraphicsLayer::create factory ? factory->create : GraphicsLayerTextureMapper() ChromeClient->graphicsLayerFactory() (GraphicsLayerFactory passed through from ChromeClient->factory(), if non exists, use default TextureMapper implementation. RenderLayerBacking Plugins Layout and painting produce a render tree that is managed by a host of classes. The RenderLayers and RenderLayerTree communicate with render layer compositer to determine the GraphicsLayers that are then passed on through the RenderLayerBacking glBindTexture() / Canvas / SDL / GLUT / XWindow / DWM / NSOpenGL / etc.. AnimationController AnimationBase AnimationControllerPrivate Document New StartWaitTimer StartWaitStyleAvailable StartWaitResponse Looping Ending PausedNew PausedWaitTimer PausedWaitResponse PausedWaitStyleAvailable PausedRun Done FillingForwards Animationstate,viewedasastatemachinewith enumm_animState Knows About, and fires Animation Controller methods as states change. Element Knows about and executes style recalculations on documents and elements. However it does not actually change the styles value, just whether it should recalculate and potentially layout/render. Document::updateStyleIfNeeded Element::setNeedsStyleRecalc CompositeAnimation RenderElement Knows about and interacts with animation base, unclear why. WaitingAnimationSet (An array of AnimationBase) ! Seems to be a list of animations (AnimationBase classes) waiting to be animated, their state is stored in AnimationBase and could potentially become out of sync by being in an array that’s technically not waiting. RenderStyle ! AnimationController has two paths based on if request animation frame is enabled or not, in addition there is request animation frame timing feature that further branches into a new path confusing how the implementation path flows. Performs most of its work in AnimationControllerPrivate as a proxy, seems unnecessary and unclear why. ! Performs separate paths for compositing animations, this makes for confusing bugs. AnimationUpdateBlock (implemented in AnimationController.h) ! Issues beginAnimationUpdate or endAnimationUpdate simply through its constructor/destructor, very unclear why, and seems to pollute the paths. animatinon() contains one controller per frame. " Has a circular dependency with AnimationController, unclear why. # Runs on a one-shot timer, unclear why. " Has a circular dependency with AnimationBase, unclear why. ! Implementation hides “AnimationControllerPrivate” rather than implementing AnimationController. Unclear why. Creates on stack an animation update block letting the deconstructor/constructor fire begin/ end calls to AnimationController. Gets Animation Controller from frame.animation() Uses the frame reference only to get access to the frame view class to execute the flush compositingstateincludingsubframes and other flush compositing state classes. Combined with RenderLayerCompositor these do the actual changes to the styles and are called by AnimationController, AnimationBase and Frame/Element. KeyframeAnimation KNOWN DESIGN ISSUES: This system has a race condition if the compositor is flushed or invalidated too quickly (e.g., chrome client calls scheduleLayerFlush on AcceleratedContext.cpp) the animation base’s timer (within AnimationController) fails to remove waiting animations that have already completed within the WaitingAnimationSet. What happens is since there is not a chance for the AnimationController to remove these on its next timer run between the AcceleratedContext’s scheduled layer flushes items within WaitingAnimationSet are thought to be “Waiting” for an animation, but have a m_animState (on AnimationBase) of Ending, Done or other. In other words, the AnimationController thinks that animations that have completed are still waiting for their style because the accelerated compositor is plowing through them too quickly. The cure for this is to simply think of requests from ChromeClient to flush, invalidate or paint as “suggestions” and prevent them from executing more than 1/60th of a second in addition do not allow more than one flush to be issued at a time (e.g., two timers on separate threads running a flush concurrently). flushPendingLayerChanges flushCompositingStateIncludingSubframes
  9. 9. Academic exercises aside... What can we do now? FIRST REMEMBER: •  There’s a difference between perceived performance vs. actual performance (E.g., is your event just firing late?) •  Be careful when optimizing your code; it’s a rabbit hole and sometimes a pitfall (80/20 rule).
  10. 10. Some"rules"of"thumb:" Avoid"interacFng"with"the"DOM"with"these"paLerns:" " •  Changing"a"DOM"parameter"(adding,"modifying," removing"elements)"then"reading"from"another." " This"requires"a"layout"validaFon"/"invalidaFon"since"the"render"has"no" idea"if"the"change"you"made"could"potenFally"cause"a"change"to"the" value"you’re"trying"to"read!"
  11. 11. Some"rules"of"thumb:" Avoid"incremental"changes"to"DOM"if"you"can"batch" them"together:" For"instance,"if"you"need"to"create" HTML"elements"in"javascript"using" innerHTML"is"faster"than"using" document.createElement,"that"is"if" you’re"creaFng"nested"or"more"than" one"element."
  12. 12. Some"rules"of"thumb:" Avoid"JavaScript"that"interacts"with"DOM"funcFons"(vs." strings"or"properFes"on"the"DOM)" " •  JavaScript"can"safely"opFmize"more"if"you’re" modifying"a"string"rather"than"execuFng"a"funcFon." •  Again,"innerHTML"does"not"cause"a"JS"opFmizaFon" pause"(if"you’re"wriFng,"appending"but"not"reading)," but"document.createElement"will."
  13. 13. Some"rules"of"thumb:" Give"the"browser"as"much"informaFon"about" animaFons"as"you"can."Use"declaraFve"animaFon"styles" in"CSS." " •  Use"animaFon"key"frames"and"transiFon"in"CSS."" •  Use"will]change"CSS"property"for"properFes"that" frequently"change"(not"yet"implemented,"but"SOON!)" •  These"can"be"pre]compiled"by"the"RenderLayer"prior" to"the"animaFon"ever"being"executed!"
  14. 14. Some"rules"of"thumb:" Use"linear"transformaFons"rather"than"standard"CSS" style"rules"to"change"the"posiFon"or"scale." " •  Using"CSS"transform()"you"can"apply"linear" transformaFons"that"can"be"enFrely"done"in"the" compositor"and"GPU." •  Changing"the"X/Y"(lec/top)"or"width/height"will" cause"a"reflow/relayout"and"a"new"texture"to"upload" in"the"GPU."
  15. 15. Some"rules"of"thumb:" Use"requestAnimaFonFrame"whenever"possible." " •  requestAnimaFonFrame"prevents"layout"thrashing"as"it’s" explicitly"done"before"the"next"layout"loop"and"acer" composiFng." •  The"compositor"is"aware"of"requestAnimaFonFrame"and"lets" you"modify"elements"prior"to"composiFng"frames." •  This"can"significantly"prevent"you"from"interrupFng"a"layout" and"causing"a"new"one"from"running."
  16. 16. Some"rules"of"thumb:" Simplify"your"CSS" " •  Do"not"use"overtly"complex"selectors" •  Duplicate"styles"must"be"resolved"and"increase"layout"Fme." •  This"has"a"r*e"growth"rate!"(r=rules,"e=elements),"reducing" either"will"lower"your"layout"Fme." •  Rules"have"a"z*r*e"growth"rate!"(z=number"of"selector" parameters)""
  17. 17. Some"rules"of"thumb:" Do"not"add"CSS"rules"or"explicitly"set"style"parameters" acer"a"document"load." " •  Browsers"can"cache"possible"states"(or"visited"style"states)," but"not"when"its"dynamically"set." •  Create"various"possible"“style"states”"for"each"element"and" switch"the"class"on"the"element"rather"than"sekng"the"style" aLribute."
  18. 18. Some"rules"of"thumb:" Avoid,"if"you"can,"using"libraries"and"frameworks" " Best"pracFce:" •  Prototype"with"libraries,"then"profile"and"begin"removing/ replacing"funcFonality"with"a"smaller"limited"set/needs." •  Most"libraries"and"frameworks"are"built"for"ease"of"use,"and" not"performance."
  19. 19. Some"rules"of"thumb:" Javascript"Memory"Leaks"are"easy"to"create." " It’s"fairly"easy"to"accidently"have"an"object"refer"to"another" object"that"refers"back"to"itself.""Becareful"(and"aware)"of"these" corner"cases." " •  Use"closures"and"avoid"objects"that"take"in"other"objects." •  Avoid"defining"variables"in"the"global"scope"
  20. 20. Some"rules"of"thumb:" Use"prototypes"rather"than"user]defined"objects." " var"obj"="{foo:funcFon()"{"console.log(‘hello’);"}" " Create"10,000"of"these"and"you’ll"have"10,000"definiFons"AND" instances.""Careful!"
  21. 21. Some"rules"of"thumb:" Don’t"fear"iframes" " •  If"you"have"complex"controls"(visjs,"d3?)"that"may"need"their" own"UI"loop"consider"placing"them"into"iframes" •  iframes"give"you"a"new"thread"and"potenFally"a"new"process!" •  Useful,"but"don’t"overdo"it,"iframes"are"heavy"weight."
  22. 22. Some"rules"of"thumb:" Be"careful"mixing"contents" " •  Plugins,"video,"webgl,"CSS"animaFons"and"tradiFonal"DOM" rendering"all"run"on"separate"contexts." •  They’re"pulled"together"via"render"layers"and"graphics"layers." •  The"more"contexts"you"introduce"the"more"complex"the" synchronizaFon"between"them"can"become." •  Contexts"!="composiFng"layers"(but"can"someFmes)"
  23. 23. Some"rules"of"thumb:" Avoid"listening"to"high]throughput"events" " •  A"common"performance"mistake"is"not"removing"event" listeners"on"DOM"elements"or"reacFng"to"the"DOM"event"in" the"event"thread." •  High]throughput"events"such"as"mousemove,"touchmove"and" scroll"should"very"rarely"be"used." •  If"you"need"to"use"these"cache"the"result"and"animate"in" requestAnimaFonFrame,"NOT"the"event."
  24. 24. Some"rules"of"thumb:" Be"conservaFve"when"forcing"a"composiFng"layer"" (e.g.,"transform3D(0,0,0)"or"translateZ(0))" " •  CreaFng"a"graphics"layer/render"layer"is"expensive." •  Generally"the"rendering"sub]system"is"very"efficient"at"figuring"out"what" should"and"shouldn’t"be"layers." •  It"makes"very"liLle"sense"to"force"composiFng"layers"in"a""nested"manner," be"careful"doing"this!" •  It"makes"very"liLle"sense"to"force"composiFng"layers"if"they"don’t"have"a" linear"transformaFon"or"mask"(e.g.,"overflow:scroll)"
  25. 25. CSS"styles"that"cause"paints" Repaints"are"the"most"expensive"operaFon,"and"should" ALWAYS"be"declaraFve"(when"possible..)" color " " " "border]style" visibility " " " "background" text]decoraFon " "background]image" background]posiFon "background]repeat" outline]color " " "outline" outline]style " " "border]radius" outline]width " " "box]shadow" background]size"
  26. 26. CSS"styles"that"cause"layout" Layouts"are"lighter"than"repaints"but"can"(in"certain" circumstances)"trigger"a"repaint"as"well!" width " "height " "overflow]y "font]weight" padding " "margin " "display " "border]width" border " "top " " "posiFon " "font]size" float" " "text]align" "overflow " "lec" font]family "line]height "verFcal]align "right" clear" " "white]space "boLom " "min]height"
  27. 27. CSS"styles"that"cause"a"composite" Composites"are"generally"not"expensive,"declaraFve"or" imperaFve"style"declaraFons"are"fine" (note,"not"all"of"these"cause"a"NEW"composite"layer,"but"cause"a"composite"of" exisFng"layers):" opacity " " " "]webkit]user]select" cursor " " " "]webkit]transform" z]index " " " "transform(scale)" transform3D " " "transformZ()" transform(rotate)"
  28. 28. Identifying Issues: Jank The overuse of graphics layers causing pages to take excessively long to composite: Cause: Composite CSS calls used in a nested pattern. Diagnose: Large composite times. Cure: Remove nested transform3d/transformZ, reduce linear transforms, remove scroll event listeners, remove opacity or CSS composite filters.
  29. 29. Identifying Issues: Paint Storm Cause: Changing a paint CSS style on a high-throughput event or circular flip/flopping a CSS paint style. Diagnose: Very frequent paint->composite in frames. Cure: Find where paint CSS styles are changing.
  30. 30. Identifying Issues: Layout Trashing Common Cause: Reading layout DOM properties or modifying DOM. Diagnose: Frequent but short layout requests without paint/composite after. Cure: You’re most likely reading a CSS property or DOM property A LOT in your JavaScript code (perhaps in a tight loop?) // els is an array of elements for(var i = 0; i < els.length; i += 1){ var w = someOtherElement.offsetWidth / 3; els[i].style.width = w + 'px'; }
  31. 31. IdenFfying"Issues:"JS"Memory"Leaks" Common%Cause:""Circular"references"or"code"holding"onto"object" references"for"longer"than"necessary." " Diagnose:"Use"DOM"shim’s"and"run"your"code"in"node"– expose_gc"opFon,"use"“gc()”"to"force"garbage"collecFon." " Cure:"Use"binary]search"type"methods"for"isolaFng"the"offending" code"and"fix/refactor."
  32. 32. Mobile"App"Best"PracFces" •  Don’t use touch move events on scrollable items. •  Nest overflow elements to produce scroll effects •  Overflow elements should be in 500px intervals –  WebKit uses tiling for composite layers, each tile is 500px. •  Use absolute positioning/transforms where ever possible. •  Avoid nesting elements •  Less is more when listening to events •  Pre-paint items soon to show up, use display:none to hide. •  Mobile has more memory to lend, less GPU/CPU. –  Declarative style CSS animations are key here. –  Be careful when forcing a compositing layer with transforms.