Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CSS parsing: performance tips & tricks

1,113 views

Published on

Technical talk about CSS parsing. Tips and tricks that used in CSSTree to make it fastest detailed CSS parser written on JavaScript.

Published in: Technology
  • Be the first to comment

CSS parsing: performance tips & tricks

  1. 1. CSS Parsing performance tips & tricks Roman Dvornov Avito Moscow, September 2016
  2. 2. Frontend lead in Avito Specializes in SPA Maintainer of:
 basis.js, CSSO, 
 component-inspector, 
 csstree and others
  3. 3. CSS parsing (russian) 3 tinyurl.com/csstree-intro This talk is the continuation of
  4. 4. CSSTree
  5. 5. CSSTree – 
 fastest detailed CSS parser 5
  6. 6. How this project was born
  7. 7. About a year ago I started 
 to maintain CSSO (a CSS minifier) 7 github.com/css/csso
  8. 8. CSSO was based on Gonzales (a CSS parser) 8 github.com/css/gonzales
  9. 9. What's wrong with Gonzales • Development stopped in 2013 • Unhandy and buggy AST format • Parsing mistakes • Excessively complex code base • Slow, high memory consumption, pressure for GC 9
  10. 10. But I didn’t want 
 to spend my time developing the parser… 10
  11. 11. Alternatives?
  12. 12. You can find a lot of CSS parsers 12
  13. 13. Common problems • Not developing currently • Outdated (don't support latest CSS features) • Buggy • Unhandy AST • Slow 13
  14. 14. PostCSS parser is a good choice if you need one now 14 postcss.org
  15. 15. PostCSS pros • Сonstantly developing • Parses CSS well, even non-standard syntax 
 + tolerant mode • Saves formatting info • Handy API to work with AST • Fast 15
  16. 16. General con: selectors and values are not parsed (are represented as strings) 16
  17. 17. That forces developers to • Use non-robust or non-effective approaches • Invent their own parsers • Use additional parsers:
 postcss-selector-parser
 postcss-value-parser 17
  18. 18. Switching to PostCSS meant writing 
 our own selector and value parsers, what is pretty much the same as writing an entirely new parser 18
  19. 19. However, as a result of a continuous refactoring within a few months 
 the CSSO parser was completely rewrote (which was not planned) 19
  20. 20. And was extracted 
 to a separate project github.com/csstree/csstree 20
  21. 21. Performance
  22. 22. CSSO – performance boost story (russian) 22 tinyurl.com/csso-speedup My previous talk about parser performance
  23. 23. After my talk on HolyJS conference the parser's performance was improved 
 one more time :) 23 * Thanks Vyacheslav @mraleph Egorov for inspiration
  24. 24. 24 CSSTree: 24 ms Mensch: 31 ms CSSOM: 36 ms PostCSS: 38 ms Rework: 81 ms PostCSS Full: 100 ms Gonzales: 175 ms Stylecow: 176 ms Gonzales PE: 214 ms ParserLib: 414 ms bootstrap.css v3.3.7 (146Kb) github.com/postcss/benchmark Non-detailed AST Detailed AST PostCSS Full = + postcss-selector-parser + postcss-value-parser
  25. 25. Epic fail as I realised later I extracted 
 the wrong version of the parser 25 😱 github.com/csstree/csstree/commit/57568c758195153e337f6154874c3bc42dd04450
  26. 26. 26 CSSTree: 24 ms Mensch: 31 ms CSSOM: 36 ms PostCSS: 38 ms Rework: 81 ms PostCSS Full: 100 ms Gonzales: 175 ms Stylecow: 176 ms Gonzales PE: 214 ms ParserLib: 414 ms bootstrap.css v3.3.7 (146Kb) github.com/postcss/benchmark Time after parser update 13 ms
  27. 27. Parsers: basic training
  28. 28. Main steps • Tokenization • Tree assembling 28
  29. 29. Tokenization
  30. 30. 30 • whitespaces – [ nrtf]+ • keyword – [a-zA-aZ…]+ • number – [0-9]+ • string – "string" or 'string' • comment – /* comment */ • punctuation – [;,.#{}[]()…] Split text into tokens
  31. 31. 31 .foo { width: 10px; } [ '.', 'foo', ' ', '{', 'n ', 'width', ':', ' ', '10', 'px', ';', 'n', '}' ]
  32. 32. We need more info about every token: type and location 32 It is more efficient 
 to compute type and location on tokenization step
  33. 33. 33 .foo { width: 10px; } [ { type: 'FullStop', value: '.', offset: 0, line: 1, column: 1 }, … ]
  34. 34. Tree assembling
  35. 35. 35 function getSelector() { var selector = { type: 'Selector', sequence: [] }; // main loop return selector; } Creating a node
  36. 36. 36 for (;currentToken < tokenCount; currentToken++) { switch (tokens[currentToken]) { case TokenType.Hash: // # selector.sequence.push(getId()); break; case TokenType.FullStop: // . selector.sequence.push(getClass()); break; … } Main loop
  37. 37. 37 { "type": "StyleSheet", "rules": [{ "type": "Atrule", "name": "import", "expression": { "type": "AtruleExpression", "sequence": [ ... ] }, "block": null }] } Result
  38. 38. Parser performance boost Part 2: new horizons
  39. 39. 39 [ { type: 'FullStop', value: '.', offset: 0, line: 1, column: 1 }, … ] Token's cost: 24 + 5 * 4 + array = min 50 bytes per token Our project ~1Mb CSS 254 062 tokens = min 12.7 Mb
  40. 40. Out of the box: changing approach
  41. 41. Compute all tokens at once and then assembly a tree is much more easy, but needs more memory, therefore is slower 41
  42. 42. Scanner (lazy tokenizer) 42
  43. 43. 43 scanner.token // current token or null scanner.next() // going to next token scanner.lookup(N) // look ahead, returns // Nth token from current token Key API
  44. 44. 44 • lookup(N)
 fills tokens buffer up to N tokens (if they are not computed yet), returns N-1 token from buffer • next()
 shift token from buffer, if any, or compute 
 next token
  45. 45. Computing the same number of tokens, 
 but not simultaneously 
 and requires less memory 45
  46. 46. Problem: the approach puts pressure on GC 46
  47. 47. Reducing token's cost step by step
  48. 48. 48 [ { type: 'FullStop', value: '.', offset: 0, line: 1, column: 1 }, … ] Type as string is easy to understand, but it's for internal use only and we can replace it by numbers
  49. 49. 49 [ { type: FULLSTOP, value: '.', offset: 0, line: 1, column: 1 }, … ] … // '.'.charCodeAt(0) var FULLSTOP = 46; …
  50. 50. 50 [ { type: 46, value: '.', offset: 0, line: 1, column: 1 }, … ]
  51. 51. 51 [ { type: 46, value: '.', offset: 0, line: 1, column: 1 }, … ] We can avoid substring storage in the token – it's very expensive for punctuation (moreover those substrings are never used); Many constructions are assembled by several substrings. One long substring is better than 
 a concat of several small ones
  52. 52. 52 [ { type: 46, value: '.', offset: 0, line: 1, column: 1 }, … ] [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ]
  53. 53. 53 [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ] Look, Ma! No strings just numbers!
  54. 54. 54 Moreover not an Array, but TypedArray Array 
 of objects Arrays
 of numbers
  55. 55. Array vs. TypedArray • Can't have holes • Faster in theory (less checking) • Can be stored outside the heap (when big enough) • Prefilled with zeros 55
  56. 56. 56 [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ] Uint8Array Uint32Array Uint32Array Uint32Array Uint32Array 1 4 4 4 4 17 per token (tokens count) 254 062 x 17 = 4.3Mb
  57. 57. 4.3Mb vs. 12.7Mb(min) 57
  58. 58. Houston we have a problem: TypedArray has a fixed length,
 but we don't know how many tokens will be found 58
  59. 59. 59 [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ] Uint8Array Uint32Array Uint32Array Uint32Array Uint32Array 1 4 4 4 4 17 per token (symbols count) 983 085 x 17 = 16.7Mb
  60. 60. 16.7Mb vs. 12.7Mb (min) 60
  61. 61. 16.7Mb vs. 12.7Mb (min) 60 Don't give up, 
 let's look on arrays more attentively
  62. 62. 61 start = [ 0, 5, 6, 7, 9, 11, …, 35 ] end = [ 5, 6, 7, 9, 11, 12, …, 36 ]
  63. 63. 61 start = [ 0, 5, 6, 7, 9, 11, …, 35 ] end = [ 5, 6, 7, 9, 11, 12, …, 36 ] …
  64. 64. 62 start = [ 0, 5, 6, 7, 9, 11, …, 35 ] end = [ 5, 6, 7, 9, 11, 12, …, 36 ] offset = [ 0, 5, 6, 7, 9, 11, …, 35, 36 ] start = offset[i] end = offset[i + 1] + =
  65. 65. 63 [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ] Uint8Array Uint32Array Uint32Array Uint32Array Uint32Array 1 4 4 4 4 13 per token 983 085 x 13 = 12.7Mb
  66. 66. 64 a { top: 0; } lines = [ 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3 ] columns = [ 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 ] lines & columns
  67. 67. 64 a { top: 0; } lines = [ 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3 ] columns = [ 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 ] lines & columns
  68. 68. 65 line = lines[offset]; column = offset - lines.lastIndexOf(line - 1, offset); lines & columns
  69. 69. 65 line = lines[offset]; column = offset - lines.lastIndexOf(line - 1, offset); lines & columns It's acceptable only for short lines, that's why we cache the last line start offset
  70. 70. 66 [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ] Uint8Array Uint32Array Uint32Array Uint32Array Uint32Array 1 4 4 4 4 9 per token 983 085 x 9 = 8.8Mb
  71. 71. 67 8.8Mb vs. 12.7Mb(min)
  72. 72. Reduce operations with strings
  73. 73. Performance «killers»* • RegExp • String concatenation • toLowerCase/toUpperCase • substr/substring • … 69 * Polluted GC pulls performance down
  74. 74. Performance «killers»* • RegExp • String concatenation • toLowerCase/toUpperCase • substr/substring • … 70 * Polluted GC pulls performance down We can’t avoid using these things, but we can get rid of the rest
  75. 75. 71 var start = scanner.tokenStart; … scanner.next(); … scanner.next(); … return source.substr(start, scanner.tokenEnd); Avoid string concatenations
  76. 76. 72 function cmpStr(source, start, end, str) { if (end - start !== str.length) { return false; } for (var i = start; i < end; i++) { var sourceCode = source.charCodeAt(i); var strCode = str.charCodeAt(i - start); if (sourceCode !== strCode) { return false; } } return true; } String comparison No substring!
  77. 77. 73 function cmpStr(source, start, end, str) { if (end - start !== str.length) { return false; } for (var i = start; i < end; i++) { var sourceCode = source.charCodeAt(i); var strCode = str.charCodeAt(i - start); if (sourceCode !== strCode) { return false; } } return true; } String comparison Length fast-check
  78. 78. 74 function cmpStr(source, start, end, str) { if (end - start !== str.length) { return false; } for (var i = start; i < end; i++) { var sourceCode = source.charCodeAt(i); var strCode = str.charCodeAt(i - start); if (sourceCode !== strCode) { return false; } } return true; } String comparison Compare strings 
 by char codes
  79. 79. Case insensitive comparison of strings*? 75 * Means avoid toLowerCase/toUpperCase
  80. 80. Heuristics • Comparison with the reference strings only (str) • Reference strings may be in lower case and contain latin letters only (no unicode) • I read once on Twitter… 76
  81. 81. Setting of the 6th bit to 1 changes upper case latin letter to lower case (works for latin ASCII letters only) 'A' = 01000001 'a' = 01100001 'A'.charCodeAt(0) | 32 === 'a'.charCodeAt(0) 77
  82. 82. 78 function cmpStr(source, start, end, str) { … for (var i = start; i < end; i++) { … // source[i].toLowerCase() if (sourceCode >= 65 && sourceCode <= 90) { // 'A' .. 'Z' sourceCode = sourceCode | 32; } if (sourceCode !== strCode) { return false; } } … } Case insensitive string comparison
  83. 83. Benefits • Frequent comparison stops on length check • No substring (no pressure on GC) • No temporary strings (e.g. result of toLowerCase/toUpperCase) • String comparison don't pollute CG 79
  84. 84. Results • RegExp • string concatenation • toLowerCase/toUpperCase • substr/substring 80
  85. 85. No arrays in AST
  86. 86. What's wrong with arrays? • As we are growing arrays their memory fragments are to be relocated frequently (unnecessary memory moving) • Pressure on GC • We don't know the size of resulting arrays 82
  87. 87. Solution? 83
  88. 88. Bi-directional list 84
  89. 89. 85
  90. 90. 85 AST node AST node AST node AST node
  91. 91. Needs a little bit more memory than arrays, but… 86
  92. 92. Pros • No memory relocation • No GC pollution during AST assembly • next/prev references for free • Cheap insertion and deletion • Better for monomorphic walkers 87
  93. 93. Those approaches and others allowed to reduce memory consumption, pressure on GC and made the parser twice faster than before 88
  94. 94. 89 CSSTree: 24 ms Mensch: 31 ms CSSOM: 36 ms PostCSS: 38 ms Rework: 81 ms PostCSS Full: 100 ms Gonzales: 175 ms Stylecow: 176 ms Gonzales PE: 214 ms ParserLib: 414 ms bootstrap.css v3.3.7 (146Kb) github.com/postcss/benchmark It's about this changes 13 ms
  95. 95. But the story goes on 😋 90
  96. 96. Parser performance boost story Part 3: а week after FrontTalks
  97. 97. In general • Simplify AST structure • Less memory consumption • Arrays reusing • list.map().join() -> loop + string concatenation • and others… 92
  98. 98. Once more time about token costs
  99. 99. 94 [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ] Uint8Array Uint32Array Uint32Array Uint32Array Uint32Array 1 types 4 offsets 4 4 lines 4 9 per token 983 085 x 9 = 8.8Mb
  100. 100. lines can be computed on demand 95
  101. 101. 96 [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ] Uint8Array Uint32Array Uint32Array Uint32Array Uint32Array 1 types 4 offsets 4 4 lines 4 5 per token 983 085 x 5 = 4.9Mb
  102. 102. Do we really needs all 32 bits for the offset? Heuristics: no one parses 
 more than 16Mb of CSS 97
  103. 103. 98 offset = [ 0, 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ]
  104. 104. 99 offset = [ 0, 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ] offsetAndType[i] = type[i] << 24 | offset[i] + =
  105. 105. 100 offset = [ 0, 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ] offsetAndType[i] = type[i] << 24 | offset[i] offsetAndType = [ 16777216, 788529157, … ] + =
  106. 106. 101 offset = [ 0, 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ] offsetAndType[i] = type[i] << 24 | offset[i] offsetAndType = [ 16777216, 788529157, … ] start = offsetAndType[i] & 0xFFFFFF; type = offsetAndType[i] >> 24; + =
  107. 107. 102 [ { type: 46, start: 0, end: 1, line: 1, column: 1 }, … ] Uint8Array Uint32Array Uint32Array Uint32Array Uint32Array 1 types 4 offsets 4 4 lines 4 4 per token 983 085 x 4 = 3.9Mb
  108. 108. 3.9-7.8 Mb vs. 12.7 Mb (min) 103
  109. 109. 104 class Scanner { ... next() { var next = this.currentToken + 1; this.currentToken = next; this.tokenStart = this.tokenEnd; this.tokenEnd = this.offsetAndType[next + 1] & 0xFFFFFF; this.tokenType = this.offsetAndType[next] >> 24; } } Needs 2 reads for 3 values (tokenEnd becomes tokenStart)
  110. 110. 105 class Scanner { ... next() { var next = this.currentToken + 1; this.currentToken = next; this.tokenStart = this.tokenEnd; this.tokenEnd = this.offsetAndType[next + 1] & 0xFFFFFF; this.tokenType = this.offsetAndType[next] >> 24; } } But 2 reads look redundant, let's fix it…
  111. 111. 106 offset = [ 0, 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ] offsetAndType[i] = type[i] << 24 | offset[i] start = end end = offsetAndType[i + 1] & 0xFFFFFF; type = offsetAndType[i] >> 24;
  112. 112. 106 offset = [ 0, 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ] offsetAndType[i] = type[i] << 24 | offset[i] start = end end = offsetAndType[i + 1] & 0xFFFFFF; type = offsetAndType[i] >> 24; …
  113. 113. 107 offset = [ 0, 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ] The first offset is always zero
  114. 114. 108 offset = [ 0, 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ] Shift offsets to the left
  115. 115. 109 offset = [ 5, 6, 7, 9, 11, 11, …, 1234 ] type = [ 1, 47, 47, 4, 4, 47, 5, …, 3 ] offsetAndType[i] = type[i] << 24 | offset[i + 1] offsetAndType[i] = type[i] << 24 | offset[i] start = end end = offsetAndType[i] & 0xFFFFFF; type = offsetAndType[i] >> 24; …
  116. 116. 110 class Scanner { ... next() { var next = this.currentToken + 1; this.currentToken = next; this.tokenStart = this.tokenEnd; this.tokenEnd = this.offsetAndType[next] & 0xFFFFFF; this.tokenType = this.offsetAndType[next] >> 24; } } Now we need just 
 one read
  117. 117. 111 class Scanner { ... next() { var next = this.currentToken + 1; this.currentToken = next; this.tokenStart = this.tokenEnd; next = this.offsetAndType[next]; this.tokenEnd = next & 0xFFFFFF; this.tokenType = next >> 24; } } -50% reads (~250k) 👌
  118. 118. Re-use
  119. 119. The scanner creates arrays every time when it parses 
 a new string 113
  120. 120. The scanner creates arrays every time when it parses 
 a new string 113
  121. 121. New strategy • Preallocate 16Kb buffer by default • Create new buffer only if current is smaller than needed for parsing • Significantly improves performance 
 especially in cases when parsing a number of small CSS fragments 114
  122. 122. 115 CSSTree: 24 ms Mensch: 31 ms CSSOM: 36 ms PostCSS: 38 ms Rework: 81 ms PostCSS Full: 100 ms Gonzales: 175 ms Stylecow: 176 ms Gonzales PE: 214 ms ParserLib: 414 ms bootstrap.css v3.3.7 (146Kb) github.com/postcss/benchmark 13 ms 7 ms Current results
  123. 123. And still not the end… 😋 116
  124. 124. One more thing
  125. 125. CSSTree – 
 is not just about performance 118
  126. 126. New feature*: Parsing and matching of 
 CSS values syntax 119 * Currently unique across CSS parsers
  127. 127. Example 120
  128. 128. 121 csstree.github.io/docs/syntax.html CSS syntax reference
  129. 129. 122 csstree.github.io/docs/validator.html CSS values validator
  130. 130. 123 var csstree = require('css-tree'); var syntax = csstree.syntax.defaultSyntax; var ast = csstree.parse('… your css …'); csstree.walkDeclarations(ast, function(node) { if (!syntax.match(node.property.name, node.value)) { console.log(syntax.lastMatchError); } }); Your own validator in 8 lines of code
  131. 131. Some tools and plugins • csstree-validator – npm package + cli command • stylelint-csstree-validator – plugin for stylelint • gulp-csstree – plugin for gulp • SublimeLinter-contrib-csstree – plugin for Sublime Text • vscode-csstree – plugin for VS Code • csstree-validator – plugin for Atom
 
 More is coming… 124
  132. 132. Conclusion
  133. 133. If you want your JavaScript works as fast as C, 
 make it look like C 126
  134. 134. Previous talks • CSSO – performance boost story (russian)
 tinyurl.com/csso-speedup • CSS parsing (russian)
 tinyurl.com/csstree-intro 127
  135. 135. github.com/csstree/csstree 128 Your feedback is welcome
  136. 136. Roman Dvornov @rdvornov github.com/lahmatiy rdvornov@gmail.com Questions? github.com/csstree/csstree

×