Don't Be Afraid of Abstract Syntax Trees

2,100 views

Published on

ASTs are an incredibly powerful tool for understanding and manipulating JavaScript. We'll explore this topic by looking at examples from ESLint, a pluggable static analysis tool, and Browserify, a client-side module bundler. Through these examples we'll see how ASTs can be great for analyzing and even for modifying your JavaScript. This talk should be interesting to anyone that regularly builds apps in JavaScript either on the client-side or on the server-side.

Published in: Technology
  • Be the first to comment

Don't Be Afraid of Abstract Syntax Trees

  1. 1. Don’t Be Afraid of ASTs Jamund Ferguson
  2. 2. Our Basic Plan 1. High-level overview 2. Static Analysis with ASTs 3. Transforming and refactoring 4. A quick look at the Mozilla Parser API (de-facto standard AST format)
  3. 3. An abstract syntax tree is basically a DOM for your code.
  4. 4. An AST makes it easier to inspect and manipulate your code with confidence.
  5. 5. { "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var" "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }
  6. 6. var fullstack = node + ui;
  7. 7. { "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var" "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }
  8. 8. Things Built On ASTs • Syntax Highlighting • Code Completion • Static Analysis (aka JSLint, etc.) • Code Coverage • Minification • JIT Compilation • Source Maps • Compile to JS Languages So much more…
  9. 9. Static Analysis
  10. 10. It’s not just about formatting.
  11. 11. Fix a bug. Add a unit test. Fix a similar bug…
  12. 12. Write some really solid static analysis. Never write that same type of bug again.
  13. 13. function loadUser(req, res, next) { User.loadUser(function(err, user) { req.session.user = user; next(); }); } Bad Example We forgot to handle the error!
  14. 14. handle-callback-err 1. Each time a function is declared check if there is an error* parameter If so set a count to 0; Increment count when error is used At the end of the function warn when count is empty * the parameter name can be defined by the user
  15. 15. Static Analysis • Complexity Analysis • Catching Mistakes • Consistent Style
  16. 16. History Lesson • 1995: JavaScript • 2002: JSLint started by Douglas Crockford • 2011: JSHint comes out as a fork of JSLint. Esprima AST parser released. • 2012: plato, escomplex, complexity-report • 2013: Nicholoas Zakas releases ESLint. Marat Dulin releases JSCS.
  17. 17. My static analysis tool of choice is ESLint.
  18. 18. JSHint Mixes the rule engine with the parser
  19. 19. Examples
  20. 20. no-console return { "MemberExpression": function(node) { if (node.object.name === "console") { context.report(node, "Unexpected console statement.”); } } }; https://github.com/eslint/eslint/blob/master/lib/rules/no-console.js
  21. 21. no-loop-func function checkForLoops(node) { var ancestors = context.getAncestors(); if (ancestors.some(function(ancestor) { return ancestor.type === "ForStatement" || ancestor.type === "WhileStatement" || ancestor.type === "DoWhileStatement"; })) { context.report(node, "Don't make functions within a loop"); } } return { "FunctionExpression": checkForLoops, "FunctionDeclaration": checkForLoops };
  22. 22. max-params var numParams = context.options[0] || 3; function checkParams(node) { if (node.params.length > numParams) { context.report(node, "This function has too many parameters ({{count}}). Maximum allowed is {{max}}.", { count: node.params.length, max: numParams }); } } return { “FunctionDeclaration”: checkParams, “FunctionExpression”: checkParams }
  23. 23. no-jquery function isjQuery(name) { return name === '$' || name === 'jquery' || name === 'jQuery'; } return { “CallExpression”: function(node) { var name = node.callee && node.callee.name; if (isjQuery(name)) { context.report(node, 'Please avoid using jQuery here.’); } } }
  24. 24. More Complex Rules • indent • no-extend-native • no-next-next • security • internationalization
  25. 25. Other Areas for Static Analysis Code complexity and visualization is another area where static analysis is really useful. Plato is an exciting start, but I believe there are tons of more interesting things that can be done in this area.
  26. 26. Recap • Static Analysis can help you catch real bugs and keep your code maintainable • ESLint and JSCS both use ASTs for inspecting your code to make it easy to cleanly to add new rules • Static analysis can also help you manage your code complexity as well • What exactly does a for loop sound like?
  27. 27. Transforms
  28. 28. Sometimes you want to step into the future, but something is keeping you in the past.
  29. 29. Maybe it’s Internet Explorer
  30. 30. Maybe it’s the size of your code base
  31. 31. ASTs to the rescue!
  32. 32. Tools like falafel and recast give you an API to manipulate an AST and then convert that back into source code.
  33. 33. Two Types of AST Transformations Regenerative Regenerate the full file from the AST. Often losing comments and non-essential formatting. Fine for code not read by humans (i.e. browserify transforms). Partial-source transformation Regenerate only the parts of the source that have changed based on the AST modifications. Nicer for one-time changes in source.
  34. 34. Build a Simple Browserify Transform
  35. 35. var vfaurl lfsutlalcskt a=c kn o=d en o+d ber o+w sueir;ify;
  36. 36. 4 Steps 1. Buffer up the stream of source code 2. Convert the source into an AST 3. Transform the AST 4. Re-generate and output the source
  37. 37. Step 1 Use through to grab the source code var through = require(‘through'); var buffer = []; return through(function write(data) { buffer.push(data); }, function end () { var source = buffer.join(‘’); });
  38. 38. Step 2 Use falafel to transform create an AST var falafel = require(‘falafel’); function end () { var source = buffer.join(‘’); var out = falafel(source, parse).toString(); }
  39. 39. Step 3 function parse(node) { if (node.type === 'Identifier' && node.value === ‘ui’) { node.update('browserify'); } } Use falafel to transform the AST
  40. 40. Step 4 Stream the source with through and close the stream function end () { var source = buffer.join(‘’); var out = falafel(source, parse).toString(); this.queue(out); this.queue(null); // end the stream }
  41. 41. var through = require('through'); var falafel = require('falafel'); module.exports = function() { var buffer = []; return through(function write(data) { buffer.push(data); }, function end() { var source = buffer.join('n'); var out = falafel(source, parse).toString(); this.queue(out); this.queue(null); // close the stream }); }; function parse(node) { if (node.type === 'Identifier' && node.name === 'ui') { node.update('browserify'); } }
  42. 42. It Works! browserify -t ./ui-to-browserify.js code.js (function e(t,n,r){function s(o,u){if(!n[o]){if(!t[o]){var a=typeof require=="var fullstack = node + browserify; },{}]},{},[1]);
  43. 43. Lots of code to do something simple?
  44. 44. Probably, but… It will do exactly what is expected 100% of the time.
  45. 45. And it’s a building block for building a bunch of cooler things.
  46. 46. What sort of cooler things?
  47. 47. How about performance?
  48. 48. V8 doesn’t do it, but there’s nothing stopping you*.
  49. 49. *Except it’s hard.
  50. 50. A Basic Map/Filter var a = [1, 2, 3]; var b = a.filter(function(n) { return n > 1; }).map(function(k) { return k * 2; });
  51. 51. Faster Like This var a = [1, 2, 3]; var b = []; for (var i = 0; i < a.length; i++) { if (a[i] > 1) { b.push(a[i] * 2); } }
  52. 52. A Basic Recast Script var recast = require(‘recast’); var code = fs.readFileSync(‘code.js', 'utf-8'); var ast = recast.parse(code); var faster = transform(ast); var output = recast.print(faster).code;
  53. 53. function transform(ast) { var transformedAST = new MapFilterEater({ body: ast.program.body }).visit(ast); return transformedAST; } var Visitor = recast.Visitor; var MapFilterEater = Visitor.extend({ init: function(options) {}, visitForStatement: function(ast) {}, visitIfStatement: function(ast) {}, visitCallExpression: function(ast) {}, visitVariableDeclarator: function(ast) {} });
  54. 54. How Does it Work? 1. Move the right side of the b declaration into a for loop 2. Set b = [] 3. Place the .filter() contents inside of an if statement 4. Unwrap the .map contents and .push() them into b 5. Replace all of the local counters with a[_i]
  55. 55. And Voila…. var a = [1, 2, 3]; var b = []; for (var i = 0; i < a.length; i++) { if (a[i] > 1) { b.push(a[i] * 2); } }
  56. 56. Worth the effort? YES!
  57. 57. The most well-read documentation for how to engineer your app is the current codebase.
  58. 58. If you change your code, you can change the future.
  59. 59. Knowledge What is an AST and what does it look like?
  60. 60. Parser 1. Read your raw JavaScript source. 2. Parse out every single thing that’s happening. 3. Return an AST that represents your code
  61. 61. Esprima is a very popular* parser that converts your code into an abstract syntax tree. *FB recently forked it to add support for ES6 and JSX
  62. 62. Parsers narcissus ZeParser Treehugger Uglify-JS Esprima Acorn
  63. 63. Esprima follows the Mozilla Parser API which is a well documented AST format used internally by Mozilla (and now by basically everyone else*)
  64. 64. var fullstack = node + ui;
  65. 65. { "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var" "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }
  66. 66. { "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var" "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }
  67. 67. Node Types SwitchCase (1) Property (1) Literal (1) Identifier (1) Declaration (3) Expression (14) Statement (18)
  68. 68. Expression Types • FunctionExpression • MemberExpression • CallExpression • NewExpression • ConditionalExpression • LogicalExpression • UpdateExpression • AssignmentExpression • BinaryExpression • UnaryExpression • SequenceExpression • ObjectExpression • ArrayExpression • ThisExpression
  69. 69. Statement Types • DebuggerStatement • ForInStatement • ForStatement • DoWhileStatement • WhileStatement • CatchClause • TryStatement • ThrowStatement • ReturnStatement • SwitchStatement • WithStatement • ContinueStatement • BreakStatement • LabeledStatement • IfStatement • ExpressionStatement • BlockStatement • EmptyStatement
  70. 70. Debugging ASTs
  71. 71. • When debugging console.log(ast) will not print a large nested AST properly. Instead you can use util.inspect: var util = require('util'); var tree = util.inspect(ast, { depth: null }); console.log(tree); • When transforming code start with the AST you want and then work backward. • Often this means pasting code using the Esprima online visualization tool or just outputting the trees into JS files and manually diffing them.
  72. 72. Oftentimes it helps to print out the code representation of a single node. In recast you can do: var source = recast.prettyPrint(ast, { tabWidth: 2 }).code; In ESLint you can get the current node with: var source = context.getSource(node)
  73. 73. ASTs can turn your code into play-dough
  74. 74. Totally worth the effort!
  75. 75. the end @xjamundx
  76. 76. Related - http://pegjs.majda.cz/ - https://www.kickstarter.com/projects/michaelficarra/make-a-better-coffeescript-compiler - http://coffeescript.org/documentation/docs/grammar.html - https://github.com/padolsey/parsers-built-with-js - https://github.com/zaach/jison - https://github.com/substack/node-falafel - eprima-based code modifier - https://github.com/substack/node-burrito - uglify-based AST walking code-modifier - https://github.com/jscs-dev/node-jscs/blob/e745ceb23c5f1587c3e43c0a9cfb05f5ad86b5ac/lib/js-file.js - JSCS’s way of walking the AST - https://www.npmjs.org/package/escodegen - converts an AST into real code again - https://www.npmjs.org/package/ast-types - esprima-ish parser - http://esprima.org/demo/parse.html - the most helpful tool https://github.com/RReverser/estemplate - AST-based search and replace https://www.npmjs.org/package/aster - build system thing Technical Papers http://aosd.net/2013/escodegen.html Videos / Slides http://slidedeck.io/benjamn/fluent2014-talk http://vimeo.com/93749422 https://speakerdeck.com/michaelficarra/spidermonkey-parser-api-a-standard-for-structured-js-representations https://www.youtube.com/watch?v=fF_jZ7ErwUY https://speakerdeck.com/ariya/bringing-javascript-code-analysis-to-the-next-level Just in Time Compilers http://blogs.msdn.com/b/ie/archive/2012/06/13/advances-in-javascript-performance-in-ie10-and-windows-8.aspx https://blog.mozilla.org/luke/2014/01/14/asm-js-aot-compilation-and-startup-performance/ https://blog.indutny.com/4.how-to-start-jitting Podcasts http://javascriptjabber.com/082-jsj-jshint-with-anton-kovalyov/ http://javascriptjabber.com/054-jsj-javascript-parsing-asts-and-language-grammar-w-david-herman-and-ariya-hidayat/
  77. 77. RANDOM EXTRA SLIDES
  78. 78. Static analysis tools like ESLint and JSCS provide an API to let you inspect an AST to make sure it’s following certain patterns.
  79. 79. function isEmptyObject( obj ) { for ( var name in obj ) { return false; } return true; }
  80. 80. static analysis > unit testing > functional testing
  81. 81. function loadUser(req, res, next) { User.loadUser(function(err, user) { if (err) { next(err); } req.session.user = user; next(); }); } Another Example
  82. 82. Program VariableDeclarator VariableDeclaration FunctionExpression ExpressionStatement Identifier Identifier Abstract Christmas Trees
  83. 83. Tools like falafel and recast give you an API to manipulate an AST and then convert that back into source code.

×