5. Case study: Cython
● Compiles Python to C using libpython
● Produces a single binary
● Simple FFI
○ Though not as simple as Chicken Scheme
● Superset of Python
○ Pure Python: OK
○ Optional types for more efficient code: OK
6. Me vs. team of PhDs over years?
Not really trying to compete on speed
7. AOT-compilation can help any language
● Simplify deployment & packaging
● Simplify FFIs
● Make performance more predictable
○ Compared to JIT interpreters
8. AOT-compilation and dynamic languages cont.
● With additional sacrifices to dynamism e.g.:
○ No dynamic imports
○ Optional types
● Can get:
○ Relatively efficiently generated code with little effort
○ Smaller binaries/output
9. My background
● Working on a Scheme interpreter in 2017
● Wanted to add a compiler backend
● Process made trivial by using
○ Existing parser
○ Interpreter runtime as a library
11. Nothing like this exists for JavaScript That I know of
3rd party
JavaScript parser
Source code AST Fancy new
compiler
C++ & V8
Final product:
Native code
13. Jsc v0
● Proof-of-concept ES5 compiler
○ Written in Rust, uses an existing ES5 parser frontend
● Targets native Node addons in C++
○ Uses node-gyp
● Uses a 1-line entrypoint to load and run the addon
● Completely type unaware
○ Not even leaf-type propagation
14. Supported functionality
● Functions and function calls
● Basic tail-call optimization
● Var declarations
● Many primitive operators
● Object, array, number, string, boolean and null literals
● Access to Node builtins via `global`
● Basic source-to-source comments for debugging generated output
16. Example
function fib(n, a, b) {
if (n == 0) { return a; }
if (n == 1) { return b; }
return fib(n - 1, b, a + b);
}
function main() {
console.log(fib(50, 0, 1));
}
20. Analysis: the good
● It works!
● V8 makes compiling to C++ dead simple!
○ Object representation: V8!
■ String::NewFromUtf8(isolate, "Boolean")
■ Number::New(isolate, 0)
○ Memory management: V8!
● Source-to-source commenting is helpful
○ // return a;
args.GetReturnValue().Set(a_2);
return;
21. Analysis: the bad
● Relatively bloated code
○ 5 LOC JavaScript -> 54 LOC C++
● TONS of redundant casting, moves
● What is going on with the Boolean check???
○ Local<Context> ctx_5 = isolate->GetCurrentContext();
Local<Object> global_6 = ctx_5->Global();
Local<Function> Boolean_7 = Local<Function>::Cast(global_6->Get(String::NewFromUtf8(isolate,
"Boolean"))); Local<Context> ctx_5 = isolate->GetCurrentContext();
...
Local<Value> result_13 = Boolean_7->Call(Null(isolate), 1, argv_12);
if (result_13->ToBoolean()->Value()) {
// return a;
args.GetReturnValue().Set(a_2);
return;
}
22. Challenges: Rust
● First time writing it 🤦
○ Borrow checker took a while
○ But JavaScripters would love the syntax
○ Didn’t make use of Traits but they are super expressive
● Parser frontend library wasn’t super mature
● Parser frontend library wasn’t TypeScript/Flow
○ Limits specializing code generation
23. Challenges: V8 documentation
● Better than expected!
● Not a ton of people writing/documenting
○ How to array literal?
○ How to std::string -> V8::String, vice-versa?
● What is built-in vs. not?
○ E.g. Value::Equals, Value::StrictEquals, String::Concat
○ But no Value::Add, Number::Add
24. Frustrated by...
● Lack of Rust knowledge
● No Rust TypeScript parser
● Bad design of code generator
25. Jsc v0.5
● Proof-of-concept TypeScript compiler
○ Written in TypeScript, uses the TypeScript compiler API
● Targets native Node addons in C++
○ Uses node-gyp
● Uses a 1-line entrypoint to load and run the addon
26. Two major changes for generated code quality
1. Destination-driven code generation
2. Basic type awareness
27. 1. Destination-driven code generation
● By Kent Dybvig at Cisco for Chez Scheme
● Parents pass destination of output to child
● Easy to implement
● Highly space-efficient
● Single-pass
● Adapted by V8 team
28. 2. Basic type awareness
● Taking advantage of obvious type information
○ Leaf-type propagation only
● Not even tapping TypeScript yet
29. Supported functionality
● Function declarations and function calls
● Basic tail-call optimization
● Var declarations
● Few primitive operators
● Number, string, boolean and null literals
● Access to Node builtins via `global`
● Static imports
31. Example
function fib(n, a, b) {
if (n == 0) { return a; }
if (n == 1) { return b; }
return fib(n - 1, b, a + b);
}
function main() {
console.log(fib(50, 0, 1));
}
35. Analysis: the good
● It works!
● DDCG & type propagation seriously reduce bloat
○ Down to 30 LOC C++
● Better boolean checks
○ Local<Boolean> sym_anon_9 = args[0]->StrictEquals(sym_rhs_11) ? True(isolate) : False(isolate);
if (sym_anon_9->IsTrue()) {
● Complexity of common operations hidden by inline functions (in lib.cc)
○ Local<Value> sym_arg_17 = genericMinus(isolate, args[0], sym_rhs_19);
Local<Value> sym_arg_21 = genericPlus(isolate, args[1], args[2]);
36. Analysis: the bad
● What’s with copying args on every function?
○ Isolate* isolate = _args.GetIsolate();
std::vector<Local<Value>> args(_args.Length());;
for (int i = 0; i < _args.Length(); i++) args[i] = _args[i];
● Regression: no source-to-source commenting
● Reduced syntax support in TypeScript port
● Tracking types increases complexity of the compiler
37. Challenges: TypeScript API documentation
● Better than expected
● Could use more!
● E.g. behavior of createProgram and getSourceFiles???
38. Performance?
● Trivial benchmarks only (e.g. fib)
○ Numbers not worth sharing
● First iteration non-TCO was awful
● First iteration TCO was on par with Node
● Second iteration TCO and non-TCO on par with Node
○ Not sure why…
● Need to build out syntax support for more complex microbenchmarks
○ MD5, SHA-1, N Queens etc.
39. What next?
● More syntax support
● Measuring binary size
● Performance benchmarking
● Specialized (unboxed) blocks
● Seamless FFI (embedded C++?)
● Tree-shaking
● Self-hosting
● Own (Node API-compatible) runtime?
● Blogging about ^^^