Your SlideShare is downloading. ×
A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]


Published on

Some notes on the internals of the Rust compiler.

Some notes on the internals of the Rust compiler.

Published in: Software

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. A(n abridged) tour of the Rust compiler Tom Lee @tglee Saturday, March 29, 14
  • 2. Brace yourselves. • I’ve contributed some code to Rust • Mostly fixing crate metadata quirks • Haven’t touched most of the stuff I’m covering today • Sorry in advance for any lies I tell you Saturday, March 29, 14
  • 3. What is this? • We’re digging into the innards of Rust’s compiler. • Along the way, I’ll cover some “compilers 101” stuff that may not be common knowledge. • Not really covering any of the runtime stuff -- data representation, garbage collection, etc. Saturday, March 29, 14
  • 4. Intro to compilers • Most compilers follow a familiar pattern: scan, parse, generate code • A scanner converts raw source code into a stream of tokens. • A parser converts the stream of tokens into an intermediate representation. • A code generator emits the target code (e.g. bytecode, x86_64 assembly, etc.) Saturday, March 29, 14
  • 5. Intro to compilers (cont.) • Real-world compilers do other stuff too. • Semantic analysis often follows the parse phase. For example, if the language is statically typed, a type checking step might happen here. • Often one or more optimization steps. • The compiler may also be kind enough to invoke external tools on your behalf. Saturday, March 29, 14
  • 6. A 10,000 foot view of Rust’s compiler • Scan • Parse • Semantic Analysis • (Optimizations occur somewhere here) • Generate target code • Link object files into an ELF/PE/Mach-O binary. Saturday, March 29, 14
  • 7. A 10,000 ft view (cont.) • Where does it all begin? • src/librustc/ main(...) and run_compiler(...) • src/librustc/driver/ see compile_input and all the phase_X methods like phase_1_parse_input, phase_2_configure_and_expand, etc. Saturday, March 29, 14
  • 8. Scanners • Raw source code goes in e.g. if (should_buy(goat_simulator)) { ... } • Tokens come out e.g. [IF, LPAREN, ID(“should_buy”), LPAREN, ID(“goat_simulator”), RPAREN, RPAREN, LBRACE, ..., RBRACE] • This simple translation makes the parser’s job easier. Saturday, March 29, 14
  • 9. Rust’s Scanner • Fully contained within libsyntax • src/libsyntax/parse/ (another name for scanning is “lexical analysis”, ergo “lexer”) Refer to the Reader trait • src/libsyntax/parse/ Tokens and keywords defined here. Saturday, March 29, 14
  • 10. Parsers •Nom on a token stream from the scanner/lexer e.g. [IF, LPAREN, ID(“should_buy”), LPAREN, ID(“goat_simulator”), RPAREN, RPAREN, LBRACE, ..., RBRACE] •Apply grammar rules to convert the token stream into an Abstract Syntax Tree (or some other representative data structure) Saturday, March 29, 14
  • 11. Abstract Syntax Trees • Or “AST” • Data structure representing the syntactic structure of your source program. • Abstract in that it omits unnecessary crap (parentheses, quotes, etc.) Saturday, March 29, 14
  • 12. Abstract Syntax Trees (cont.) If( Call( Id(“should_buy”), [Id(“goat_simulator”)]), [...]) example AST for input “if (should_buy(goat_simulator)) { ... }” Saturday, March 29, 14
  • 13. Rust’s Parser and AST • Also fully contained within libsyntax • src/libsyntax/ the Expr_ enum is an interesting starting point, containing the AST representations of most Rust expressions. • src/libsyntax/parse/ see parse_crate_from_file • src/libsyntax/parse/ Most of the interesting stuff is in impl<‘a> Parser<‘a>. Maybe check out parse_while_expr, for example. Saturday, March 29, 14
  • 14. Semantic Analysis • Language- & implementation-specific, but there are common themes. • Typically performed by analyzing and/or annotating the AST (directly or indirectly). • Statically typed languages often do type checking etc. here. Saturday, March 29, 14
  • 15. Semantic Analysis in Rust • Here we apply all the weird & wonderful rules that make Rust unique. • Mostly handled by src/librustc/middle/*.rs • Name resolution ( • Type checking (typeck/*.rs) • Much, much more... see phase_3_run_analysis_passes in compile_input for the full details Saturday, March 29, 14
  • 16. Semantic Analysis in Rust: Name Resolution • src/librustc/middle/ • Resolve names “what does this name mean in this context?” • Type? Function? Local variable? • Rust has two namespaces: types and values this is why you can e.g. refer to the str type and the str module at the same time • resolve_item seems to be the real workhorse here. Saturday, March 29, 14
  • 17. Semantic Analysis in Rust: Type Checking • src/librustc/middle/typeck/ see check_crate • Infer and unify types. • Using inferred & explicit type info, ensure that the input program satisfies all of Rust’s type rules. Saturday, March 29, 14
  • 18. Semantic Analysis in Rust: Rust-y Stuff •A borrow checking pass enforces memory safety rules see src/librustc/middle/borrowck/ for details •An effect checking pass to ensure that unsafe operations occur in unsafe contexts. see src/librustc/middle/ •A kind checking pass enforces special rules for built-in traits like Send and Drop see src/librustc/middle/ Saturday, March 29, 14
  • 19. Semantic Analysis in Rust: More Rust-y Stuff •A compute moves pass to determine whether the use of a value will result in a move in a given expression. Important to enforce rules on non-copyable (”linear”) types. see src/librustc/middle/ Saturday, March 29, 14
  • 20. Code Generators • Takes an AST as input e.g. If(Call(Id(“should_buy”), [Id(“goat_simulator”)]), [...]) • Emits some sort of target code e.g. (some made up bytecode) LOAD goat_simulator CALL should_buy JMPIF if_stmt_body_addr Saturday, March 29, 14
  • 21. Rust’s Code Generator • First, Rust translates the analyzed, type- checked AST into an LLVM module. This is phase_4_translate_to_llvm • src/librustc/middle/trans/ trans_crate is a good place to start Saturday, March 29, 14
  • 22. Rust’s Code Generator (cont.) • src/librustc/back/ • Passes are run over the LLVM module to write the target code to disk this is phase_5_run_llvm_passes in, which calls the appropriate stuff on rustc::back::link • We can tweak the output format using command line options: assembly code, LLVM bitcode files, object files, etc. see build_session_options and the OutputType* variants as used in Saturday, March 29, 14
  • 23. Rust’s Code Generator (cont.) • If you’re trying to build a native executable, the previous step will produce object files... • ... but LLVM won’t link our object files into a(n ELF/PE) binary. this is phase_6_link_output • Rust calls out to the system’s cc program to do the link step. see link_binary, link_natively and get_cc_prog in src/librustc/back/ Saturday, March 29, 14