A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]

T
Tom LeeSenior Engineer at New Relic
A(n abridged) tour of
the Rust compiler
Tom Lee
@tglee
Saturday, March 29, 14
Brace yourselves.
• I’ve contributed some code to Rust
• Mostly fixing crate metadata quirks
• Haven’t touched most of the stuff I’m
covering today
• Sorry in advance for any lies I tell you
Saturday, March 29, 14
What is this?
• We’re digging into the innards of Rust’s
compiler.
• Along the way, I’ll cover some “compilers
101” stuff that may not be common
knowledge.
• Not really covering any of the runtime stuff
-- data representation, garbage collection,
etc.
Saturday, March 29, 14
Intro to compilers
• Most compilers follow a familiar pattern:
scan, parse, generate code
• A scanner converts raw source code into
a stream of tokens.
• A parser converts the stream of tokens
into an intermediate representation.
• A code generator emits the target code
(e.g. bytecode, x86_64 assembly, etc.)
Saturday, March 29, 14
Intro to compilers
(cont.)
• Real-world compilers do other stuff too.
• Semantic analysis often follows the
parse phase.
For example, if the language is statically typed, a type checking
step might happen here.
• Often one or more optimization steps.
• The compiler may also be kind enough to
invoke external tools on your behalf.
Saturday, March 29, 14
A 10,000 foot view of
Rust’s compiler
• Scan
• Parse
• Semantic Analysis
• (Optimizations occur somewhere here)
• Generate target code
• Link object files into an ELF/PE/Mach-O
binary.
Saturday, March 29, 14
A 10,000 ft view (cont.)
• Where does it all begin?
• src/librustc/lib.rs
main(...) and run_compiler(...)
• src/librustc/driver/driver.rs
see compile_input and all the phase_X methods
like phase_1_parse_input,
phase_2_configure_and_expand, etc.
Saturday, March 29, 14
Scanners
• Raw source code goes in e.g.
if (should_buy(goat_simulator)) { ... }
• Tokens come out e.g.
[IF, LPAREN, ID(“should_buy”), LPAREN, ID(“goat_simulator”), RPAREN, RPAREN,
LBRACE, ..., RBRACE]
• This simple translation makes the parser’s
job easier.
Saturday, March 29, 14
Rust’s Scanner
• Fully contained within libsyntax
• src/libsyntax/parse/lexer.rs
(another name for scanning is “lexical analysis”, ergo “lexer”)
Refer to the Reader trait
• src/libsyntax/parse/token.rs
Tokens and keywords defined here.
Saturday, March 29, 14
Parsers
•Nom on a token stream from the
scanner/lexer e.g.
[IF, LPAREN, ID(“should_buy”), LPAREN, ID(“goat_simulator”),
RPAREN, RPAREN, LBRACE, ..., RBRACE]
•Apply grammar rules to convert the token
stream into an Abstract Syntax Tree
(or some other representative data structure)
Saturday, March 29, 14
Abstract Syntax Trees
• Or “AST”
• Data structure representing the syntactic
structure of your source program.
• Abstract in that it omits unnecessary crap
(parentheses, quotes, etc.)
Saturday, March 29, 14
Abstract Syntax Trees
(cont.)
If(
Call(
Id(“should_buy”),
[Id(“goat_simulator”)]),
[...])
example AST for input
“if (should_buy(goat_simulator)) { ... }”
Saturday, March 29, 14
Rust’s Parser and AST
• Also fully contained within libsyntax
• src/libsyntax/ast.rs
the Expr_ enum is an interesting starting point, containing the
AST representations of most Rust expressions.
• src/libsyntax/parse/mod.rs
see parse_crate_from_file
• src/libsyntax/parse/parser.rs
Most of the interesting stuff is in impl<‘a> Parser<‘a>.
Maybe check out parse_while_expr, for example.
Saturday, March 29, 14
Semantic Analysis
• Language- & implementation-specific, but
there are common themes.
• Typically performed by analyzing and/or
annotating the AST (directly or indirectly).
• Statically typed languages often do type
checking etc. here.
Saturday, March 29, 14
Semantic Analysis in
Rust
• Here we apply all the weird & wonderful
rules that make Rust unique.
• Mostly handled by src/librustc/middle/*.rs
• Name resolution (resolve.rs)
• Type checking (typeck/*.rs)
• Much, much more...
see phase_3_run_analysis_passes in compile_input
for the full details
Saturday, March 29, 14
Semantic Analysis in Rust:
Name Resolution
• src/librustc/middle/resolve.rs
• Resolve names
“what does this name mean in this context?”
• Type? Function? Local variable?
• Rust has two namespaces: types and values
this is why you can e.g. refer to the str type and the str
module at the same time
• resolve_item seems to be the real
workhorse here.
Saturday, March 29, 14
Semantic Analysis in Rust:
Type Checking
• src/librustc/middle/typeck/mod.rs
see check_crate
• Infer and unify types.
• Using inferred & explicit type info, ensure
that the input program satisfies all of Rust’s
type rules.
Saturday, March 29, 14
Semantic Analysis in Rust:
Rust-y Stuff
•A borrow checking pass enforces
memory safety rules
see src/librustc/middle/borrowck/doc.rs for details
•An effect checking pass to ensure that
unsafe operations occur in unsafe contexts.
see src/librustc/middle/effect.rs
•A kind checking pass enforces special
rules for built-in traits like Send and Drop
see src/librustc/middle/kind.rs
Saturday, March 29, 14
Semantic Analysis in Rust:
More Rust-y Stuff
•A compute moves pass to determine
whether the use of a value will result in a
move in a given expression.
Important to enforce rules on non-copyable (”linear”) types.
see src/librustc/middle/moves.rs
Saturday, March 29, 14
Code Generators
• Takes an AST as input e.g.
If(Call(Id(“should_buy”), [Id(“goat_simulator”)]), [...])
• Emits some sort of target code e.g.
(some made up bytecode)
LOAD goat_simulator
CALL should_buy
JMPIF if_stmt_body_addr
Saturday, March 29, 14
Rust’s Code Generator
• First, Rust translates the analyzed, type-
checked AST into an LLVM module.
This is phase_4_translate_to_llvm
• src/librustc/middle/trans/base.rs
trans_crate is a good place to start
Saturday, March 29, 14
Rust’s Code Generator
(cont.)
• src/librustc/back/link.rs
• Passes are run over the LLVM module to
write the target code to disk
this is phase_5_run_llvm_passes in driver.rs,
which calls the appropriate stuff on rustc::back::link
• We can tweak the output format using
command line options: assembly code,
LLVM bitcode files, object files, etc.
see build_session_options and the OutputType*
variants as used in driver.rs
Saturday, March 29, 14
Rust’s Code Generator
(cont.)
• If you’re trying to build a native executable,
the previous step will produce object files...
• ... but LLVM won’t link our object files into
a(n ELF/PE) binary.
this is phase_6_link_output
• Rust calls out to the system’s cc program
to do the link step.
see link_binary, link_natively and get_cc_prog
in src/librustc/back/link.rs
Saturday, March 29, 14
1 of 23

Recommended

Sistemas de controle de versão by
Sistemas de controle de versãoSistemas de controle de versão
Sistemas de controle de versãoMarcos Pessoa
2.1K views52 slides
Aula 01 - Revisão Algoritmo 1 by
Aula 01  - Revisão Algoritmo 1Aula 01  - Revisão Algoritmo 1
Aula 01 - Revisão Algoritmo 1Eder Samaniego
1.2K views18 slides
Lógica de Programação - Estrutura de repetição by
Lógica de Programação - Estrutura de repetiçãoLógica de Programação - Estrutura de repetição
Lógica de Programação - Estrutura de repetiçãoWesley R. Bezerra
19.1K views20 slides
Qualidade de Software: Teste de software by
Qualidade de Software: Teste de softwareQualidade de Software: Teste de software
Qualidade de Software: Teste de softwareAlex Camargo
261 views46 slides
BDD em Ação by
BDD em AçãoBDD em Ação
BDD em AçãoUilian Ries
883 views42 slides
Lecture 9 - DSA - Python Data Structures by
Lecture 9 - DSA - Python Data StructuresLecture 9 - DSA - Python Data Structures
Lecture 9 - DSA - Python Data StructuresHaitham El-Ghareeb
2.5K views61 slides

More Related Content

What's hot

Arduino - Projeto GENIUS (jogo de memória) by
Arduino - Projeto GENIUS (jogo de memória)Arduino - Projeto GENIUS (jogo de memória)
Arduino - Projeto GENIUS (jogo de memória)Professor José de Assis
112 views8 slides
Python: Polymorphism by
Python: PolymorphismPython: Polymorphism
Python: PolymorphismDamian T. Gordon
3.2K views16 slides
Programação Estruturada 2 - Curso Completo by
Programação Estruturada 2 - Curso CompletoProgramação Estruturada 2 - Curso Completo
Programação Estruturada 2 - Curso Completothomasdacosta
1.4K views191 slides
NameNode Analytics - Querying HDFS Namespace in Real Time by
NameNode Analytics - Querying HDFS Namespace in Real TimeNameNode Analytics - Querying HDFS Namespace in Real Time
NameNode Analytics - Querying HDFS Namespace in Real TimePlamen Jeliazkov
138 views34 slides
Product Security by
Product SecurityProduct Security
Product SecuritySteven Carlson
320 views33 slides
Limitations of python by
Limitations of pythonLimitations of python
Limitations of pythonBrindaThirumalkumar
125 views4 slides

What's hot(13)

Programação Estruturada 2 - Curso Completo by thomasdacosta
Programação Estruturada 2 - Curso CompletoProgramação Estruturada 2 - Curso Completo
Programação Estruturada 2 - Curso Completo
thomasdacosta1.4K views
NameNode Analytics - Querying HDFS Namespace in Real Time by Plamen Jeliazkov
NameNode Analytics - Querying HDFS Namespace in Real TimeNameNode Analytics - Querying HDFS Namespace in Real Time
NameNode Analytics - Querying HDFS Namespace in Real Time
Plamen Jeliazkov138 views
Fluxograma processo - desenvolvimento de software by Aragon Vieira
Fluxograma   processo - desenvolvimento de softwareFluxograma   processo - desenvolvimento de software
Fluxograma processo - desenvolvimento de software
Aragon Vieira1.8K views
John deere 455 lawn garden tractor service repair manual by fjskedsxmmem
John deere 455 lawn garden tractor service repair manualJohn deere 455 lawn garden tractor service repair manual
John deere 455 lawn garden tractor service repair manual
fjskedsxmmem722 views

Similar to A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]

Rust All Hands Winter 2011 by
Rust All Hands Winter 2011Rust All Hands Winter 2011
Rust All Hands Winter 2011Patrick Walton
8.5K views45 slides
Using Static Analysis in Program Development by
Using Static Analysis in Program DevelopmentUsing Static Analysis in Program Development
Using Static Analysis in Program DevelopmentPVS-Studio
449 views5 slides
Introduction to libre « fulltext » technology by
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyRobert Viseur
561 views46 slides
XML SAX PARSING by
XML SAX PARSING XML SAX PARSING
XML SAX PARSING Eviatar Levy
2.3K views30 slides
Mapreduce Algorithms by
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
28K views36 slides
Compiler_Project_Srikanth_Vanama by
Compiler_Project_Srikanth_VanamaCompiler_Project_Srikanth_Vanama
Compiler_Project_Srikanth_VanamaSrikanth Vanama
415 views25 slides

Similar to A(n abridged) tour of the Rust compiler [PDX-Rust March 2014](20)

Rust All Hands Winter 2011 by Patrick Walton
Rust All Hands Winter 2011Rust All Hands Winter 2011
Rust All Hands Winter 2011
Patrick Walton8.5K views
Using Static Analysis in Program Development by PVS-Studio
Using Static Analysis in Program DevelopmentUsing Static Analysis in Program Development
Using Static Analysis in Program Development
PVS-Studio449 views
Introduction to libre « fulltext » technology by Robert Viseur
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
Robert Viseur561 views
XML SAX PARSING by Eviatar Levy
XML SAX PARSING XML SAX PARSING
XML SAX PARSING
Eviatar Levy2.3K views
Mapreduce Algorithms by Amund Tveit
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
Amund Tveit28K views
Compiler_Project_Srikanth_Vanama by Srikanth Vanama
Compiler_Project_Srikanth_VanamaCompiler_Project_Srikanth_Vanama
Compiler_Project_Srikanth_Vanama
Srikanth Vanama415 views
C++ Advanced Features (TCF 2014) by Michael Redlich
C++ Advanced Features (TCF 2014)C++ Advanced Features (TCF 2014)
C++ Advanced Features (TCF 2014)
Michael Redlich1.4K views
Cd ch2 - lexical analysis by mengistu23
Cd   ch2 - lexical analysisCd   ch2 - lexical analysis
Cd ch2 - lexical analysis
mengistu23193 views
Preventing Complexity in Game Programming by Yaser Zhian
Preventing Complexity in Game ProgrammingPreventing Complexity in Game Programming
Preventing Complexity in Game Programming
Yaser Zhian698 views
"Hints" talk at Walchand College Sangli, March 2017 by Neeran Karnik
"Hints" talk at Walchand College Sangli, March 2017"Hints" talk at Walchand College Sangli, March 2017
"Hints" talk at Walchand College Sangli, March 2017
Neeran Karnik66 views
Language-agnostic data analysis workflows and reproducible research by Andrew Lowe
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
Andrew Lowe197 views
West Coast DevCon 2014: Programming in UE4 - A Quick Orientation for Coders by Gerke Max Preussner
West Coast DevCon 2014: Programming in UE4 - A Quick Orientation for CodersWest Coast DevCon 2014: Programming in UE4 - A Quick Orientation for Coders
West Coast DevCon 2014: Programming in UE4 - A Quick Orientation for Coders
COMPILER CONSTRUCTION KU 1.pptx by Rossy719186
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
Rossy7191862 views
ANTLR - Writing Parsers the Easy Way by Michael Yarichuk
ANTLR - Writing Parsers the Easy WayANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy Way
Michael Yarichuk125 views
Applications of data structures by Wipro
Applications of data structuresApplications of data structures
Applications of data structures
Wipro29.5K views
Elasticsearch Basics by Shifa Khan
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
Shifa Khan4.7K views
TypeScript: Basic Features and Compilation Guide by Nascenia IT
TypeScript: Basic Features and Compilation GuideTypeScript: Basic Features and Compilation Guide
TypeScript: Basic Features and Compilation Guide
Nascenia IT398 views

More from Tom Lee

Inside PHP [OSCON 2012] by
Inside PHP [OSCON 2012]Inside PHP [OSCON 2012]
Inside PHP [OSCON 2012]Tom Lee
2.5K views30 slides
Inside Python [OSCON 2012] by
Inside Python [OSCON 2012]Inside Python [OSCON 2012]
Inside Python [OSCON 2012]Tom Lee
4.8K views31 slides
Open Source Compiler Construction for the JVM by
Open Source Compiler Construction for the JVMOpen Source Compiler Construction for the JVM
Open Source Compiler Construction for the JVMTom Lee
1.2K views14 slides
Open Source in Big Business [LCA2011 Miniconf] by
Open Source in Big Business [LCA2011 Miniconf]Open Source in Big Business [LCA2011 Miniconf]
Open Source in Big Business [LCA2011 Miniconf]Tom Lee
580 views15 slides
Open Source Compiler Construction for the JVM [LCA2011 Miniconf] by
Open Source Compiler Construction for the JVM [LCA2011 Miniconf]Open Source Compiler Construction for the JVM [LCA2011 Miniconf]
Open Source Compiler Construction for the JVM [LCA2011 Miniconf]Tom Lee
995 views19 slides
Open Source in Big Business (OSDC 2010) by
Open Source in Big Business (OSDC 2010)Open Source in Big Business (OSDC 2010)
Open Source in Big Business (OSDC 2010)Tom Lee
721 views15 slides

More from Tom Lee(8)

Inside PHP [OSCON 2012] by Tom Lee
Inside PHP [OSCON 2012]Inside PHP [OSCON 2012]
Inside PHP [OSCON 2012]
Tom Lee2.5K views
Inside Python [OSCON 2012] by Tom Lee
Inside Python [OSCON 2012]Inside Python [OSCON 2012]
Inside Python [OSCON 2012]
Tom Lee4.8K views
Open Source Compiler Construction for the JVM by Tom Lee
Open Source Compiler Construction for the JVMOpen Source Compiler Construction for the JVM
Open Source Compiler Construction for the JVM
Tom Lee1.2K views
Open Source in Big Business [LCA2011 Miniconf] by Tom Lee
Open Source in Big Business [LCA2011 Miniconf]Open Source in Big Business [LCA2011 Miniconf]
Open Source in Big Business [LCA2011 Miniconf]
Tom Lee580 views
Open Source Compiler Construction for the JVM [LCA2011 Miniconf] by Tom Lee
Open Source Compiler Construction for the JVM [LCA2011 Miniconf]Open Source Compiler Construction for the JVM [LCA2011 Miniconf]
Open Source Compiler Construction for the JVM [LCA2011 Miniconf]
Tom Lee995 views
Open Source in Big Business (OSDC 2010) by Tom Lee
Open Source in Big Business (OSDC 2010)Open Source in Big Business (OSDC 2010)
Open Source in Big Business (OSDC 2010)
Tom Lee721 views
Hugging Abstract Syntax Trees: A Pythonic Love Story (OSDC 2010) by Tom Lee
Hugging Abstract Syntax Trees: A Pythonic Love Story (OSDC 2010)Hugging Abstract Syntax Trees: A Pythonic Love Story (OSDC 2010)
Hugging Abstract Syntax Trees: A Pythonic Love Story (OSDC 2010)
Tom Lee2K views
Python Compiler Internals Presentation Slides by Tom Lee
Python Compiler Internals Presentation SlidesPython Compiler Internals Presentation Slides
Python Compiler Internals Presentation Slides
Tom Lee2.3K views

Recently uploaded

Copilot Prompting Toolkit_All Resources.pdf by
Copilot Prompting Toolkit_All Resources.pdfCopilot Prompting Toolkit_All Resources.pdf
Copilot Prompting Toolkit_All Resources.pdfRiccardo Zamana
8 views4 slides
Fleet Management Software in India by
Fleet Management Software in India Fleet Management Software in India
Fleet Management Software in India Fleetable
11 views1 slide
Keep by
KeepKeep
KeepGeniusee
75 views10 slides
El Arte de lo Possible by
El Arte de lo PossibleEl Arte de lo Possible
El Arte de lo PossibleNeo4j
39 views35 slides
SAP FOR CONTRACT MANUFACTURING.pdf by
SAP FOR CONTRACT MANUFACTURING.pdfSAP FOR CONTRACT MANUFACTURING.pdf
SAP FOR CONTRACT MANUFACTURING.pdfVirendra Rai, PMP
11 views2 slides
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut... by
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...Deltares
6 views28 slides

Recently uploaded(20)

Copilot Prompting Toolkit_All Resources.pdf by Riccardo Zamana
Copilot Prompting Toolkit_All Resources.pdfCopilot Prompting Toolkit_All Resources.pdf
Copilot Prompting Toolkit_All Resources.pdf
Riccardo Zamana8 views
Fleet Management Software in India by Fleetable
Fleet Management Software in India Fleet Management Software in India
Fleet Management Software in India
Fleetable11 views
El Arte de lo Possible by Neo4j
El Arte de lo PossibleEl Arte de lo Possible
El Arte de lo Possible
Neo4j39 views
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut... by Deltares
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
Deltares6 views
Neo4j y GenAI by Neo4j
Neo4j y GenAI Neo4j y GenAI
Neo4j y GenAI
Neo4j45 views
Roadmap y Novedades de producto by Neo4j
Roadmap y Novedades de productoRoadmap y Novedades de producto
Roadmap y Novedades de producto
Neo4j50 views
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the... by Deltares
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
Deltares6 views
Advanced API Mocking Techniques by Dimpy Adhikary
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking Techniques
Dimpy Adhikary19 views
Navigating container technology for enhanced security by Niklas Saari by Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy12 views
Generic or specific? Making sensible software design decisions by Bert Jan Schrijver
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... by Marc Müller
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Marc Müller37 views
Software testing company in India.pptx by SakshiPatel82
Software testing company in India.pptxSoftware testing company in India.pptx
Software testing company in India.pptx
SakshiPatel827 views
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM... by Deltares
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...
Deltares7 views
Elevate your SAP landscape's efficiency and performance with HCL Workload Aut... by HCLSoftware
Elevate your SAP landscape's efficiency and performance with HCL Workload Aut...Elevate your SAP landscape's efficiency and performance with HCL Workload Aut...
Elevate your SAP landscape's efficiency and performance with HCL Workload Aut...
HCLSoftware6 views
360 graden fabriek by info33492
360 graden fabriek360 graden fabriek
360 graden fabriek
info3349236 views

A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]

  • 1. A(n abridged) tour of the Rust compiler Tom Lee @tglee Saturday, March 29, 14
  • 2. Brace yourselves. • I’ve contributed some code to Rust • Mostly fixing crate metadata quirks • Haven’t touched most of the stuff I’m covering today • Sorry in advance for any lies I tell you Saturday, March 29, 14
  • 3. What is this? • We’re digging into the innards of Rust’s compiler. • Along the way, I’ll cover some “compilers 101” stuff that may not be common knowledge. • Not really covering any of the runtime stuff -- data representation, garbage collection, etc. Saturday, March 29, 14
  • 4. Intro to compilers • Most compilers follow a familiar pattern: scan, parse, generate code • A scanner converts raw source code into a stream of tokens. • A parser converts the stream of tokens into an intermediate representation. • A code generator emits the target code (e.g. bytecode, x86_64 assembly, etc.) Saturday, March 29, 14
  • 5. Intro to compilers (cont.) • Real-world compilers do other stuff too. • Semantic analysis often follows the parse phase. For example, if the language is statically typed, a type checking step might happen here. • Often one or more optimization steps. • The compiler may also be kind enough to invoke external tools on your behalf. Saturday, March 29, 14
  • 6. A 10,000 foot view of Rust’s compiler • Scan • Parse • Semantic Analysis • (Optimizations occur somewhere here) • Generate target code • Link object files into an ELF/PE/Mach-O binary. Saturday, March 29, 14
  • 7. A 10,000 ft view (cont.) • Where does it all begin? • src/librustc/lib.rs main(...) and run_compiler(...) • src/librustc/driver/driver.rs see compile_input and all the phase_X methods like phase_1_parse_input, phase_2_configure_and_expand, etc. Saturday, March 29, 14
  • 8. Scanners • Raw source code goes in e.g. if (should_buy(goat_simulator)) { ... } • Tokens come out e.g. [IF, LPAREN, ID(“should_buy”), LPAREN, ID(“goat_simulator”), RPAREN, RPAREN, LBRACE, ..., RBRACE] • This simple translation makes the parser’s job easier. Saturday, March 29, 14
  • 9. Rust’s Scanner • Fully contained within libsyntax • src/libsyntax/parse/lexer.rs (another name for scanning is “lexical analysis”, ergo “lexer”) Refer to the Reader trait • src/libsyntax/parse/token.rs Tokens and keywords defined here. Saturday, March 29, 14
  • 10. Parsers •Nom on a token stream from the scanner/lexer e.g. [IF, LPAREN, ID(“should_buy”), LPAREN, ID(“goat_simulator”), RPAREN, RPAREN, LBRACE, ..., RBRACE] •Apply grammar rules to convert the token stream into an Abstract Syntax Tree (or some other representative data structure) Saturday, March 29, 14
  • 11. Abstract Syntax Trees • Or “AST” • Data structure representing the syntactic structure of your source program. • Abstract in that it omits unnecessary crap (parentheses, quotes, etc.) Saturday, March 29, 14
  • 12. Abstract Syntax Trees (cont.) If( Call( Id(“should_buy”), [Id(“goat_simulator”)]), [...]) example AST for input “if (should_buy(goat_simulator)) { ... }” Saturday, March 29, 14
  • 13. Rust’s Parser and AST • Also fully contained within libsyntax • src/libsyntax/ast.rs the Expr_ enum is an interesting starting point, containing the AST representations of most Rust expressions. • src/libsyntax/parse/mod.rs see parse_crate_from_file • src/libsyntax/parse/parser.rs Most of the interesting stuff is in impl<‘a> Parser<‘a>. Maybe check out parse_while_expr, for example. Saturday, March 29, 14
  • 14. Semantic Analysis • Language- & implementation-specific, but there are common themes. • Typically performed by analyzing and/or annotating the AST (directly or indirectly). • Statically typed languages often do type checking etc. here. Saturday, March 29, 14
  • 15. Semantic Analysis in Rust • Here we apply all the weird & wonderful rules that make Rust unique. • Mostly handled by src/librustc/middle/*.rs • Name resolution (resolve.rs) • Type checking (typeck/*.rs) • Much, much more... see phase_3_run_analysis_passes in compile_input for the full details Saturday, March 29, 14
  • 16. Semantic Analysis in Rust: Name Resolution • src/librustc/middle/resolve.rs • Resolve names “what does this name mean in this context?” • Type? Function? Local variable? • Rust has two namespaces: types and values this is why you can e.g. refer to the str type and the str module at the same time • resolve_item seems to be the real workhorse here. Saturday, March 29, 14
  • 17. Semantic Analysis in Rust: Type Checking • src/librustc/middle/typeck/mod.rs see check_crate • Infer and unify types. • Using inferred & explicit type info, ensure that the input program satisfies all of Rust’s type rules. Saturday, March 29, 14
  • 18. Semantic Analysis in Rust: Rust-y Stuff •A borrow checking pass enforces memory safety rules see src/librustc/middle/borrowck/doc.rs for details •An effect checking pass to ensure that unsafe operations occur in unsafe contexts. see src/librustc/middle/effect.rs •A kind checking pass enforces special rules for built-in traits like Send and Drop see src/librustc/middle/kind.rs Saturday, March 29, 14
  • 19. Semantic Analysis in Rust: More Rust-y Stuff •A compute moves pass to determine whether the use of a value will result in a move in a given expression. Important to enforce rules on non-copyable (”linear”) types. see src/librustc/middle/moves.rs Saturday, March 29, 14
  • 20. Code Generators • Takes an AST as input e.g. If(Call(Id(“should_buy”), [Id(“goat_simulator”)]), [...]) • Emits some sort of target code e.g. (some made up bytecode) LOAD goat_simulator CALL should_buy JMPIF if_stmt_body_addr Saturday, March 29, 14
  • 21. Rust’s Code Generator • First, Rust translates the analyzed, type- checked AST into an LLVM module. This is phase_4_translate_to_llvm • src/librustc/middle/trans/base.rs trans_crate is a good place to start Saturday, March 29, 14
  • 22. Rust’s Code Generator (cont.) • src/librustc/back/link.rs • Passes are run over the LLVM module to write the target code to disk this is phase_5_run_llvm_passes in driver.rs, which calls the appropriate stuff on rustc::back::link • We can tweak the output format using command line options: assembly code, LLVM bitcode files, object files, etc. see build_session_options and the OutputType* variants as used in driver.rs Saturday, March 29, 14
  • 23. Rust’s Code Generator (cont.) • If you’re trying to build a native executable, the previous step will produce object files... • ... but LLVM won’t link our object files into a(n ELF/PE) binary. this is phase_6_link_output • Rust calls out to the system’s cc program to do the link step. see link_binary, link_natively and get_cc_prog in src/librustc/back/link.rs Saturday, March 29, 14