Python Compiler Internals Thomas Lee Shine Technologies, Melbourne
My Contributions to Python try/except/finally syntax for 2.5 AST to bytecode compilation from Python “ optimization” at the AST level
What will you get out of this? Compiler development != rocket science. No magic: it's just code.
Compiler? Isn't Python interpreted? Well, yes and no.
WTF It ... It's Java?
Aaanyway Let's screw with the compiler.
Erm ... Before we start ... A patch to fix a quirk in the build process. This has been reported on the bug tracker!
An overview of the compiler Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
New construct: the “unless” statement unless  may_give_you_up: print   “never gonna give you up...”
Semantics of “unless” Works just like “if not” ... if not  would_consider_letting_you_down: print   “never gonna let you down...”
Implementing “unless” Modify the Grammar Change the AST definition Generate bytecode for “unless”
So what's first? Modify the Grammar Change the AST definition Generate bytecode for “unless”
What does this change affect? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
WTF is a “grammar”? Defines the syntactic structure of the language.
Why a tokenizer? Makes parsing easier.
So what does a tokenizer do, then? Generates a stream of events for the parser.
What does the parser do? Organises tokens into the structure defined by the grammar. This is the “parse tree”.
What's next? Modify the Grammar Change the AST definition Generate bytecode for “unless”
What's affected by this change? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
What's an AST? Similar to a parse tree, but not bound to the syntax “ Abstract Syntax Tree”
Why does Python use an AST? Much easier to operate upon an AST than a parse tree.
AST reuse New constructs can sometimes be expressed using existing AST nodes ...
AST reuse ... so we  could  implement “unless” using “if” and “not” AST nodes. But then I wouldn't be able to show you the code generator ... Unless (test, body) =  If ( Not (test), body)
What's next? Modify the Grammar Change the AST definition Generate bytecode for “unless”
What's affected by this change? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
Generating bytecode for “unless” Add a hook for our new AST node Generate bytecode implementing “unless” logic Use basicblocks as labels for jumps
What about the rest of the compiler? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
“unless” in action Let's see if this stuff works ...
Where to go from here? Use the Source. Don't try to grok it all at once. Don't be afraid to be wrong. Annoy the python-dev mailing list, like I do!
Thanks!
Questions?

Python Compiler Internals Presentation Slides

  • 1.
    Python Compiler InternalsThomas Lee Shine Technologies, Melbourne
  • 2.
    My Contributions toPython try/except/finally syntax for 2.5 AST to bytecode compilation from Python “ optimization” at the AST level
  • 3.
    What will youget out of this? Compiler development != rocket science. No magic: it's just code.
  • 4.
    Compiler? Isn't Pythoninterpreted? Well, yes and no.
  • 5.
    WTF It ...It's Java?
  • 6.
    Aaanyway Let's screwwith the compiler.
  • 7.
    Erm ... Beforewe start ... A patch to fix a quirk in the build process. This has been reported on the bug tracker!
  • 8.
    An overview ofthe compiler Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  • 9.
    New construct: the“unless” statement unless may_give_you_up: print “never gonna give you up...”
  • 10.
    Semantics of “unless”Works just like “if not” ... if not would_consider_letting_you_down: print “never gonna let you down...”
  • 11.
    Implementing “unless” Modifythe Grammar Change the AST definition Generate bytecode for “unless”
  • 12.
    So what's first?Modify the Grammar Change the AST definition Generate bytecode for “unless”
  • 13.
    What does thischange affect? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  • 14.
    WTF is a“grammar”? Defines the syntactic structure of the language.
  • 15.
    Why a tokenizer?Makes parsing easier.
  • 16.
    So what doesa tokenizer do, then? Generates a stream of events for the parser.
  • 17.
    What does theparser do? Organises tokens into the structure defined by the grammar. This is the “parse tree”.
  • 18.
    What's next? Modifythe Grammar Change the AST definition Generate bytecode for “unless”
  • 19.
    What's affected bythis change? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  • 20.
    What's an AST?Similar to a parse tree, but not bound to the syntax “ Abstract Syntax Tree”
  • 21.
    Why does Pythonuse an AST? Much easier to operate upon an AST than a parse tree.
  • 22.
    AST reuse Newconstructs can sometimes be expressed using existing AST nodes ...
  • 23.
    AST reuse ...so we could implement “unless” using “if” and “not” AST nodes. But then I wouldn't be able to show you the code generator ... Unless (test, body) = If ( Not (test), body)
  • 24.
    What's next? Modifythe Grammar Change the AST definition Generate bytecode for “unless”
  • 25.
    What's affected bythis change? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  • 26.
    Generating bytecode for“unless” Add a hook for our new AST node Generate bytecode implementing “unless” logic Use basicblocks as labels for jumps
  • 27.
    What about therest of the compiler? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  • 28.
    “unless” in actionLet's see if this stuff works ...
  • 29.
    Where to gofrom here? Use the Source. Don't try to grok it all at once. Don't be afraid to be wrong. Annoy the python-dev mailing list, like I do!
  • 30.
  • 31.