Python Compiler Internals Thomas Lee Shine Technologies, Melbourne
My Contributions to Python <ul><li>try/except/finally syntax for 2.5 </li></ul><ul><li>AST to bytecode compilation from Py...
What will you get out of this? Compiler development != rocket science. No magic: it's just code.
Compiler? Isn't Python interpreted? Well, yes and no.
WTF It ... It's Java?
Aaanyway Let's screw with the compiler.
Erm ... Before we start ... A patch to fix a quirk in the build process. This has been reported on the bug tracker!
An overview of the compiler Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode...
New construct: the “unless” statement unless  may_give_you_up: print   “never gonna give you up...”
Semantics of “unless” Works just like “if not” ... if not  would_consider_letting_you_down: print   “never gonna let you d...
Implementing “unless” <ul><li>Modify the Grammar </li></ul><ul><li>Change the AST definition </li></ul><ul><li>Generate by...
So what's first? <ul><li>Modify the Grammar </li></ul><ul><li>Change the AST definition </li></ul><ul><li>Generate bytecod...
What does this change affect? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Byteco...
WTF is a “grammar”? Defines the syntactic structure of the language.
Why a tokenizer? Makes parsing easier.
So what does a tokenizer do, then? Generates a stream of events for the parser.
What does the parser do? Organises tokens into the structure defined by the grammar. This is the “parse tree”.
What's next? <ul><li>Modify the Grammar </li></ul><ul><li>Change the AST definition </li></ul><ul><li>Generate bytecode fo...
What's affected by this change? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Byte...
What's an AST? Similar to a parse tree, but not bound to the syntax “ Abstract Syntax Tree”
Why does Python use an AST? Much easier to operate upon an AST than a parse tree.
AST reuse New constructs can sometimes be expressed using existing AST nodes ...
AST reuse ... so we  could  implement “unless” using “if” and “not” AST nodes. But then I wouldn't be able to show you the...
What's next? <ul><li>Modify the Grammar </li></ul><ul><li>Change the AST definition </li></ul><ul><li>Generate bytecode fo...
What's affected by this change? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Byte...
Generating bytecode for “unless” <ul><li>Add a hook for our new AST node </li></ul><ul><li>Generate bytecode implementing ...
What about the rest of the compiler? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation...
“unless” in action Let's see if this stuff works ...
Where to go from here? <ul><li>Use the Source. </li></ul><ul><li>Don't try to grok it all at once. </li></ul><ul><li>Don't...
Thanks!
Questions?
Upcoming SlideShare
Loading in...5
×

Python Compiler Internals Presentation Slides

4,128

Published on

My OSDC 2008 presentation on the internals of the Python programming language.

Published in: Technology
1 Comment
10 Likes
Statistics
Notes
No Downloads
Views
Total Views
4,128
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
134
Comments
1
Likes
10
Embeds 0
No embeds

No notes for slide

Python Compiler Internals Presentation Slides

  1. 1. Python Compiler Internals Thomas Lee Shine Technologies, Melbourne
  2. 2. My Contributions to Python <ul><li>try/except/finally syntax for 2.5 </li></ul><ul><li>AST to bytecode compilation from Python </li></ul><ul><li>“ optimization” at the AST level </li></ul>
  3. 3. What will you get out of this? Compiler development != rocket science. No magic: it's just code.
  4. 4. Compiler? Isn't Python interpreted? Well, yes and no.
  5. 5. WTF It ... It's Java?
  6. 6. Aaanyway Let's screw with the compiler.
  7. 7. Erm ... Before we start ... A patch to fix a quirk in the build process. This has been reported on the bug tracker!
  8. 8. An overview of the compiler Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  9. 9. New construct: the “unless” statement unless may_give_you_up: print “never gonna give you up...”
  10. 10. Semantics of “unless” Works just like “if not” ... if not would_consider_letting_you_down: print “never gonna let you down...”
  11. 11. Implementing “unless” <ul><li>Modify the Grammar </li></ul><ul><li>Change the AST definition </li></ul><ul><li>Generate bytecode for “unless” </li></ul>
  12. 12. So what's first? <ul><li>Modify the Grammar </li></ul><ul><li>Change the AST definition </li></ul><ul><li>Generate bytecode for “unless” </li></ul>
  13. 13. What does this change affect? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  14. 14. WTF is a “grammar”? Defines the syntactic structure of the language.
  15. 15. Why a tokenizer? Makes parsing easier.
  16. 16. So what does a tokenizer do, then? Generates a stream of events for the parser.
  17. 17. What does the parser do? Organises tokens into the structure defined by the grammar. This is the “parse tree”.
  18. 18. What's next? <ul><li>Modify the Grammar </li></ul><ul><li>Change the AST definition </li></ul><ul><li>Generate bytecode for “unless” </li></ul>
  19. 19. What's affected by this change? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  20. 20. What's an AST? Similar to a parse tree, but not bound to the syntax “ Abstract Syntax Tree”
  21. 21. Why does Python use an AST? Much easier to operate upon an AST than a parse tree.
  22. 22. AST reuse New constructs can sometimes be expressed using existing AST nodes ...
  23. 23. AST reuse ... so we could implement “unless” using “if” and “not” AST nodes. But then I wouldn't be able to show you the code generator ... Unless (test, body) = If ( Not (test), body)
  24. 24. What's next? <ul><li>Modify the Grammar </li></ul><ul><li>Change the AST definition </li></ul><ul><li>Generate bytecode for “unless” </li></ul>
  25. 25. What's affected by this change? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  26. 26. Generating bytecode for “unless” <ul><li>Add a hook for our new AST node </li></ul><ul><li>Generate bytecode implementing “unless” logic </li></ul><ul><li>Use basicblocks as labels for jumps </li></ul>
  27. 27. What about the rest of the compiler? Tokens Parse Tree AST Bytecode + Data AST + Symtable Tokenizer Parser AST Translation Bytecode Generation Bytecode Optimization Execution Symtable Construction
  28. 28. “unless” in action Let's see if this stuff works ...
  29. 29. Where to go from here? <ul><li>Use the Source. </li></ul><ul><li>Don't try to grok it all at once. </li></ul><ul><li>Don't be afraid to be wrong. </li></ul><ul><li>Annoy the python-dev mailing list, like I do! </li></ul>
  30. 30. Thanks!
  31. 31. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×