Introduction to Compiler Construction


Published on

Translation of a program written in a source language into a semantically equivalent program written in a target language
It also reports to its users the presence of errors in the source program

Published in: Education, Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Code improvement - unclear slide
  • preliminary storage map => prepare symbol table? shape code for the back end => same as produce IR?
  • character string value for a token is a lexeme Typical tokens: id, number, do, end … Key issue is speed
  • Introduction to Compiler Construction

    1. 1. Introduction to Compiler Construction (Lecture 1)
    2. 2. Natural Languages • What are Natural Languages? • How do you understand the language? • If you know multiple languages then how can you recognize each of them? • How you know which sentence is correct and which one is incorrect?
    3. 3. Programming Languages • What are programming languages? • How do you understand the programming language? • If you know multiple programming languages then how can you recognize each of them? • How do you know which syntax is correct and which one is incorrect?
    4. 4. Compilers and Interpreters • “Compilation” – Translation of a program written in a source language into a semantically equivalent program written in a target language – It also reports to its users the presence of errors in the source program – C++ uses compiler Compiler Error messages Source Program Target Program Input Output4
    5. 5. Compilers and Interpreters Interpreter Source Program Input Output Error messages • “Interpretation” – Interpreter is a program that reads an executable program and produces the results of running that program. OR – Instead of producing a target program as a translation, an interpreter performs the operations implied by the source program. – GWBASIC is an example of Interpreter 5
    6. 6. Why study compilers? • Application of a wide range of theoretical techniques – Data Structures – Theory of Computation – Algorithms – Computer Architecture • Good SW engineering experience • Better understanding of programming languages
    7. 7. Features of compilers • Correctness – preserve the meaning of the code • Speed of target code • Recognize legal and illegal program. • Speed of compilation • Good error reporting/handling • Cooperation with the debugger • Manage storage of all variables and codes. • Support for separate compilation
    8. 8. Introduction to Compiler Construction (Lecture 2)
    9. 9. Classification of Compilers 1. Single Pass Compilers 2. Two Pass Compilers 3. Multipass Compilers
    10. 10. Single Pass Compiler • Source code directly transforms into machine code. – For example Pascal source code target code Front EndCompiler
    11. 11. Two Pass Compiler • Use intermediate representation – Why? source code target code Front End Back End IR Front End
    12. 12. Two pass compiler • intermediate representation (IR) • front end maps legal code into IR • back end maps IR onto target machine • simplify retargeting • allows multiple front ends • multiple passes ⇒ better code 12
    13. 13. © Oscar Nierstrasz Multipass compiler • analyzes and changes IR • goal is to reduce runtime • must preserve values 13
    14. 14. Comparison • One pass compilers are generally faster than Multipass Compilers • Multipass ensures the correctness of small program rather than the correctness of a large program (high quality code)
    15. 15. Lecture 3
    16. 16. Front end • recognize legal code • report errors • produce IR • preliminary storage map • shape code for the back end 16
    17. 17. Scanner • Breaks the source code text into small pieces called tokens. • It is also known as Lexical Analyzer
    18. 18. Scanner / Lexical Analyser • map characters to tokens • character string value for a token is a lexeme • eliminate white space x = x + y <id,x> = <id,x> + <id,y> 18
    19. 19. Syntactic Analysis – Parsing Majid ate the apple
    20. 20. Front end –Analysis– Machine Independent • The front end consists of those phases, that depend primarily on the source language and are largely independent of the target machine.
    21. 21. Parser • recognize context-free syntax • guide context-sensitive analysis • construct IR(s) • produce meaningful error messages • attempt error correction 21
    22. 22. BACK END • Synthesis process • Machine dependent • The back end includes those portions of the compiler that depends on the target machine and generally, these portions do not depend on the source language
    23. 23. Back end • translate IR into target machine code • choose instructions for each IR operation • decide what to keep in registers at each point • ensure conformance with system interfaces 23
    24. 24. Compiler Structure • Front end – Front end Maps legal code into IR – Recognize legal/illegal programs • report/handle errors – Generate IR – The process can be automated • Back end – Translate IR into target code • instruction selection • register allocation • instruction scheduling
    25. 25. Lecture 4
    26. 26. The Analysis-Synthesis Model of Compilation • There are two parts to compilation: – Analysis determines the operations implied by the source program which are recorded in a tree structure – Synthesis takes the tree structure and translates the operations therein into the target program 26
    27. 27. ANALYSIS PROCEDURE • During analysis, the operation implied by the source program are determined and recorded in a hierarchical structure called a tree. • Often a special type of tree called a Syntax tree in which each node represents an operation and the children of a node represent the arguments of the operation.
    28. 28. Lexical Analyzer Syntax Analyzer Semantic Analyzer character stream position = initial + rate * 60 <id,1> <=> <id,2> <+> <id,3> <*> <60> = <id,1> <id,2> <id,3> + * 60 = <id,1> <id,2> <id,3> + * inttofloat 60
    29. 29. REMEMBER The front end is responsible for analysis process while the back end is responsible for Synthesis
    30. 30. Other Tools that Use the Analysis-Synthesis Model • Editors (syntax highlighting) • Pretty printers (e.g. Doxygen) • Static checkers (e.g. Lint and Splint) • Interpreters • Text formatters (e.g. TeX and LaTeX) • Silicon compilers (e.g. VHDL) • Query interpreters/compilers (Databases) 30
    31. 31. Structure Editors • A structure editor takes as input a sequence of commands to build a source program. • The structure editor not only performs the text creation and modification functions of an ordinary text editor but it also analyzes the program text, putting an appropriate hierarchical structure on the source program. • Thus the structure editor can perform additional tasks that are useful in the preparation of programs.
    32. 32. Structure Editors (cont..) • For example, it can check that the input is correctly formed, can supply key words automatically (e.g. when the user types while the editor supplies the matching do and reminds the user that a conditional must come between them).
    33. 33. Pretty printers • A pretty printer analyzes a program and prints it in such a way that the structure of the program becomes clearly visible. • For example comments may appear in a special font, and the statements may appear with an amount of indentation proportional to the depth of their nesting in the hierarchical organization of the statement.
    34. 34. Static Checkers • A static checker reads a program, analyzes it, and attempts to discover potential bugs without running the program. • A static checker may detect that parts of the source program can never be executed, or that a certain variable might be used before being defined. • In addition, it can catch logical errors such as trying to use a real variable as a pointer, employing the type checking techniques.
    35. 35. Interpreters• Instead of producing a target program as a translation, an interpreter performs the operations implied by the source program. • For example, for an assignment statement an interpreter might build a tree and then carry out the operations at the nodes as it “walks” the tree. := <id,1> <id,2> <id,3> + * 60 position := initial + rate * 60
    36. 36. Interpreters (cont..)• At the root it would discover it had an assignment to perform, so it would call a routine to evaluate the expression on the right, and then store the resulting value in the location associated with the identifier position. • At the right child of the root, the routine would discover it had to compute the sum of two expressions • It would call itself recursively to compute the value of expression rate * 60 • It would then add that value to the value of the variable initial
    37. 37. Text Formatters • A text formatter takes input that is a stream of characters, most of which is text to be typeset, but some of which includes commands to indicate paragraphs, figures or mathematical structures like subscripts and superscripts.
    38. 38. Silicon compilers • A silicon compiler has a source language that is similar or identical to a conventional programming language. • However, the variables of the language represent, not locations in memory but logical signals (0 or 1) or groups of signals in a switching circuit.
    39. 39. Query interpreters • A query interpreter translates a predicate containing relational and Boolean operators into commands to search a database for records satisfying that predicate.
    40. 40. JIT compilation