Interm codegen

881 views
709 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
881
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
79
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Interm codegen

  1. 1. IntermedIate Code GeneratIon Sarfaraz MaSood Asstt Prof, Department of Computer Engineering Jamia Millia University New Delhi
  2. 2. CS 540 GMU Spring 2007 2 Compiler Architecture Scanner (lexical analysis) Parser (syntax analysis) Code Optimizer Semantic Analysis (IC generator) Code Generator Symbol Table Source language tokens Syntactic structure Intermediate Code Target language Intermediate Code Intermediate Code
  3. 3. Joey Paquet, 2000, 2002 3 Introduction to Code Generation • Front end: – Lexical Analysis – Syntactic Analysis – Intermediate Code Generation • Back end: – Intermediate Code Optimization – Object Code Generation • The front end is machine-independent, i.e. it can be reused to build compilers for different architectures • The back end is machine-dependent, i.e. these steps are related to the nature of the assembly or machine language of the target architecture
  4. 4. 08/31/13 4 Introduction to Code Generation Target-1 Code Generator Target-2 Code Generator Intermediate-code Optimizer Language-1 Front End Source program in Language-1 Language-2 Front End Source program in Language-2 Non-optimized Intermediate Code Optimized Intermediate Code Target-1 machine code Target-2 machine code
  5. 5. Joey Paquet, 2000, 2002 5 Introduction to Code Generation • After syntactic analysis, we have a number of options to choose from: – generate object code directly from the parse – generate intermediate code, and then generate object code from it – generate an intermediate abstract representation, and then generate code directly from it – generate an intermediate abstract representation, generate intermediate code, and then the object code • All these options have one thing in common: they are all based on syntactic information gathered in the semantic analysis
  6. 6. Joey Paquet, 2000, 2002 6 Introduction to Code Generation Syntactic Analyzer Object Code Syntactic Analyzer Intermediate Representation Object Code Lexical Analyzer Lexical Analyzer Lexical Analyzer Syntactic Analyzer Intermediate Representation Intermediate Code Object Code Syntactic Analyzer Intermediate Code Object Code Lexical Analyzer Front End Back End
  7. 7. 08/31/13 7 Intermediate Representation (IR) A kind of abstract machine language that can express the target machine operations without committing to too much machine details. •Why IR ?
  8. 8. 08/31/13 8 Without IR C Pascal FORTRAN C++ SPARC HP PA x86 IBM PPC
  9. 9. 08/31/13 9 With IR C Pascal FORTRAN C++ SPARC HP PA x86 IBM PPC IR
  10. 10. 08/31/13 10 With IR C Pascal FORTRAN C++ IR Common Backend ?
  11. 11. 08/31/13 11 Advantages of Using an Intermediate Language 1. Retargeting - Build a compiler for a new machine by attaching a new code generator to an existing front-end. 2. Optimization - reuse intermediate code optimizers in compilers for different languages and different machines. Note: the terms “intermediate code”, “intermediate language”, and “intermediate representation” are all used interchangeably.
  12. 12. 08/31/13 12 Issues in Designing an IR  Whether to use an existing IR  if target machine architecture is similar  if the new language is similar  Whether the IR is appropriate for the kind of optimizations to be performed  e.g. speculation and predication  some transformations may take much longer than they would on a different IR
  13. 13. 08/31/13 13 Issues in Designing an IR  Designing a new IR needs to consider  Level (how machine dependent it is)  Structure  Expressiveness  Appropriateness for general and special optimizations  Appropriateness for code generation  Whether multiple IRs should be used
  14. 14. what are the IR in actual compilers? • gcc is a widely used compiler on many platforms it uses two IRs: AST (Abstract Syntax Tree) and RTL (Register Transfer Language), and some development paths are using Tree-SSA [SSA: Static Single Assignment: each name is assigned once. We will talk about this later!] • VM can be seen as a new type of IR Java Bytecode .Net IL some programming languages have well defined intermediate languages.  java – java virtual machine  prolog – warren abstract machine  In fact, there are byte-code emulators to execute instructions in these intermediate languages.
  15. 15. Intermediate Code Generation • Direct Translation – Using SDT scheme – Parse tree to Three-Address Instructions – Can be done while parsing in a single pass – Needs to be able to deal with Syntactic Errors and Recovery • Indirect Translation – First validate parsing constructing of AST – Uses SDT scheme to build AST – Traverse the AST and generate Three Address Instructions Intermediate Code Generation O(n) IR IR Three-Address Instructions ∞ regs Parse tree AST
  16. 16. Syntax-directed definition to produce AST for assignment statements productionproduction semantic rulessemantic rules SS →→ id :=id :=EE SS..nptrnptr :=:= mknodemknode((‘‘assignassign’’,, mkleafmkleaf (id, id.(id, id.entryentry),), EE..nptrnptr)) EE →→ EE11 ++EE22 EE..nptrnptr :=:= mknodemknode(( ‘‘++’’,, EE11..nptrnptr,, EE22..nptrnptr)) EE →→ EE11 ∗∗EE22 EE..nptrnptr :=:= mknodemknode(( ‘‘∗∗’’,, EE11..nptrnptr,, EE22..nptrnptr)) EE →→ −−EE11 EE..nptrnptr :=:= mkunodemkunode(( ‘‘uminusuminus’’,, EE11..nptrnptr)) EE →→ ((EE11)) EE..nptrnptr :=:= EE11..nptrnptr EE →→ idid EE..nptrnptr :=:= mkleafmkleaf (id, id.(id, id.entryentry)) 1. Syntax Tree vs DAG
  17. 17. assign a + + ∗ ∗ b c d c duminus syntax tree for a := (−b + c∗d ) + c∗d Syntax Tree vs DAG
  18. 18. • if mknode returns a pointer to an existing node whenever possible, a DAG can be produced assign a + + ∗ ∗ b c d c duminus assign a + + ∗ b c d uminus (a)syntax tree (b)DAG a := (−b + c∗d ) + c∗d Syntax Tree vs DAG
  19. 19. 08/31/13 19 Form Rules: 1. If E is a variable/constant, the PN of E is E itself 2. If E is an expression of the form E1 op E2, the PN of E is E1 ’ E2 ’ op (E1 ’ and E2 ’ are the PN of E1 and E2, respectively.) 3. If E is a parenthesized expression of form (E1), the PN of E is the same as the PN of E1. The PN of expression 9* (5+2) is 952+* How about (a+b)/(c-d) ? ab+cd-/ A mathematical notation wherein every operator follows all of its operands. 2. Postfix Notation
  20. 20. Intermediate-Code Generation 20 3. Static Single-Assignment Form • Static single assignment form (SSA) is an intermediate representation that facilitates certain code optimization. • Two distinct aspects distinguish SSA from three – address code. – All assignments in SSA are to variables with distinct names; hence the term static single-assignment.
  21. 21. Intermediate-Code Generation 21 3. Static Single-Assignment Form if (flag) x = -1; else x = 1; y = x * ;a if (flag) x1 = -1; else x2 = 1; X3 = φ(x1, x2)
  22. 22. 4. Three Address Instructions IR • Construct mapped to Three-Address Instructions – Register-based IR for expression evaluation – Infinite number of virtual registers – Still independent of target architecture • Generic Statement Format: Label: x = y op z or if exp goto L – Statements can have symbolic labels – Compiler inserts temporary variables – Type and conversions dealt in other phases of the code generation
  23. 23. Types of Three-address Statements • Assignment – Binary: x := y op z – Unary: x := op y – “op” can be any reasonable arithmetic or logic operator. • Copy – Simple: x := y – Indexed: x := y[i] or x[i] := y – Address and pointer manipulation: • x := &y • x := * y • *x := y
  24. 24. Types of Three-address Statements • Jump – Unconditional: goto L – Conditional: if x relop y goto L1 [else goto L2], where relop is <,=, >, , or ≠.≧ ≦ • Procedure call – Call procedure P(X1,X2, . . . ,Xn) PARAM X1 PARAM X2 ... PARAM Xn CALL P, n
  25. 25. implementations of three-address statements • common implementations: – Quadruples – Triples – indirect triples Consider the code: a := b * -c + b * -c
  26. 26. Quadruples • A quadruple is a record structure with four fields: op, arg1, arg2, and result – The op field contains an internal code for an operator – Statements with unary operators do not use arg2 – Operators like param use neither arg2 nor result – The target label for conditional and unconditional jumps are in result • The contents of fields arg1, arg2, and result are typically pointers to symbol table entries – If so, temporaries must be entered into the symbol table as they are created – Obviously, constants need to be handled differently
  27. 27. Quadruples Example op arg1 arg2 result (0) uminus c t1 (1) * b t1 t2 (2) uminus c t3 (3) * b t3 t4 (4) + t2 t4 t5 (5) := t5 a a := b * -c + b * -c
  28. 28. Triples • Triples refer to a temporary value by the position of the statement that computes it – Statements can be represented by a record with only three fields: op, arg1, and arg2 – Avoids the need to enter temporary names into the symbol table • Contents of arg1 and arg2: – Pointer into symbol table (for programmer defined names) – Pointer into triple structure (for temporaries) – Of course, still need to handle constants differently
  29. 29. Triples Example op arg1 arg2 (0) uminus c (1) * b (0) (2) uminus c (3) * b (2) (4) + t2 (3) (5) assign a (4) Result is implicit in triples a := b * -c + b * -c
  30. 30. opop arg1arg1 arg2arg2 (0)(0) []=[]= xx ii (1)(1) :=:= (0)(0) yy  an indexed assignment requires two triples:an indexed assignment requires two triples: x[i] := yx[i] := y
  31. 31. Indirect triples • indirect triples add a list of pointers to triples, so that triples can be shared and moved easily op arg1 arg2 (14) uminus c (15) * b (14) (16) uminus c (17) * b (16) (18) + (15) (17) (19) assign a (18) op (0) (14) (1) (15) (2) (16) (3) (17) (4) (18) (5) (19) a := b * -c + b * -c
  32. 32. syntax-directed translation into three-address code productionproduction semantic rulessemantic rules SS →→ id :=id :=EE SS..codecode := E.code:= E.code ‖gen(‖gen(id.placeid.place ‘‘:=:=’’ E.place)E.place) EE →→ EE11 ++EE22 EE.place := newtemp;.place := newtemp; E.code :=E.code := EE11.code.code ‖‖EE22.code.code ‖‖ gen(E.place ‘:=’gen(E.place ‘:=’EE11.place.place ‘‘++’’EE22. place). place) EE →→ EE11 ∗∗EE22 ............ EE →→ −−EE11 EE.place := newtemp;.place := newtemp; E.code :=E.code := EE11.code.code ‖gen(E.place ‘:=’‖gen(E.place ‘:=’‘‘uminusuminus’’EE11. place). place) EE →→ ((EE11)) EE..placeplace :=:= EE11.place;.place; EE.code :=.code := EE11.code.code FF →→ idid EE..placeplace := id.place; E.code :=:= id.place; E.code := ‘’‘’
  33. 33. syntax-directed translation into three-address code productionproduction semantic rulessemantic rules SS →→ while Ewhile E do Sdo S11 S.begin := newlabel;S.begin := newlabel; S.after := newlabel;S.after := newlabel; SS..codecode :=:= gen(S.begingen(S.begin ‘‘::’’)) ‖‖ E.codeE.code ‖‖ gen(‘if’Egen(‘if’E.place.place ‘‘==’’ ‘‘00’’ ‘‘gotogoto’’ S.after)S.after) ‖‖ SS11.code.code ‖‖ gen(gen(‘‘gotogoto’’ S.begin)S.begin) ‖‖ gen(S.aftergen(S.after ‘‘::’’))
  34. 34. Declarations • enter symbols in a symbol table • allocate space and record it in the symbol table • emit appropriate code
  35. 35. Declarations in a procedure • computing types and relative address of names P → {offset := 0} D D → D ; D D → id : T {enter ( id.name, T.type, offset); offset := offset + T.width } T → integer {T.type := integer; T.width := 4 } T→ real {T.type := real; T.width := 8 } T→ array [ num ] of T1 {T.type := array (num.val, T1 .type); T.width := num.val × T1 .width} T→ ↑T1 {T.type := pointer (T1 .type); T.width := 4 }
  36. 36. Synta x -Directed Translation to Three Address Code • Attributes for the Non-Terminals, say E and S – Location of the value of an expression: E.place – The Code that Evaluates the Expressions or Statement: E.code – Markers for beginning and end of sections of the code S.begin, S.end • Semantic Actions in Productions of the Grammar – Functions to create temporaries newtemp, and labels newlabel – Use Auxiliary functions to enter symbols and consult types corresponding to declarations in aside data structure that can be built as the code is being parsed - a symbol table. – To generate the code we use the emit function gen which creates a list of instructions to be emitted later and can generate symbolic labels corresponding to next instruction of a list. – Use of append function on lists of instructions. – Synthesized and Inherited Attributes
  37. 37. Assignment Statements: Grammar and Actions S → id = E { p = lookup(id.name); if (p != NULL) S.code = gen(p ‘=‘ E.place); else error; S.code = nulllist; } E → E1 + E2 {E.place = newtemp(); E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place ‘+’ E2.place); } E → E1 * E2 { E.place = newtemp(); E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place ‘*’ E2.place); }
  38. 38. Assignment Statements: Grammar and Actions E → - E1 {E.place = newtemp(); E.code = append(E1.code,gen(E.place ‘=‘ ‘-’ E1.place)); } E → (E1) {E.place = E1.place; E.code = E1.code; } E → id {p = lookup(id.name); if (p != NULL) E.place = p; else error; E.code = nulllist; }
  39. 39. Assignment: Example x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E -
  40. 40. Assignment: Example x = a * b + c * d - e * f; id E → id { p = lookup(id.name); if (p != NULL) E.place = p; else error; E.code = null list; } Production: S id = E E * id a id b x E E E * id c id d E E E * id e f E E + E - place = loc(e) code = null
  41. 41. Assignment: Example x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E - place = loc(f) code = null E → id { p = lookup(id.name); if (p != NULL) E.place = p; else error; E.code = null list; } Production: place = loc(e) code = null
  42. 42. Assignment: Example x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E - place = loc(f) code = null E → E1 * E2 {E.place = newtemp(); E.code = gen(E.place ‘=‘ E1.place ‘*’ E2.place);} Production: place = loc(e) code = null place = loc(t1) code = {t1 = e + f;}
  43. 43. Assignment: E x ample x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E - Production: E → E1 + E2 {E.place = newtemp(); E.code = gen(E.place ‘=‘ E1.place ‘+’ E2.place);} place = loc(f) code = null place = loc(e) code = null place = loc(t1) code = {t1 = e + f;} place = loc(d) code = null place = loc(c) code = null place = loc(t2) code = {t2 = c + d;}
  44. 44. Assignment: Example x = a * b + c * d - e * f; S id = E E * id a id b x E E E * id c id d E E E * id e id f E E + E - Production: place = loc(f) code = null place = loc(e) code = null place = loc(t1) code = {t1 = e * f;} place = loc(d) code = null place = loc(c) code = null place = loc(t2) code = {t2 = c * d;} S → id = E { p = lookup(id.name); if (p != NULL) E.code = append(E.code, gen(p ‘=‘ E.place)); else error; } place = loc(t3) code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; } place = loc(b) code = null place = loc(a) code = null place = loc(t4) code = {t4 = a * b;} place = loc(t5) code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3} code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3; x = t5;} place = loc(x) code = null
  45. 45. Assignment: Example x = a * b + c * d - e * f; S id = E E* id a id b x E E E * id c id d E E E* id e id f E E + E - t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3; x = t5;
  46. 46. Reusing Temporary Variables • Temporary Variables – Short lived – Used for Evaluation of Expressions – Clutter the Symbol Table • Change the newtemp Function – Keep track of when the value created in a temporary is used – Use a counter to keep track of the number of active temps – When a temporary is used in an expression decrement counter – When a temporary is generated by newtemp increment counter – Initialize counter to zero
  47. 47. Assignment: Example • Only 2 Registers Needed x = a * b + c * d - e * f; S id = E E* id a id b x E E E * id c id d E E E* id e id f E E + E - // c = 0 t1 = e * f; // c = 1 t2 = c * d; // c = 2 t1 = t2 - t1; // c = 1 t2 = a * b; // c = 2 t1 = t2 + t1; // c = 1 x = t1; // c = 0
  48. 48. Boolean & Relational Values How should the compiler represent them? • Answer depends on the target machine Two classic approaches • Numerical representation • Positional (implicit) representation Correct choice depends on both context and ISA
  49. 49. • Issue: Control Flow Introduces Complications – In Both Representations – Need to Know Address to Jump To in Some Cases • Solution: Two Additional Attributes – nextstat (Inherited) Indicates the next location to be generated – laststat (Synthesized) Indicates the last location filled – As code is generated the attributes are filled with the correct value SDT Scheme for Boolean Expressions
  50. 50. Boolean Expression: Grammar and Actions E → false {E.place = newtemp() E.code = {gen(E.place = 0)} E.laststat = E.nextstat + 1 } E → true {E.place = newtemp() E.code = {gen(E.place = 1)} E.laststat = E.nextstat + 1 }
  51. 51. Boolean Expression: Grammar and Actions E → (E1) {E.place = E1.place; E.code = E1.code; E1.nextstat = E.nextstat E.laststat = E1.laststat } E → not E1 {E.place = newtemp() E.code = append(E1.code,gen(E.place = not E1.place)) E1.nextstat = E.nextstat E.laststat = E1.laststat + 1 }
  52. 52. Boolean Expression: Grammar and Actions E → E1 or E2 {E.place = newtemp() E.code = append(E1.code,E2.code,gen(E.place = E1.place or E2.place) E1.nextstat = E.nexstat E2.nextstat = E1.laststat E.laststat = E2.laststat + 1 }
  53. 53. Boolean Expression: Grammar and Actions E → E1 and E2 {E.place = newtemp() E.code = append(E1.code,E2.code,gen(E.place = E1.place and E2.place) E1.nextstat = E.nexstat E2.nextstat = E1.laststat E.laststat = E2.laststat + 1 }
  54. 54. Boolean Expression: Grammar and Actions E → id1 relop id2 { E.place = newtemp() E.code = gen(if id1.place relop id2.place goto E.nextstat+3) E.code = append(E.code,gen(E.place = 0)) E.code = append(E.code,gen(goto E.nextstat+2)) E.code = append(E.code,gen(E.place = 1)) E.laststat = E.nextstat + 4 }
  55. 55. Boolean Expressions: Example a < b or c < d and e < f 00: if a < b goto 03 01: t1 = 0 02: goto 04 03: t1 = 1 04: if c < d goto 07 05: t2 = 0 06: goto 08 07: t2 = 1 08: if e < f goto 11 09: t3 = 0 10: goto 12 11: t3 = 1 12: t4 = t2 and t3 13: t5 = t1 or t4 id relop id E E E id relop id E id relop id E a b c d e f<< < or and
  56. 56. Control Flow Statements: Code Layout E.code S1.code S → if E then S1 S → if E then S1else S2 E.code S1.code S2.code goto S.next to E.true to E.false E.true: E.false: to E.true to E.false E.true: E.false: S.next: • Attributes: – E.true: the label to which control flows if E is true – E.false: the label to which control flows if E is false – S.next: an inherited attribute with the symbolic label of the code following S
  57. 57. Control Flow Statements: Code Layout E.code S1.code S → while E do S1 goto S.begin to E.true to E.false E.true: E.false: S.begin: • Difficulty: Need to know where to jump to – Introduce a symbolic labels using the newlabel function – Use inherited attributes – Backpatch it later with the actual value (later…)
  58. 58. Control Flow Statements: Grammar and Actions S → if E then S1 { E.true = newlabel E.false = S.next S1.next = S.next S.code = append(E.code,gen(E.true:),S1.code) }
  59. 59. Control Flow Statements: Grammar and Actions S → if E then S1 else S2 { E.true = newlabel E.false = newlabel S1.next = S.next S2.next = S.next S.code = append(E.code,gen(E.true:),S1.code, gen(goto S.next),gen(E.false :),S2.code) }
  60. 60. Control Flow Statements: Grammar and Actions S → while E do S1 { S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:), E.code, gen(E.true:), S1.code, gen(goto S.begin) }
  61. 61. Control Flow Translation of Boolean Expressions • Short-Circuit Evaluation – No Need to Evaluate portions of the expression if the outcome is already determined – Examples: • E1 or E2 need not evaluate E2 if E1 is known to be true. • E1 and E2 need not evaluate E2 if E1 is known to be false. • Use Control Flow – Jump over code that evaluates boolean terms of the expression – Use Inherited E.false and E.true attributes and link evaluation of E
  62. 62. Control Flow Translation of Boolean Expressions E → E1 or E2 { E1.true = E.true E1.false = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,gen(E1.false:),E2.code) } E → E1 and E2 {E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,gen(E1.true:),E2.code) }
  63. 63. Control Flow Translation of Boolean Expressions E → id1 relop id2 {E.code = append(gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) } E → true {E.code = gen(goto E.true) } E → false {E.code = gen(goto E.false) } E → not E1 {E1.true = E.false E1.false = E.true E.code = E1.code } E → ( E1 ) { E1.true = E.true E1.false = E.false E.code = E1.code }
  64. 64. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f E E E id relop id E id relop id E a b c d e f<< < or andid relop id
  65. 65. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f E E E id relop id E id relop id E a b c d e f<< < or and E.true = Ltrue E.false = Lfalse E1.true = Ltrue E1.false = L1 id relop id E → id1 relop id2 ‖ E.code = append( gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) E → E1 or E2 ‖ E1.true = E.true E1.false = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code, gen(E1.false:),E2.code) E2.true = Ltrue E2.false = Lfalse if a < b goto Ltrue goto L1 L1:
  66. 66. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f id relop id E → id1 relop id2 ‖ E.code = append( gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) E → E1 and E2 ‖ E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code, gen(E1.true:),E2.code) if a < b goto Ltrue goto L1 E2.true = Ltrue E2.false = Lfalse L1: if c < d goto L2 goto Lfalse L2: E E E id relop id E id relop id E a b c d e f<< < or and E.true = Ltrue E.false = Lfalse E1.true = Ltrue E1.false = L1 E2.true = Ltrue E2.false = Lfalse E1.true = L2 E1.false = Lfalse
  67. 67. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f E E E id relop id E id relop id E a b c d e f<< < or and E.true = Ltrue E.false = Lfalse E1.true = Ltrue E1.false = L1 id relop id E → id1 relop id2 ‖ E.code = append( gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) E → E1 and E2 ‖ E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code, gen(E1.true:),E2.code) E2.true = Ltrue E2.false = Lfalse if a < b goto Ltrue goto L1 E2.true = Ltrue E2.false = Lfalse E1.true = L2 E1.false = Lfalse L1: if c < d goto L2 goto Lfalse L2: if e < f goto Ltrue goto Lfalse
  68. 68. Boolean Expression: Short Circuit Evaluation a < b or c < d and e < f E E E id relop id E id relop id E a b c d e f<< < or andid relop id if a < b goto Ltrue goto L1 L1: if c < d goto L2 goto Lfalse L2: if e < f goto Ltrue goto Lfalse
  69. 69. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin) S S E id relop id S E b c d< < do if id relop id while a then Sthen
  70. 70. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1
  71. 71. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1 L1: if a < b goto L2 goto Lnext L2:
  72. 72. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1 L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3:E.true = L3 E.false = L4 S1.next = L1 S2.next = L1
  73. 73. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1 L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3: t1 = x + z x = t1 goto L1 L4: E.true = L3 E.false = L4 S1.next = L1 S2.next = L1
  74. 74. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen S.next = Lnext S.begin = L1 E.true = L2 E.false = Lnext S.next = L1 L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3: t1 = x + z x = t1 goto L1 L4: t2 = x - z x = t2 goto L1 Lnext: E.true = L3 E.false = L4 S1.next = L1 S2.next = L1
  75. 75. Combining Boolean and Control Flow Statements while a < b do if c < d then x = y + z else x = y - z S → while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code, gen(E.true:),S1.code, gen(goto S.begin)) S S E id relop id S E b c d< < do if id relop id while a then Sthen L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3: t1 = x + z x = t1 goto L1 L4: t2 = x - z x = t2 goto L1 Lnext:
  76. 76. Loop Constructs Loops • Evaluate condition before loop (if needed) • Evaluate condition after loop • Branch back to the top (if needed) Why this structure? • Merges test with last block of loop body • Pre-test block to hold loop-invariant code • Post-test for increment instructions and test while, for, do, & until all fit this basic model Pre-test Loop head Post-test Next block B1 B2
  77. 77. Break & Skip Statements Many modern programming languages include a break • Exits from the innermost control-flow statement – Out of the innermost loop – Out of a case statement Translates into a jump • Targets statement outside control- flow construct • Creates multiple-exit construct • skip in loop goes to next iteration Only make sense if loop has > 1 block Pre-test Loop head Post-test Next block B1 B2Break in B1 Skip in B2
  78. 78. Break and Skip Statements • Need to Keep track of enclosing control-flow constructs • Harder to have clean SDT scheme… – Keep a Stack of control-flow constructs – Using S.next as in the stack as the target for the break statement – For skip statements need to keep track of the label of the code of the post-test block to advance to the next iteration. This is harder since the code has not been generated yet. • Backpatching helps – Use a breaklist and a skiplist to be patched later.
  79. 79. Backpatching • Single Pass Solution to Code Generation? – No more symbolic labels - symbolic addresses instead – Emit code directly into an array of instructions – Actions associated with Productions – Executed when Bottom-Up Parser “Reduces” a production • Problem – Need to know the labels for target branches before actually generating the code for them. • Solution – Leave Branches undefined and patch them later – Requires: carrying around a list of the places that need to be patched until the value to be patched with is known.
  80. 80. Boolean Expressions Revisited • Use Additional ε-Production – Just a Marker M – Label Value M.addr • Attributes: – E.truelist: code places that need to be filled-in corresponding to the evaluation of E as “true”. – E.falselist: same for “false” (1) E → E1 or M E2 (2) | E1 and M E2 (3) | not E1 (4) | ( E1 ) (5) | id1 relop id2 (6) | true (7) | false (8) M → ε
  81. 81. Boolean Expressions: Code Outline E1.code E2.code E1 and E2 false ? true false ?true E1.code E2.code E1 or E2 true false ? false true
  82. 82. Action (8) M → ε { M.Addr := nextAddr; } (1) E → E1 or M E2 { backpatch(E1.falselist,M.Addr); E.truelist := merge(E1.truelist,E2.truelist); E.falselist := E2.falselist; } (2) E → E1 and M E2 { backpatch(E1.truelist,M.Addr); E.truelist := E2.truelist; E.falselist := merge(E1.falselist, E2.falselist); }
  83. 83. (3) E → not E1 {E.truelist := E1 .falselist; E.falselist := E1 .truelist;} (4) E → ( E1 ) {E.truelist := E1 .truelist; E.falselist := E1 .falselist;} (6) E → true {E.truelist := makelist(nextquad); emit(‘goto _’);} (7) E → false {E.falselist := makelist(nextquad); emit(‘goto _’);} More Actions
  84. 84. Backpatching Example E.truelist = E.falselist = E.truelist = E.falselist = E.truelist = E.falselist = E.truelist = E.falselist = E.truelist = E.falselist = M.addr = M.addr =or and a < b c < d e < f e e E E E E E M M E E.truelist E.falselist M.addr M Generated CodeExecuting Action { E.truelist := makelist(nextquad()); E.falselist := makelist(nextquad()); emit(“if id1.place relop.op id2.place goto _”); emit(“goto _”); } 100: if a < b goto _ 101: goto_ 102: if c < d goto _ 103: goto_ { M.quad = nextquad(); } 104: if e < f goto _ 105: goto_ { backpatch(E1.falselist,M.quad); E.truelist := merge(E1.truelist,E2.truelist); E.falselist := E2.falselist; } { backpatch(E1.truelist,M.quad); E.truelist := E2.truelist; E.falselist := merge(E1.falselist,E2.falselist; } 102: if c < d goto 104 103: goto_ 100: if a < b goto _ 101: goto 102 {100} {101} {102} {103} 102 104 {104} {105} {104} {103, 105} {103, 105} {100, 104}
  85. 85. Control Flow Code Structures . . . E.code S1.codeE.true: E.false: if E then S1 . . . E.code S1.codeE.true: E.false: if E then S1 else S2 S.next: S2.code goto S.next . . . E.code S1.codeE.true: E.false: while E do S1 goto S.begin S.begin:

×