COMPILER CONSTRUCTION Principles and Practice Kenneth C. Louden
8.  Code Generation   PART ONE
<ul><li>Generate executable code for a target machine that is  a faithful representation  of the semantics of the source c...
Contents <ul><li>Part One </li></ul><ul><li>8.1 Intermediate Code and Data Structure for code Generation </li></ul><ul><li...
8.1 Intermediate Code and Data Structures for Code Generation
8.1.1 Three-Address Code
<ul><li>A data structure that represents the source program during translation is called an  intermediate representation, ...
 
<ul><li>Figure 8.1 Sample TINY program: </li></ul><ul><li>{ sample program  </li></ul><ul><li>in TINY language --  compute...
<ul><li>The Three-address codes for above TINY program </li></ul><ul><li>read x t1=x>0 </li></ul><ul><li>if_false t1 goto ...
8.1.2 Data Structures for the Implementation of Three-Address Code
<ul><li>The most common implementation is to implement three-address code as  quadruple , which means that four fields are...
<ul><li>Quadruple implementation for the three-address code of the previous example  </li></ul><ul><ul><li>(rd, x , _ , _ ...
<ul><li>C code defining data structures for the quadruples </li></ul><ul><li>Typedef enum { rd, gt, if_f, asn, lab, mul, s...
<ul><li>A representation of the three-address code of the previous example as triples </li></ul><ul><ul><li>(0)  (rd, x , ...
8.1.3 P-Code
<ul><li>It was designed to be the actual code for a   hypothetical stack machine , called the  P-machine , for which an in...
<ul><li>The P-machine consists of  a code memory,  an  unspecified data memory for named variables , and  a stack for temp...
<ul><li>Comparison of P-Code to Three-Address Code   P-code is in many respects  closer to actual machine code  than three...
8.2 Basic Code Generation Techniques
8.2.1 Intermediate Code or Target Code as a Synthesized Attribute
<ul><li>Intermediate code generation can be viewed as an attribute computation. </li></ul><ul><li>This code becomes a synt...
1) P-Code  The expression (x=x+3)+4 has the following P-Code attribute: Lda x Lod x Ldc 3 Adi Stn Ldc 4 Adi Factor     id...
T1=x+3  x=t1 t2=t1+4 2) Three-Address Code Factor     id  factor.namenum.strval; factor.tacode=” “ Factor    num  factor...
8.2.2 Practical Code Generation
<ul><li>The basic algorithm can be described as the following recursive procedure </li></ul><ul><li>Procedure gencode (T: ...
<ul><li>Typedef enum {plus, assign} optype; </li></ul><ul><li>Typedef enum {OpKind, ConstKind, IdKind} NodeKind; </li></ul...
 
<ul><li>A genCode procedure to generate P-code. </li></ul><ul><li>Void genCode (SyntaxTree t) </li></ul><ul><li>{ char cod...
<ul><li>  case Assign: </li></ul><ul><li>  sprintf(codestr, “%s %s”, “lda”, t->strval); </li></ul><ul><li>emitcode(codestr...
<ul><li>  case IdKind: </li></ul><ul><li>  sprintf(codestr,”%s %s”,”lod”,t->strval); </li></ul><ul><li>emitCode(codestr); ...
<ul><li>Yacc specification for the generation P-code according to the attribute grammar of Table 8.1 </li></ul><ul><li>%{ ...
<ul><li>| aexp </li></ul><ul><li>; </li></ul><ul><li>aexp : aexp ‘+’ factor {emitCode(“adi”);} </li></ul><ul><li>| factor ...
8.2.3 Generation of Target Code from Intermediate Code
<ul><li>Code generation from intermediate code involves either or both of  two standard techniques:   </li></ul><ul><ul><l...
<ul><li>Consider the expression (x=x+3) +4, translate the P-code into three-address code: </li></ul><ul><li>Lad x </li></u...
 
 
<ul><li>We now consider the case of translating from three-address code to P-code, by  simple macro expansion . A three-ad...
<ul><li>Then, the three-address code for the expression (x=x+3)+4: </li></ul><ul><li>T1 = x + 3 </li></ul><ul><li>X = t1 <...
 
End of Part One THANKS
Upcoming SlideShare
Loading in...5
×

Chapter Eight(1)

1,228

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,228
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
54
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Chapter Eight(1)

  1. 1. COMPILER CONSTRUCTION Principles and Practice Kenneth C. Louden
  2. 2. 8. Code Generation PART ONE
  3. 3. <ul><li>Generate executable code for a target machine that is a faithful representation of the semantics of the source code </li></ul><ul><li>Depends not only on the characteristics of the source language but also on detailed information about the target architecture , the structure of the runtime environment , and t he operating system running on the target machine </li></ul>
  4. 4. Contents <ul><li>Part One </li></ul><ul><li>8.1 Intermediate Code and Data Structure for code Generation </li></ul><ul><li>8.2 Basic Code Generation Techniques </li></ul><ul><li>Other Parts </li></ul><ul><li>8.3 Code Generation of Data Structure Reference </li></ul><ul><li>8.4 Code Generation of Control Statements and Logical Expression </li></ul><ul><li>8.5 Code Generation of Procedure and Function calls </li></ul><ul><li>8.6 Code Generation on Commercial Compilers: Two Case Studies </li></ul><ul><li>8.7 TM: A Simple Target Machine </li></ul><ul><li>8.8 A Code Generator for the TINY Language </li></ul><ul><li>8.9 A Survey of Code Optimization Techniques </li></ul><ul><li>8.10 Simple Optimizations for TINY Code Generator </li></ul>
  5. 5. 8.1 Intermediate Code and Data Structures for Code Generation
  6. 6. 8.1.1 Three-Address Code
  7. 7. <ul><li>A data structure that represents the source program during translation is called an intermediate representation, or IR , for short </li></ul><ul><li>Such an intermediate representation that resembles target code is called intermediate code </li></ul><ul><ul><li>Intermediate code is particularly useful when the goal of the compiler is to produce extremely efficient code ; </li></ul></ul><ul><ul><li>Intermediate code can also be useful in making a compiler more easily retarget-able . </li></ul></ul><ul><li>Study two popular forms of intermediate code: Three -Address code and P-code </li></ul><ul><li>The most basic instruction of three-address code is designed to represent the evaluation of arithmetic expressions and has the following general form: </li></ul><ul><ul><li>X=y op z </li></ul></ul>
  8. 9. <ul><li>Figure 8.1 Sample TINY program: </li></ul><ul><li>{ sample program </li></ul><ul><li>in TINY language -- computes factorial </li></ul><ul><li>} </li></ul><ul><li>read x ; { input an integer } </li></ul><ul><li>if 0 > x then { don’t compute if x <= 0 } </li></ul><ul><li>fact:=1; </li></ul><ul><li>repeat </li></ul><ul><li>fact:=fact*x; </li></ul><ul><li>x:=x-1 </li></ul><ul><li>until x=0; </li></ul><ul><li>write fact { output factorial of x } </li></ul><ul><li>ends </li></ul>
  9. 10. <ul><li>The Three-address codes for above TINY program </li></ul><ul><li>read x t1=x>0 </li></ul><ul><li>if_false t1 goto L1 </li></ul><ul><li>fact=1 </li></ul><ul><li>label L2 </li></ul><ul><li>t2=fact*x </li></ul><ul><li>fact=t2 </li></ul><ul><li>t3=x-1 </li></ul><ul><li>x=t3 </li></ul><ul><li>t4= x= =0 </li></ul><ul><li>if_false t4 goto L2 </li></ul><ul><li>write fact </li></ul><ul><li>label L1 </li></ul><ul><li>halt </li></ul>
  10. 11. 8.1.2 Data Structures for the Implementation of Three-Address Code
  11. 12. <ul><li>The most common implementation is to implement three-address code as quadruple , which means that four fields are necessary: </li></ul><ul><ul><li>One for the operation and three for the addresses </li></ul></ul><ul><li>A different implementation of three-address code is called a triple: </li></ul><ul><ul><li>Use the instructions themselves to represent the temporaries. </li></ul></ul><ul><li>It requires that each three-address instruction be reference-able, either as an index in an array or as a pointer in a linked list. </li></ul>
  12. 13. <ul><li>Quadruple implementation for the three-address code of the previous example </li></ul><ul><ul><li>(rd, x , _ , _ ) </li></ul></ul><ul><ul><li>(gt, x, 0, t1 ) </li></ul></ul><ul><ul><li>(if_f, t1, L1, _ ) </li></ul></ul><ul><ul><li>(asn, 1,fact, _ ) </li></ul></ul><ul><ul><li>(lab, L2, _ , _ ) </li></ul></ul><ul><ul><li>(mul, fact, x, t2 ) </li></ul></ul><ul><ul><li>(asn, t2, fact, _ ) </li></ul></ul><ul><ul><li>(sub, x, 1, t3 ) </li></ul></ul><ul><ul><li>(asn, t3, x, _ ) </li></ul></ul><ul><ul><li>(eq, x, 0, t4 ) </li></ul></ul><ul><ul><li>(if_f, t4, L2, _) </li></ul></ul><ul><ul><li>(wri, fact, _, _ ) </li></ul></ul><ul><ul><li>(lab, L1, _ , _ ) </li></ul></ul><ul><ul><li>(halt, _, _, _ ) </li></ul></ul>
  13. 14. <ul><li>C code defining data structures for the quadruples </li></ul><ul><li>Typedef enum { rd, gt, if_f, asn, lab, mul, sub, eq, wri, halt, …} </li></ul><ul><li>OpKind; </li></ul><ul><li>Typedef enum { Empty, IntConst, String } AddrKind; </li></ul><ul><li>Typedef struct </li></ul><ul><li>{ AddrKind kind; </li></ul><ul><li> Union </li></ul><ul><li>{ int val; </li></ul><ul><li> char * name; </li></ul><ul><li>} contents; </li></ul><ul><li>} Address </li></ul><ul><li>Typedef struct </li></ul><ul><li>{ OpKind op; </li></ul><ul><li> Address addr1, addr2, addr3; </li></ul><ul><li>} Quad </li></ul>
  14. 15. <ul><li>A representation of the three-address code of the previous example as triples </li></ul><ul><ul><li>(0) (rd, x , _ ) </li></ul></ul><ul><ul><li>(1) (gt, x, 0) </li></ul></ul><ul><ul><li>(2) (if_f, (1), (11) ) </li></ul></ul><ul><ul><li>(3) (asn, 1,fact ) </li></ul></ul><ul><ul><li>(4) (mul, fact, x) </li></ul></ul><ul><ul><li>(5) (asn, (4), fact ) </li></ul></ul><ul><ul><li>(6) (sub, x, 1 ) </li></ul></ul><ul><ul><li>(7) (asn, (6), x) </li></ul></ul><ul><ul><li>(8) (eq, x, 0 ) </li></ul></ul><ul><ul><li>(9) (if_f, (8), (4)) </li></ul></ul><ul><ul><li>(10) (wri, fact, _ ) </li></ul></ul><ul><ul><li>(11) (halt, _, _) </li></ul></ul>
  15. 16. 8.1.3 P-Code
  16. 17. <ul><li>It was designed to be the actual code for a hypothetical stack machine , called the P-machine , for which an interpreter was written on various actual machines </li></ul><ul><li>Since P-code was designed to be directly executable, it contains an implicit description of a particular runtime environment , including data sizes, as well as a great deal of information specific to the P-machines. </li></ul>
  17. 18. <ul><li>The P-machine consists of a code memory, an unspecified data memory for named variables , and a stack for temporary data , together with whatever registers are needed to maintain the stack and support execution </li></ul><ul><ul><li>2*a+(b-3) </li></ul></ul><ul><ul><li>ldc 2 ; load constant 2 </li></ul></ul><ul><ul><li>lod a ; load value of variable a </li></ul></ul><ul><ul><li>mpi ; integer multiplication </li></ul></ul><ul><ul><li>lod b ; load value of variable b </li></ul></ul><ul><ul><li>ldc 3 ; load constant 3 </li></ul></ul><ul><ul><li>sbi ; integer subtraction </li></ul></ul><ul><ul><li>adi ; integer addition </li></ul></ul>
  18. 19. <ul><li>Comparison of P-Code to Three-Address Code P-code is in many respects closer to actual machine code than three-address code. </li></ul><ul><li>P-code instructions also require fewer addresses : </li></ul><ul><li>One the other hand, P-code is less compact than three-address code in terms of numbers of instructions, and P-code is not “ self-contained ” in that the instructions operate implicitly on a stack. </li></ul><ul><li>2) Implementation of P-Code Historically, P-code has largely been generated as a text file, but the previous descriptions of internal data structure implementations for three-address code will also work with appropriate modification for P-code. </li></ul>
  19. 20. 8.2 Basic Code Generation Techniques
  20. 21. 8.2.1 Intermediate Code or Target Code as a Synthesized Attribute
  21. 22. <ul><li>Intermediate code generation can be viewed as an attribute computation. </li></ul><ul><li>This code becomes a synthesized attribute that can be defined using an attribute grammar and generated either directly during parsing or by a post-order traversal of the syntax tree. </li></ul><ul><li>For an example: a small subset of C expressions: </li></ul><ul><ul><li>Exp  id = exp | aexp </li></ul></ul><ul><ul><li>Aexp  aexp + factor | factor </li></ul></ul><ul><ul><li>Factor  (exp) | num | id </li></ul></ul>
  22. 23. 1) P-Code The expression (x=x+3)+4 has the following P-Code attribute: Lda x Lod x Ldc 3 Adi Stn Ldc 4 Adi Factor  id factor.pcode=”lod”||id.strval Factor  num factor.pcode=”ldc”||num.strval Factor  (exp) factor.pcode=exp.pcode Aexp  factor aexp.pcode=factor.pcode Aexp1  aexp2 + factor aexp1.pcode=aexp2.pcode++factor.pcode++”adi” Exp  aexp exp.pcode=aexp.pcode Exp1  id = exp2 exp1.pcode=”lda”|| id.strval++exp2.pcode++”stn” Grammar Rule Semantic Rules
  23. 24. T1=x+3 x=t1 t2=t1+4 2) Three-Address Code Factor  id factor.namenum.strval; factor.tacode=” “ Factor  num factor.namenum.strval; factor.tacode=” “ Factor  (exp) factor.name=exp.name; factor.tacode=exp.tacode Aexp  factor aexp.name=factor.name; aexp.tacode=factor.tacode Aexp1  aexp2 + factor aexp1.name=newtemp( ) aexp1.tacode=aexp2.tacode++factor.tacode++ aexp1.name||”=”||aexp2.name||”+”||factor.name Exp  aexp exp.name=aexp.name; exp.tacode=aexp.tacode Exp1  id = exp2 exp1.name=exp2.name Exp1.tacode=exp2.tacode++ id.strval||”=”||exp2.name Grammar Rule Semantic Rules
  24. 25. 8.2.2 Practical Code Generation
  25. 26. <ul><li>The basic algorithm can be described as the following recursive procedure </li></ul><ul><li>Procedure gencode (T: treenode); </li></ul><ul><li>Begin </li></ul><ul><li>If T is not nil then </li></ul><ul><li>Generate code to prepare for code of left child of T; </li></ul><ul><li>Gencode(left child of T); </li></ul><ul><li>Generate code to prepare for code of right child of T; </li></ul><ul><li>Gencode(right child of T); </li></ul><ul><li>Generate code to implement the action of T; </li></ul><ul><li>End; </li></ul>
  26. 27. <ul><li>Typedef enum {plus, assign} optype; </li></ul><ul><li>Typedef enum {OpKind, ConstKind, IdKind} NodeKind; </li></ul><ul><li>Typedef struct streenode </li></ul><ul><li>{ NodeKind kind; </li></ul><ul><li> Optype op; /* used with OpKind */ </li></ul><ul><li> Struct streenod *lchild, *rchild; </li></ul><ul><li> Int val; /* used with ConstKind */ </li></ul><ul><li> Char * strval; </li></ul><ul><li> /* used for identifiers and numbers */ </li></ul><ul><li>} streenode; </li></ul><ul><li>typedef streenode *syntaxtree; </li></ul>
  27. 29. <ul><li>A genCode procedure to generate P-code. </li></ul><ul><li>Void genCode (SyntaxTree t) </li></ul><ul><li>{ char codestr[CODESIZE]; </li></ul><ul><li>/* CODESIZE = max length of 1 line of P-code */ </li></ul><ul><li>if (t !=NULL) </li></ul><ul><li>{ switch (t->kind) </li></ul><ul><li>{ case opKind: </li></ul><ul><li>switch(t->op) </li></ul><ul><li>{ case Plus: </li></ul><ul><li>genCode(t->lchild); </li></ul><ul><li>genCode(t->rchild); </li></ul><ul><li>emitCode(“adi”); </li></ul><ul><li>break; </li></ul>
  28. 30. <ul><li> case Assign: </li></ul><ul><li> sprintf(codestr, “%s %s”, “lda”, t->strval); </li></ul><ul><li>emitcode(codestr); </li></ul><ul><li>genCode(t->lchild); </li></ul><ul><li>emitCode(“stn”); </li></ul><ul><li>break; </li></ul><ul><li>} </li></ul><ul><li>break; </li></ul><ul><li> case ConstKind: </li></ul><ul><li>sprintf(codestr,”%s %s”,”ldc”,t->strval); </li></ul><ul><li>emitCode(codestr); </li></ul><ul><li>break; </li></ul>
  29. 31. <ul><li> case IdKind: </li></ul><ul><li> sprintf(codestr,”%s %s”,”lod”,t->strval); </li></ul><ul><li>emitCode(codestr); </li></ul><ul><li>break; </li></ul><ul><li> default: </li></ul><ul><li>emitCode(“error”); </li></ul><ul><li>break; </li></ul><ul><li>} </li></ul><ul><li>} </li></ul><ul><li>} </li></ul>
  30. 32. <ul><li>Yacc specification for the generation P-code according to the attribute grammar of Table 8.1 </li></ul><ul><li>%{ </li></ul><ul><li>#define YYSTYPE char * </li></ul><ul><li>/* make Yacc use strings as values */ </li></ul><ul><li>/* other inclusion code … */ </li></ul><ul><li>%} </li></ul><ul><li>%token NUM ID </li></ul><ul><li>%% </li></ul><ul><li>exp : ID </li></ul><ul><li>{ sprintf (codestr, “%s %s”, “lda”, $1); </li></ul><ul><li> emitCode ( codestr); } </li></ul><ul><li>‘ = ‘ exp </li></ul><ul><li>{ emitCode (“stn”); } </li></ul>
  31. 33. <ul><li>| aexp </li></ul><ul><li>; </li></ul><ul><li>aexp : aexp ‘+’ factor {emitCode(“adi”);} </li></ul><ul><li>| factor </li></ul><ul><li>; </li></ul><ul><li>factor : ‘(‘ exp ‘)’ </li></ul><ul><li>| NUM {sprintf(codestr, “%s %s”,”ldc”,$1); </li></ul><ul><li>emitCode(codestr);} </li></ul><ul><li>| ID {sprintf(codestr,”%s %s”, “lod”,$1); </li></ul><ul><li>emitCode(codestr);} </li></ul><ul><li>; </li></ul><ul><li>%% </li></ul><ul><li>/*utility functions… */ </li></ul>
  32. 34. 8.2.3 Generation of Target Code from Intermediate Code
  33. 35. <ul><li>Code generation from intermediate code involves either or both of two standard techniques: </li></ul><ul><ul><li>Macro expansion and Static simulation </li></ul></ul><ul><li>Macro expansion involves replacing each kind of intermediate code instruction with an equivalent sequence of target code instructions. </li></ul><ul><li>Static simulation involves a straight-line simulation of the effects of the intermediate code and generating target code to match these effects. </li></ul>
  34. 36. <ul><li>Consider the expression (x=x+3) +4, translate the P-code into three-address code: </li></ul><ul><li>Lad x </li></ul><ul><li>Lod x </li></ul><ul><li>Ldc 3 </li></ul><ul><li>Adi t1=x+3 </li></ul><ul><li>Stn x=t1 </li></ul><ul><li>Ldc 4 </li></ul><ul><li>Adi t2=t1+4 </li></ul><ul><li>We perform a static simulation of the P-machine stack to find three-address equivalence for the given code </li></ul>
  35. 39. <ul><li>We now consider the case of translating from three-address code to P-code, by simple macro expansion . A three-address instruction: </li></ul><ul><li>a = b + c </li></ul><ul><li>Can always be translated into the P-code sequence </li></ul><ul><li>lda a </li></ul><ul><li>lod b </li></ul><ul><li>lod c </li></ul><ul><li>adi </li></ul><ul><li>sto </li></ul>
  36. 40. <ul><li>Then, the three-address code for the expression (x=x+3)+4: </li></ul><ul><li>T1 = x + 3 </li></ul><ul><li>X = t1 </li></ul><ul><li>T2 = t1 + 4 </li></ul><ul><li>Can be translated into the following P-code: </li></ul><ul><li>Lda t1 </li></ul><ul><li>Lod x </li></ul><ul><li>Ldc 3 </li></ul><ul><li>Adi </li></ul><ul><li>Sto </li></ul><ul><li>Lad x </li></ul><ul><li>Lod t1 </li></ul><ul><li>Sto </li></ul><ul><li>Lda t2 </li></ul><ul><li>Lod t1 </li></ul><ul><li>Ldc 4 </li></ul><ul><li>Adi </li></ul><ul><li>Sto </li></ul>
  37. 42. End of Part One THANKS
  1. Gostou de algum slide específico?

    Recortar slides é uma maneira fácil de colecionar informações para acessar mais tarde.

×