Compiler Design QA

12,431 views
12,306 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
12,431
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
360
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Compiler Design QA

  1. 1. 7)Back patching: *.Back patching is a technique to solve the problem of replacing symbolic names into goto statements by the actual target addresses. *.This problem comes up because if some languages do not allow symbolic names in the braches. *.Idea: Maintain a list of branches that have the same target labels and replace the once they are defined. The easiest way to implement the Boolean expression is using two passes. (i)Construct a syntax tree for the input. (II)Walk the tree in depth –first order, Computing the translating given in the definition. The main problem with generating code for Boolean expressions and flow of control statements in a single pass, is that during one single pass we may not know the labels that control must go to at the time the jump statements are generated *.Back patching can be used to generate code for Boolean expressions and flow-of- control statements in one pass. *.During one pass, the following steps are expected. 1. Constrnct the syntax tree. 2. Depth-first order tree walk computing translations. 3. Generate jumps with unspecified targets(labels). 4. Keep a list of such statements. 5. Subsequently fill in the labels(back patching) To manipulate list of labels, we use following three functions: 1.makelist(i): Creates a new list containing only i,an index into the array of quadraples ; Make list returns a pointer to the list it has made. 2.merge(p1,p2):Concatenates the lists pointed to by p1,&p2 and returns a pointer to the concatenated list. 3.backpatch(p,i):Inserts i as the target labels for each of the statements on the list pointed to by P. Boolean expressions: The following grammar is used: (some syntax) *.Insert a mark non terminal M into the grammar to pick up index of next quadruple *.Attributes true list and false list are used to generate jump code for Boolean expressions. *.Incomplete jumps are placed on list pointed to by E.truelist and E falselist. *.Consider (symbole)and ME2 -if E1 is false then E is also false so statements in E1. falselist become part of E.falselist. - if E1 is true then E2 must be tested so target if E1.truelist is beginning of E2. -traget is obtained by marker M. -attribute M-quad records the number of the first list stmnt of E2.code -true list , false list are synthesized attributes. The variable next quad holds the index of the next quadrnple to follow. Translation sctreme is as follows. 1.(symbol):{backpatching(E1.falselist,M.quad); E.truelist:=merge(E1.truelist,E2.truelist);E.falselist:=E2.falselist}
  2. 2. 2.(symbol):{back patch(E1.truelist,M.quad); E.truelist:=, E2.truelist; E.falselist:= merge (E1.truelist, E2.truelist); 3.(symbol):{ E.truelist:= E1.falselist; E.falselist:= E1.truelist 4.(symbol):{ E.truelist:= E1.truelist; E.falselist= E1.falselist} 5.symbol);{E.truelist:=makelist(nextquad); E.falselist:=makelist (nextquad +1); Emit (‘if’id, place relop.op id2.place’goto’) Emit(‘goto-‘)} 6.(symbol):{E.truelist:=makelist(next quad); Emit (‘goto-‘)} 7.(symbol):{E.falselist:=makelist(next quad); Emit(‘goto-‘) 8(symbol):{M.quad:=next quad} 8) Explain the various data structures used during compilation (or)Explain the various data structures used for implementing the symbol table. The various data structures used for implementing the symbol on table are 1.Linear list,2.Binary tree ,3.Hash table 1. Linear list: *.This method involves arranging the symbols sequentially in memory by using a simple linear linked list *The new names are added to the table in order, of their arrivel. * Whenever the new name is to be added, first the table searched linearly or sequentially to check whether the name is already present or not Name1 Info1 Ram2 Info2 *if not then the record fro new name is created and added to the list at a position given by the available pointer. *To access a particular symbol, the table is searching sequentially from its beginning until it is found. *It will take on the average N/2 comparisons to find a particular symbol in a table containing N entries. *The insertion method is very fast,but the referencing is extremely slow. *In the case of few references,this method would be efficient.
  3. 3. *But in the case of many, some other method should be used. Eg:For 1000 entries,it would take 500 comparisons (average) to laeate a specified element. Advantages: 1.It takes less space. 2.Insertions to the table is simple. 2.Binary tree: A binary tree is constructed with typical nodes of the form Lptr Symbols Info Rptr Where Lptr represents left pointer Rptr represents right pointer Symbols represents identifier(or) variable name Info represents information about the identifier. The entries in the table are stored in alphabetical or numerically increasing order Name1 info Name2 info (one more table) *Whenever to review a particular symbol, the middle entry is located & its value is examined. *If its value is too high then the middle entry of the first half the table tried until the entry of the first half the table tried until the item is found. *If the value is too low then the second half the table 9middle entry) is tried and the procedure is repeated until the item is found *An average of log2 N comparisons is required in order to locate an entry. *for higher n this method has advantages over the linear list organization. (Whenever a name is to be added first the name is searched in the tree.If it dose not exit then a record for the name is created and added at the proper position in the search tree.) 3Hash table: The basic hashing scheme has two parts; a)A hash table consisting of a fixed array of m pointer to table entries. b)Table entries organized into m separated linked lists, called bucket. each record in the symbol table appears on exactly one of these list(diagram)
  4. 4. *To enter a name into symbol table ,we find out the hash value of the name by applying suitable hash function, which maps the name into an integer between 0 to m-1 *by using this value as a index of a hash table ,We search the list of symbol table records built on that hash index. *If the name is not present in the list, we create a record for a name and insert it at the head of the list built on that hash index. *The retrieval of the names information, first the hash value of the name is obtained and the list is searched for getting the information about the name. 9.Specification of tokens: *Regular expressions are an important notation for specifying patterns . *Each pattern matches a set of strings so regular expressions will save as names for sets of strings. Strings: *An alphabet or a character class is a finite set of symbols eg. For symbol are letters and characters. *The set{0,1}is the binary alphabet. *eg.for computer alphabets; ASCII, UNICOD, EBCDI, LATIN-1. *A string over an alphabet is a finite sequence of symbols from that alphabet. Strings are often called words or sentences. For eg:banana is a sequence of six symbols.(i.e string of length six) taken from ASCII computer alphabet. *The empty string denoted by E,is a special string with zero symbols (i.e,string length is 0) *|s| is the length of the string s. *If x and y are two strings, then the concatenation of x & y written as xy , is the string formed by appearing y to x. For eg: If x= computer ,y=science then xy= computer science. *the empty sting is the identify element Under concatenation i.e,SE=SE=S., *String exponentiation concatenates a string with itself a given number of times S2=SS or S.S S3=SSS or S.S.S By definition S0 is an empty string E and s1=s for eg: if x=ba, y=na, then xy2= banana.
  5. 5. More string terminology Term Definition Prefix of S A string obtain by removing zero or more trailing symbols of string s.eg:ban is a prefix of banana. Suffix of S A string formed by deleting zero or more of the leading symbols of S.eg ,nana is a suffix of banana. Sub string of S A string formed by deleting a prefix & a suffix from S. eg nan is a substring of banana. Proper prefix ,suffix or substring of S A string do not include E and S. subsequent of S Any string formed by deleting zero or more not necessarily contigons symbols from S. (eg) banana is a subsequence of banana. Languages: • A language is a set of strings over some fixed alphabet. It may contain finite or an infinite number of strings. • (phi) the empty set is a language. • {E} the set containing empty string is a language. • The set of all syntactically well from c-programs is a language • The set of all possible identifies is a language. (left one page) 10.What is three address code?what are its types ? how it is implemented? Three address code. It is a sequence of stmnts of the form x:y op Z Where x,y,z- names ,constantants or temps op--any operator. A string involves not more then three references ,it is called three address stmnt and a sequence of such stmnt is called three addres code. Eg for TAC: x+y*z is t1 :=y*z, t2:=x+t1 Sometimes a stmnt may also contain less than three refrences, Types of three address statements 1.Asignment stmnts of the form x:=y0pz, where op is a binary arithmetic or logical operation. 2.Assignment instrnctions of the form x:=opy, where op is a binary operation.
  6. 6. 3.Copy stmntrs of the form x:=y, where the value of y is assigned to x. 4. The unconditional jump goto L .The three address stmnts with lable L is the next to be executed 5.Conditional jumps such as if x relop y goto L 6.Param x and call P, n for procedure calls and return y,where y representing a returned value is optional.their typical use is as the sequence of three addres stmnts. Param x1 Param x2 . . . Param xn Call P,n Generated as part of a call of the procedure P(x1,x2,……xn) 7.Indexed assignments of the form x:=y[i] and x[i]:=y 8. Address & pointer assignments of the form X:=&y, x:=*y and *x:=y. Implementations of three address statements A three address stmnts is an abstract form of intermediate code .three types are 1.Quadraples 2.Triples 3.Indirect triples 1.quadraples:A quadruples is a record strnethre with four fields op arg1,arg2 and result .The op field contains an internal code for the operator Eg: x:=y op z is represented by, Op arg1 arg2 result (0) op y z x *stmnts with unary operators like x:=-y or x:=y do not use arg2. * operators like param use neither arg2 nor –result * Conditional and unconditional jumps put the target label in result.Eg a:=b*-c+b*-c OP arg1 arg2 result (0) 0 mints c t1 (1) * b t1 t2 (2)Lmints c t3 (3)* b t3 t4 (4)+ t2 t4 t5 (5) := t5 a 2. Triples: A triple is a record structure with three fields .op arg1 and barg2. Arg1,arg2, are either pointer to the symbol table (or) pointer into the triple structure. This method is used to avoid entering temporary names into the symbol table Temporary value is referred by the position of the stmnts that computes it. Eg: Triple representstion of a:=b*-c+b-c 1.What is an ambigons grammer? Give an eg: A grammer that produces more than one parse tree for some sentences is said to be (eg)E-E+E/E E|(E)|-E|id
  7. 7. Consider the sentence id +id *id Two parse tre for the above sentence are(dig) 2. Kernal items :It is a collection of all the items S’-.S and all the items Whose dots are not at the left most end of RAS of the rule. Non kernel items: It is the collection of all the items in which are at the left end of RHS of the rule. 3.What is a cross – compiler ? Give an example. There may be a compiler which run on one machine and produces the target code for another machine is called cross – compiler. Its from which platform independency is achived eg; for the first version of EQN compiler, the compilers written in c and the command are generated for TROF. (dig) The cross compiler for EQN can be obtained by running it on PDP –H through C Compiler which produces out put for POP-n (dig) 4.What are the methods of representing a syntax tree? (i)Array representation (ii) Linked list representation Op arg1 arg2 (0) luminus c (1)* b (0) (2) luminus c (3) * b (2) (4) + (1) (3) (5)= a (4) 3.Indirect triples: Listing pointer to triples rather than listing the triple themselves are called indirect triples Eg : Indirect triple representatioin of a:=b*=c+b*-c Statements (0) (10) (1) (11) (2) (12) (3) (13) (4) (14) (5) (15)
  8. 8. Op arg1 arg2 (10) luminus c (11)* b (10) (12) luminus c (13) * b (12) (14) + (11) (13) (15)= a (14) Advantage:Indirect triple can save some Space compared with quardaples if the same temporary value is used mor then once.

×