Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

P code

12,174 views

Published on

  • Be the first to comment

P code

  1. 1. SYSTEM  SOFTWARE     PCODE     R.  V.  COLLEGE  OF  ENGINEERING     DEPARTMENT  OF  COMPUTER  SCIENCE  AND  ENGINEERING   (Autonomous  Institution  Affiliated  to  VTU)   Bangalore  –  560  059                      Assignment  Report  on  P-­‐Code  Compiler     Bachelor  of  Engineering   in   Computer Science & Engineering     By       Sandeep  R.V    1RV10CS089                                  (Academic  Year  :  2012-­‐13  )      
  2. 2. SYSTEM  SOFTWARE     PCODE    1. INTRODUCTION     Interpretive compilers are translators for high-level languages,  Such translators produce (asoutput) intermediate code (P-code for example) which is intrinsically simple enough to satisfy theconstraints imposed by a practical interpreter, even though it may still be quite a long way fromthe machine code of the system on which it is desired to execute the original program. Rather thancontinue translation to the level of machine code, an alternative approach that may performacceptably well is to use the intermediate code as part of the input to a specially writteninterpreter. This in turn "executes" the original algorithm, by simulating a virtual machine forwhich the intermediate code effectively is the machine code.                     Machine  Independent                      Machine  Dependent  Fig.  1.1    2.  P-­‐CODE     P-Code (Portable Code) is an assembly language for a hypothetical stack machine. One Such kind is Bytecode, bytecodes are compact numeric codes, constants, and references(normally numeric addresses), which encode the result of parsing and semantic analysis of thingslike type, scope, and nesting depths of, program objects. They therefore allow much betterperformance than direct interpretation of source code. The name bytecode stems from instruction sets, which have one byte opcodes, followed byoptional parameters. Example: Java bytecode3.  P-­‐CODE  COMPILER     n P-code compilers (also called bytecode compilers) are very similar in concept to interpreter n The source program is analyzed and converted into an intermediate form, which is then executed interpretively n With a P-code compiler, this intermediate forms is the machine language for a hypothetical machine, often called pseudo-machine n P-code object programs can be executed on any machine that has a P-code interpreter
  3. 3. SYSTEM  SOFTWARE     PCODE     n The P-code object program is often much smaller than a corresponding machine code (native code) program would be Fig 3.1 Translation and Execution Using a P-code Compiler4. P-CODE MACHINE Portable code machine is a virtual machine designed to execute p-code. This term is applied both generically to all such machines (such as the Java Virtual Machineand MATLAB precompiled code), and to specific implementations, the most famous being the p-Machine of the Pascal-P system, particularly the UCSD Pascal implementation.4.1. Machine architecture The P-code machine is similar to a conventional computer in that it consists of a processorand a memory. A major difference is that many operations performed by the processor involve thestack, which is part of memory. For example a procedure call entails manipulating various factorssuch as parameters and return addresses and these are held on the stack. The machine instructions,called P-code, are stored in the memory and accessed in the normal manner. The processor has a defined instruction set as well as five registers, which have distinctfunctions for controlling the instructions and the stack areas within the memory.
  4. 4. SYSTEM  SOFTWARE     PCODE     The registers are: . PC the program counter; . SP the stack pointer . MP the mark stack pointer; . NP the new pointer; . EP the extreme stack pointer. The program counter, PC, is a pointer to the current instruction being executed. Otherregisters are used to control the remaining area of memory - the data store. The data store is divided into three areas as shown in Fig. 4.1.1 Fig 4.1.1 The constants area is generated by the assembler and accessed by the code. The heap grows towards the low-numbered locations of store and the NP pointer points atfree heap space. Data is put onto the heap during a call to the new procedure and removed byusing mark and release. The stack area has a further internal structure and is used to hold stackframes. A stack frame is generated and placed on the stack each time a procedure or function is
  5. 5. SYSTEM  SOFTWARE     PCODE    called (in the following section, except where explicitly stated, references to procedures alsoinclude functions.) The first stack frame is the exception in that this belongs to the program block.The stack frames on the stack are shown in Fig. 4.1.2. Fig 4.1.2.4.2. The Structure of Stack Frames The stack frame format with its associated registers is shown in Fig 4.2.1 Fig 4.2.1 The mark stack carries the administrative details for executing and returning froma procedure and for maintaining the necessary links to access variables in outer levels: . The function value is simply an area used for returning the function value to the
  6. 6. SYSTEM  SOFTWARE     PCODE     calling block and so is only used by functions; in practice this area must be capable of holding the largest possible value that may be returned by a function; . The static link is used as a pointer to access variables in outer blocks; it points to the base of the stack frame of the procedure that textually encloses it; . The dynamic link is the previous MP value so that the current stack frame may be removed when the execution of the called procedure is complete; . The previous EP value is similarly used for resetting the EP register when this procedure returns; . The return address enables program control to be returned to the calling procedure. The following example program contains a nested recursive procedure and the schematic diagram shows the static and dynamic links on the stack at the time the program halts. The static link of a procedure will always point to the stack frame of the textually enclosing procedure (q will always point to p, p will always point to the program); the dynamic link will always point to the stack frame of the calling procedure (in this example, q will point sometimes to p and sometimes to q): program main(input,output); var j: integer; procedure p; var i: integer; procedure q; begin if i 0 then begin i:=i-1; j := j+1; q (*procedure q is called recursively here*) end else write(input, 0) (*execution error to halt program*) end; begin i:=2; q end; begin j := 0; p end.
  7. 7. SYSTEM  SOFTWARE     PCODE     Fig 4.2.2 If p were called recursively before calling q, the static links of q would then point to the most recent stack frame of p. The extreme stack pointer, EP, points to the top of the current stack frame. It protects thestack frame from NP, so that SP does not have to be checked each time it is incremented, and italso marks the point where a new stack frame can be placed should a procedure be called from thecurrent one. The maximum possible size of each stack frame is known at compile time and this isreflected by the value of EP. The local stack is used for holding temporary values during the execution of a procedure.For example the SBI instruction takes the two values from the top of this stack, subtracts them andreturns this value back to the stack. The stack pointer will be adjusted to reflect the fact that thereis now one value less on the stack.
  8. 8. SYSTEM  SOFTWARE     PCODE    4.3. Instruction set A typical P-Code assembly instruction looks like OPC [T] [P] [Q] . opc is a three letter opcode mnemonic. . T is a letter from {A, B, C, I, R, S, etc} for type address, boolean, character, integer, real, set, etc. . P is usually an integer specifying a static level for addressing. . Q is an integer or label. P-Code instructions contain four fields, although not all are always used. The OP field mustbe at least 7 bits long to contain the opcode. The T field must be at least 4 bits to contain theoperand type. The P field is at least 4 bits; usually it contains a static nesting level for describingoperand location. The Q field contains enough bits to address all of STORE or CODE. It usuallycontains an absolute address, a data segment displacement, a jump destination address, or a shortconstant. 4.3.1. Instructions for calling procedures and functions There are three P-code instructions for building stack frames, MST and CUP are usedbefore entering the procedure and ENT is used for the first two instructions within the procedure.The detailed operation of these instructions is as follows. A typical calling sequence is MST 0 [code to load parameters] CUP 0 L 3 The operand of the MST (mark stack) instruction is an indication of the depth of nesting ofthe given procedure and is defined as one plus the level of the calling procedure minus the levelof the called procedure. This is used for calculating the static link. MST 0 means: the static link is the stack frame that called you; MST 1 means the static linkis the static link of the stack frame that called you; MST 2 would use the static link of that frame,and so on. The execution of MST creates values for the static and dynamic links and saves the EP. Asjust explained, it assigns the value of the static link to point to the stack frame of the procedurethat textually encloses this one. The dynamic link is assigned the current value of the mark stackpointer and the current value of the extreme stack pointer is stored. The stack pointer isincremented so as to point at the parameter area. This is in readiness for subsequent instructionswhich may load parameters.
  9. 9. SYSTEM  SOFTWARE     PCODE     The CUP (call user procedure) instruction sets the new value of MP and the link for thereturn address and finally causes the jump to the procedure concerned. At the start of the procedure there are two ENT instructions which define the overall size ofthe stack frame and adjust SP and EP accordingly. ENT 1 L 7 ENT 2 L 8 The labels L 7 and L 8 point to the values to be used by the ENT instructions. The procedure execution may now proceed. At conclusion of the procedure, the RET(return) instruction provides the mechanism for returning to the calling procedure and removingstack frames. The main program block is handled in the same way as procedures as regards the stack,except that there are four locations reserved above the mark stack for files. These locations havefixed addresses as shown in Fig. 4.3.1.1, and are manipulated in the same way as global variables.The files may be regarded as parameters to the program. Fig 4.3.1.1
  10. 10. SYSTEM  SOFTWARE     PCODE     The P-Code Instruction Set
  11. 11. SYSTEM  SOFTWARE     PCODE    Fig 4.3.1.2Key to effect on stack: a address b boolean c character i integer r real s set x any of the above types The c in instruction names is a single character denoting one of the primitive types A, B, C, I, R, S, and matches the x in the effect column.
  12. 12. SYSTEM  SOFTWARE     PCODE     5. UCSD p-Machine 5.1. Architecture Like many other p-code machines, the UCSD p-Machine is a stack machine, which meansthat most instructions take their operands from the stack, and place results back on the stack.Thus, the add instruction replaces the two topmost elements of the stack with their sum. A fewinstructions take an immediate argument. Like Pascal, the p-code is strongly typed, supportingboolean (b), character (c), integer (i), real (r), set (s), and pointer (a) types natively. Some simple instructions:Fig 5.1.15.2. Environment Unlike other stack-based environments (such as the Java Virtual Machine) the p-System hasonly one stack shared by procedure stack frames (providing return address, etc.) and thearguments to local instructions. Three of the machines registers point into the stack (which grows upwards): . SP points to the top of the stack (the stack pointer). . MP marks the beginning of the active stack frame (the frame pointer). . EP points to the highest stack location used in the current procedure . Also present is a constant area, and, below that, the heap growing down towards the stack.The NP register points to the top (lowest used address) of the heap. When EP gets greater thanNP, the machines memory is exhausted. The fifth register, PC, points at the current instruction in the code area.
  13. 13. SYSTEM  SOFTWARE     PCODE    5.3. Calling conventions Stack frames look like this:Fig 5.3.1The procedure calling sequence works as follows:The call is introduced with: mst nwhere n specifies the difference in nesting levels. This instruction will mark the stack, i.e. reservethe first five cells of the above stack frame, and initialise previous EP, dynamic, and static link.The caller then computes and pushes any parameters for the procedure, and then issues cup n, pto call a user procedure (n being the number of parameters, p the procedures address). This willsave the PC in the return address cell, and set the procedures address as the new PC.User procedures begin with the two instructions: ent 1, i ent 2, j The first sets SP to MP + i, the second sets EP to SP + j. So i essentially specifies the spacereserved for locals (plus the number of parameters plus 5), and j gives the number of entriesneeded locally for the stack. Memory exhaustion is checked at this point.Returning to the caller is accomplished via retCwith C giving the return type (i, r, c, b, a as above, and p for no return value). The return value hasto be stored in the appropriate cell previously. On all types except p, returning will leave thisvalue on the stack. Instead of calling a user procedure (cup), standard procedure q can be called with csp q These standard procedures are Pascal procedures like readln() (csp rln), sin() (csp sin)etc.
  14. 14. SYSTEM  SOFTWARE     PCODE    Advantages of P-code compiler: • Portability of software It is not necessary for the compiler to generate different code for different computers. • The source version of the compiler is compiled into p-code, this p-code can then be interpreted on another computer. • Easy to write a new compiler for each different machine. • The p-code object program is much smaller than a corresponding machine code program.Disadvantages: The interpretive execution of a P-code program may be much slower than the execution of the equivalent machine code.REFERENCES: • http://www.decompiler-vb.net/blog/post/The-truth-about-P-Code.aspx • Pascal  Implementation  by  Steven  Pemberton  and  Martin  Daniels,  CHAPTER   10  http://homepages.cwi.nl/~steven/pascal/book/10pcode.html • http://pascal-central.com/pcode.html#What%20Is%20P-Code • http://www.itechtalk.com/thread135.html • A Pascal P-Code Interpreter for the Stanford Emmy, by Donald Alpert, Technical Note No .164.

×