1. VIDYAVAHINI FIRST GRADE
COLLEGE
SYSTEM PROGRAMMING
6TH
Sem BCA
Vidyavahini First Grade College
Near Puttanjaneya Temple, Kuvempunagar, Tumkur – 572103.
E-Mail:vvfgc.bca@gmail.com
Website:www.vidyavahini.org/bca
Contact No: 0816 – 2261130Contact No: 0816 – 2261130
2. System Programming Vidyavahini First Grade College
Department of BCA Page 2
Chapter-1
Introduction
Software
Software is as set of instructions or programs written to carry out certain task on digital
computers
Types of software
1. System software
2. Application software
System software
System software consists of a variety of programs that support the operation of a
computer
Ex:OS,compiler,loader, linker ,assembler,macroprocessor.
System software acts as an intermediary between the users and the hardware. It creates a
virtual environment for the user that hides the actual computer architecture.
Virtual machine is a set of services and resources created by the system software and
seen by the user.
Application software
An application is a program, or group of programs, that is designed for the end user.
The special purpose programs are also know as packages
Ex. database programs, word processors, Web browsers and spreadsheets, library
management system.
3. System Programming Vidyavahini First Grade College
Department of BCA Page 3
Difference between system software and application program
System software Application software
1. It is used in the operation of the computer. 1.it is used to perform some user task
2. it is machine dependent software 2.it is machine independent software
3.the programmer should know the architecture
of the computer
3.not necessary to know the
architecture of the computer
4. System software is not meant to be run by
the end user.
4. Application software can be run by
the end user.
5.ex: compiler,loader,os,assembler etc.
5.Ex: inventory, payroll preparation,
banking system etc
The role of system software
Components of system software
The various components of system software are:
1. Assemblers
2. Loaders
3. Macros
4. Compilers or interpreters
5. Operating system
Assemblers
Assemblers are the programs that translate the assembly language program (source code)
into machine language program (object code)
Loader
Loader is system software that places the program into memory and prepares them for
execution.
4. System Programming Vidyavahini First Grade College
Department of BCA Page 4
Macros:
Macro processor is a program that substitutes and specializes macro definitions for macro
calls.
Compilers:
The programs that translate the high level language program (source code) into machine
language program (object code)
Operating system
It contains programs which manage and it is concerned with allocation of resources and
services such as memory, processor, devices and information.
Machine structure
Memory
Memory is the device where information is stored and retrieved.
Information is stored in the form of 1’s and 0’s.
Each 1/0 is a separate binary digit called bit.
Bits are grouped into words, characters or bytes.
o Nibble 4bits
o Byte 8bits
o Half word 16bits
o Word 32bits
o Double word 64bits
Basic unit of memory is a byte.
5. System Programming Vidyavahini First Grade College
Department of BCA Page 5
the content of memory are
o DATA- values to be operated on
o INSTRUCTIONS-operation to be performed
Instructions and data shares the same memory or storage medium
Processor
Processor is a device that performs a sequence of operations specified by instructions in
memory.
There are two types of processors,
1. Central processing units
2. Input and output processors
Central processing units
It is the brain of the computer
It controls all internal and external devices, performs arithmetic and logic operations
It also carries out the instructions of a computer program
It is concerned with manipulations of data stored in memory
Input and output processors
Input and output processors transfers data between memory and peripheral devices such
as disks, drums, printers etc.
An I/O processor executes these instructions which are activated by a command from the
CPU.
Programming the I/O processor is called as I/O programming
6. System Programming Vidyavahini First Grade College
Department of BCA Page 6
Evolution of the components of a programming system
Assembler
In the earlier stages the computer programmers are used to write programs using 0’s and
1’s.programmers found difficult in writing programs using machine language
Assembly language is a low level programming language that allows a user to write
programs using letters and symbols (mnemonic) which are more easily remembered.
Assembler is a system program supplied by the computer manufacturer
Assemblers are the programs that translate the assembly language program(source code)
into machine language program(object code)
7. System Programming Vidyavahini First Grade College
Department of BCA Page 7
Loaders
The purpose of a loader is to assure that object programs are placed in memory in an
executable .the assembler itself could place the object program directly in memory and
transfer control to it, then that machine level language program is executed. But there are
two disadvantages
1. Wastage of memory-assembler itself occupies more space in memory during
execution.
2. Wasting translation time-need of retranslation of the program with each
execution.inorder to avoid this the new system software called loader is introduced
If the program size is very large then subdivide the program into smaller routines called
sub-routines
There are two different types Subroutines
1. Closed subroutines
2. Open subroutines
The task of adjusting programs, so that they may be placed in arbitrary memory locations
is called relocation.
8. System Programming Vidyavahini First Grade College
Department of BCA Page 8
Relocating loader perform four functions
1. Allocate space in memory for the programs (allocation)
2. Resolve symbolic references between object decks(linking)
3. Adjust all address dependent locations(relocation)
4. Physically place the machine instructions and data into memory(loading)
Based on the loading function the loader is divided into different types they are
1. Compile and go
2. Absolute loader
3. Relocating loader
4. Direct linking
5. Dynamic loading and linking
Macros
To eliminate the need of repeating identical parts of the program, operating systems
provide a macro processing facility
Macro permits to define an abbreviation for a part of program and use the abbreviation in
anywhere in a program. The macro processor treats the identical parts of program defined
by the abbreviation as a macro definition.
The macro processor substitutes the definition for all occurrences of the abbreviation in
the program
Compiler
As the user started concentrating problems into areas such as scientific, business,
statistical areas. High level languages were developed to express certain problems more
easily.
COBOL,FORTRAN,PASCAL,ALGOL ,C etc are high level languages, which is
processed by compilers and interpreters
9. System Programming Vidyavahini First Grade College
Department of BCA Page 9
Formal systems
A formal system is an uninterrupted calculus. It consist of
o An alphabet
o A set of words called axioms and
o A finite set of relations called rules of inference
Examples of formal systems are set theory,boolean algebra, post systems
Uses of formal systems
o Used in the design, implementation and study of programming languages
o Used to specify the syntax and the semantics of programming languages
o Used in syntax directed compilation, compiler verification and complexity studies of
languages
Operating system
OS is a program that controls the execution of an application program and acts as an
interface between user and computer hardware.
10. System Programming Vidyavahini First Grade College
Department of BCA Page 10
The main functions of operating system are
o Job sequencing
o Job scheduling
o Input/output programming
o Secondary storage management
o User interface
o Error-handling
Answer the following questions:
1. Differentiate between system software and application software. (5M, June 2013)
2. Differentiate between compiler and interpreter. (5M, June 2013)
3. Define loader. Mention the functions of loader. (3M, May 2012)
11. System Programming Vidyavahini First Grade College
Department of BCA Page 11
Chapter-2
Machine structure, Machine language and Assembler language
History of IBM System/360/370
IBM System 360 was announced in April 1964.
System/360 was the first line of processors with upward and downward compatibility.
System/360 was the first major product line designed for both business and scientific
need.
IBM System 370 was announced in June 1970.
Addressing and virtual memory architectures are created and enhanced the performance
of 370.
System/370 had an execution time 3 to 5 times faster than system/360 model 50 and 60.
General machine structure
The CPU consists of
An instruction interpreter
A location counter
An instruction register
Various working registers and
General purpose registers
12. System Programming Vidyavahini First Grade College
Department of BCA Page 12
Instruction interpreter
It is a group of electrical circuits that performs the purpose of the instructions fetched
from the memory.
It is like a decoder that decodes the type of the instruction.
Location counters (LC)
It is also known as program counter or instruction counter which holds the location of the
current instructions being executed
Instruction registers (IR)
It contains a copy of the current instruction that is executed.
Working registers (WR)
WR are memory devices that serves as ‘SCRATCH PAD’S’ (a plurality of multibit
storage locations) for instruction interpreter.
WR are general purpose registers.
General purpose register
GPR’S are used by the programmer as storage locations and for special functions.
Memory address registers (MAR)
It contains the address of memory location that is to be read from or stored into.
Memory buffer register (MBR)
13. System Programming Vidyavahini First Grade College
Department of BCA Page 13
It contains a copy of the designated memory location specified by the MAR, after read or
write.
Memory controller
It is hardware that transfers data between the MBR and the core memory location the
address of which in the MAR
I/O channels
It may be through of as separate computers which interpret special instructions for
inputting and outputting information from the memory.
Example:
ADD 2, 176
Where ADD is the operation code.
2 is the register number and
176 is the memory location.
This means, add the contents of general register 2 with the data stored in the memory location
176 and store the result in register 2.
The sequence of hardware operations required to execute this instruction is given in the
flowchart.
14. System Programming Vidyavahini First Grade College
Department of BCA Page 14
Machine structure of IBM 360
Memory
Memory is the device where information is stored and retrieved.
Information is stored in the form of 1’s and 0’s.
Each 1/0 is a separate binary digit called bit.
Bits are grouped into words, characters or bytes.
o Nibble 4bits
o Byte 8bits 1bytes
o Half word 16bits 2bytes
o Word 32bits 4bytes
o Double word 64bits 8bytes
Size of the IBM 360 memory is up to 224
bytes
15. System Programming Vidyavahini First Grade College
Department of BCA Page 15
Address: Memory locations are specified by addresses, where each address identifies a
specific byte or word. The addressing scheme used in IBM 360 contents of it may be
data or instructions
Address=value of an offset + contents of a base register + contents of an index register.
Registers
Types of register are
Name of the register number size
General purpose registers 16 32bits each
Floating point registers 4 64bits each
Program status word 1 64bits
General purpose registers
IBM 360 machine has 16 general purpose registers denoted by R0 to R15.
The general purpose register may be used for various arithmetic and logical operations
these registers acts as scratch pads and base-registers they aid in the formation of the
address.
For example
A 1,901(2, 15)
Where A - add opcode
1 - Argument registers number
901 - offset
2 - Index register
15 - Base register
The total number of bits for an add instruction would be 32
Advantage of base register
16. System Programming Vidyavahini First Grade College
Department of BCA Page 16
It helps in the process of relocation of a program.i.e., an entire program may be
moved from one series of locations to another by changing the contents of the
base register
Efficient addressing of core
Saves 8 bits per address reference.
Disadvantages of base register
Overhead associated with the formation of the address during execution
Difficult to reach the data.
When we were not using the base register, to address all the 24 bytes locations, we would
have needed 24 bits for every address, whose format would be
i.e., if base register is not used 40 bits is required for each instruction which leads to an addition
of bits and thus wastage of memory.
Program status word (PSW)
It contains the value of the location counter, protection information and interrupts status
which is 64bits in length.
Data formats
The different types of data formats present in IBM system 360 are
1. Short form fixed point numbers
2. Long form fixed point numbers
3. Packed decimal numbers
4. Unpacked decimal numbers.
5. Short form floating point number
6. Long form floating point numbers
17. System Programming Vidyavahini First Grade College
Department of BCA Page 17
7. Logical or character data
1. Short form fixed point numbers
In this 2 bytes (16 bits) of memory is allocated to store data.
Out of which the first bit is reserved for sigh bit and next 15 bits are reserved for
integer.
To allocate sign bit either 0 or 1 based on given number.
If given number is positive then the sign bit is zero.
If given number is negative then the sign bit is one.
Format:
Example 1: Decimal number +257 is represented as
S(+) binary equivalent
0 000 0001 0000 0001
16bits
2. Long form fixed point numbers
In this 4 bytes (32 bits) of memory is allocated to store data.
Out of which the first bit is reserved for sigh bit and next 31 bits are reserved for integer.
To allocate sign bit either 0 or 1 based on given number.
If given number is positive then the sign bit is zero.
If given number is negative then the sign bit is one.
Format:
18. System Programming Vidyavahini First Grade College
Department of BCA Page 18
Example:
Decimal number +267 is represented as
The byte address starting from 1016 it occupies four locations
4 bytes =32bits
0 000 0000 0000 0000 0000 0001 0000 1011
S(+) 1016 1017 1018 1019
Byte address
3. Packed decimal numbers
Size of this format will be 1 to 16 bytes.
Out of which last 4 bits are reserved for sign bit.
Sign bit contains the BinaryCodedDecimal (BCD) of hexadecimal digits A,B,C,D,E & F.
The hex digits C,A,F,E indicates a +ve numbers
While D & B indicates a –ve numbers
Instead of representing the binary numbers, we are represented in binary coded decimal
format.
It leads to efficient usage of memory.
Format:
19. System Programming Vidyavahini First Grade College
Department of BCA Page 19
Example:-021 is represented as
2bytes
0000 0010 0001 1101
BCD 0f 021 sign bit(-)
4. Unpacked decimal numbers
Format:
It is as same as the packed decimal number format.
The size of this format also 1 to 16 bytes.
Out of which the last 1byte (4+4 bits) is reserved for sign and data.
In between the BCD of each numbers a 4 bit zone code/padding bit is introduced which
contains 0/1.
The hex digits C,A,F,E indicates a +ve numbers
While D & B indicates a –ve numbers
5. Short form floating point number
In this 4 bytes (32 bits) are allocated.
Out of which the first bit is reserved for sign bit.
20. System Programming Vidyavahini First Grade College
Department of BCA Page 20
In this floating point number is divided into exponent and fractional part.
Format:
Example:-118.625e5
4byts(32bits)
1bit 7bits 24bits
1 1000010 0111 0110 1010 0000 0000 0000
S Exponent Fraction
1bit-sign bit(-ve)
7bits represent exponent-in this example 5 is the exponent the range of the 7bit is 127.so
add 127+5=132 then 132 is convert to binary(2) by division method
24bits represents fraction-in this example 0.625 is the fractional part then convert 0.625
to binary by repeated multiplication by 2
6. Long form floating point numbers
Format:
It is same as short form floating point numbers but in Long form floating point numbers
consists of 64bits (8 bytes) are allocated for the floating point number which contains the
exponential and fractional part
21. System Programming Vidyavahini First Grade College
Department of BCA Page 21
7. Logical or character data
In character data each character code is stored in 8 bits (1byte) and its length varies from
1 to 256 bytes.
Format:
Example: ‘A’ is represented in IBM 360 as
1byte
0100 0001
Instructions
An instruction includes an opcode, specifying the operation to be performed, such as
“add contents of memory to register” and zero or more operands, which may specify
registers, memory locations or literal data. The operand may have addressing modes
determining their meaning or may be in fixed fields.
The size or length of an instruction varies from 2 bytes to 6 bytes depending on the
instruction formats
Instructions are of different types they are
o Arithmetic instructions
o Control or transfer instructions.
o Special interrupt instruction
The different types of operands
o Register operands
o Storage operands
o Immediate operands
22. System Programming Vidyavahini First Grade College
Department of BCA Page 22
Register operands
Register operands refer to data stored in one of the 16 general purpose registers, which
are addressed by 4 bit field in the instruction.
Storage operands
Storage operand refers to data stored in core memory. The length of the operand depends
upon the specific data type.
Immediate operand
Immediate operands are single byte of data and are stored as part of the instruction.
The different instruction formats are
o RR format
o RX format
o RS format
o SI format
o SS format
RR instruction:
RR instruction denotes register to register operation.i.e, both the operands are register.
The length of RR instruction is 2 bytes (16 bits)
The general format of RR
Example: the instruction is Add 3, 4
RX instruction:
RX instruction denotes a register and indexed storage operation
The length of RX instruction is 4 bytes (32 bits)
23. System Programming Vidyavahini First Grade College
Department of BCA Page 23
The general format of RX instruction is
Indexed storage operand refers to the data stored in core memory. The address of the
storage operand is calculated as follows
Address=value of an offset or displacement + contents of a base register + contents of an
index register
=C(B2)+C(X2)+D2
Example: ADD 3,16(0,5)
Assume base register 5 contains the number 1000
The address of the storage operand= C(B2)+C(X2)+D2
=C(5)+0+16
=1000+0+16
=1016
RS instructions
RS instruction denotes register and storage operation
The length of RS type instruction is 4 bytes (32 bits).
The general format of RS instruction is
24. System Programming Vidyavahini First Grade College
Department of BCA Page 24
Address=value of an offset or displacement + contents of a base register
=C (B2)+D2
Example 1: LM 1, 3 16(5)
4bytes
OP R1 R3 B2 D2
1001 1000 0001 0011 0101 0000 0001 0000
Load multiple
register 1 3 5 16
Address =C (B2) + D2
= C(5)+16
=1000+16
=1016
The load instruction loads register 1 and 3 with the contents of location 1016
SI instruction
SI instruction denotes storage and immediate operand operation
Immediate operands are single byte of data and stored as part of the instruction
The length of SI type instruction is 4 bytes (32 bits).
The general format of SI instruction is
25. System Programming Vidyavahini First Grade College
Department of BCA Page 25
Address=value of an offset or displacement + contents of a base register
=C (B1) +D1
Example: MOV I2, 4(5)
Address of storage operand =C(B1)+D1
=C(5)+4
=1000+4
=1004
SS instruction
SS instruction denotes a storage to storage operation
The length of SS instruction is 6 bytes(48 bits)
The general format is
Example 1: MVC 32 (79, 5), 300(5)
26. System Programming Vidyavahini First Grade College
Department of BCA Page 26
In SS format, the length is always on less than the data moved.i.e. If L=0 move 1 byte.
Here L=79, therefore move 80 bytes from location 1032[till 1111(1032+79)] to 1300[till
1379(1300 + 79)]
27. System Programming Vidyavahini First Grade College
Department of BCA Page 27
Instruction set
The various categories of instructions are
1. load-store registers instructions
2. Fixed point arithmetic
3. Logical instructions
4. Transfer instructions
5. Miscellaneous instructions
1. load-store registers instructions
2. Fixed point arithmetic
3. Logical instructions
28. System Programming Vidyavahini First Grade College
Department of BCA Page 28
4. Transfer instructions
5. Miscellaneous instructions
29. System Programming Vidyavahini First Grade College
Department of BCA Page 29
Machine level language
Machine language is the basic language of the computer, representing data as 1’s and 0’s.
Example1: L 2,924(0, 1)
OP R1, D2(X2, B2)
Advantages of machine language
Instructions of a machine language program are immediately executed, they require no
compilation or translation.
Machine language make efficient use of storage
Disadvantages
Machine languages are machine dependent
Machine language is difficult to program, since the programmers has to know the
architecture of the system.
Writing, reading, correcting or modifying a machine language program is difficult.
Example program:
Write a program that will add the number 49 to the contents of 10 adjacent full words
(32bits or 4 bytes) in memory with the following assumptions:
30. System Programming Vidyavahini First Grade College
Department of BCA Page 30
1. The 10 numbers are contiguous full words beginning at absolute location 952.
2. The program is in core memory starting at absolute location 48.
3. The number 49 is a full word at absolute location 948.
4. Register 1 contains a 48.
Fig shows the core memory structure.
This program can be written using three different ways
1. Long way, no looping.
2. Address modification using instruction as data.
3. Address modification using index registers.
4. Looping
Long way, no looping.
Register 2 is used as an accumulator
Index register is 0.therefore the contents of the index register is also 0.
Address of the storage operand=offset + contents of the base register.
Content of register 1 is 48.
31. System Programming Vidyavahini First Grade College
Department of BCA Page 31
L 2,904(0, 1)
Load the first number into register 2
Address of the storage operand =904+contents of base register1
=904+48
=952
Contents of register 2=contents of memory location 952=Data1
A 2,900(0, 1)
Add 49 with data1
Address of the storage operand =900+contents of base register1
=900+48
=948
Contents of register 2 =contents of register2+contents of memory location 948
=Data1+49
ST 2,904(0, 1)
Address of storage operand =904+contenet of register1
=904+48
=952
Content of register 2 =content of base register2
32. System Programming Vidyavahini First Grade College
Department of BCA Page 32
=Data1+49
L and ST are RX type instruction. Whose size is 4 bytes. Therefore absolute and relative
address is incremented by 4.
Advantages
Implementation is easy.
Disadvantages
Instructions are repeated for all the data items
It is impossible to access both the first data item and the last data item using register 1 as
the base
Wastage of memory
Need of relocation.
Instruction would overlap data in the core.
Address modification using instructions as data
In this approach the problem consisting only of those 3 instructions followed by a
sequence of commands that would change the offset of the load and store instruction by
adding 4 to them
In addition to the 4 assumptions given in the problem statement, we are going to make
one more assumptions.
Assumption 5: relative location 896 contains a 4.
Here instruction is treated as data. Therefore adding 4 to an instruction will update its
offset.
For example, if location 48 contains the instruction L 2,904(0, 1)
The instruction is stored as follows from byte number 48
33. System Programming Vidyavahini First Grade College
Department of BCA Page 33
Now when we add 4 to this instruction, the offset present in the 4th
byte is treated as data
and is incremented by 4.
L+4=904+4
=908
Advantages
Saves memory
Address is modified easily using the instruction.
34. System Programming Vidyavahini First Grade College
Department of BCA Page 34
Disadvantages
Treating instructions as data is not a good programming practice
Separate instructions are used for increasing the displacement(offset) of load and store
Address modification using index registers
In this approach,we use the same 3 instructions i.e.,load,add and store.we simply loop
through these 3 instructions,updating the storage operands of load and store
instructions,by adding 4 to the contents of the index register during each pass.
Register 4 is used as an index register.
SR 4, 4
Clear register 4 by subtracting the contents of register 4 from register 4.
The contents of register 4=0
L 2,904(4,1)
Load data element of array
Address of the storage operand=904+contents of index register 4+contents of base
register 1
35. System Programming Vidyavahini First Grade College
Department of BCA Page 35
=904+0+48
=952
Content of register 2 =contents of memory location 952
=data1.
A 2,900(0, 1)
Add 49
Address of the storage operand =900+contents of base register 1
=900+48
=948
Contents of register 2 =contents of register 2+contents of memory
=Data1+49
ST 2,904(4, 1)
Replace data element
Address of the storage operand =904+contents of index register 4+contents of base
register1
=904+0+48
=952
Contents of memory location 952 =contents of register2
=Data1+49
A 4,896(0,1)
Add 4 to index register
Address of the storage operand =896+contents of base register1
=896+48
=944
Contents of index register 4 = Contents of index register 4+contents of memory
location944
=0+4=4
Note
SR is RR type instruction, whose length is 2 bytes
L is RX type instruction, whose length is 4 bytes
36. System Programming Vidyavahini First Grade College
Department of BCA Page 36
A is RX type instruction, whose length is 4 bytes
ST is RX type instruction, whose length is 4 bytes
Contents of index register 4 will be 4,8,12 etc during the subsequent passes
Advantages
Easy to understand
Saves memory
Looping
The additional assumptions made for this method are
Assumption 6: relative location 892 contains a 10
Assumption 7: relative location 888 contains a 1.
s
L 3,892(0, 1)
Load data into register 3
Address of the storage operand =892+C(B1)
=892+48
=940.
Content of register 3 =contents of memory location 940
=10
37. System Programming Vidyavahini First Grade College
Department of BCA Page 37
S 3,888(0, 1)
Subtract 1
Address of the storage operand =888+contents of register 1
=888+48
=936
C(R3) =C(R3)-contents of memory location936
=10-1
=9
ST 3,892(0, 1)
Store temp
Address of the storage operand =892+C(R1)
=892+48
=940
Content of memory location 940 =contents of register 3
=9
BC 2, 2(0,1)
Branch if result is positive.
2 denotes a condition code
Address of the storage operand =2+C(B1)
=2+48
=50
38. System Programming Vidyavahini First Grade College
Department of BCA Page 38
Assembly language
Definition
Assembly language is a low level programming language that allows and uses to write programs
using mnemonics (symbols)
Advantages
1. It is mnemonic
2. Reading is easier
3. Addresses are symbolic
4. Introduction of data to program is easier
5. It can be easily modified than machine language programs.
Disadvantages
1. An assembly language is required to translate source program into object program
2. It is machine dependent.
3. Lack of portability of programs between computers of different makes.
Pseudo opcodes
Pseudo opcode is an assembly language instruction that specifies an operation of the
assembler.
USING
Using is a pseudo opcode that indicates to the assemble which General Purpose Register
to use as a base register and what value it contained at execution time
Syntax
USING <content of base register><GPR to be used as base register>
Ex:
USING * 5
START:
Start is a pseudo opcode that tells the assembler where the beginning of the program is
and allows the user to give a name to the program
39. System Programming Vidyavahini First Grade College
Department of BCA Page 39
Ex:
START sum
Or
sum START
END:
End is a pseudo opcode that tells the assembler that the last statement of the program has
been reached
Ex:
END
EQU:
EQU is the pseudo opcode which allows the program to define variables
Ex:
BASE EQU 15
DC (data constant)/(define constant):
DC is a declarative pseudo opcode used to create a memory area to hold a constant value
Syntax:
<Label>DC ‘constants’
Ex:
FOUR DC ‘F4’
DS (data storage)
DS is the pseudo opcode that reserves storage for the data and gives them a name
Syntax
<Label> DS ‘size’
Ex:
FOUR A DS 1F
DROP:
Drop is a pseudo opcode which indicates an unavailable base register and its contents
40. System Programming Vidyavahini First Grade College
Department of BCA Page 40
Syntax:
DROP<BS register number>
Ex:
DROP 15
LTORG:
LTORG is a pseudo opcode which tells the assembler to place the encountered literals at
an earlier location
Machine opcodes
BALR:
BALR is a branch and link instruction. It is an instruction to the computer to load a
register with the next address and branch to the address specified in the second field.
BALR loads the base register and it is an executable statement it is an RR type instruction
whose length is 2 bytes.
Ex: BALR 15,0
BR:
BR is a machine opcode indicating branch to the location whose address is in general
register
Ex: BR 14
BCT:
BCT indicates branch and count it is a RX type instruction whose size is 4 bytes.
Ex: BCT 3, loop
Decrements register 3 by 1 if result is not 0 branch back to loop
41. System Programming Vidyavahini First Grade College
Department of BCA Page 41
Chapter-3
Assembler
Assembler is system software which is used to translate assembly level language into machine
level language program code
Functions of assembler
Assembler can take an input and produces its machine instructions
Converts symbolic instructions for each machine instructions
It decides the proper instruction format
Converts the data constants to internal machine representations
Write the object program and the assembly listing
Design procedure for assembler
Specify the problem statement.
Specifies data structures (database)
Define format of data structure
Specify the algorithm
Lack for modularity.
Description of above phases:
First we are analyzing the problem statement next we are maintaining the what are the databases
using design proposes next we are getting one format of those database and write algorithm for
the statement and look overview of these steps
Here assembler is designing two pass
42. System Programming Vidyavahini First Grade College
Department of BCA Page 42
The first pass defines the symbols and literals here we cannot find out the offset value
The second pass generates the instruction addresses means the offset value.
Step1: statement of a problem
Consider the following source program which has to be converted into machine language(object
program)
JOHN START N0
USING 15
L 1, FIVE
A 1 FOUR
ST 1 TEMP
FOUR DC F’4’
FIVE DC F’5’
TEMP DS 1F
Intermediate steps in assembling the program
Pass 1 pass2
Relative loc relative address
0 l 1,-(0,15) 0 l 1,16(0,15)
4 A 1,-(0,15) 4 A 1,12(0,15)
8 ST 1,-(0,15) 8 ST 1,20(0,15)
12 4 12 4
16 5 16 5
20 - 20 -
Note: the load instruction add instruction and store instruction are RX instruction format so we
are using relative address 0,4and 8
43. System Programming Vidyavahini First Grade College
Department of BCA Page 43
Step 2: specify data structure (database)
Pass 1: database
1. Input source program
2. Location counter (lc)-it is used to store each instruction location
3. MOT(machine opcode/operation table)-it is used to store mnemonic or symbols for each
instruction and its length.
4. POT(pseudo operation table)-it is used to store the all pseudo opcodes in our source program
and corresponding actions
5. ST(symbol table):it is used to store all the symbol /labels used in our program and its
corresponding value
6. LT(literal table): it is used to store all the literals/constants used in our program and its
assigned location
7. A copy of the input to be used later by pass 2
Pass 2: databases
1. copy of the source program input to pass-1
2. LC:it is used to store each instruction location
3. MOT:it is used to store all directives or mnemonics and its length, binary opcode and
instruction format
4. POT: it is used to store all directives and its action corresponding index
5. ST(symbol table)-it is used to prepare by pass 1 it consists of each label and its value
6.BT(base table)-it is used to store which registers are currently specified as base register by
USING pseudo opcode and it specifies the contents of these registers
7. INST-this is a work space used to store each instruction as its various parts.
44. System Programming Vidyavahini First Grade College
Department of BCA Page 44
Ex: binary opcodes, register fields, length fields etc
8. Print line: it is also work space used to produces a printed listing
9. Punch card: it is also work space used for converting the assembled instructions in the format
needed by the loader.
Step 3: Format of databases
After the second step the third step will be format of those above mentioned databases.
MOT table
MOT table is used in both pass1 and pass2
The size of the MOT table will be 6 bytes
The both passes can take one MOT table.
In pass-1 the fields mnemonic opcode and instruction length are filled
In pass 2 the binary opcode and instruction format are filled
Codes
01=1 half word=2 bytes
10=2 half word=4 bytes
45. System Programming Vidyavahini First Grade College
Department of BCA Page 45
11=3 half word=6 bytes
Instruction formats
000=RR(2 bytes)
001=RX(4 bytes)
010=RS(4 bytes)
011=SI(4 bytes)
100 =SS(6 bytes)
Format of the MOT table
POT table
This table is a fixed table
This is similar to the pass 1 and pass2.
It contains pseudo opcode and corresponding address
The size of this table is 8 bytes
Format
Symbol table
It is a variable table
Same table is used in both pass1 and pass2
The size of the table is 14 bytes per entry
46. System Programming Vidyavahini First Grade College
Department of BCA Page 46
The length field indicates the length in bytes of the instruction to which symbol is
attached
Absolute means the value of the symbol doesn’t change if the program is moved in core
Format
Literal table
It is also a variable table
It contains literal and its values
It is same as symbol table but instead of storing symbols we were store literals in a table
The size of the table is 14 bytes per entry
Format
Base table
It is a variable table
It is used to specify the base register
It is used only in pass2
The size of this table is 4 bytes per entry
47. System Programming Vidyavahini First Grade College
Department of BCA Page 47
Step 4: SPECIFY ALGORITHM
After the third step next we will go to fourth step of a design that is algorithms and flowcharts of
an assemblers
PASS1 SYMBOLS, FLOWCHART AND ALGORITHM:
The important purpose of a pass1 assembler is to assign location to each instruction and data
defining
Pseudo-ops and define values for symbols appearining in the label field of source program.
ALGORITHM:
1. First we are initialized location counter value is zero because of the relocate location is first
zero value after that based on the instruction formats that will be increased.
2. Next step read the course statement of a source program.
3. In that statement that is pseudo-opcode whether it is which type of pseudo-op.either it is
USING, DROP, END, DS, DC AND EQU. if it is equ pseudo-opcode expression. if it is a dc or
ds
Pseudo-ops it will the attach to the location counter and definition of symbols in pass1.
4. If is LTROG pseudo-opcode the determine the earlier location of the literals.
5. If is END pseudo-opcode pass1 will be terminated.
6. If it is not pseudo-opcode that is machine opcode search for MOT for matching with with the
source
Program and find the length.
7. Repeat these steps upto encounters the END pseudo opcode.
48. System Programming Vidyavahini First Grade College
Department of BCA Page 48
PASS2: SYMBOLS, FLOWCHART AND ALGORITHMS:
The purpose of pass2 is to process each card to find the values and its offset values
ALGORITHM
1. First location counter is initialized as zero
2. Read the statement from source file copied by pass1
3. Process the opcode field
4. If it is USING pseudo-op or DROP pseudo opcode, then they may require addition processing
in pass2
5. If is a DC pseudo op convert the constant and output it thereby updating LC
49. System Programming Vidyavahini First Grade College
Department of BCA Page 49
6. If it is a DS pseudo opcode then update the LC value
7. If it is END pseudo opcode terminate pass2.before terminating pass2 generating the code for
literals and symbols.
8. If it is a START or LTROG pseudo opcode just print the card in program.
9. If it is a machine opcode then it will store in to the MOT entry and to find length and binary
opcode and format of the instruction format.
10. Again in this pass2 the extra field will be added.
11. It means we will search which type of instruction format it is whether RR, RX, RS, SS etc
52. System Programming Vidyavahini First Grade College
Department of BCA Page 52
Sorting
Defn:it is the process of arranging items in the table in some sequence (asecending r desending).
The different types of sorting techniques are
1. Interchange sort,sink,bubble,sifting sort
2. Shell sort
3. Bucket sort or radix sort
4. Radix exchange sort
5. Address calculation sort
1.Interchange sort,sink,bubble,shifting sort
1. To arrange a list of items in ascending order
2. Compare each adjacent pair of item in a list in turn swapping the items if necessary otherwise
retain as it is
3. The number of comparison and the number of passes depends on the number of elements in
the table
4. After each pass ,the number of comparison is also reduced.
5. The total number of passes for N elements is N-1 .
53. System Programming Vidyavahini First Grade College
Department of BCA Page 53
2.Shell sort
Shell sort was invented in 1959 by Donald shell
Steps to sort
1.calculate the distance d value
2.intial value of d1=N/2,where N is the number of data elements
3.in every pass each item is compared with the one located d positions.further in the array of
items.
4.if the higher item has a lower value,an exchange is made.
5.continue this process with next distance,d2+1=d2+1/2
54. System Programming Vidyavahini First Grade College
Department of BCA Page 54
3.Bucket sort or radix sort
It is simple and distributive sorting technique
Steps to do sort
1.Setup an array of initially empty buckets from 0 to 9
2.Distribute based on least significant digit then merge the bucket elements
3.Distribute merged element from step2 based on next significant bit and then merge the bucket
elemnets and so on.
4.Address calculation sort
It is the fastest type of sorting
Step to do sort
1.Find the number of elements or table size
2.Find the highest number in the list
3.Find the address
Address=highest number/number of elements
4.Find the calculated address
Calculated address=data item/address
Divide the table entries by address and take the integer part if there is an empty space
place the element in that position,if there is a conflict move the items down in the list by
performing a linear search
55. System Programming Vidyavahini First Grade College
Department of BCA Page 55
5.Radix exchange sort
It is a distributive sorting technique
Steps to do sort
1.convert decimal to binary number
2.the ordering of a group on a given bit is accomplished by scanning down from the top of the
group for a one bit and
3. up from the bottom for a zero bit,these two are exchanged and the sort continues.
56. System Programming Vidyavahini First Grade College
Department of BCA Page 56
Chapter-4
Macro language and the macro processor
In most of the programs, we need to repeat same block of code several times, in such
situations macro facility is very useful
“Macro instructions are single line abbreviations for groups of instructions”
For every occurrence of macro instruction in a program the macro processing assembler
will substitute the entire block.
Advantages
The frequent use of macros can reduce programmer induced errors
They facilitate standardization.
Use of microprocessor
The most common use of macro processor is in assembly language programming
They can also be used with high level programming languages
They can also be used in operating system command language
General purpose microprocessors are not tied to any particular language
Macro definition
“In this phase we will attach a name to the sequence of instructions”
Structure
MACRO start of definition
[ ] macro name
------------
---------- Sequence of instructions
MEND end of macro definition.
The macro definition starts with “MACRO” pseudo opcode it indicates the beginning of
the macro definition
The following macro pseudo opcode we will give the name of the macro definition
57. System Programming Vidyavahini First Grade College
Department of BCA Page 57
After that the sequence of instructions being abbreviated
The macro definition is terminated by the MEND pseudo opcode
Example
MACRO
INCR
A 1,DATA
A 2,DATA
A 3,DATA
MEND
DATA DC F ‘5’
Where
MACRO->beginning of the macro definition
INCR->name of the macro definition
MEND->end of the macro definition
Between the INCR and MEND we will have write the sequence of macro instructions to
perform operations
Macro call
Once the macro has been defined, the use of the macro name in the place of sequence of
instruction
i.e., “In the program the repeated instruction are replaced by the macro name this is called
as macro call or macro instruction”
Or
The occurrences of the macro name in the source program is nothing but macro call
MACRO
INCR
A 1,DATA
A 2,DATA
A 3,DATA
58. System Programming Vidyavahini First Grade College
Department of BCA Page 58
MEND
.
.
INCR
.
.
INCR
.
.
DATA DC F ‘5’
.
.
Macro expansion
“The macro processor substitutes the macro definition in the place of the macro call.
This is called as macro expansion”
In this macro expansion MACRO, MEND and name of the macro doesn’t appear in the
expanded source code. All other remaining lines are appearing.
Example
Source Expanded
MACRO
INCR
A 1,DATA
A 2,DATA
A 3,DATA
MEND
.
.
INCR A 1,DATA
. A 2,DATA
59. System Programming Vidyavahini First Grade College
Department of BCA Page 59
. A 3,DATA
INCR
.
.
DATA DC F ‘5’
.
.
Features of macro facility
The important features of macro call are
1. Macro instruction arguments
2. Conditional macro expansion
3. Macro calls within macros
4. Macro instruction defining macro.
1. Macro instruction arguments
The macro which we have worked previous it lacks flexibility, because all of the calls to
any given macro will replace by identical blocks. There is no way for a specific macro
call to modify the coding
To overcome this problem, we use macro instruction arguments, where arguments or
parameters appear in macro calls.
A 1, DATA1
A 2, DATA1
A 3, DATA1
A 1, DATA2
A 2, DATA2
A 3, DATA2
60. System Programming Vidyavahini First Grade College
Department of BCA Page 60
DATA1 DC F ‘5’
DATA2 DC F ‘10’
In this program the sequence of instructions are similar but not identical. The first
sequence performs an operation using DATA1 as operand, second sequence performs an
operation using DATA2 as operand, so the same operation performs with the different
types of parameters. Such type of parameters is known as macro instruction argument
or dummy arguments
Macro instruction argument is specified on the macro name line with an ‘&’ as its first
character
The ‘&’ symbol is used to identify the macro language symbol.
Source expanded
MACRO
INCR &arg
A 1, &arg
A 2, &arg
A 3, &arg
MEND
.
.
INCR DATA1 A 1,DATA1
. A 2,DATA1
. A 3,DATA1
INCR DATA2 A 1,DATA2
. A 2,DATA2
. A 3,DATA2
.
.
DATA1 DC F ‘5’ DATA1 DC F ‘5’
61. System Programming Vidyavahini First Grade College
Department of BCA Page 61
DATA2 DC F ‘10’ DATA2 DC F ‘10’
.
.
More than one argument in a macro call
It is possible to supply more than one argument in a macro call. These arguments are separated
by comma
MACRO
INCR &arg1, &arg2, &arg3
A 1, &arg1
A 2, &arg2
A 3, &arg3
MEND
.
.
INCR DATA1, DATA2, DATA3 A 1, DATA1
. A 2, DATA2
. A 3, DATA3
INCR DATA3, DATA2, DATA1 A 1, DATA3
. A 2, DATA2
. A 3, DATA1
.
.
DATA1 DC F ‘5’ DATA1 DC F ‘5’
DATA2 DC F ‘10’ DATA2 DC F ‘10’
DATA3 DC F ‘15’ DATA3 DC F ‘15’
Arguments two macro call can be specified using two different methods they are
62. System Programming Vidyavahini First Grade College
Department of BCA Page 62
1. Positional arguments
2. Keyword arguments
Positional arguments
These are matched with the dummy arguments. these are appeared after the macro
definition.
Or
Positional arguments are matched with dummy arguments according to the order in
which they appear.
Keyboard arguments
These arguments allow reference to dummy arguments by name as well as by position.
Or
Keyword arguments allow reference to dummy arguments by name as well as by
position.
2. Conditional macro expansion:
The sequence of macro expansions can be changed based on some conditions called as
Conditional macro expansion
There are two important macro pseudo opcode
1. AIF
2. AGO
AIF: this is conditional branching pseudo opcode
The general format
AIF<expression>. <Label name>
Where expression or condition is true control transfer to the label statement
If it is false, the next statement following the AIF is executed.
The labels used in the statements starts with a period (.) followed by label name
AIF statement and label doesn’t appear in the expanded code
AGO
63. System Programming Vidyavahini First Grade College
Department of BCA Page 63
It is an unconditional branching pseudo opcode
It is also called as goto statement
Syntax
AGO<sequence label>
Example
Source Expanded
MACRO
Vary &count,&arg1,&arg2,&arg3
A 1,&arg1
AIF(&count EQU 1).FINI
A 2,&arg2
AIF(&count EQU 2).FINI
A 3,&arg3
FINI MEND
LOOP1 Vary 3,data1,data2,data3 LOOP1 A 1,data1
. A 2,data2
. A 3,data3
LOOP2 Vary 2,data3,data2 LOOP2 A 1,data3
. A 2,data2
.LOOP3 vary 1,data1 LOOP3 A 1,data1
.
Data1 DC F’5’
Data2 DC F’10’
Data3 DC F’15’
64. System Programming Vidyavahini First Grade College
Department of BCA Page 64
3. Macro calls within macro
Sometimes one macro will call other macro such a concept is called macro calls within
macro
Macro can be called within definition of another macro.
Example
MACRO
ADD1 &arg
L 1,&arg
A 1,=F’1’
ST 1,&arg
MEND
MACRO
ADDS &arg1,&arg2,&arg3
ADD1 &arg1
ADD1 &arg2
ADD1 &arg3
MEND Expansion of ADDS Expansion Of ADD1
ADDS Data1,Data2,Data3 ADD1 Data1 L 1,Data1
A 1,=F
ST 1,Data1
ADD1 Data2 L 1,Data2
A 1,=F
ST 1,Data2
ADD1 Data3 L 1,Data3
A 1,=F
65. System Programming Vidyavahini First Grade College
Department of BCA Page 65
ST 1,Data3
.
Data1 DC F’5’
Data2 DC F’10’
Data3 DC F’15’
4.Macro Instructions Defining Macro
Macro definition within macros are sometimes called “macro definitions within macro
definition”
The inner macro definition is not defined until after the outer macro has been called or
defined
Example
MACRO
DEFINE &FUN
MACRO
&FUN&A
…………….
…………….
……………
MEND
MEND
DEFINE WEL
WEL DATA1
In the above example “DEFINE” is the outer macro name
66. System Programming Vidyavahini First Grade College
Department of BCA Page 66
Within that original macro we have declared a dummy macro name i.e. &FUN
Implementation of macro processor
The macro processor taken as input an assembly language program which contains macro
definitions and macro calls
It then transforms into another program were all macro definitions have been replaced
with the corresponding macro bodies.
The output of the macro processor is an assembly language program containing no
macros.
Statement of problem
There are four basic tasks or functions performed by macro processor.
They are
1. Recognize macro definitions
2. Save the definitions
3. Recognize calls
4. Expand calls and substitute arguments
Recognize macro definitions
The macros are recognized by the keyword MACRO and MEND pseudo-opcodes.
And also identifies the nested macro the macro processor must recognize the nesting and
should correctly match the last or outer MEND with the first macro
Save the definitions
The macro processor stores all the macro instruction definitions in memory.
Recognize calls
67. System Programming Vidyavahini First Grade College
Department of BCA Page 67
It must recogninize macro calls (i.e., macro name) that appear as operation mnemonics.
Expand calls and substitute arguments
The macro processor substitute dummy or macro definition arguments with the corresponding
arguments from a macro call. Then assembly language text is then substituted for the macro call
Database specification
Pass 1 database:
1. The input macro source deck
2. Copy for use by pass2
3. Macro definition table (MDT) used to store the body of the macro definitions.
4. The macro name table (MNT), used to store the names of defined macros.
5. The macro definition table counter (MDTC), used to indicate next available entry in the MDT
6. The macro name table counter (MNTC), used to indicate next available entry in the MNT
7. The argument list array (ALA), used for storing dummy arguments.
Pass2 database
1. The copy of the input macro source deck
2. The output expanded source deck to be used as input to the assembler
3. Macro definition table(MDT), created by pass1
4. The macro name table(MNT), created by pass1
5. The macro definition table counter(MDTP),used to indicate next line of text to be used during
macro expansion.
6. The argument list array(ALA),used to substitute macro call arguments for the index markers in
the stored macro definition.
Specification of database format
Argument list array (ALA)
Argument list array (ALA) maintains the details about the parameters.ALA is used
during pass1 and pass2,but the functions are reverse in both the passes.
68. System Programming Vidyavahini First Grade College
Department of BCA Page 68
Pass1:ALA
In pass1,when the macro definitions are stored, the arguments in the macro definitions
are replaced by index markers
Where # is the index marker symbol
Example
MACRO
LOOP1 INCR &arg1, &arg2, &arg3
A 1, &arg1
A 2, &arg2
A 3, &arg3
MEND
.
.
LOOP1 INCR DATA1, DATA2, DATA3
During pass1 all the arguments inside the body of the macro definitions is replaced by
corresponding index markers.
&LOOP1 INCR &arg1, &arg2, &arg3
#0 A 1, #1
A 2, #2
A 3, #3
MEND
Pass2 ALA
During pass2,arguments in the macro call are substituted for the index markers stored in
macro definition
For example,
Consider a macro call
LOOP1 INCR DATA1, DATA2, DATA3
69. System Programming Vidyavahini First Grade College
Department of BCA Page 69
The macro call expander would prepare the following ALA
Macro definition table (MDT)
MDT is used to store the body of the macro definitions.
The size of macro definition table is 80 bytes per entry. Every line is the macro
definition, except MACRO is stored is the MDT
Macro name table (MNT)
MNT is used to store the names of the defined macros.
Each MNT entry consists of macro name whose size is 8 bytes. The size of the MDT
index is 4 bytes. Therefore the size of the MNT is 12bytes per entry
Macro language and macro processor
Two pass macro processor algorithm
Pass1
70. System Programming Vidyavahini First Grade College
Department of BCA Page 70
Step 1: Initialize macro definition table counter (MDTC) and macro name table counter to 1.
step2: Read one line from the source program
step3.a: Check whether it is a MACRO pseudo-op
a)If it is a macro psedudo.op, you have encountered a macro definition and the entire definition
that follows macro pseudo.op should be stored is macro definition table
For that
i) Read next line from the source program
ii) In macro name table (MNT) enter the macro name and current value of macro
definition table counter (MDTC) in entry number MNTC
iii) Increment MNT (macro name table counter)
iv) Prepare the argument list array
v) In macro definition table (MDT) enter the macro name line
vi) Increment macro definition table counter (MDTC)
vii) Read next line from the source program
71. System Programming Vidyavahini First Grade College
Department of BCA Page 71
viii) Substitute index notations for arguments
ix) Enter the line into macro definition table, with index markers for arguments
x) Increment macro definition table counter (MDTC)
xi) Check whether you have encountered a MEND pseudo op
a) If it is MEND pseudo op, go to step 2
b) If it is not MEND pseudo op, go to step 3.a.vii
Step3.b If it is not a macro pseudo.op
i) Write copy of source card
ii) Check whether you name encountered an END pseudo op
a) If it is and END pseudo.op, you have reached the end of the program and all the macro
definitions have been processed. Therefore go to pass2 to process macro calls.
b) If it is not and END pseudo op go to step 2.
Pass 2: macro
Algorithm pass-2
72. System Programming Vidyavahini First Grade College
Department of BCA Page 72
step1: Read next line fr0m the source program copied by pass1
step2.a: Search macro name table (MNT) for match with operation code
Check whether you have encountered a macro call i.e. checks whether a macro name is found.
I. if it is a macro name, set the macro definition table points, to the corresponding macro
definition stored in macro definition table (MDT) for that assign MDT index field of
MNT entry to MDTP
II. prepare argument list array (ALA)
III. Increment macro definition table counter (MDTP)
IV. Get line from macro definition table (MDT)
V. substitute arguments from macro calls
VI. check whether you have encounter MEND pseudo.op
a. If it is MEND pseudo.op, it means that macro expansion is over. so go to step 1 to scan
the input file
b. IF it is not MEND pseudo.op, write expanded source card and go to step2.iii
step2.b) If you have not encountered a macro call, i.e. if macro name is not found, write into
expanded source card file.
c) Check whether you have encountered. END pseudo.op
I. If it is END pseudo.op, transfer expanded source file to the assemblers for further
processing
II. If it is not and END pseudo.op go to step1
A single pass algorithm
Suppose our macro call calls another macro and inner macro then define the macro and
expand it in two passes. It will take more time and more number of computations.inorder to
avoid this, within a single pass itself define and expand the macro calls
The following single pass algorithm.
73. System Programming Vidyavahini First Grade College
Department of BCA Page 73
There are two additional variables are used in a single pass macro
1. A macro definition input (MDI) and
2. A macro definition level counters (MDLC)
74. System Programming Vidyavahini First Grade College
Department of BCA Page 74
The MDI and MDLC are used to keep track of macro calls and macro definitions.
The MDI indicates “ON” during macro call expansion and “OFF” at the other times.
The MDLC is incremented by 1 when a MACRO pseudo-op is encountered and
decremented by 1 when a MEND pseudo-op occurs.
Two separate argument list arrays (ALA)should be used because one ALA for macro definitions
and another one for macro call expansions.
Implementation of macro processor
For example
MACRO
DEFINE &SUB
MACRO
&SUB &Y
CNOP 0,4
BAL 1,*+8
DC A(&Y)
L 15,=V(&SUB)
BALR 14,15
MEND
MEND
MDT: macro definition table
Index card
1 DEFINE &SUB
2 MACRO
3 #1 &Y
4 CNOP 0,4
5 BAL 1,*+8
6 DC A(&Y)
7 L 15,=V(#1)
8 BALR 14,15
9 MEND
75. System Programming Vidyavahini First Grade College
Department of BCA Page 75
10 MEND
MNT: macro name table
Index name MDT index
1 DEFINE 1
Advantages of implementing macro into pass1 of the assembler are
1. No need to implement the function twice.
2. Less overhead. it is not necessary to create intermediate files as output from macro processor
and input to the assembler
3. More flexible to the user
Disadvantages
1. The algorithm is too large.
2. More complex
Answer the following questions:
1. What do you mean by Macro? (1M May/June 2011)
2. Explain one pass macropocessor with its flow chart. (7M May/June 2011)
3. What is Macro-Processor? (1M April/May 2012)
76. System Programming Vidyavahini First Grade College
Department of BCA Page 76
Chapter-5
Loaders
Definition:
Loader is a system software program that performs the loading function.
Loading is the process of placing the program into memory for execution. The loader is
responsible for initializing the execution of process.
Functions of loader:
1. Allocation: The space for program is allocated in the main memory, by calculating the
size of the program.
2. Linking: Which combines two or more separate object programs and supplies the
necessary information
3. Relocation: Adjusting all address location to object program or modifies the object
program so that it can be loaded at an address different from the location originally
specified.
4. Loading: Physically place the machine instruction and data into memory.
Loaders Scheme or types of Loader:
1. Compile and go loader or Assemble and go loader
2. General loader scheme
3. Absolute loader
4. Direct linking loader
5. Relocating loader
6. Dynamic linking loader
1 Compile and go loader: It is a link editor or program loader in which the assembler its self-
places the assembled instruction directly into the designated memory location.
After completion of assembly process it assigns the starting address of the program to the
location counter, and then there is no stop between the compilation, link editing, loading, and
execution of the program
77. System Programming Vidyavahini First Grade College
Department of BCA Page 77
Advantages:
They are simple and easier to implement
Disadvantages:
This loader can perform and take only one object program at a time
A portion of memory is wasted because of the memory occupied by the assembler for
each object program due to that assembler necessary to retranslate the user program every
time
It is very difficult to handle multiple segments or subprograms
2 General loader schemes:
The general loader can be solve the draw backs of previous loader it is compiler and go loader.
This loader accepts multiple object programs at a time.
Generally loader is assumed smaller than the assembler so that more memory is available to the
user.
Program loaded on
memory
Assembler
Compile and go
loader
Source program
Object prog ready
for execution loaded
on memory
oLoaderader
Object
program
program
Translator
Translator
Source
program
Loader
oader
OObject
program
program
bject
program
Translatorr
Translator
Source
program
Source
program
78. System Programming Vidyavahini First Grade College
Department of BCA Page 78
Advantages:
In this scheme there is no require for retranslation for each and every program because
here we are storing the loader instead of assembler
3 Absolute loader: In this scheme the assembler outputs the machine language translation of the
source program.
The data is punched on the cards instead of being placed directly in memory
The loaders in turns simply accept the machine language text and places into core at the location
prescribed by the assembler.
Ex:
The main program assigned to location 100 to 247 and the subroutine is assigned to the location
400 to 477, if the changes were made to main memory i,e., increased its length more to an 300
byte at that time relocation is necessary
Main Program Sqrt subroutine
Main start 100 sqrt start 400
………… …………
Sqrt DC f 400 END END
100
248 248
400
477
The loader functions are accomplished as follows in an absolute loader schemes
Main
Main
Sqrt
Absolute
loader
Sqrt
79. System Programming Vidyavahini First Grade College
Department of BCA Page 79
1. Allocation by the programmer
2. Linking is also by the programmer
3. Relocation is by assembler
4. Loading by loader
Design of an absolute loader:
We can design an absolute loader we must and should having two types cards
1. Text cards.
2. Transfer cards.
1 Text cards: This type of card is used to store instructions and data.
The capacity of this card is 80bytes.
It must convey the machine instructions that assembler has created along with assigned core
location.
Card column content
1
2
3-5
6-7
8-72
73-80
Card type=0[indicates text card
Count the number of bytes in information
Address of that information
Empty
Instruction and data to be loaded
Cards sequence numbers
2 Transfer cards: These cards must convey the entry point of the program, which is where the
loader is to transfer the control when all instructions are loaded.
Card column content
80. System Programming Vidyavahini First Grade College
Department of BCA Page 80
1
2
3-5
6-72
73-80
Card type=1[indicates transfer card
Count type=0
Address of entry points
Empty
Cards sequence numbers
Algorithm for an absolute loader:
Statement 1: start
Statement 2: read header record [first record or first line]
Statement 3: program length
Statement 4: if[it is text card or transfer card ]
If it is text card, then store the data and instruction
Else
Transfer instructions
Statement 5: code is in character for then it will convert in to internal representation
Statement 6: read next object program
Statement 7: end
Advantages:
1. It is very simple and easy to implement.
2. More memory available to the user.
3. Multiple segments can be allowed at a time
Disadvantages:
1. In this loader program adjust all internal segment addresses. So that programmers must
and should know the memory management and address of the programs.
2. If any modification is done in one segment then starting address is also changed.
3. If there are multiple segments the programmer must and should remember the addresses
of all sub-routine.
Sub-routine linkages:
81. System Programming Vidyavahini First Grade College
Department of BCA Page 81
If one main program is transfer to sub program and that sub program also transfer to another
program.
The assembler does not know this mechanism [symbolic reference] hence it will declare the error
message. That situation assembler provides two pseudo-op codes. They are
1 EXTRN
2 ENTRY
The assembler will inform the loader that these symbols may be referred by other programs
1 EXTRN:
The EXTRN pseudo op code is used to maintain the reference between 2 or more subroutines.
OR
The assembler pseudo-op code EXTRN followed by a list of symbols indicates that these
symbols are defined in other programs but referenced in the present program.
2 ENTRY: The assembler pseudo-op code ENTRY followed by a list of symbols indicates that
these symbols are defined in present program and referenced i9n other program.
ENTRY pseudo-op code is optional which is used to defining entry locations of sub-routines.
Ex:
A START B START
EXTRN B USING *15
………… …………
L 15, =A (B) …………
BALR 14, 15 BR 14
……………
END END
82. System Programming Vidyavahini First Grade College
Department of BCA Page 82
Relocating loaders
In order to avoid the disadvantages of reassembling in absolute loader another type of loader
called relocating loader is introduced.
BSS [Binary Symbolic Subroutine] is one of the examples of relocating loader.
The BSS loader allows many procedure segments but only one data segment,
The assembler assembles each procedure segments independently and then passes to loader the
text and information as to relocation and inters segment reference
The output of a relocating assembler using a BSS scheme is the object program and information
about all other programs it reference
For each source program the assembler output a text prefixed by transfer vector that consist of
address containing names of the subroutines referenced by the source program.
The assembler would also provide to loader with additional information the length of the entire
program and length of the transfers’ vector.
This BSS scheme uses RX type instruction format
OP R1 X D
It is necessary to relocate the address portion of every instruction, the assembler associate a bit
with each instruction or address field called relocation bits. If relocation bit ==1 the
corresponding address filed must be relocated. If (rb==0) the field is not relocated.
The relocation bits are used to solve the problem of relocation, the transfer vector is used to solve
the problem of linking and the program length information is used to solve the problem of
allocation.
83. System Programming Vidyavahini First Grade College
Department of BCA Page 83
Advantage of relocating loader:
1. Reassembling is not necessary
2. All the function of the loader are implemented only by the BSS loader
Disadvantage of relocating loader:
1. The transfer vector increases the size of the object program in memory.
2. There is no facility for accessing common data segment.
Direct Linking Loader
It is general relocating loader and it is most popular loading scheme presently used.
The main difference between direct linking border and relocating order is “relocating
loaders one data segment” support multiple procedure segment but only
“Indirect linking loader support multiple procedure segment and also multiple data
segments”
Linker: it is system software which is used to link the object programs and that output will be
sent to the loader.
There are 4 types of cards available in the direct linking loader. They are
1. ESD-External symbol dictionary
2. TXT-card
3. RLD-Relocation and linking dictionary
4. END-card
1 ESD card:
It contains information about all symbols that are defined in the program but reference some
where
It contains:
Reference number
Symbol name
Type Id
Relative location
84. System Programming Vidyavahini First Grade College
Department of BCA Page 84
Length
There are again ESD cards classified into 3 types of mnemonics. They are:
1. SD [Segment Definition]: It refers to the segment definition [01]
2. LD; It refers to the local definition [ENTRY] [02]
3. ER: it refers to the external reference they are used in the [EXTRN] pseudo op code [03]
2 TXT Card: It contains the actual information are text which are already translated.
3 RLD Card: This card contains information about location in the program whose contexts
depends on the address at which the program is placed.
In this we are used ‘+’ and ‘–‘sign, when we are using the ‘+’ sign then no need of relocation,
when we are using ‘-‘sign relocation is necessary.
The format of RLD contains:
1. Reference number
2. Symbol
3. Flag
4. Length
5. Relative location
4 END Card: It indicates end of the object program.
Note: The size of the above 4 cards is 80 bytes
Design of direct linking loader:
Here we are taking PG1 and PG2 are two programs
The relative address and secure code of above two programs is written in the below
ESD Cards:
In a ESD card table contains information necessary to build the external symbol dictionary or
symbols table
85. System Programming Vidyavahini First Grade College
Department of BCA Page 85
In the above source code the symbols are PG1, PG1ENT2, PG2, and PG2ENT1
Format of ESD Card for PG1:
Source card
reference
Name Type Id Relative
address
length
1
2
2
3
3
PG1
PG1ENT1
PG1ENT2
PG2
PG2ENT1
SD
LD
LD
ER
ER
01
-
-
-
-
0
20
30
-
-
60
-
-
-
-
Here, the PG1 is the segment definition it means, the header of program1
PG1ENT1 and PG1ENT2 those are the local definition of program1, so that the we are using the
type LD.
PG2 and PG2ENT1 those are using the EXTRN pseudo op code, so that we are using the type
ER
Text card for PG1:
The format of card will be
Source card
reference
Relative address Content Comments
6
7
8
9
10
40-43
44-47
48-51
52-55
56-60
20
45
7
0
-16
=30+15
=30-20-3
Unknown to PG1
-20+4
6= A(PG1ENT1)=20
7=A (PG1ENT2+15)=30+15=45
8=A (PG1ENT2-PG1ENT1-3)=30-20-3=7
9=A (PG2)=0
10=A (PG2ENT1+PG2-PG1ENT1+4)=0+0-20+4= -16
86. System Programming Vidyavahini First Grade College
Department of BCA Page 86
RLD Card Format for PG1:
Source card reference
address
ESD ID Length
[bytes]
Flag
+ or -
relative
6
7
9
10
10
10
02
02
03
02
03
02
4
4
4
4
4
4
+
+
+
+
+
-
40
44
52
56
56
56
ESD Card for program2 [PG2]:
Source Card
reference
Name Type Id ADDR Length
12
13
14
14
PG2
PG2ENT1
PG1ENT1
PG1ENT2
SD
LD
ER
ER
01
-
03
03
0
16
-
-
36
-
-
-
Text Card for PG2:
Source card reference Relative address content
16
17
18
24-27
28-31
32-35
0
15
-3
16=A (PG1ENT1) =0
17=A (PG1ENT2+15) =0+15=15
18=A (PG1ENT2-PG1ENT1-3) =0-0-3=-3
RLD Card for PG2:
Source card ESD ID Length[flag] Flag Relative
87. System Programming Vidyavahini First Grade College
Department of BCA Page 87
reference [bytes] + or - address
16
17
18
18
01
-
03
03
4
4
4
4
+
+
+
-
24
28
32
32
Specification of data structure:
1 Pass1 database:
1. Input object decks
2. The initial program load addresses [IPLA]: The IPLA supplied by the programmer or
operating system that specifies the address to load the first segment.
3. Program load address counter [PLA]: It is used to keep track of each segments
assigned location
4. Global external symbol table [GEST]: It is used to store each external symbol and its
corresponding assigned core address
5. A copy of the input to be used later by pass2
6. A printed listing that specifies each external symbol and its assigned value
2 Pass2 database:
1. A copy of object program is input to pass2
2. The initial program load address [IPLA]
3. The program load address counter [PLA]
4. A table the global external symbol table [GEST]
5. The local external symbol array [LESA]: which is used to establish a correspondence
between the ESD ID numbers used on ESD and RLD cards and the corresponding
External symbols , Absolute address value
Format of data bases:
Object deck:
The object deck contains 4 types of cards
88. System Programming Vidyavahini First Grade College
Department of BCA Page 88
1 ESD Card format:
Source card
reference
Name Type ID Relative
address
length
Type Hexa-decimal
SD 01
LD 02
ER 03
2 TEXT Card:
Source card reference
address
Relative
address
content
3 RLD Card:
Source card
references
ESD ID Length Flag
+ or -
Relative address
Note: The length of each card is 80-bytes
4 Global External Symbol Table [GEST]:
It is used to store each external symbol and its corresponding core address.
External symbol
[8 bytes] character
Assigned core
[4 bytes] address decimal
“PG1bbbbb”
“PG1ENT1b”
104
124
5 Local external symbol array[LESA]:
89. System Programming Vidyavahini First Grade College
Department of BCA Page 89
The external symbol is used for relocation and linking purpose. This is used to identify the RLD
card by means of an ID number rather than the symbols name. The ID number must match an SD
or ER entry on the ESD card
Assigned core address of
corresponding symbol [4 bytes]
104
124
134
….
….
This technique saves space and also increases the processing speed.
Pass1 Algorithm and Flowchart:
The purpose of pass1 is to assign location to each segment and also finding the values of all
symbols
1. Initial program load address [PLA] its set to the initial program load address [IPLA]
2. Read the object card
3. Write copy of card for pass2
4. The card can be any one of the following type
i) Text card or RLD card- there is no processing required during pass1 and then
read the next card
ii) ESD card is processed based on the type of the external symbols
a) SD is read the length field LENGTH from the card is temporarily saved in
the variable SLENGTH. The value, VALUE toassign to this symbol is set
to the current value of the PLA.
The symbol and its assigned value are then stored in the GEST. If the
symbol already existed in the GEST then this is an error.
90. System Programming Vidyavahini First Grade College
Department of BCA Page 90
b) The symbol and its value are printed as part of the load map. LD is read
the value to be assigned is set to the current PLA+ the relative address
[ADDR]. The ADDR indicates on the ESD card
c) ER symbols do not required any processing durating pass1
iii) When an END card is encountered the program load address is incremented
by length of the segment and saved on SLENGTH.
iv) EOF card is read pass1 is completed and control transfer to pass2
Pass2 Algorithm and Flowchart:
1. If an address is specified in the END card then that address is used as the executed start
address otherwise, the execution will begin from the first segment
2. In pass2 the five cards are read one by one described as follows
At the beginning of pass2 the program load address is initialized as in pass1 and the
execution start address [EXADDR] is set to IPLA.
i. ESD card
a) SD type=>the length of the segment is temporarily saved in the variable
SLENGTH. The LESA [ID] is set to the current value of the PLA.
b) LD type=> Does not requires any processing during pass2
c) ER type=>the guest is searched for a match with the ER symbols. If it is not
found then there is an error. If found in the GEST its value is extracted and the
corresponding LESA entry is set
ii. Text card: The text card is copied from the card to the appropriate relocated
memory location [PLA+ADDR]
iii. RLD card: The value to be used for relocation and linking is executed from the
LESA [ID] as specified by the ID field.
91. System Programming Vidyavahini First Grade College
Department of BCA Page 91
Depending upon the flag values is either added to or subroutine from the address
constants
iv. END card: If an execution start address is specified on the END card it is saved
in a variable EXADDR. The PLA is incremented by length of the segment and
saved in SLENGTH, becoming the PLA for the next segment
[PLA=PLA+SLENGTH]
v. EOF card: The loader transfers control to the loaded program at the address
specified by the current contents of the execution address variable [EXADDR]
Other loading segments:
Binders:
In order to avoid the disadvantages of direct linking divides the loading process into 2 separate
programs:
1. A binder
2. A module loader
Binder is a program that performs the function as direct linking loader in binding together. It
outputs the text as a file or card deck, rather than placing the relocated and linked text directly
into memory.
The output files are in format ready to be loaded and are called a load module
The module loader loads the module into memory
The binder performs the function of the allocation, relocation and linking
The modules loader performs the function of loading.
There are 2 major classes of binders:
1. Core image builder
2. Linkage editor
1 Core image builder: A specific memory allocation of the program is performed at a time that
the subroutines are bound together. It is called a core image module and the corresponding
binder is called a core image builder
92. System Programming Vidyavahini First Grade College
Department of BCA Page 92
Advantages:
Simple to implement
Fast to execution
Disadvantages:
Difficult to allocate and load the program
2 Linkage editors: The linkage editor can keep track of relocation information so that the
resulting load module can be further relocated and their care the module loader must performs
additional allocation and relocation as well as loading but it does not worry about the problem of
linking
Advantages:
More flexible allocation and loading scheme
Disadvantages:
Implementation is so complex
Dynamic loading:
For the entire loader scheme we have assured that all of the subroutine needed or loaded into
memory at the same time.
If the total amount of memory required by all these subroutine exceeds the amount available
especially for large programs there is a problem to load all the subroutine into memory at the
same time. There are several hardware techniques to solve this problem such as paging or
segments.
Usually the subroutine of a program is need at different times
Ex:
Pass1 and Pass2 of an assembler are mutually exclusive. The assembler can recognize which
subroutine calls the other subroutine it is possible to produce an overlay structure that identifies
the mutual exclusive subroutine
93. System Programming Vidyavahini First Grade College
Department of BCA Page 93
A 25k
B 15k
C 20k
D 30k
E 20k
Subroutine calls and procedures
Above figure illustrate a program consist of five sub programs [A to E] and that requires 110k
bytes of memory
The arrow indicator that sub program ’A’ only calls ‘B’, ‘D’ and ‘E’ sub program ‘B’ only calls
‘C’ and sub programs ‘C’, ‘E’ does not calls only other sub programs or routines
Note that procedure ‘B’ and ‘D’ are never use at the same time
Overlay Loading scheme:
Total required 65b memory is enough at a time. The portion of the loader the necessary
procedure is called as overlay supervisor or simply the flipper. This scheme is called as dynamic
loading or load on call.
Dynamic linking:
The main disadvantages of all of the previous loading scheme are that the sub routine as
references but they never executed, but the loader still in use the overhead of linking the
subroutines.
This mechanism the loading linking of external references are postponed until execution time the
loaded only the main program if the main program should execute a transfer instruction to an
external address or external variable the loader is call.
94. System Programming Vidyavahini First Grade College
Department of BCA Page 94
Advantages:
The number overhead is incurred unless the procedure to be called or reference is actual
used
System can be dynamically reconfigured
Saves memory
Disadvantages:
More complex because of postponed most of binding process until the program execution
time.
95. System Programming Vidyavahini First Grade College
Department of BCA Page 95
Chapter-6
Compiler
Compiler is system software components that accepts a program return in a high level language
and produce an object program
The compiler must perform the following 4 tasks [functions]:
1. Recognize certain strings as basic elements or token i. e., variables, operator’s
keywords etc.
2. Recognize combinations of elements as synthetic units and interpret their meaning.
3. Allocates strong and assign location for all variables in the program
4. Generate the appropriate object code
General model of a compiler:
Ex:
WCM: procedure (Rate, Start, finish);
Declare (Cost, Rate, Start, Finish) fixed binary (31) static;
Cost=Rate *(Start- Finish) +2*Rate*(Start-Finish-100);
Return (Cost);
End;
1 Recognize basic elements are tokens:
Step1:
“The source program was broken to pieces of blocks called as tokens”.
96. System Programming Vidyavahini First Grade College
Department of BCA Page 96
In the representation taken in a represented by a rectangular symbols
Tokens are recognized as identifiers, literals (Constants), terminals symbols (operators or
keywords).
In the above Example WCM, Rate, Start, Finish are identifies
Procedure is keyword.
: ( , ) ; are terminal symbols.
Step 2:
The basic elements are tokens are entered into the table.
The table consists of 2 fields
1. Uniform symbols
2. Pointer
97. System Programming Vidyavahini First Grade College
Department of BCA Page 97
The uniform symbols are of fixed size and points the table entry of the associated basic element.
Here, uniform symbols are IDN for identifiers TRM->for terminals, LIT->for literals
2 Recognizing syntactic units and interpreting their meaning:
Here, we have to perform 2 separate tasks
1. Recognize the phrases
2. Interpreting their meaning
Step1:
Recognize the phrases (statements or syntactical construction):
The compiler checks for the validity of each phrase are statements.
If the statements are free of errors then, the statement is declared as a valid statement.
Else,
The compiler assures some sort of recovery and continuous with the complication errors of the
next statement.
Step 2:
Interpreting the meaning of the construction:
After performing the above step the resultant form is “syntactic form”
Step 3:
98. System Programming Vidyavahini First Grade College
Department of BCA Page 98
Intermediate form:
“The process of generating the object code for each construction after determining syntactic
construction is known as intermediate form”
The intermediate form depends on syntactic construction. They are:
1. Arithmetic statement
2. Non-Arithmetic statement
3. Non executable statement
1 Arithmetic Statements: The one intermediate form of the arithmetic statement is a parse tree.
The rules for convening arithmetic statement into a phrase tree are:
a) Any variable is a terminal node of a tree
b) For every operator having 2 branches in a binary tree whose left branch in the tree for
operand and whose right branch in the tree for operand 2.
Priority to constraint the tree:
1. Highest priory is given to the expression written in brackets
2. * and 1 operator having the second priority
3. + and – operator having the third priority
4. If the sequence of the operator is same then start solving from left to right
Ex:
99. System Programming Vidyavahini First Grade College
Department of BCA Page 99
The another intermediate form is linear representation of the parse tree called a matrix
Matrix number Operator Operand1 Operand2
1
2
3
4
5
6
7
8
-
*
*
-
-
*
+
=
Start
Rate
2
Start
M4
M3
M2
Cost
Finish
M1
Rate
Finish
100
M5
M6
M7
2 Non-arithmetic statement: The non-arithmetic statements are DD, IF, GOTO are the
examples of non-arithmetic statements
These statements can all be replaced by a sequential ordering of individual matrix entry.
100. System Programming Vidyavahini First Grade College
Department of BCA Page 100
Ex: Return (cost)
End
Matrix
Operator Operand1 Operand2
Return
End
Cost
3 Non-Executable statements: Non-Executable statements such as declare give the compiler
information that clarifies the reference a allocation of variables and associated storage.
The information contains in a non-executable statement is entered into tables
Ex:
Declare (Cost, Rate, Start, Finish) fixedBinary (31) static;
The tables consist of four fields
1. Variables-> cost, rate, start, finish
2. Data type-> fixed binary
3. Precession-> 31 bits
4. Storage class-> static
3) Storage allocation:
Proper amount of memory is reserved i.e., required by the program at some point of time.
Ex:
Declare (cost, Rate, Start, Finish) Fixed Binary
( 3 1 ) Static ;
101. System Programming Vidyavahini First Grade College
Department of BCA Page 101
Identifiers table for above example:
Name Base Scale Prerecession Storage
class
relative
Cost
Rate
Start
finish
Binary
Binary
Binary
Binary
Fixed
Fixed
Fixed
Fixed
31
31
31
31
Static
Static
Static
static
0
4
8
12
Identifiers table consists of size fields:
1. Name: it specifies the name of the variable
2. Base: binary or decimal
3. Scale: Fixed or Float
4. Precisions: number of digits and used floating point number, a scale factor
Storage classes are:
1. Static
2. Automatic
3. Controlled
4. Base
The storage allocation routine scans the identifier table and assigns location to each scalar
Since the data type of each variable is of fixed (32) bits, the relative location 0 is assigned to the
first variable, 4 for second variable, 8 for third variable and 12 for fourth variable
Each variable of size 32 bite the first bit is reserved for representing sign bit. The sign bit is
allocated during load time
This relative addresses are used by the later phases of the compiler for proper accessing similarly
storage is also assigned for the temporary locations that will contain intermediate results of the
matrix
Sign Data [Binary or decimal]
102. System Programming Vidyavahini First Grade College
Department of BCA Page 102
Ex:[ M1, M2, M3,………M7]
4 Code generations: The code generation phase taking the input in matrix form and generating
the object code for each and every entry defined in the table
Each entry in the matrix and with the associated object code is defined by a table called as
production on table
Ex:
Start – Finish
The operator -> In matrix is treated as a macro call
The operands start and finish -> Is treated as macro arguments
Operator operand1 operand2
L 1,&operand1
S 1,&operand2
ST 1,&N
The following code can be generated the above statement using code definition of the
operator minus.
L 1,start
S 1,finish
ST 1,M1
103. System Programming Vidyavahini First Grade College
Department of BCA Page 103
Optimization [machining dependent]:-
Removing or deleting the duplicate entries in the matrix and modifying aii reference to the
deleted entries.
Matrix with common sub expressions Matrix after elimination of common sub
expressions
M1 – Start Finish
M2 * Rate M1
M3 * 2 Rate
M4 – Start Finish
M5 – M4 100
M6 * M3 M5
M7 + M2 M6
M8 = Cost M7
M1 – Start Finish
M2 * Rate M1
M3 * 2 Rate
M4
M5 – M1 100
M6 * M3 M5
M7 + M2 M6
M8 = cost M7
104. System Programming Vidyavahini First Grade College
Department of BCA Page 104
Optimization [machine dependent]:-
This phase has reduced both the memory space and the execution time of the object program.
Since these two factors is dependent on machine. The type of optimization is known as machine
dependent optimization.
Assembly phase:
The code generating phase is producing assembly language or the process of generating the
actual code is known as assembly phase
The assembly phase must perform these operations:
1. Resolve label references
2. Calculate addresses
3. Generate binary machine instructions
105. System Programming Vidyavahini First Grade College
Department of BCA Page 105
4. Generate storage
5. Convert literals
General model of complier:
There are 7 distinct logical problems
1. Lexical analysis
2. Syntax analysis
3. Integration phase
4. Machine independent optimization
5. Storage assignment
6. Code generation
7. Assembly and output
1 lexical analysis: Recognition of basics element or tokens and creation of uniform single table
2 Syntax analyses: Recognition of basics syntactic construct through reduction table
3 Interpretation phases: It describes the definition of exact meaning, creation of matrix and
tables for respective routine [action routings]
4 Machine independent optimization: Creation of most optimal matrix [removes the duplicate
entries in the matrix table]
5 storage assignment: It makes entries in the matrix that allow code generation to create code
that allocates dynamic storage and also the assembly phase to reserve the proper amount of
STARTIC storage
6 Code generation: A macro processor is used to produced more optimal assembly code
7 Assembly and Output: It resolving symbolic address and generating the machine language
106. System Programming Vidyavahini First Grade College
Department of BCA Page 106
Phase 1 to 4 is machine independent and language3 dependent. Because this phases helps in
determining the syntax and meaning of each statement in the source program. Hence it
dependent on the language and independent of the machine
Phase 5 to 7 is machine dependent and language independent. Because this phase allocates
memory for literals and also generate the assembly code which is dependent on machine and
independent of language
The database used by the compiler is:
1 Source code: The program written by user or the user program.
2 Uniform symbol table: It consist of the tokens or basic elements as they appear in the
program created by lexical analysis phase and given as input syntax analysis and interprition
phase
107. System Programming Vidyavahini First Grade College
Department of BCA Page 107
3 Terminal table: This table is created by lexical analysis phase and contains all variable in the
program
4 Identifier table: It contains all variable in the program and temporary storage [Ex M1, M2,
M3 … M7] and information needed to reference allocate storage for the variables. This table is
created by lexical analysis
5 Literal tables: It contains all contents in the program
6 Reductions: It is a permanent table of decision rules in the form of pattern for matching with
the uniform symbols table to discover synthetic structure.
7 Matrix: Matrix is created by the intermediate form of the program which is created by the
action routine. It is optimized and then used for code generation
8 Code productions: It is permanent table of definition. There is one entry defining code for
each matrix operator.
9 assembly code: The assembly language variation of the program which is created by the code
generation phase and it is input to the assembly phase
10 Re-locatable object codes: The final output of the assembly phase ready to be use as input to
loader
Phases of compiler
1 Lexical phase:
The lexical phase performs the following three tasks:
1. Recognize basic elements are tokens present in the source code
2. Build literal and an identifier table
3. Build a uniform symbol table
Database:
108. System Programming Vidyavahini First Grade College
Department of BCA Page 108
Lexical phase involves the manipulation of 5 databases
1. Source program
2. Terminal table
3. Literal table
4. Identifier table
5. Uniform symbol table
1 Source program: The original form of the program created by the user
2 Terminal table: It is a permanent database it consist of 3 fields
Symbol: operators, keywords and separators [(,;,:]
Indicators: values are YES or NO
Yes=> operators, separators
No=> Keywords
Precedence: Used in later phase
Step Symbol Indicator Precedence
1
2
3
4
5
6
7
8
9
10
:
;
(
)
,
*
Declare
Procedure
+
-
Yes
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Yes
Symbol Indicator precedence
109. System Programming Vidyavahini First Grade College
Department of BCA Page 109
11
12
13
14
*
Rate
Start
finish
Yes
No
No
No
3 Literal table:
It describes all literals constants used in the source program.
It consists of 6 fields:
Literal Base Scale Precision Other
information
address
Other information and address are stored in lateral phases
Ex:
Literals Base Scale Precision Other
information
Address
31
2
100
Decimal
Decimal
decimal
Fixed
Fixed
fixed
2
1
3
4 Identifier table:
It describes all identifiers used in the source program. It consists of three fields
1. Name
2. Data attribute
3. Address
Name Data attribute Address
110. System Programming Vidyavahini First Grade College
Department of BCA Page 110
Data attribute and address are used in later phases
Name Data attribute address
WCM
RATE
START
FINISH
COST
5 Uniform symbol tables:
Uniform symbol table represent the program as a strange of tokens rather than individual
character. There is one uniform symbol for every token in the program
It consists of 2 fields:
Table class index
Table class Index token
IDN
TRM
TRM
TRM
IDN
TRM
IDN
TRM
IDN
TRM
TRM
1
1
8
3
2
5
3
5
4
4
2
WCM
:
Procedure
(
Rate
,
Start
,
Finish
)
;
Algorithm:
Step1: The first task of the lexical analysis algorithm is to parse the input character strange into
tokens
Step2: the second step is to make appropriate entries in the table
111. System Programming Vidyavahini First Grade College
Department of BCA Page 111
Implementation:
1 The input strange is separated into tokens by break character. Brake characters are denoted by
the contents of a special field in the terminal table
2 lexical analysis 3 types of tokens:
1. Terminal symbols [TRM]
2. Identifiers [IDN]
3. Literals [LIT]
If symbol== TERMINAL table then
Create uniform symbol table of type TRM
3 Else if symbol==IDENTIFIER table then
Create uniform symbol table of type IDN
4 Else
Create uniform symbol table of type LIT
End if
2 Syntax Phase:
The functions of the syntax phase are
1. To recognize the major construct of the language
2. To call the appropriate action routines that will generate the intermediate form or matrix
form the constructs
Databases:
1 Uniform symbol table: The table create a by lexical phase
The uniform symbols are the source of input to the stack which s used by syntax and
interpretation phase
Table classes index
2 Stack: The stack is a collection of uniform symbol i.e., currently being worked on the stack is
organized in LIFO technique
112. System Programming Vidyavahini First Grade College
Department of BCA Page 112
3 Reduction table: The syntax rules of the source language are contained in the reduction table
The general form of the reduction or rules is:-
Label: old top stack/ action routine/ new top stack/ next reduction
Algorithm:
Step1: Reduction or tested consequently for match between old top of stack field and the actual
top of stack until match is found
Step2: When match is found the action routine specified in the action fields are executed in
ordered from left to right
Step3: when controlled return to the syntax analyzer, it modifies the top of stack to agree with
the new top of tack.
Step4: step1 is repeated starting with the reduction specified in the next reduction field
3. Interpretation Phase:
1. Uniform symbol table
113. System Programming Vidyavahini First Grade College
Department of BCA Page 113
2. Stack
3. Identifier table
4. Matrix
The above mentioned data bases are referred in text book page nos: 210.
5. Optimization Phase:
Optimization performed by a compiler is of 2 types. They are
1. Machine dependent Optimization:
It is related to the machine instructions that get generated. So it is added into the
code generation phase.
2. Machine independent Optimization:
It is not related to the machine instructions. It is used to increase efficiency of the
code and reduces the lines of code.
Data bases:
Matrix
Identifier table.
Literal table.
These are referred in text book page no:217.
Machine in dependent code Optimization:
Ex: A=2 * 276 / 92 * B
Refer in text book page no: 219.
Machine dependent code Optimization:
Ex: A= B + C + D
Refer in text book pageno: 224.
6. Code generation:
The Purpose of the code generation is to produce appropriate code. In this phase Matrix is
the input data base.
Data bases:
Matrix
114. System Programming Vidyavahini First Grade College
Department of BCA Page 114
Identifier table
Literal table
Code productions.
Ex: code generation with machine dependent Optimization.
A = B + C + D
Refer in text book Page no: 224.
Draw an Overview of a flowchart of a compiler depicting the passes. (or)
Explain the Passes of a Compiler.
Passes of a compiler
The above diagram depicts a flowchart of a compiler.
Pass1:
It corresponds to the lexical analysis of a compiler. It scans the source program and
creates the identifiers, literals and uniform symbol tables.
Pass2:
115. System Programming Vidyavahini First Grade College
Department of BCA Page 115
It corresponds to syntax and interpretation phases. Pass2 scans the uniform symbol table
produces the matrix.
Pass3 through Pass N-3 means Pass4:
They correspond to the optimization phase.
Pass N-2: Pass 5:
It corresponds to the storage assignment phase.
Pass N-1: Pass 6:
It corresponds to code generation phase. It scans the matrix.
Pass N: Pass 7:
It corresponds to Assembly and output phase.
What is Cross Compiler?
Def:
A cross compiler is a compiler capable of creating executable code for a platform other
than the one on which the compiler is running.
A cross compiler is necessary to compile for multiple platforms from one machine. A platform
could be infeasible for a compiler to run on, such as for the microcontroller of an embedded
system because those systems contain no operating system.
Cross compilers are not to be confused with source-to-source compilers. A cross compiler is for
cross-platform software development of binary code, while a source-to-source "compiler" just
translates from one programming language to another in text code.
Uses of cross compilers
Embedded computers
Compiling for multiple machines
Use of virtual machines
What is Linker and Functions of Linker?
The linker is the software program which binds many object modules to make a
single object program