4. 4
Assembler is a program which accepts
assembly language program as its
input and translate into equivalent
machine language program.
Assembly
Language
Program
Assembler
Machine
Language
Code
5. Mnemonic
Opcode (1)
Machine
Language
instruction are
replaced by
Mnemonic codes
e.g ADD
Symbolic
Operands (2)
Symbolic
names can be
used as
operands
E.g. A,RES
Data Declaration
(3)
Data is
declared in
variety of
form like
DS,DC,Literal
Readability (4)
Programs are
easy to
understand
and modify
Debugging (5)
Easy as
compared to
MLL.. Errors can
be detected and
corrected easily.
6. 6
• symbolic labeling of an assembler address .
• it is symbolic name for that statement.
Label
• Symbolic description of an operation
Mnemonic op code.
• Contains of variables or addresse if necessary
Operand Specification,[operand Specification]
Syntax of Assembly Language Statement :-
<Mnemomic> <Operand> Comments<Label>
AGAIN MULT BREG, TERM
MULT
BREG OR TERM
7. 7
Operand 1 :- Could be register /Condition Code.
Operand 2 :- Symbolic name of variable/ Label.
AGAIN MULT BREG, TERM
RES
• Refers to
memory word
with symbolic
name as “RES”.
RES+3
• Refers to
memory word
which is 3
words away
from word
“RES”.
RES(2)
• Content of
index register 2
RES RES RES+3
RES(2)
9. 9
Imperative
Statements:-
Declaration
Statements:-
They indicate an action to be performed during the execution of an
assembled program. Each imperative statement is translated into one
machine instruction.
[Label] DS <constant>[
statement reserves memory and associates names with them.
for ADD imperative statement (01) mchine
instructon is generated
A DS 1 ; reserves a memory area of 1
word, associa statement reserves memory
and associates names with them. Give the
name A to it
G DS 200 ; reserves a block of 200 words
and the name G is associated with G.
Label] DC '<value>’
◼ The DC (declare constant) statement constructs memory words
containing constants.
ONE DC '1’ ;
associates name one with a memory word
containing value 1
10. Assembler Directive
Statements:-
Assembler directives instruct the assembler to perform certain actions
during the assembly of a program.
1) START <constant>
This directive indicates that the first word of the
target program generated by the assembler
should be placed in the memory word having
address <constant>. E.g. START 100.
2) END [<operand spec>]
This directive indicates the end of the of the source
program. The optional <operand spec>
indicates the address of the instruction where
the execution of the program should begin.
11. Use of Constants
● An Assembly Program can use constants just like HLL, in two ways – as
immediate operands, and as literals.
• 1) Immediate operands can be used in an assembly statement only if the
architecture of the target machine includes the necessary features.
○ Ex: ADD AREG, 5
○ This is translated into an instruction from two operands – AREG and
the value '5' as an immediate operand
11
12. Use of Constants continue...
2) A literal is an operand with the syntax = '<value>'.
It differs from a constant because its location cannot be specified in the
assembly program.
Its value does not change during the execution of the program.
It differs from an immediate operand because no architectural provision is
needed to support its use.
ADD AREG, =‘5’ ➔ ADD AREG, FIVE
FIVE DC ‘5’
12
14. Advanced assembler directives (LTORG, ORIGIN, EQU)
14
ORIGIN
Provides the ability to
perform LC processing.
Useful when the target
program does not have
consecutive Memory.
ORIGIN LOOP
Reinitialize LC to
address of LOOP
EQU
Defines the symbol to
represent the given
Address
Simply associates name
with Address
A EQU LOOP
Assigns Address of
LOOP to A.
A= addr(LOOP)
LTORG
Treated as a constant.
Memory address or
word is used as
constant no the name
of it.
LTORG
=‘5’
No Need Memory
17. Design of assembler – Analysis Phase and Synthesis
Phase.
17
Analysis Phase
Synthesis
Phase
SP
Symbol
Table Creates
Uses
Mnemonic
Table
TP
18. Analysis Phase
● Primary Function :- Building the symbol
Table.
● For this, Memory Allocation :- it must
determine the addresses of symbols
[ directly or inferred].
● For this ,a Data structure is used Location
Counter [LC].
● Location Counter is made to contain the
address of the next memory word in the
target program.
● It is initialised to the constant given with
START
● E.g STRAT 101 then LC=101
18
LC=101
1.Build Symbol table.
2.Memory Allocation.
3.LC Processing.
19. 19
Mnemonic Opcode length
ADD 01 1
SUB 02 1
Symbol Address
N 113
AGAIN 104
Symbol Table
[SYMTAB] generated
by Analysis Phase.
Mnemonic Table
[Ready made] Analysis
Phase
20. LC Processing [giving memory to each element.]
● It is initialized to the constant specified at the START statement.
● LC is always made to contain the address of the next memory word in the
target program.
● When a LABEL is encountered,
○ it enters the LABEL and the contents of LC in a new entry of the
symbol table.
○
20
LABEL – e.g. N, AGAIN, SUM
Symbol Name Address Length
AGAIN 104 1
22. 22
Separate the label, mnemonic,op code &
operand of a statement
Build Symbol Table
Perform Memory Allocation
If label Present then
Add LC (Label) in the SymbolTable.
Check Mnemonic Valid?
[Using Mnemonic Table]
Perform LC Processing .
Assembler
Analysis
Steps….
READ N
27. See the difference between (a) After
Analysis Phase (b) Target Code to create
27
(a) (b)
28. 28
Obtain the Address of operand
From Symbol Table.
Obtain the machine code for
mnemonic from Mnemonic
Table
Construct Machine Instruction.
29. 29
MOVER BREG,ONE
Mnemonic Opcode length
MOVER 04 1
MOVEM 05 1
04
Symbol Address
N
113
ONE 115
TERM
116
AGAIN
104
RESULT
114
AREG 1
BREG 2
CREG 3
DREG 4
2 115
042115
30. 30
Symbol
Name
Address len
A 217 1
LOOP 202 1
B 218 1
NEXT 214 1
BACK 202 1
LAST 216 1
Symbol Table
Entr
y No
Literal Address
1 =‘5’ 211
2 =‘1’ 212
3 =‘1’ 219
Literal Table
Pool
No
Literal
no
1 #1
2 #3
Data Structure Entry for A given
Program
POOL Table
31. Need Of Pass
Structure
● Problem of Forward
Reference.
● i.e. ( Use and then Declare)
● So to Handle this problem
“Translation pass “ is
introduced.
31
Definition
Use
33. Backpatching 102)
● The problem of forward references is handled using a
process called back patching
1. Initially, the operand field of an instruction containing a forward reference is left blank
2. Ex: MOVER BREG, ONE can be only partially synthesized since ONE is a forward
reference
3. The instruction opcode and address of BREG will be assembled to reside in location
102.
4. To insert the second operand’s address later, an entry is added as Table of Incomplete
Instructions (TII)
5. The entry TII is a pair (<instruction address>, <symbol>) which is (102, ONE) here
MOVER BREG, ONE 04 2 ………
34. Back patching….
6. When END statement is processed, the symbol table would contain the addresses of all
symbols defined in the source program
7. So TII would contain information of all forward references
8. Now each entry in TII is processed to complete the instruction
9. Ex: the entry (102, ONE) would be processed by obtaining the address of ONE from
symbol table and inserting it in the operand field of the instruction with assembled
address 102.
10. Alternatively, when definition of some symbol L is encountered, all forward references
to L can be processed
Table of Incomplete
Instruction
(102, ONE)
35. Forward Reference :- A forward reference of an entity
is a reference to that entity which precedes its definition in the
program.
35
Pass:-A language processor pass is the processing of every
statement in a source statement (or its equivalent program) to
perform a language processing
Processing Performed
by the language
processor [Assembler]
36. Assembler Design
• Pass of a language processor – one complete scan of
the source program
• Assembler Design can be done in:
- Single pass
- Two pass
• Single Pass Assembler:
– Does everything in single pass
- Cannot resolve the forward referencing
• Two pass assembler:
– Does the work in two pass
– Resolves the forward references
39. 2-Pass Assembler Design
• First pass: [similar to Analysis Phase]
– Scan the code by separating the symbol, mnemonic op code and
operand fields
– Build the symbol table, Literal Table.
– Perform LC processing
– Construct intermediate representation (IR)
• Second Pass: [similar to Synthesis Phase]
– Obtain the Machine Code from Mnemonic Table.
– Solves forward references by obtaining addresses from SYMTAB.
– Constructs the machine instruction by processing IC generated by
Pass-I
IR = Data
Structure
[symtab , littab.. ]
+ Processed
form of Source
Program.
40. Data Structures in Pass I
1. OPTAB – a table of mnemonic op codes : Contains mnemonic op code, class and
mnemonic info
2. SYMTAB - Symbol Table:- Contains address and length
3. LOC - Location Counter
4. LITTAB – a table of literals used in the program , Contains literal and address. Literals
are allocated addresses starting with the current value in LC and LC is incremented,
appropriately.
Name Addr Len
N 101 1
Symtab
Mnemo
nic
Numeric
Opcode
Len
ADD 01 1
Optab
Name Addr
=‘5’ 101
Littab
No entry
No
1 #1
PoolTab
LC
41. 42
Two Pass
translation
One Pass
Translation
Pass
Types
One Pass Assembler [Single Pass
Assembler.
1.LC processing and Construction of Symbol Table is as in 2
Pass Assembler.
2.The Problem of forward Reference is handled by using Back -
patching
3. Use of Table of Incomplete Instruction [TII] or
Incomplete Instruction Table [IIT]
4.Explanation of Back patching.
5.By the END statement , SYMTAB would contain all addresses of all symbols and TII
contains information about all forward references.
42. 43
6. Assembler can now process each entry in the TII to complete the concerned
instruction and all forward references can be processed.
MOVER BREG ,ONE was 04 2 ……………………
Table of Incomplete
Instruction
(102, ONE)
Symbol Address
N 113
ONE 115
TERM 116
AGAIN 104
RESULT 114
102)
04 2 115
43. 44
Statements having Forward
References [ For Symbol B] :-
ADD BREG,B
MOVEM BREG ,B
PRINT B
Symbol with forward Ref
entered into TII
Table of Incomplete
Instruction
(103, B)
(104, B)
(105, B)
Symbol Address
A 113
Again 101
B 107
Symbol entered into Sym
table.
As we have reached at he
end statement
44. 45
Here single pass or One
Pass Assembler is
completed with the
partially converted form
of machine code.
With the help of TII
incomplete instructions
having LC as 103,104,105
are back patched with
address of Symbol ‘B’ i.e.
107 and at last we get this
output
46. Intermediate Code Forms [IC]
47
???? :- It is generated at the end of PASS-I of 2 Pass Assembler.
It expresses the Machine Operation.
Intermediate Code consists of a set of IC units , each unit consisting
of the 3 fields…
1. Address 2.Opcode 3.Operands.
START 200 (AD,01) (C,200)
IC
47. Generate IC For each part of Statement
except Label
48
LOOP MOVER AREG, N
IC1
IC2
IC3
48. 49
How to convert
Mnemonic Field?????
(Statement Class , Code)
IS
DL
AD
Use Numeric Op
codes.
DC 01
DS 02
START 01
END 02
ORIGIN 03
EQU 04
LTORG 05
MOVER AREG,N
(IS,04)
A DS 1
(DL ,01)
LTORG
(AD,05)
49. Generate IC For each part of Statement
except Label
50
LOOP MOVER AREG, N
(IS,04)
?
?
50. Two ways to construct IC:-Variant I and
Variant II
1. First operand is represented by
single digit number , which is a
51
code for
register(1-4 for
AREG-DREG).
code for Condition
Codes (1-6 for LT-
ANY ).
o
r
LOOP MOVER AREG , A
1ST Operand is : AREG
IC 1
BC GT, LOOP
1ST Operand is : GT
IC 4
o
r
51. Two ways to construct IC:-Variant I….
2. Second operand which is a
memory word [variable] is
represented by a pair of form..
52
(Operand Class , Code)
LOOP MOVER AREG , A
2nd Operand is A
IC (S,1)
SUB AREG , =‘1’
2nd Operand is =‘1’
IC (L,1)
o
r
C
S
L
Constant
Symbol
Label
Literal
Internal
Representation of
constant. (C,200)
Operands Entry No
in Symbol Table
Operands Entry No
in Literal Table
52. Two ways to construct IC:-Variant I..operand2 may
be defined or with forward reference
3. Second operand which is a
memory word [variable] is
53
LOOP MOVER AREG , A
2nd Operand is A
IC (S,125,1)
o
r
Without forward
reference [Memory is
given then using]
With forward
reference [ Memory
is not given but used
first]
(S ,125 ,1) (S ,1)
LOOP MOVER AREG , A
2nd Operand is A
IC (S,1)
53. Variant II differs from Variant I in operand
field.
First Operand
[REG/Condition
Code]
Second Operand
DO NOT process
C
Li
S
La
IC generation of Mnemonics is same as Variant I
[MOVER ] …. (IS,04)
(C,100)
(L,1)
S
Label