Loaders complete

Linkers and Loaders
Collected By: pallavi

Introduction
The execution of a program written in a particular programming
language involves the following steps:
Translation: Conversion of source program into object code.
Linking: Combining two or more object programs and supplying
the information needed to allow references between them.
Relocation: modifying the object program so that it can be loaded
at an address different from the location originally specified.
Relocation is performed by linker.
Loading: to load or bring the object program into memory for
execution.

Linking
• It is a process of combining different object modules
and resolving various external references.
• In general a program contains:
– Internal references to externally defined symbols.
– Internally defined symbols that may be used externally i.e.
public definition.
E.g. a program consists of a set of program units as P1,
P2 AND P3
If P1 and P2 need to interact with each other they must contain
public definitions and external references.

• Public definition: a symbol or variable
defined in a program as public to be used
in other program
• External references: a symbol or variable
not defined in program unit but is
referenced in the program unit

Linkers
Computer programs typically comprise several parts or modules; all these
parts/modules need not be contained within a single object file, and in such
case refer to each other by means of symbols.
Typically, an object file can contain three kinds of symbols:
– defined symbols, which allow it to be called by other modules,
– undefined symbols, which call the other modules where these symbols are defined, and
– Local symbols, used internally within the object file to facilitate relocation.
– Linkers can take objects from a collection called a library. Some linkers do not include the
whole library in the output; they only include its symbols that are referenced from other
object files or libraries. Libraries exist for diverse purposes, and one or more system
libraries are usually linked in by default.

Static and Dynamic Linking

In general, there are three types of linkers:
• Static Linker (or linkage editor) - linking prior to
load time.
• Implicit Dynamic Linker (or linking loader) -
linking at load time.
• Explicit Dynamic Linker - linking is performed at
execution time.

Static Linking
• The simplest form of linking is called static
linking.
Static linking is used to combine multiple
functions into a single object module.
• These functions may be a part of your compiled
program modules, and/or a part of various
archive libraries that it uses.
• The result of static linking will be an executable
object,

Working of Static Linker
1. The linker is given your compiled code, containing many unresolved
references to library routines.
2. It also gets archive libraries containing each library routine as a separate
module.
3. The linker keeps working until there are no more unresolved references
and writes out a single file that combines your code and a jumbled mixture
of modules containing parts of several libraries.
4. This static linking process takes place once - when the executable module
is created. All internal references to functions within a module are resolved
at this time

Advantages
• One linking for many executions.
• Calls into the library routines have a little less
overhead since they are linked together directly.
• Start-up time at program loading is reduced as there is
no need to locate and load the dynamic libraries

Disadvantages
• Physical memory wasted by duplicating the same library code in every static linked
process can be significant.
• A static-linked program contains a subset of the jumbled library routines. The library
cannot be tuned as a whole to put routines that call each other onto the same
memory page.
• During program development, recompiling for nearly every execution.
• Since the size of the executable object may be relatively large, the storage waste is
significant, especially in case of infrequently used program.

Dynamic Linking
The operating system provides facilities for creating and using
dynamically linked shared libraries.
With dynamic linking, external symbols referenced in user code
and defined in a shared library are resolved by the loader at load
time.
When you compile a program that uses shared libraries, they are
dynamically linked to your program by default.
The central property of the dynamic linking is that even if
many processes are using a particular, dynamically linked
function, only a SINGLE copy of this function appears in the
memory, and it is shared by all these processes.Collected By: pallavi

Dynamic Linking
• Many operating system environments allow dynamic linking,
that is the postponing of the resolving of some undefined
symbols until a program is run.
• That means that the executable code still contains undefined
symbols, plus a list of objects or libraries that will provide
definitions for these.
• Loading the program will load these objects/libraries as well,
and perform a final linking.
• Dynamic linking needs no linker.

• The shared library code is not present in the
executable image on disk, but is kept in a
separate library file.
• Shared code is loaded into memory once in the
shared library segment and shared by all
processes that reference it.

Dynamic Linking Process
The dynamic linking process consists of two steps.
1.First, the module that contains the external function must be located.
2.Second, the address of the function within this module must be found. Once this
address is found, the calling module can use this external function.
3.Dynamically imported functions are functions that are called from within one module
but actually reside in another module.
4.In order to import a function, the developer must provide the name of the module.
5.These items should uniquely identify the imported function.
Dynamic linking occurs when this information is used to find the imported function's
address.
6.If the specified function is also found, its address is dynamically placed in the calling
module's code.

Advantages
1. Load time might be reduced because the shared library code might already be
in memory.
2. Run-time performance can be enhanced because the operating system is less
likely to page out shared library code that is being used by several
applications, or copies of an application, rather than code that is only being used by
a single application. As a result, fewer page faults occur.
3. The routines are not statically bound to the application but are dynamically bound
when the application is loaded. This permits applications to automatically inherit
changes to the shared libraries, without recompiling or rebinding.

Disadvantages
1. A more subtle effect is a reduction in "locality of reference." You may
be interested in only a few of the routines in a library, and these routines
may be scattered widely in the virtual address space of the library. Thus,
the total number of pages you need to touch to access all of your routines is
significantly higher than if these routines were all bound directly into your
executable program.
2. Dynamically linked programs are dependent on having a compatible
library. If a library is changed (for example, a new compiler release may
change a library), applications might have to be reworked to be made
compatible with the new version of the library. If a library is removed from
the system, programs using that library will no longer work.

Loader is a program which accepts the program object deck, places it into memory and
prepares it for execution and initiates execution.
Functions of a Loader :
• Allocates space in memory for the programs (allocation)
• Resolve symbolic references between object decks (linking)
• Adjust all address dependent locations such as address constants, to correspond to
the allocated space (relocation)
• Physically place the machine instructions and data into memory (loading)
Loading is done by copying the file from secondary storage to the primary or virtual
memory.
Introduction

Loader schemes
• There are various loader schemes available and these are as
follows:
– Compile and go/ Assemble and go loader
– General loader
– Absolute loader
– Relocating loader
– Direct linking loader

Compile-and-Go Loader
• It executes an assembler in one part of memory and places the assembled machine
instructions and data directly into their assigned memory locations.
• As soon as the assembly is over the assembler gives the control to the starting memory
instruction of the program.
• Here Assembler simply places code into core and loader consisting one instruction, transfer
control to beginning of the program.
• In this type of loader assembling or compiling, linking and loading goes in one shot, as a
result it does not require any extra procedures.
• This type of loading scheme is used by WATFOR FORTRAN Compiler.

Advantages of Compile and Go
Loader
• Easy to implement as assembler after translating the
source program directly places it into core where it is
executed.
• Does not involve extra procedures.

Disadvantage of Compile-and-Go
Loader
• A portion of the memory is occupied by the assembler, which is not
available for the program.
• Necessary to retranslate the user program deck every time it is run.
• Very difficult to handle multiple segments, specially if they are in different
languages.
– E.g.. One subroutine in assembly language and another written in
Fortran.
• Difficult to produce orderly modular programs.

General Loader Scheme
• Outputting the instructions and data as they are assembled circumvents the problem
of wasting core for the assembler.
• The assembled output can be saved and loaded whenever the code need to be
executed.
• The assembled program can be loaded into same memory area which was earlier
used for assembler as assembling has been completed.
• This output, which may be on cards containing coded form of instructions, is called
object deck.
• The use of an object deck as an intermediate data to avoid one disadvantage of the
loader scheme but require the addition of a new program to a system a loader.

• The loader accepts the assembled code, data and information in object form and
place them in core, in a executable form.
• The loader is assumed to be smaller than assembler which saves memory.
• Reassembly is not required each time.
• Source program can be in multiple languages.

Absolute Loaders
• It fits the general loading scheme.
• In absolute loader the assembler translates the source program, generates
the object code and writes these instructions and data in a file together with
their load address.
• The loader reads the file and places the code at the absolute address given
in the file.
• Here, no relocation information is needed to be stored as a part of an object
file, so it is termed as absolute.
.

• Resolution of external references and linking of different modules
which are interdependent is done by the programmer assuming
programmer knows memory management.
• In this scheme, multiple segments are allowed. Logically even some
programs written in multiple languages are allowed but respective
assembler has to take care of converting them into a common object
format.
• For this assembler must give the following information through
object file:
– Starting address and name of each module.
– Length of each module

Absolute Loader(contd.)
• Produces object code almost similar to compile-and-go loader.
• Only difference is object code is written on cards instead of core.
• Loader then simply accepts the machine code and put this code in core prescribed by the
assembler.
• Saves core.
• Simple to implement.
• The programmer must specify the assembler the core location where the program is to be
loaded.
• If program contains multiple subroutines then programmer must remember each address
and specify these addresses explicitly in subroutines to perform the linkage.
• The programmer has to be careful about not assigning same locations to more than one
subroutines Collected By: pallavi

Fig: Absolute Loader
MAIN
Absolute
Loader
SQRT SQRT
MAIN
100
248
400
478
Object
De
cks

Design of An Absolute Loader
With an absolute loader scheme the programmer and the assembler perform the
tasks of allocation, relocation, and linking.
Therefore it is only necessary for the loader to read cards of the object deck and
move the text on the cards in to the absolute locations specified by the assembler.
Two types of info assembler must provide to loader:
• Machine instructions along with assigned core locations. TEXT CARD
• Entry Point i.e. where the loader is to transfer control when all instructions are
loaded. TRANSFER CARD

When a card is read, it is stored in core as 80 contiguous
bytes:
Text Card (for instructions and data)
Card Column Contents
1 Card type=0 (for text card identifier)
2 Count of number of bytes of info on card
3-5 Address at which data is to be put
6-7 Empty (could be used for validity check)
8-72 Instructions and Data to be loaded
73-80 Card Sequence Number

When a card is read, it is stored in core as 80
contiguous bytes:
Transfer Cards (to hold entry point to program)
Card Column Contents
1 Card type=1 (transfer card identifier)
2 count=0
3-5 Address of Entry Point
6-72 Empty
73-80 Card Sequence Number

Absolute Loader
Card Type?
READ Card
Initialize
Transfer to location
CURLOC
Set CURLOC to
location in char 3-5
Move LNG bytes of
text from char 8-72 to
location CURLOC
Set LNG to
count in char 2

Four loading fuctions are
accomplished as follows in
absolute loading scheme:
• Allocation- by programmer
• Linking – by programmer
• Relocation – by assembler
• Loading- by loader

Limitations
• Programmer has to specify to the assembler where to
load the program.
• In case of multiple subroutines, programmer has to
remember each address, and use it explicitly in other
subroutines.
• Change in one subroutine can cause a change to
other subroutines also.

Subroutine Linkage
• The problem of Subroutine Linkage is: a main program A wishes to transfer to subprogram B.
the programmer in program A Could write a transfer Instruction (E.g. BAL 14,B) to sub
program B.
• How ever the assembler does not Know the value of this symbol reference and will declare
this an error.
• This mechanism is typically implemented with a relocating or a direct –linking loader.
• The Assembler Pseudo-op EXTRN followed by a list of symbols indicates that these
symbols are defined in other programs but referenced in the present program.
• If a symbol is defined in one program and referenced by other’s, we insert it into a symbol list
following the pseudo-op ENTRY.
• In turn the assembler will inform the loader that these symbols may be referenced by other
programs . Collected By: pallavi

Subroutine Linkage
(Use of EXTRN)
MAIN START
EXTERN SUBROUT
------
------
L 15,=A(SUBROUT)
BALR 14,15
……
END
1) The Above sequence of instructions first declares SUBROUT as an external
variable, that is a variable referenced but not defined in this program
2) The load instruction loads the address of that variable into register 15.
3) The BALR instruction Branches to the contents of register 15, Which is the
address of SUBROUT and leaves the value of next instruction in register 14.

Uses of Multiple Entry Points are:
• Common coding
• Colleting together related routines for
convenience.
• Better or convenient access to common
data base.

Relocation
• Relocation is the process of assigning load addresses to
various parts of a program and adjusting the code and data in
the program to reflect the assigned addresses.
• A linker usually performs relocation in conjunction with symbol
resolution, the process of searching files and libraries to
replace symbolic references or names of libraries with actual
usable addresses in memory before running a program.
• Although relocation is typically done by the linker at link time, it
can also be done at execution time by a relocating loader, or by
the running program itself.

Relocation Procedure
Relocation is typically done in two steps:
• Each object file has various sections like code, data, .bss etc.
– To combine all the objects to a single executable, the linker merges
all sections of similar type into a single section of that type.
– The linker then assigns run time addresses to each section and each
symbol.
– At this point, the code (functions) and data (global variables) will have
unique run time addresses.
• Each section refers to one or more symbols which should be
modified so that they point to the correct run time addresses
based on information stored in a relocation table in the object
file.

Relocating Loaders
To avoid possible reassembling of all subroutines when a single subroutine is changed, and to
perform the tasks of allocation and linking for the programmer the general class of relocating
loaders are introduced.
Binary Symbolic Subroutine(BSS).
• The BSS loader allows many procedure segments, yet only one data segment.
• Assembler assembles each procedure segment independently and passes on to the loader the
text and information as to relocation and inter-segment references.
Relocation Assembler:
• The output of a relocating assembler using BSS scheme is the object program and
information about all other programs it references.
• For Each Source program the assembler outputs a text(Machine Translation of the
program) prefixed by a transfer vector that consists of addresses containing names of
subroutines referenced by the source program.

• The assembler translates each procedure segments independently and pass
these assembled segment and information about relocation to the loader.
• The output of relocation assembler is the object program and information
about all other programs it references.
• For each source program the assembler produces machine language
equivalent of the program called TEXT. This text is prefixed by a
TRANSFER VECTOR.
Transfer Vector
•Consists of addresses containing names of subroutines
referenced by source program.
• Transfer vector consists of addresses containing names of the
subroutines referenced by the source program.

• Assembler provides the following to the loader:
– Machine translation of a program called text.
– Transfer vector
– Information about all other programs it references.
– Relocation information
– The length of the entire program and length of transfer vector portion.
• Loader accepts all the information passed by assembler
and performs the following actions:
o First it loads transfer vector and text into core.
o Then it loads each subroutine specified by transfer vector
o It would then place a transfer instruction to the corresponding Subroutine in each
entry in the transfer Vector.

– Thus the transfer vector is used to solve the problem of
linking and program length information is used to solve the
problem of allocation.
– Thus the execution of the call SQRT statement would result in a branch
to the first location in the transfer vector, which would contain a
transfer instruction to the location of SQRT.

Relocation Loader
Relocation Bits
• If not fixed length direct address
instruction format in that case assembler
associate a bit with each instruction or
address field.
• If this bit is 0: relocation not needed
• If this bit is 1: relocation needed

Functions of loader
• Relocation Bits: for relocation purpose
• Transfer Vector: for Linking Purpose
• Program Length Information: for allocation
purpose

Advantages of relocating loader
• It avoids reassembling of all sub routines
when a single subroutine changes,
• All 4 functions i.e. allocation, linking,
relocation and loading are performed by
loader.
• It allows independent translation of each
module.

Disadvantages
• The use of transfer vector increases the
size of the object program.

Bootstrap Loader
• Alternatively referred to as bootstrapping, boot loader, or
boot program, a bootstrap loader is a program that resides in
the computers EPROM, ROM, or other non-volatile memory
that automatically executed by the processor when turning on
the computer.
• The bootstrap loader reads the hard drives boot sector to
continue the process of loading the computers operating
system.

• It is a special type of absolute loader that is executed
first when computer is turned on.
• It is a program of small size loaded by the BIOS at
system start up.
• BIOS does not specify any information about the
environment an operating system needs, and thus is
not able to initialize a system.
• It is the responsibility of Bootstrap loader to load the
code and build an appropriate operating environment.

• Bootstrap programs are capable of performing
various tasks such as initializing some
hardware pieces, putting the processor into
advanced operating modes or carrying out a
dedicated processing function

Machine Independent Features of
Loaders

Automatic Library Search
• Automatic library call
– The programmer does not need to take any action beyond mentioning
the subroutine names as external references
• Solution
1 Enter the symbols from each Refer record into ESTAB
2 When the definition is encountered (Define record), the address is
assigned
3 At the end of Pass 1, the symbols in ESTAB that remain undefined
represent unresolved external references
4 The loader searches the libraries specified (or standard) for undefined
symbols or subroutines

Automatic Library Search
(Cont.)
• The library search process may be repeated
– Since the subroutines fetched from a library may themselves contain
external references
• Programmer defined subroutines have higher priority
– The programmer can override the standard subroutines in the library by
supplying their own routines
• Library structures
– Assembled or compiled versions of the subroutines in a library can be
structured using a directory that gives the name of each routine and a
pointer to its address within the library

Loader Options
• Many loaders have a special command language that is used to specify
options
– a separate input file
– source program
– embedded in the primary input stream between programs
• Command Language
– specifying alternative sources of input
• INCLUDE program-name(library-name)
– changing or deleting external reference
• DELETE name
• CHANGE symbol1, symbol2
– controlling the automatic library search
• LIBRARY MYLIB

Loader Options (cont.)
– specify that some references not be resolved
• NOCALL name
– specify the location at which execution is to begin
• Example
– If we would like to evaluate the use of READ and
WRITE instead of RDREC and WRREC, for a
temporary measure, we use the following loader
commands
• INCLUDE READ(UTLIB)
• INCLUDE WRITE(UTILB)
• DELETE RDREC, WRREC
• CHANGE RDREC, READ
• CHANGE WRREC, WRITE

Machine Dependent Loader
Features

Relocating Loaders
• Motivation
– Efficient sharing of the machine with larger memory
and when several independent programs are to be run
together
– Support the use of subroutine libraries efficiently
• Two methods for specifying relocation
– Modification record
– Relocation bit
– Each instruction is associated with one relocation bit
• These relocation bits in a Text record is gathered into bit
masks

Program Linking
• Goal
– Resolve the problems with EXTREF and EXTDEF from
different control sections (sec 2.3.5)
• Example
– Program in Fig. 3.8 and object code in Fig. 3.9
– Use modification records for both relocation and linking
• Address constant
• External reference

Loaders complete

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Loaders complete

Similar to Loaders complete (20)

Recently uploaded

Recently uploaded (20)

Loaders complete

Editor's Notes