Distributed Computing
EG 3113 CT Diploma in Computer Engineering
5th Semester
Unit 2.1 Uni-Processor Architecture
Lecture by : Er. Ashish K.C(Khatri)
Uniprocessor Architecture:
• A uniprocessor is a system with a single processor which has three major
components:
1. CPU:
- set of general purpose registers along with the program counter.
- a special purpose CPU status registers for storing the current state of CPU and
program under execution.
- One ALU and one local cache memory.
2. Main memory
3. Input/Output system
• In addition, there is a common synchronous bus architecture for communication
between CPU, Main memory and I/O system
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 2
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 3
Example:
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 4
Parallel processing mechanism:
• Parallelism in a uniprocessor means a system with a single processor performing
two or more than two tasks simultaneously.
• Parallelism can be achieved by two means hardware and software.
• Parallelism increases efficiency and reduces the time of processing.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 5
Hardware approach :
1. Multiplicity of functional units
2. Parallelism and pipelining within the CPU
3. Overlapped CPU and I/O operations
4. Use of a hierarchical memory system
5. Balancing of subsystem bandwidths
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 6
Multiplicity of functional units:
• In earlier computers, the CPU consists of only one arithmetic logic unit which used
to perform only one function at a time.
• This slows down the execution of the long sequence of arithmetic instructions.
• To overcome this the functional units of the CPU can be increased to perform
parallel and simultaneous arithmetic operations.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 7
Parallelism and Pipelining within CPU
• Parallel adders can be implemented using techniques such as carry-look ahead and
carry-save.
• A parallel adder is a digital circuit that adds two binary numbers, where the length
of one bit is larger as compared to the length of another bit and the adder operates
on equivalent pairs of bits parallelly.
• The multiplier can be recoded to eliminate more complex calculations.
• Various instruction execution phases are pipelined and to overcome the situation
of overlapped instruction execution the techniques like instruction pre-fetch and
data buffers are used.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 8
Overlapped CPU and I/O Operation
• To execute I/O operation parallel to the CPU operation we can use I/O controllers
or I/O processors.
• For direct information transfer between the I/O device and the main memory,
direct memory access (DMA) can be used.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 9
Use Hierarchical Memory System
• We all are aware of the fact that the processing
speed of the CPU is 1000 times faster than the
memory accessing speed which results in slowing
the processing speed.
• To overcome this speed gap hierarchical memory
system can be used.
• The faster accessible memory structure is
registered in CPU, then cache memory which
buffers the data between CPU and main memory.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 10
Balancing of Subsystem Bandwidth
• The processing and accessing time of CPU, main memory, and I/O devices are
different.
• If arrange the processing time of these units in descending order the order would
be: td > tm > tp
• where td is the processing time of the device,
tm is the processing time of the main memory, and
tp is the processing time of the central processing unit.
• The processing time of the I/O devices is greater as compared to the main memory
and processing unit. CPU is the fastest unit.
• To put a balance between the speed of CPU and memory a fast cache memory can
be used which buffers the information between memory and CPU.
• To balance the bandwidth between memory and I/O devices, input-output channels
with different speeds can be used between main memory and I/O devices.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 11
Software approach for parallelism:
1. Multi programming
2. Time sharing
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 12
Multi programming:
• There may be multiple processes active in computers and some of them may be
competing for memory, some for I/O devices and some for CPU.
• So, to establish a balance between these processes, program interleaving must be
practiced.
• This will boost resource utilization by overlapping the I/O and CPU operations.
• Program interleaving can be understood as when a process P1 is engaged in I/O
operation the process scheduler can switch CPU to operate process P2.
• This led process P1 and P2 to execute simultaneously.
• This interleaving between CPU and I/O devices is called multiprogramming.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 13
Time-sharing:
• Multiprogramming is based on the concept of time-sharing.
• The CPU time is shared among multiple programs.
• Sometimes a high-priority program can engage the CPU for a long period starving
the other processes in the computer.
• The concept of timesharing assigns a fixed or variable time slice of CPUs to
multiple processes in the computer.
• This provides an equal opportunity to all the processes in the computer.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 14
CISC architecture:
• Complex Instruction Set Computer.
• This processor includes a huge collection of simple to complex instructions.
• These instructions are specified in the level of assembly language level and the
execution of these instructions takes more time.
• A complex instruction set computer is a computer where single instructions can
perform numerous low-level operations like a load from memory, an arithmetic
operation, and a memory store or are accomplished by multi-step processes or
addressing modes in single instructions, as its name proposes “Complex
Instruction Set ”.
• So, this processor moves to decrease the number of instructions on every program
& ignore the number of cycles for each instruction.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 15
• CISC computers have small programs.
• It has a huge number of compound instructions, which takes a long time to
perform.
• Here, a single set of instructions is protected in several steps; each instruction set
has additional than 300 separate instructions.
• Maximum instructions are finished in two to ten machine cycles.
• In CISC, instruction pipelining is not easily implemented.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 16
Characteristics of CISC:
• Complex instruction, hence complex instruction decoding.
• Instructions are larger than one-word size.
• Instruction may take more than a single clock cycle to get executed.
• Less number of general-purpose registers as operation get performed in
memory itself.
• Complex Addressing Modes.
• More Data types.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 17
Disadvantages of CISC:
• The existing instructions used by the CISC are 20% within a program event.
• As compared with the RISC processor, CISC processors are very slow while
executing every instruction cycle on every program.
• This processor use number of transistors as compared with RISC.
• The pipeline execution within the CISC will make it difficult to use.
• The machine performance reduces because of the low speed of the clock.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 18
RISC architecture:
• “Reduced Instruction Set Computer’’.
• It is a CPU design plan based on simple orders and acts fast.
• This is a small or reduced set of instructions.
• Here, every instruction is expected to attain very small jobs.
• In this machine, the instruction sets are modest and simple, which help in
comprising more complex commands.
• Each instruction is of a similar length; these are wound together to get compound
tasks done in a single operation.
• Most commands are completed in one machine cycle.
• This pipelining is a crucial technique used to speed up RISC machines.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 19
Characteristics of RISC:
• Simpler instruction, hence simple instruction decoding.
• Instruction comes undersize of one word.
• Instruction takes a single clock cycle to get executed.
• More number of general-purpose registers.
• Simple Addressing Modes.
• Less Data types.
• Pipeline can be achieved.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 20
Disadvantages of RISC:
• The performance of this processor may change based on the executed code
because the next commands may depend on the earlier instruction for their
implementation within a cycle.
• The complex instruction is frequently used by the compilers and programmers
• These processors need very quick memory to keep different instructions that use a
huge collection of cache memory to react to the command within less time.
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 21
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 22
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 23
Differences between RISC and CISC:
8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 24

Uni Processor Architecture

  • 1.
    Distributed Computing EG 3113CT Diploma in Computer Engineering 5th Semester Unit 2.1 Uni-Processor Architecture Lecture by : Er. Ashish K.C(Khatri)
  • 2.
    Uniprocessor Architecture: • Auniprocessor is a system with a single processor which has three major components: 1. CPU: - set of general purpose registers along with the program counter. - a special purpose CPU status registers for storing the current state of CPU and program under execution. - One ALU and one local cache memory. 2. Main memory 3. Input/Output system • In addition, there is a common synchronous bus architecture for communication between CPU, Main memory and I/O system 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 2
  • 3.
    8/16/2022 Distributed ComputingNotes © Er. Ashish K.C(Khatri) 3
  • 4.
    Example: 8/16/2022 Distributed ComputingNotes © Er. Ashish K.C(Khatri) 4
  • 5.
    Parallel processing mechanism: •Parallelism in a uniprocessor means a system with a single processor performing two or more than two tasks simultaneously. • Parallelism can be achieved by two means hardware and software. • Parallelism increases efficiency and reduces the time of processing. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 5
  • 6.
    Hardware approach : 1.Multiplicity of functional units 2. Parallelism and pipelining within the CPU 3. Overlapped CPU and I/O operations 4. Use of a hierarchical memory system 5. Balancing of subsystem bandwidths 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 6
  • 7.
    Multiplicity of functionalunits: • In earlier computers, the CPU consists of only one arithmetic logic unit which used to perform only one function at a time. • This slows down the execution of the long sequence of arithmetic instructions. • To overcome this the functional units of the CPU can be increased to perform parallel and simultaneous arithmetic operations. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 7
  • 8.
    Parallelism and Pipeliningwithin CPU • Parallel adders can be implemented using techniques such as carry-look ahead and carry-save. • A parallel adder is a digital circuit that adds two binary numbers, where the length of one bit is larger as compared to the length of another bit and the adder operates on equivalent pairs of bits parallelly. • The multiplier can be recoded to eliminate more complex calculations. • Various instruction execution phases are pipelined and to overcome the situation of overlapped instruction execution the techniques like instruction pre-fetch and data buffers are used. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 8
  • 9.
    Overlapped CPU andI/O Operation • To execute I/O operation parallel to the CPU operation we can use I/O controllers or I/O processors. • For direct information transfer between the I/O device and the main memory, direct memory access (DMA) can be used. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 9
  • 10.
    Use Hierarchical MemorySystem • We all are aware of the fact that the processing speed of the CPU is 1000 times faster than the memory accessing speed which results in slowing the processing speed. • To overcome this speed gap hierarchical memory system can be used. • The faster accessible memory structure is registered in CPU, then cache memory which buffers the data between CPU and main memory. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 10
  • 11.
    Balancing of SubsystemBandwidth • The processing and accessing time of CPU, main memory, and I/O devices are different. • If arrange the processing time of these units in descending order the order would be: td > tm > tp • where td is the processing time of the device, tm is the processing time of the main memory, and tp is the processing time of the central processing unit. • The processing time of the I/O devices is greater as compared to the main memory and processing unit. CPU is the fastest unit. • To put a balance between the speed of CPU and memory a fast cache memory can be used which buffers the information between memory and CPU. • To balance the bandwidth between memory and I/O devices, input-output channels with different speeds can be used between main memory and I/O devices. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 11
  • 12.
    Software approach forparallelism: 1. Multi programming 2. Time sharing 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 12
  • 13.
    Multi programming: • Theremay be multiple processes active in computers and some of them may be competing for memory, some for I/O devices and some for CPU. • So, to establish a balance between these processes, program interleaving must be practiced. • This will boost resource utilization by overlapping the I/O and CPU operations. • Program interleaving can be understood as when a process P1 is engaged in I/O operation the process scheduler can switch CPU to operate process P2. • This led process P1 and P2 to execute simultaneously. • This interleaving between CPU and I/O devices is called multiprogramming. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 13
  • 14.
    Time-sharing: • Multiprogramming isbased on the concept of time-sharing. • The CPU time is shared among multiple programs. • Sometimes a high-priority program can engage the CPU for a long period starving the other processes in the computer. • The concept of timesharing assigns a fixed or variable time slice of CPUs to multiple processes in the computer. • This provides an equal opportunity to all the processes in the computer. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 14
  • 15.
    CISC architecture: • ComplexInstruction Set Computer. • This processor includes a huge collection of simple to complex instructions. • These instructions are specified in the level of assembly language level and the execution of these instructions takes more time. • A complex instruction set computer is a computer where single instructions can perform numerous low-level operations like a load from memory, an arithmetic operation, and a memory store or are accomplished by multi-step processes or addressing modes in single instructions, as its name proposes “Complex Instruction Set ”. • So, this processor moves to decrease the number of instructions on every program & ignore the number of cycles for each instruction. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 15
  • 16.
    • CISC computershave small programs. • It has a huge number of compound instructions, which takes a long time to perform. • Here, a single set of instructions is protected in several steps; each instruction set has additional than 300 separate instructions. • Maximum instructions are finished in two to ten machine cycles. • In CISC, instruction pipelining is not easily implemented. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 16
  • 17.
    Characteristics of CISC: •Complex instruction, hence complex instruction decoding. • Instructions are larger than one-word size. • Instruction may take more than a single clock cycle to get executed. • Less number of general-purpose registers as operation get performed in memory itself. • Complex Addressing Modes. • More Data types. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 17
  • 18.
    Disadvantages of CISC: •The existing instructions used by the CISC are 20% within a program event. • As compared with the RISC processor, CISC processors are very slow while executing every instruction cycle on every program. • This processor use number of transistors as compared with RISC. • The pipeline execution within the CISC will make it difficult to use. • The machine performance reduces because of the low speed of the clock. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 18
  • 19.
    RISC architecture: • “ReducedInstruction Set Computer’’. • It is a CPU design plan based on simple orders and acts fast. • This is a small or reduced set of instructions. • Here, every instruction is expected to attain very small jobs. • In this machine, the instruction sets are modest and simple, which help in comprising more complex commands. • Each instruction is of a similar length; these are wound together to get compound tasks done in a single operation. • Most commands are completed in one machine cycle. • This pipelining is a crucial technique used to speed up RISC machines. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 19
  • 20.
    Characteristics of RISC: •Simpler instruction, hence simple instruction decoding. • Instruction comes undersize of one word. • Instruction takes a single clock cycle to get executed. • More number of general-purpose registers. • Simple Addressing Modes. • Less Data types. • Pipeline can be achieved. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 20
  • 21.
    Disadvantages of RISC: •The performance of this processor may change based on the executed code because the next commands may depend on the earlier instruction for their implementation within a cycle. • The complex instruction is frequently used by the compilers and programmers • These processors need very quick memory to keep different instructions that use a huge collection of cache memory to react to the command within less time. 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 21
  • 22.
    8/16/2022 Distributed ComputingNotes © Er. Ashish K.C(Khatri) 22
  • 23.
    8/16/2022 Distributed ComputingNotes © Er. Ashish K.C(Khatri) 23
  • 24.
    Differences between RISCand CISC: 8/16/2022 Distributed Computing Notes © Er. Ashish K.C(Khatri) 24