This document discusses computer organization and architecture. It covers fundamental concepts like the program counter, instruction register, and control signals. It describes the five steps to fetch and execute an instruction - fetch, decode, execute, memory, and writeback. The hardware components of a basic processing unit like the register file, ALU, and datapath are explained. Different types of instructions like load, arithmetic, store, and branches are covered. Finally, it introduces parallel computer architectures, Flynn's taxonomy, memory organization models, and parallelism techniques like simultaneous multithreading and multicore processors.
The document discusses computer architecture and the Von Neumann architecture. It describes:
- The main components of a CPU including registers for temporary storage, buses for transmitting data/instructions, and functional units like the ALU.
- The fetch-execute cycle where the control unit fetches instructions and data from memory, decodes and executes the instructions using functional units, and writes results back to memory.
- The differences between RISC and CISC architectures, where RISC uses simpler instructions that can execute in one clock cycle while CISC incorporates complex operations into single instructions.
- Key components like the program counter, memory address register, and accumulator.
- The Von Neumann architecture where the CPU
The document describes the von Neumann architecture, including its main components: main memory, arithmetic logic unit (ALU), control unit, CPU registers, and I/O equipment. The CPU consists of registers like the program counter, instruction register, and memory address register. The control unit interprets instructions and causes them to execute. Main memory stores both instructions and data, while the ALU performs arithmetic operations. I/O equipment is controlled by the control unit to input and output data.
Design & Simulation of RISC Processor using Hyper Pipelining TechniqueIOSR Journals
This Hyper pipelining technique is different to the pipelining of instruction decoding known from
RISC processors. The point is that we can use hyper pipelining on top of any sequential logic, for example a
RISC processor, independent of its underlying functionality. The RISC processor with pipelined instruction set
decoding can automatically be hyper pipelined to generate CMF individual RISC processors. Hyper pipelining
implements additional register and can use register balancing for fine grain timing optimizations. The method
hyper pipelining is also called “C-slow Retiming”. The main benefit is the multiplication of the core's
functionality by only implementing registers. This is a great advantage for ASICs but obviously very attractive
for FPGAs with their already existing registers
embedded system and computer architecuremadhulbtech23
This document discusses a simple CPU design including instruction fetch, addition, and storage instructions. It describes the fetch cycle where the program counter addresses the instruction in memory and it is copied to the instruction register. The addition instruction is explained as taking operands from registers, performing the addition, and storing the result in a register. The store instruction uses the contents of one register as a memory address, copies the contents of another register to this memory location.
The document provides information about the structure and components of a computer system. It discusses the main parts including the processor, memory, input/output devices, and how they communicate via buses. It explains that the processor uses address, data, and control buses to read from and write to memory locations. The document also discusses different types of memory like registers, cache, main memory, and backing storage and how they differ in speed and capacity. It covers concepts like addressability that allow the processor to access any part of the computer.
computer architecture and the fetch execute cycle By ZAKTabsheer Hasan
The document describes the Von Neumann architecture and the fetch-execute cycle of a CPU. It explains that Von Neumann introduced the concept of storing both instructions and data in memory. This allowed programs to be changed by modifying memory rather than rewiring the computer. It then outlines the registers involved in the fetch-execute cycle, including the program counter, instruction register, and accumulator. The cycle fetches an instruction from memory, decodes it, executes it, and then resets to fetch the next instruction, repeating continuously. The cycle provides the fundamental processing model for modern CPU design.
1.3.2 computer architecture and the fetch execute cycle By ZAKTabsheer Hasan
The document describes the Von Neumann architecture and the fetch-execute cycle of a CPU. It explains that Von Neumann introduced the concept of storing both instructions and data in memory. This allowed programs to be changed by modifying memory rather than rewiring the computer. It then outlines the registers involved in the fetch-execute cycle, including the program counter, instruction register, and accumulator. The cycle fetches an instruction from memory, decodes it, executes it, and then resets to fetch the next instruction, repeating continuously. The cycle provides the fundamental operation of a CPU.
A REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORSIRJET Journal
This document provides a review and comparison of 32-bit and 64-bit RISC processors. It discusses the system architectures of 32-bit and 64-bit RISC processors, including their instruction sets, registers, arithmetic logic units, control units, and flag registers. It also summarizes previous research comparing the performance of 16-bit and 32-bit RISC processors in terms of power consumption, operating frequency, and delay. The document aims to analyze and compare implementation models and operational elements such as acceleration and power dissipation between 32-bit and 64-bit RISC processors.
The document discusses computer architecture and the Von Neumann architecture. It describes:
- The main components of a CPU including registers for temporary storage, buses for transmitting data/instructions, and functional units like the ALU.
- The fetch-execute cycle where the control unit fetches instructions and data from memory, decodes and executes the instructions using functional units, and writes results back to memory.
- The differences between RISC and CISC architectures, where RISC uses simpler instructions that can execute in one clock cycle while CISC incorporates complex operations into single instructions.
- Key components like the program counter, memory address register, and accumulator.
- The Von Neumann architecture where the CPU
The document describes the von Neumann architecture, including its main components: main memory, arithmetic logic unit (ALU), control unit, CPU registers, and I/O equipment. The CPU consists of registers like the program counter, instruction register, and memory address register. The control unit interprets instructions and causes them to execute. Main memory stores both instructions and data, while the ALU performs arithmetic operations. I/O equipment is controlled by the control unit to input and output data.
Design & Simulation of RISC Processor using Hyper Pipelining TechniqueIOSR Journals
This Hyper pipelining technique is different to the pipelining of instruction decoding known from
RISC processors. The point is that we can use hyper pipelining on top of any sequential logic, for example a
RISC processor, independent of its underlying functionality. The RISC processor with pipelined instruction set
decoding can automatically be hyper pipelined to generate CMF individual RISC processors. Hyper pipelining
implements additional register and can use register balancing for fine grain timing optimizations. The method
hyper pipelining is also called “C-slow Retiming”. The main benefit is the multiplication of the core's
functionality by only implementing registers. This is a great advantage for ASICs but obviously very attractive
for FPGAs with their already existing registers
embedded system and computer architecuremadhulbtech23
This document discusses a simple CPU design including instruction fetch, addition, and storage instructions. It describes the fetch cycle where the program counter addresses the instruction in memory and it is copied to the instruction register. The addition instruction is explained as taking operands from registers, performing the addition, and storing the result in a register. The store instruction uses the contents of one register as a memory address, copies the contents of another register to this memory location.
The document provides information about the structure and components of a computer system. It discusses the main parts including the processor, memory, input/output devices, and how they communicate via buses. It explains that the processor uses address, data, and control buses to read from and write to memory locations. The document also discusses different types of memory like registers, cache, main memory, and backing storage and how they differ in speed and capacity. It covers concepts like addressability that allow the processor to access any part of the computer.
computer architecture and the fetch execute cycle By ZAKTabsheer Hasan
The document describes the Von Neumann architecture and the fetch-execute cycle of a CPU. It explains that Von Neumann introduced the concept of storing both instructions and data in memory. This allowed programs to be changed by modifying memory rather than rewiring the computer. It then outlines the registers involved in the fetch-execute cycle, including the program counter, instruction register, and accumulator. The cycle fetches an instruction from memory, decodes it, executes it, and then resets to fetch the next instruction, repeating continuously. The cycle provides the fundamental processing model for modern CPU design.
1.3.2 computer architecture and the fetch execute cycle By ZAKTabsheer Hasan
The document describes the Von Neumann architecture and the fetch-execute cycle of a CPU. It explains that Von Neumann introduced the concept of storing both instructions and data in memory. This allowed programs to be changed by modifying memory rather than rewiring the computer. It then outlines the registers involved in the fetch-execute cycle, including the program counter, instruction register, and accumulator. The cycle fetches an instruction from memory, decodes it, executes it, and then resets to fetch the next instruction, repeating continuously. The cycle provides the fundamental operation of a CPU.
A REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORSIRJET Journal
This document provides a review and comparison of 32-bit and 64-bit RISC processors. It discusses the system architectures of 32-bit and 64-bit RISC processors, including their instruction sets, registers, arithmetic logic units, control units, and flag registers. It also summarizes previous research comparing the performance of 16-bit and 32-bit RISC processors in terms of power consumption, operating frequency, and delay. The document aims to analyze and compare implementation models and operational elements such as acceleration and power dissipation between 32-bit and 64-bit RISC processors.
Here are the steps to launch Microsoft Visual Studio 2008 and create a project to edit the program P 2-1:
1. Launch Microsoft Visual Studio 2008.
2. Click "File" -> "New" -> "Project".
3. In the "New Project" dialog box, select "Visual C++" in the left pane and "Win32 Console Application" in the middle pane.
4. Click "OK".
5. In the "Win32 Application Wizard" dialog box, enter a name for the project (e.g. "AssemblyProgram") and click "OK".
6. This will create a new empty project in Visual Studio.
7. Right click on the
The CPU is the central processing unit of a computer and consists of three main parts - the control unit, register set, and ALU. The control unit directs operations between the register set and ALU. The register set stores intermediate data and the ALU performs arithmetic and logic operations. The CPU follows a fetch-execute cycle where it fetches instructions from memory and stores them in the instruction register before executing them. Common instruction types include processor-memory operations, I/O operations, data processing, and control operations.
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIMjournalBEEI
This paper deals with the novel design and implementation of asynchronous microprocessor by using HDL on Vivado tool wherein it has the capability of handling even I-Type, R-Type and Jump instructions with multiplier instruction packet. Moreover, it uses separate memory for instructions and data read-write that can be changed at any time. The complete design has been synthesized and simulated using Vivado. The complete design is targeted on Xilinx Virtex-7 FPGA. This paper more focuses on the use of Vivado Tool for advanced FPGA device. By using Vivado we get enhaced analysis result for better view of properly Route & Placed design.
The document discusses machine structure and system programming. It begins with an overview of system software components like assemblers, loaders, macros, compilers and formal systems. It then describes the general machine structure including CPU, memory and I/O channels. Specific details are provided about the IBM 360 machine structure including its memory, registers, data, instructions and special features. Machine language and different approaches to writing machine language programs are also summarized.
This document discusses the history and characteristics of CISC and RISC architectures. It describes how CISC architectures were developed in the 1950s-1970s to address hardware limitations at the time by allowing instructions to perform multiple operations. RISC architectures emerged in the late 1970s-1980s as hardware improved, focusing on simpler instructions that could be executed faster through pipelining. Common RISC and CISC processors used commercially are also outlined.
8 bit Microprocessor with Single Vectored InterruptHardik Manocha
SoC consists of instruction memory, main memory and microprocessor unit. Instructions are fetched using PC and as per the instruction, main memory and register memory are accessed. 8 bit data bus is built. Working on developing programs to look for microprocessor operation.
Please send the answers to my email. Mirre06@hotmail.comSomeone se.pdfebrahimbadushata00
Please send the answers to my email. Mirre06@hotmail.com
Someone sent me wrong answers so please send me correct answers thanks.
1) What is a register? Be precise. Name at least two components in the LMC that meet the
qualications for a register. Name several different kinds of values that a register might hold.
Suppose that the following instructions are found at the given locations in memory:
20
LDA
50
21
ADD
51
50
724
51
006
a. Show the contents of the IR, the PC, the MAR, the MDR, and A at the conclusion of
instruction 20.
b. Show the contents of each register as each step of the fetch–execute cycle is performed for
instruction 21.
3) what is the purpose of the instructions register? What takes the place of the instruction
register in the LMC?
4) What is the explanation for the reasons why programmed IO does not work very well when
the IO device is a hard disk or a graphics display?
5) the x86 series is an example of a CPU architecture. as you are probably aware there are a
number of different chip including the x86 architecture? What word defines the difference
between the various CPUs that share the same architecture? Name at least one different CPU
architecture
20
LDA
50
21
ADD
51
50
724
51
006
Solution
1)The Little Man Computer (LMC) is an instructional model of a computer, created by Dr. Stuart
Madnick in 1965.The LMC is generally used to teach students, because it models a simple von
Neumann architecture computer - which has all of the basic features of a modern computer. It
can be programmed in machine code (albeit in decimal rather than binary) or assembly code.
Register:
In a computer, a register is one of a small set of data holding places that are part of a computer
processor . A register may hold a computer instruction , a storage address, or any kind of data
(such as a bit sequence or individual characters). Some instructions specify registers as part of
the instruction. For example, an instruction may specify that the contents of two defined registers
be added together and then placed in a specified register. A register must be large enough to hold
an instruction - for example, in a 32-bit instruction computer, a register must be 32 bits in length.
In some computer designs, there are smaller registers - for example, half-registers - for shorter
instructions. Depending on the processor design and language rules, registers may be numbered
or have arbitrary names.
Small, permanent storage locations within the CPU used for a particular purpose
Manipulated directly by the Control Unit
Wired for specific function
Size in bits or bytes (not in MB like memory)
Can hold data, an address or an instruction
Use of Registers
Scratchpad for currently executing program
Holds data needed quickly or frequently
Stores information about status of CPU and currently executing program
Address of next program instruction
Signals from external devices
General Purpose Registers
User-visible registers
Hold intermediate results or data values, e.g., l.
MICROPROGRAMMED
-> CONTROL UNIT IMPLEMENTATION
-> What Is Control Unit ??
-> Hardwire
-> Microprogrammed
-> Hardwire Vs Microprogrammed
-> Microprogrammed Control Unit
-> Micro-Operations
-> Control Memory
-> Microprogrammed Control Organization
-> Microprogram Routines
-> Conditional Branching
-> Mapping Of Instruction
-> Address Sequencing
-> Micro-Program Example
-> Micro-Instruction Format
-> Microinstruction Fields
-> Symbolic Microinstruction
This document discusses different addressing modes and RISC and CISC microprocessors. It defines eight addressing modes: register, register indirect, immediate, direct, indirect, implicit, relative, and index addressing modes. It provides examples for each mode. The document also defines RISC and CISC architectures, noting that RISC uses simple instructions that perform in one clock cycle while CISC uses more complex instructions that can perform multiple operations. It compares the two approaches using multiplying two numbers as an example.
This document provides an overview of an embedded systems course that focuses on the LPC 2148 ARM processor. The objectives are to study the architecture and design aspects of the LPC 2148, including I/O and memory interfacing. The outcomes include designing and implementing programs on the LPC 2148 as well as studying communication interfaces and scheduling algorithms. The course is divided into 5 modules that cover the ARM instruction set, LPC 2148 architecture, peripherals, operating system overview, and the μC/OS-II real-time kernel. Learning resources include textbooks on embedded systems, ARM architecture, and real-time concepts.
4bit pc report[cse 08-section-b2_group-02]shibbirtanvin
The document describes the design and implementation of a 4-bit very simple computer system as an assignment. Key aspects of the design include a 2-stage pipeline with separate fetch and execution units, Harvard architecture with separate instruction and data memory, and a microprogrammed control unit. The computer is designed to execute 28 instructions from an assigned instruction set in an efficient manner using as few clock cycles and chips as possible.
This document provides an overview of implementing a processor that executes a subset of the MIPS instruction set. It describes the basic components needed, including an instruction memory to store and fetch instructions, registers to hold data, an ALU to perform arithmetic and logical operations, multiplexers to direct data flow, and a program counter to keep track of the next instruction address. The implementation is built up incrementally, first explaining how instructions are fetched and the program counter updated. It then describes adding components for R-type instructions like arithmetic and logical operations. Finally, it discusses adding units to support load/store memory instructions by sign-extending offsets and calculating effective addresses. The goal is to explain at a high level how the MIPS
The document discusses computer architecture and the fetch-execute cycle. It describes the Von Neumann architecture, which uses a single processor that follows a linear sequence of fetching, decoding, and executing instructions. It then explains the fetch-execute cycle in more detail with the steps involved. Finally, it discusses parallel processor systems that can split up the fetching, decoding, and executing stages to improve efficiency.
The document discusses the functional requirements and design of a central processing unit (CPU). It describes the main components that must be included in the CPU design such as an instruction fetch unit, operand fetch unit, register file, instruction register, instruction decoder, and arithmetic logic unit. It then provides details on the register file design for the Intel 8086 processor including the segment and pointer registers used for memory addressing. Finally, it outlines the six addressing modes used by the Intel 8086 for accessing data in memory.
Realization of high performance run time loadable mips soft-core processoreSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A 16-bit microprocessor I designed during my final semester (2005) of my Bachelor of Technology program. The microprocessor circuitry design was coded in VHDL and then configured in a Xilinx XC9572 PC84 CPLD kit. Most of the design, the architecture and the instruction set were taken from Computer System Architecture (3rd ed.) by M. Morris Mano. See https://github.com/susam/mano-cpu for VHDL source code and other related files.
The document outlines the basics of processor operation, including the instruction cycle, representation of machine instructions, and types of instructions. It discusses how the processor clock synchronizes activities and how the program counter increments to fetch each subsequent instruction from memory. The core instruction cycle stages are fetch, decode, and execute, where the processor fetches instructions and data from memory, decodes the operation, and executes it by performing the required operation.
This document discusses the organization and architecture of computers. It covers topics like instruction codes, computer registers, instruction cycles, and memory-referenced instructions. Specifically, it describes:
- Instruction codes are made up of operation codes and addresses/operands that instruct the computer to perform operations.
- Computer registers like the program counter, instruction register, and accumulator are needed to process instructions and data.
- The stored program concept allows instructions to be stored in memory and executed sequentially through an instruction cycle.
The document discusses computational geometry and algorithms for solving geometric problems. It focuses on algorithms in two dimensions that can determine properties of line segments, such as whether one segment is clockwise or counterclockwise from another. It also describes an algorithm using a sweeping technique that can determine if any two segments in a set of segments intersect in O(n log n) time.
The document discusses linear programming problems and methods for solving them. It defines a linear programming problem as optimizing a linear objective function subject to linear constraints. It describes how to convert a linear program into standard form, which involves changing it to a maximization problem, adding non-negativity constraints, and converting equalities to inequalities. It also describes converting a standard form problem into an equivalent slack form by introducing slack variables. The simplex method is then introduced as an approach for iteratively solving a linear program by moving from one basic feasible solution to another with a better objective value.
More Related Content
Similar to Unit-5-BasicProcessing_Parallel-26-1-2023-5am.pptx
Here are the steps to launch Microsoft Visual Studio 2008 and create a project to edit the program P 2-1:
1. Launch Microsoft Visual Studio 2008.
2. Click "File" -> "New" -> "Project".
3. In the "New Project" dialog box, select "Visual C++" in the left pane and "Win32 Console Application" in the middle pane.
4. Click "OK".
5. In the "Win32 Application Wizard" dialog box, enter a name for the project (e.g. "AssemblyProgram") and click "OK".
6. This will create a new empty project in Visual Studio.
7. Right click on the
The CPU is the central processing unit of a computer and consists of three main parts - the control unit, register set, and ALU. The control unit directs operations between the register set and ALU. The register set stores intermediate data and the ALU performs arithmetic and logic operations. The CPU follows a fetch-execute cycle where it fetches instructions from memory and stores them in the instruction register before executing them. Common instruction types include processor-memory operations, I/O operations, data processing, and control operations.
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIMjournalBEEI
This paper deals with the novel design and implementation of asynchronous microprocessor by using HDL on Vivado tool wherein it has the capability of handling even I-Type, R-Type and Jump instructions with multiplier instruction packet. Moreover, it uses separate memory for instructions and data read-write that can be changed at any time. The complete design has been synthesized and simulated using Vivado. The complete design is targeted on Xilinx Virtex-7 FPGA. This paper more focuses on the use of Vivado Tool for advanced FPGA device. By using Vivado we get enhaced analysis result for better view of properly Route & Placed design.
The document discusses machine structure and system programming. It begins with an overview of system software components like assemblers, loaders, macros, compilers and formal systems. It then describes the general machine structure including CPU, memory and I/O channels. Specific details are provided about the IBM 360 machine structure including its memory, registers, data, instructions and special features. Machine language and different approaches to writing machine language programs are also summarized.
This document discusses the history and characteristics of CISC and RISC architectures. It describes how CISC architectures were developed in the 1950s-1970s to address hardware limitations at the time by allowing instructions to perform multiple operations. RISC architectures emerged in the late 1970s-1980s as hardware improved, focusing on simpler instructions that could be executed faster through pipelining. Common RISC and CISC processors used commercially are also outlined.
8 bit Microprocessor with Single Vectored InterruptHardik Manocha
SoC consists of instruction memory, main memory and microprocessor unit. Instructions are fetched using PC and as per the instruction, main memory and register memory are accessed. 8 bit data bus is built. Working on developing programs to look for microprocessor operation.
Please send the answers to my email. Mirre06@hotmail.comSomeone se.pdfebrahimbadushata00
Please send the answers to my email. Mirre06@hotmail.com
Someone sent me wrong answers so please send me correct answers thanks.
1) What is a register? Be precise. Name at least two components in the LMC that meet the
qualications for a register. Name several different kinds of values that a register might hold.
Suppose that the following instructions are found at the given locations in memory:
20
LDA
50
21
ADD
51
50
724
51
006
a. Show the contents of the IR, the PC, the MAR, the MDR, and A at the conclusion of
instruction 20.
b. Show the contents of each register as each step of the fetch–execute cycle is performed for
instruction 21.
3) what is the purpose of the instructions register? What takes the place of the instruction
register in the LMC?
4) What is the explanation for the reasons why programmed IO does not work very well when
the IO device is a hard disk or a graphics display?
5) the x86 series is an example of a CPU architecture. as you are probably aware there are a
number of different chip including the x86 architecture? What word defines the difference
between the various CPUs that share the same architecture? Name at least one different CPU
architecture
20
LDA
50
21
ADD
51
50
724
51
006
Solution
1)The Little Man Computer (LMC) is an instructional model of a computer, created by Dr. Stuart
Madnick in 1965.The LMC is generally used to teach students, because it models a simple von
Neumann architecture computer - which has all of the basic features of a modern computer. It
can be programmed in machine code (albeit in decimal rather than binary) or assembly code.
Register:
In a computer, a register is one of a small set of data holding places that are part of a computer
processor . A register may hold a computer instruction , a storage address, or any kind of data
(such as a bit sequence or individual characters). Some instructions specify registers as part of
the instruction. For example, an instruction may specify that the contents of two defined registers
be added together and then placed in a specified register. A register must be large enough to hold
an instruction - for example, in a 32-bit instruction computer, a register must be 32 bits in length.
In some computer designs, there are smaller registers - for example, half-registers - for shorter
instructions. Depending on the processor design and language rules, registers may be numbered
or have arbitrary names.
Small, permanent storage locations within the CPU used for a particular purpose
Manipulated directly by the Control Unit
Wired for specific function
Size in bits or bytes (not in MB like memory)
Can hold data, an address or an instruction
Use of Registers
Scratchpad for currently executing program
Holds data needed quickly or frequently
Stores information about status of CPU and currently executing program
Address of next program instruction
Signals from external devices
General Purpose Registers
User-visible registers
Hold intermediate results or data values, e.g., l.
MICROPROGRAMMED
-> CONTROL UNIT IMPLEMENTATION
-> What Is Control Unit ??
-> Hardwire
-> Microprogrammed
-> Hardwire Vs Microprogrammed
-> Microprogrammed Control Unit
-> Micro-Operations
-> Control Memory
-> Microprogrammed Control Organization
-> Microprogram Routines
-> Conditional Branching
-> Mapping Of Instruction
-> Address Sequencing
-> Micro-Program Example
-> Micro-Instruction Format
-> Microinstruction Fields
-> Symbolic Microinstruction
This document discusses different addressing modes and RISC and CISC microprocessors. It defines eight addressing modes: register, register indirect, immediate, direct, indirect, implicit, relative, and index addressing modes. It provides examples for each mode. The document also defines RISC and CISC architectures, noting that RISC uses simple instructions that perform in one clock cycle while CISC uses more complex instructions that can perform multiple operations. It compares the two approaches using multiplying two numbers as an example.
This document provides an overview of an embedded systems course that focuses on the LPC 2148 ARM processor. The objectives are to study the architecture and design aspects of the LPC 2148, including I/O and memory interfacing. The outcomes include designing and implementing programs on the LPC 2148 as well as studying communication interfaces and scheduling algorithms. The course is divided into 5 modules that cover the ARM instruction set, LPC 2148 architecture, peripherals, operating system overview, and the μC/OS-II real-time kernel. Learning resources include textbooks on embedded systems, ARM architecture, and real-time concepts.
4bit pc report[cse 08-section-b2_group-02]shibbirtanvin
The document describes the design and implementation of a 4-bit very simple computer system as an assignment. Key aspects of the design include a 2-stage pipeline with separate fetch and execution units, Harvard architecture with separate instruction and data memory, and a microprogrammed control unit. The computer is designed to execute 28 instructions from an assigned instruction set in an efficient manner using as few clock cycles and chips as possible.
This document provides an overview of implementing a processor that executes a subset of the MIPS instruction set. It describes the basic components needed, including an instruction memory to store and fetch instructions, registers to hold data, an ALU to perform arithmetic and logical operations, multiplexers to direct data flow, and a program counter to keep track of the next instruction address. The implementation is built up incrementally, first explaining how instructions are fetched and the program counter updated. It then describes adding components for R-type instructions like arithmetic and logical operations. Finally, it discusses adding units to support load/store memory instructions by sign-extending offsets and calculating effective addresses. The goal is to explain at a high level how the MIPS
The document discusses computer architecture and the fetch-execute cycle. It describes the Von Neumann architecture, which uses a single processor that follows a linear sequence of fetching, decoding, and executing instructions. It then explains the fetch-execute cycle in more detail with the steps involved. Finally, it discusses parallel processor systems that can split up the fetching, decoding, and executing stages to improve efficiency.
The document discusses the functional requirements and design of a central processing unit (CPU). It describes the main components that must be included in the CPU design such as an instruction fetch unit, operand fetch unit, register file, instruction register, instruction decoder, and arithmetic logic unit. It then provides details on the register file design for the Intel 8086 processor including the segment and pointer registers used for memory addressing. Finally, it outlines the six addressing modes used by the Intel 8086 for accessing data in memory.
Realization of high performance run time loadable mips soft-core processoreSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A 16-bit microprocessor I designed during my final semester (2005) of my Bachelor of Technology program. The microprocessor circuitry design was coded in VHDL and then configured in a Xilinx XC9572 PC84 CPLD kit. Most of the design, the architecture and the instruction set were taken from Computer System Architecture (3rd ed.) by M. Morris Mano. See https://github.com/susam/mano-cpu for VHDL source code and other related files.
The document outlines the basics of processor operation, including the instruction cycle, representation of machine instructions, and types of instructions. It discusses how the processor clock synchronizes activities and how the program counter increments to fetch each subsequent instruction from memory. The core instruction cycle stages are fetch, decode, and execute, where the processor fetches instructions and data from memory, decodes the operation, and executes it by performing the required operation.
This document discusses the organization and architecture of computers. It covers topics like instruction codes, computer registers, instruction cycles, and memory-referenced instructions. Specifically, it describes:
- Instruction codes are made up of operation codes and addresses/operands that instruct the computer to perform operations.
- Computer registers like the program counter, instruction register, and accumulator are needed to process instructions and data.
- The stored program concept allows instructions to be stored in memory and executed sequentially through an instruction cycle.
Similar to Unit-5-BasicProcessing_Parallel-26-1-2023-5am.pptx (20)
The document discusses computational geometry and algorithms for solving geometric problems. It focuses on algorithms in two dimensions that can determine properties of line segments, such as whether one segment is clockwise or counterclockwise from another. It also describes an algorithm using a sweeping technique that can determine if any two segments in a set of segments intersect in O(n log n) time.
The document discusses linear programming problems and methods for solving them. It defines a linear programming problem as optimizing a linear objective function subject to linear constraints. It describes how to convert a linear program into standard form, which involves changing it to a maximization problem, adding non-negativity constraints, and converting equalities to inequalities. It also describes converting a standard form problem into an equivalent slack form by introducing slack variables. The simplex method is then introduced as an approach for iteratively solving a linear program by moving from one basic feasible solution to another with a better objective value.
This document summarizes an analysis of factors influencing diabetes among Indians. The analysis aimed to identify key variables strongly associated with diabetes and provide insights into potential risk factors and their interplay. The dataset analyzed was obtained from Kaggle and originally from the National Institute of Diabetes and Digestive and Kidney Diseases, with the objective of predicting whether a patient has diabetes based on diagnostic measurements. Data cleaning was performed to remove missing and duplicate values before summary statistics and distributions of values were analyzed using histograms, box plots, and kernel density plots.
This document outlines the course objectives, syllabus, and evaluation for a Universal Human Values course taught by Dr. Umadevi. The course aims to develop a holistic perspective in students about themselves, family, society, and nature through self-exploration. The syllabus is divided into 5 units covering harmony in the human being, family, society, nature, and implications for professional ethics. Students will be evaluated through continuous internal evaluation and a final semester exam consisting of multiple choice questions.
The document discusses assembly language subroutines and parameter passing methods. It explains how subroutines can pass parameters through registers, memory locations, or a stack. Examples are provided to illustrate subroutine linkage and passing parameters in each method. Register R5 is used as the stack pointer to implement push and pop operations for parameter passing via the stack. The subroutine LISTADD is used to demonstrate adding a list of numbers as an example of subroutine coding and parameter passing techniques.
The document discusses several adverse effects of various human activities on the environment including agriculture, housing, industrialization, mining, transportation and their key impacts. Agriculture can cause soil degradation through erosion, and water contamination from fertilizer and pesticide runoff. Housing development leads to loss of natural resources and pollution from building materials. Industrial activities result in air and water pollution, depletion of resources from mining, and greenhouse gas emissions. Mining degrades land, forests and water bodies. Transportation burns fossil fuels causing air pollution and greenhouse gas emissions. Environmental impact assessments are used to evaluate impacts of development projects and activities to select more sustainable alternatives.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Low power architecture of logic gates using adiabatic techniquesnooriasukmaningtyas
The growing significance of portable systems to limit power consumption in ultra-large-scale-integration chips of very high density, has recently led to rapid and inventive progresses in low-power design. The most effective technique is adiabatic logic circuit design in energy-efficient hardware. This paper presents two adiabatic approaches for the design of low power circuits, modified positive feedback adiabatic logic (modified PFAL) and the other is direct current diode based positive feedback adiabatic logic (DC-DB PFAL). Logic gates are the preliminary components in any digital circuit design. By improving the performance of basic gates, one can improvise the whole system performance. In this paper proposed circuit design of the low power architecture of OR/NOR, AND/NAND, and XOR/XNOR gates are presented using the said approaches and their results are analyzed for powerdissipation, delay, power-delay-product and rise time and compared with the other adiabatic techniques along with the conventional complementary metal oxide semiconductor (CMOS) designs reported in the literature. It has been found that the designs with DC-DB PFAL technique outperform with the percentage improvement of 65% for NOR gate and 7% for NAND gate and 34% for XNOR gate over the modified PFAL techniques at 10 MHz respectively.
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
1. Course – Computer Organization
and Architecture
Course Instructor
Dr. Umadevi V
Department of CSE, BMSCE
26 October 2023 CSE, BMSCE
1
2. Unit-5
Some Fundamental Concepts, Fundamental Basic Processing Unit:
Some Fundamental Concepts , Instruction Execution, Hardware
Components, Instruction Fetch and Execution Steps, Control
Signals, Hardwired Control
Parallel Computer Architecture: Processor Architecture and
Technology Trends, Flynn’s Taxonomy of Parallel Architectures,
Memory Organization of Parallel Computers: Computers with
Distributed Memory Organization, Computers with Shared
Memory Organization, Thread-Level Parallelism: Simultaneous
Multithreading, Multicore Processors
26 October 2023 CSE, BMSCE 2
4. Fundamental Concepts
Processor fetches one instruction at a time and perform
the operation specified.
Instructions are fetched from successive memory
locations until a branch or a jump instruction is
encountered.
Processor keeps track of the address of the memory
location containing the next instruction to be fetched
using Program Counter (PC).
Instruction Register (IR)
26 October 2023 CSE, BMSCE 4
6. Fundamental Concepts (Contd…)
26 October 2023 CSE, BMSCE 6
PC provides
instruction address.
Instruction is fetched
into IR
Instruction address
generator updates PC
Control circuitry
interprets instruction
and generates control
signals to perform the
actions needed.
Main hardware components of a processor
7. Data Processing Hardware
Contents of register A are processed and deposited in register B.
26 October 2023 CSE, BMSCE 7
Basic structure for data processing
8. Data Processing Hardware (Contd…)
26 October 2023 CSE, BMSCE 8
A hardware structure with multiple stages.
Processing moves from one stage to the next in each clock cycle.
Such a multi-stage system is known as a pipeline.
High-performance processors have a pipelined organization.
Pipelining enables the execution of successive instructions to be overlapped.
9. Five-step sequence of actions to fetch and execute
an instruction.
A five-step sequence of actions to fetch and execute an
instruction.
26 October 2023 CSE, BMSCE 9
10. Instruction Execution
Load Instructions
Consider the instruction
Load R5, X(R7)
which uses the Index addressing mode to load a word of data from memory location X
+ [R7] into register R5.
Fetch and Execution of this instruction involves the following actions:
1. Fetch the instruction and increment the program counter.
2. Decode the instruction and read the contents of register R7 in the register file.
3. Compute the effective address.
4. Read the memory source operand.
5. Load the operand into the destination register, R5.
26 October 2023 CSE, BMSCE 10
11. Instruction Execution (Contd…)
Arithmetic and Logic Instructions:
A typical instruction of this type is
Add R3, R4, R5
Fetch and Execution of this instruction involves the following actions:
1. Fetch the instruction and increment the program counter.
2. Decode the instruction and read registers R4 and R5.
3. Compute the sum [R4] + [R5].
4. No action.
5. Load the result into the destination register, R3.
26 October 2023 CSE, BMSCE 11
12. Instruction Execution (Contd…)
Store Instruction
Store R6, X(R8)
stores the contents of register R6 into memory location X + [R8].
Fetch and Execution of this instruction involves the following actions:
1. Fetch the instruction and increment the program counter. 2. Decode the
instruction and read registers R6 and R8.
3. Compute the effective address X + [R8].
4. Store the contents of register R6 into memory location X + [R8].
5. No action
26 October 2023 CSE, BMSCE 12
13. Hardware Components
The discussion in previous slides indicates that all instructions of a RISC-
style processor can be executed using the five-step sequence. Hence, the
processor hardware may be organized in five stages, such that each stage
performs the actions needed in one of the steps.
Register File
A 2-port register file is needed to read the two source registers at the
same time.
It may be implemented using a 2-port memory.
26 October 2023 CSE, BMSCE 13
14. Hardware Components: ALU (Contd…)
- Two source operands are from registers.
26 October 2023 CSE, BMSCE 14
Conceptual view of the hardware needed for computation.
Both source operands
and the destination
location are in the
register file.
[RA] and [RB] denote
values of registers
that are identified by
addresses A and B
new [RC] denotes the
result that is stored to
the register identified
by address C
15. Hardware Components: ALU (Contd…)
- One of the source operands is the immediate value in the IR.
26 October 2023 CSE, BMSCE 15
Conceptual view of the hardware needed for computation.
16. Five-step sequence of actions to fetch and execute
an instruction.
A five-step sequence of actions to fetch and execute an
instruction.
26 October 2023 CSE, BMSCE 16
17. Hardware Components: DataPath
26 October 2023 CSE, BMSCE 17
Instruction processing
moves from stage to stage
in every clock cycle,
starting with fetch.
The instruction is decoded
and the source registers
are read in stage 2.
Computation takes place in
the ALU in stage 3.
A five-stage organization
18. Hardware Components: DataPath
26 October 2023 CSE, BMSCE 18
A five-stage organization
If a memory operation is
involved, it takes place in
stage 4.
The result of the instruction
is stored in the destination
register in stage 5.
23. Five-step sequence of actions to fetch and execute
an instruction.
A five-step sequence of actions to fetch and execute an
instruction.
26 October 2023 CSE, BMSCE 23
24. Hardware Components:
DataPath
26 October 2023 24
Sequence of actions needed to
fetch and execute the instruction:
Unconditional Branch Instruction
Datapath in a processor
25. Hardware Components:
DataPath
26 October 2023 25
Sequence of actions needed to
fetch and execute the instruction:
Conditional Branch Instruction
Branch_if_[R5]=[R6] LOOP
Datapath in a processor
26. Hardware Components:
DataPath
26 October 2023 26
Sequence of actions needed to
fetch and execute the instruction:
Subroutine Call Instructions
Call_Register R9
which calls a subroutine whose address
is in register R9
Datapath in a processor
27. Control Signals
Select multiplexer inputs to guide the flow of data.
Set the function performed by the ALU.
Determine when data are written into the PC, the IR,
the register file, and the memory.
Inter-stage registers are always enabled because
their contents are only relevant in the cycles for which
the stages connected to the register outputs are active.
26 October 2023 CSE, BMSCE 27
Control signals for the datapath
28. Rough Slide: to Explain Control Signals
Register File Control Signals
26 October 2023 CSE, BMSCE 28
29. Rough Slide: to Explain Control Signals
ALU Control Signals
26 October 2023 CSE, BMSCE 29
30. Rough Slide: to Explain Control Signals
Result Selection Signals
26 October 2023 CSE, BMSCE 30
31. Control signal generation:
Actions to fetch & execute instructions have been described.
The necessary control signals have also been described.
Circuitry must be implemented to generate control signals
so actions take place in correct sequence and at correct time.
There are two basic approaches:
Hardwired control and Microprogramming
26 October 2023 CSE, BMSCE 31
32. Hardwired Control
26 October 2023 CSE, BMSCE 32
Generation of the control signals.
Hardwired control involves implementing circuitry that
considers step counter, IR, ALU result, and external inputs.
Step counter keeps track of execution progress,
one clock cycle for each of the five steps described earlier
(unless a memory access takes longer than one cycle).
33. Parallel Computer Architecture
Parallel Computer Architecture: Processor Architecture and
Technology Trends, Flynn’s Taxonomy of Parallel Architectures,
Memory Organization of Parallel Computers: Computers
with Distributed Memory Organization, Computers with Shared
Memory Organization,
Thread-Level Parallelism: Simultaneous Multithreading,
Multicore Processors
26 October 2023 CSE, BMSCE 33
34. What is Parallel Computing ?
Serial vs Parallel Computing
26 October 2023 CSE, BMSCE 34
35. Serial Computing
Traditionally, software has been written for serial computation: A problem is broken
into a discrete series of instructions
Instructions are executed sequentially one after another
Executed on a single processor
Only one instruction may execute at any moment in time
26 October 2023 CSE, BMSCE 35
36. Parallel Computing
In the simplest sense, parallel computing is the simultaneous use of multiple
compute resources to solve a computational problem: A problem is broken into
discrete parts that can be solved concurrently
Each part is further broken down to a series of instructions
Instructions from each part execute simultaneously on different processors
An overall control/coordination mechanism is employed
26 October 2023 CSE, BMSCE 36
38. Internal Parallelism Levels
1. Bit level parallelism
2. Parallelism by pipelining
3. Parallelism by multiple functional units
4. Parallelism at process or thread level
26 October 2023 CSE, BMSCE 38
39. Internal Parallelism Levels
1. Bit level parallelism: 1970 to ~1985
4 bits, 8 bit, 16 bit, 32, 64 bit microprocessors
2. Parallelism by pipelining: ~1985 through today
Instruction level parallelism (ILP): The idea of pipelining at instruction level is an
overlapping of the execution of multiple instructions.
The execution of each instruction is partitioned into several steps which are
performed by dedicated hardware units (pipeline stages) one after another. A
typical partitioning could result in the following steps:
(a) fetch: fetch the next instruction to be executed from memory;
(b) decode: decode the instruction fetched in step (a);
(c) execute: load the operands specified and execute the instruction;
(d) write-back: write the result into the target register/memory.
26 October 2023 CSE, BMSCE 39
40. Internal Parallelism Levels (Contd…)
3. Parallelism by multiple functional units:
Many processors are multiple-issue processors.
They use multiple, independent functional units like ALUs (arithmetic
logical units), FPUs (floating-point units), load/store units, or branch
units.
These units can work in parallel, i.e., different independent instructions
can be executed in parallel by different functional units. Thus, the
average execution rate of instructions can be increased.
4. Parallelism at process or thread level:
The three techniques described so far assume a single sequential
control flow which is provided by the compiler and which determines
the execution order if there are dependencies between instructions.
An alternative approach is to use add multiple, independent processor
cores onto a single processor chip.
This approach has been used for typical desktop processors since
2005. The resulting processor chips are called multicore processors.
Example: Dual Core, Quadcore processor
26 October 2023 CSE, BMSCE 40
41. Flynn’s Taxonomy of Parallel Computers
Question:
List and Explain different classifications of Parallel Computer
according to Flynn’s Taxonomy ?
26 October 2023 CSE, BMSCE 41
42. Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 42
43. Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 43
Instruction
Stream
Data
Stream
SSID 1 1
MISD
SIMD
MIMD
44. Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 44
Instruction
Stream
Data
Stream
SSID 1 1
MISD >1 1
SIMD
MIMD
45. Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 45
Instruction
Stream
Data
Stream
SSID 1 1
MISD >1 1
SIMD 1 >1
MIMD
46. Flynn’s Taxonomy of Parallel Computers
Flynn’s Taxonomy: Classification according to important characteristics of a parallel computer.
Four categories are distinguished based on:
- How Many Instruction Streams
- How Many Data Streams
1. Single-Instruction, Single-Data (SISD)
2. Multiple-Instruction, Single-Data (MISD)
3. Single-Instruction, Multiple-Data (SIMD)
4. Multiple-Instruction, Multiple-Data (MIMD)
26 October 2023 CSE, BMSCE 46
Instruction
Stream
Data
Stream
SSID 1 1
MISD >1 1
SIMD 1 >1
MIMD >1 >1
47. Flynn’s Taxonomy of Parallel Computers (Contd…)
1. Single-Instruction, Single-Data (SISD): There is one processing
element which has access to a single program and data storage. In each
step, the processing element loads an instruction and the corresponding
data and executes the instruction. The result is stored back in the data
storage. Thus, SISD is the conventional sequential computer according to
the von Neumann model.
2. Multiple-Instruction, Single-Data (MISD): There are multiple
processing elements each of which has a private program memory, but
there is only one common access to a single global data memory. In each
step, each processing element obtains the same data element from the
data memory and loads an instruction from its private program memory.
These possibly different instructions are then executed in parallel by the
processing elements using the previously obtained (identical) data element
as operand. This execution model is very restrictive and no commercial
parallel computer of this type has ever been built.
26 October 2023 CSE, BMSCE 47
48. Flynn’s Taxonomy of Parallel Computers (Contd…)
3. Single-Instruction, Multiple-Data (SIMD): There are multiple processing
elements each of which has a private access to a (shared or distributed) data memory.
But there is only one program memory from which a special control processor fetches
and dispatches instructions. In each step, each processing element obtains from the
control processor the same instruction and loads a separate data element through its
private data access on which the instruction is performed. Thus, the instruction
is synchronously applied in parallel by all processing elements to different data
elements.
For applications with a significant degree of data parallelism, the SIMD approach can
be very efficient. Examples are multimedia applications or computer graphics
algorithms to generate realistic three-dimensional views of computer-generated
environments.
4. Multiple-Instruction, Multiple-Data (MIMD): There are multiple processing
elements each of which has a separate instruction and data access to a (shared or
distributed) program and data memory. In each step, each processing element loads a
separate instruction and a separate data element, applies the instruction to the data
element, and stores a possible result back into the data storage. The processing
elements work asynchronously with each other. Multicore processors or cluster
systems are examples for the MIMD model.
26 October 2023 CSE, BMSCE 48
49. Memory Organization of Parallel Computers
A Further classification of MIMD computers can be done
according to their memory organization:
1. Computers with Distributed Memory Organization
2. Computers with Shared Memory Organization
26 October 2023 CSE, BMSCE 49
50. Shared vs. Distributed Memory
The simplest and most useful way to classify modern parallel
computers is by their memory model:
Shared memory and Distributed memory
26 October 2023 CSE, BMSCE 50
51. Illustration of computers with distributed memory
Illustration of computers with distributed memory: (a) abstract structure,
(b) computer with distributed memory and hypercube as interconnection structure,
(c) DMA (direct memory access), (d) processor–memory node with router, and (e)
interconnection network in the form of a mesh to connect the routers of the different
processor–memory nodes
26 October 2023 CSE, BMSCE 51
52. Illustration of a computer with shared memory
Illustration of a computer with shared memory: (a) abstract view and (b)
implementation of the shared memory with memory modules
26 October 2023 CSE, BMSCE 52
53. Illustration of the architecture of computers with shared memory:
Illustration of the architecture of computers with shared memory: (a) SMP –
symmetric multiprocessors, (b) NUMA – non-uniform memory access, (c) CC-NUMA –
cache-coherent NUMA, and (d) COMA – cache-only memory access
26 October 2023 CSE, BMSCE 53
55. What is Thread w.r.t Computers ?
Thread: A Process (or Program) with own instructions and
data
- Each thread has all the state (instructions, data, PC,
register state, and so on) necessary to allow it to execute
26 October 2023 CSE, BMSCE 55
56. Thread-Level Parallelism
Multi-Core Processors: Placement of multiple independent
execution cores with all execution resources onto a single
processor chip.
Design choices for multicore chips
26 October 2023 CSE, BMSCE 56
Hierarchical design
57. Thread-Level Parallelism
Multi-Core Processors: Placement of multiple independent
execution cores with all execution resources onto a single
processor chip.
Design choices for multicore chips
26 October 2023 CSE, BMSCE 57
Hierarchical design Pipelined design
58. Thread-Level Parallelism
Multi-Core Processors: Placement of multiple independent
execution cores with all execution resources onto a single
processor chip.
Design choices for multicore chips
26 October 2023 CSE, BMSCE 58
Hierarchical design Pipelined design Network-based design