SYSTEM ON CHIP
Introduction to the Systems Approach
UNIT-1
• SYSTEM ARCHITECTURE: AN OVERVIEW
1. The past 40 years have seen amazing advances in silicon technology and resulting increases in
transistor density and performance.
2. In 1966, Fairchild Semiconductor [84] introduced a quad two input NAND gate with about 10
transistors on a die. In 2008, the Intel quad - core Itanium processor has 2 billion transistors
3. A system - on - chip (SOC) architecture is an ensemble of processors, memories, and
interconnects tailored to an application domain
4. A simple example of such an architecture is the Emotion Engine for the Sony PlayStation , which
has two main functions: behavior simulation and geometry translation. This system contains three
essential components: a main processor of the reduced instruction set computer (RISC) style
[118] and two vector processing units, VPU0 and VPU1, each of which contains four parallel
processors of the single instruction, multiple data (SIMD) stream style
Feature Microprocessor (MPU) Microcontroller (MCU) System on Chip (SoC)
Definition
A CPU (Central Processing Unit) that
requires external components for
full functionality.
A compact integrated circuit with a
CPU, memory, and peripherals.
A highly integrated chip containing
a CPU, memory, GPU, connectivity,
and more.
Components Included
Only the processor; needs external
RAM, ROM, and I/O devices.
Processor, RAM, ROM, I/O ports,
timers, and more in one chip.
Processor(s), memory, GPU, wireless
modules, I/O, and sometimes AI
accelerators.
Performance
High-performance, used in complex
computing tasks.
Lower performance, optimized for
control tasks.
Varies widely, optimized for
embedded and mobile applications.
Power Consumption
High power consumption, not
suitable for battery-operated
devices.
Low power consumption, ideal for
embedded systems.
Optimized for power efficiency,
common in mobile and IoT devices.
Cost
Expensive due to external
components.
Cost-effective as it integrates
everything needed.
Can be expensive but offers high
integration and efficiency.
Usage
Used in computers, laptops, and
high-end applications.
Used in embedded systems like
washing machines, ATMs, and
remote controls.
Used in smartphones, tablets, smart
TVs, IoT devices, and AI
applications.
COMPONENTS OF THE SYSTEM: PROCESSORS,
MEMORIES, AND INTERCONNECTS
• The term architecture denotes the operational structure and the user ’ s view of the system.
• it has evolved to include both the functional specification and the hardware implementation.
• The system architecture defines the system - level building blocks, such as processors and memories,
and the interconnection between them.
• The processor architecture determines the processor ’ s instruction set, the associated programming
model, its detailed implementation, which may include hidden registers, branch prediction circuits and
specific details concerning the ALU (arithmetic logic unit).
• The implementation of a processor is also known as microarchitecture
• As an example, an SOC for a smart phone would need to support, in addition to audio input and output
capabilities for a traditional phone, Internet access functions and multimedia facilities for video
communication, document processing, and entertainment such as games and movies.
• If all the elements cannot be contained on a single chip, the implementation is probably best referred
to as a system on a board, but often is still called a SOC.
HARDWARE AND SOFTWARE
HARDWARE AND SOFTWARE
• A fundamental decision in SOC design is to choose which components in the system are to be implemented in
hardware and in software. The major benefits and drawbacks of hardware and software implementations are
summarized in Table .
• A software implementation is usually executed on a general - purpose processor (GPP), which interprets instructions
at run time. This architecture offers flexibility and adaptability, and provides a way of sharing resources among
different applications;
• the hardware implementation of the ISA is generally slower and more power hungry than implementing the
corresponding function directly in hardware without the overhead of fetching and decoding instructions.
• Given that hardware and software have complementary features, many SOC designs aim to combine the individual
benefits of the two. The obvious method is to implement the performance - critical parts of the application in
hardware, and the rest in software.
• Custom ASIC hardware and software on GPPs can be seen as two extremes in the technology spectrum with
different trade - offs in programmability and performance;
• there are various technologies that lie between these two extremes (Figure 1.6 ). The two more well - known ones
are application - specific instruction processors (ASIPs) and field - programmable gate arrays (FPGAs).
A simplifi ed technology comparison: programmability
versus performance. GPP, general - purpose processor;
CGRA, coarse - grained reconfi gurable architecture.
Programmability vs performance
• 1. ASIC (Application-Specific Integrated Circuit)
• Definition: A custom-designed chip tailored for a specific application or
task.
• Flexibility: Not reprogrammable after manufacturing.
• Performance: Highest performance and power efficiency since it's
optimized for a specific function.
• Cost: High initial cost due to design and fabrication, but lower unit cost in
large volumes.
• Use Cases: Used in consumer electronics (smartphones, GPUs, AI
accelerators, Bitcoin mining).
FPGA (Field-Programmable Gate Array)
• Definition: A reconfigurable chip that can be programmed after
manufacturing using hardware description languages (HDLs).
• Flexibility: Highly flexible and reprogrammable for different applications.
• Performance: Slower than ASICs but faster than general-purpose processors
due to configurable hardware.
• Cost: Lower upfront cost (no fabrication needed), but higher per-unit cost
compared to ASICs in large volumes.
• Use Cases: Used in prototyping, real-time processing, telecom (5G),
aerospace, and data centers.
ASIP (Application-Specific Instruction-set Processor)
• Definition: A processor optimized for a specific domain with a specialized
instruction set.
• Flexibility: More flexible than ASICs but less than FPGAs, as it can be
programmed with software but has a fixed instruction set.
• Performance: Higher than general-purpose processors for targeted tasks but
lower than ASICs.
• Cost: Lower cost than ASICs for domain-specific applications, but development
can be complex.
• Use Cases: Used in signal processing, embedded systems, and IoT devices
requiring domain-specific processing.
application - specific instruction processors (ASIPs) and
field - programmable gate arrays (FPGAs).
• An ASIP is a processor with an instruction set customized for a specific application or domain.
• Custom instructions efficiently implemented in hard ware are often integrated into a base
processor with a basic instruction set.
• An FPGA typically contains an array of computation units, memories, and their interconnections,
and all three are usually programmable in the field by application builders.
• FPGA technology often offers a good compromise: It is faster than software while being more
flexible and having shorter development times than custom ASIC hardware implementations;
• like GPPs, they are offered as off - the - shelf devices that can be programmed without going
through chip fabrication. Because of the growing demand for reducing the time to market and the
increasing cost of chip fabrication, FPGAs are becoming more popular for implementing digital
designs.
PROCESSOR ARCHITECTURES
• Typically, processors are characterized either by their application or by
their architecture
PROCESSOR ARCHITECTURES
• The requirements space of an application is often large, and there is a range of implementation options. Thus, it
is usually difficult to associate a particular architecture with a particular application.
• In addition, some architectures combine different implementation approaches as seen in the PlayStation example
of Section 1.1 . There, the graphics processor consists of a four - element SIMD array of vector processing
functional units (FUs). Other SOC implementations consist of multiprocessors using very long instruction word
(VLIW) and/or superscalar processors.
• From the programmer ’ s point of view, sequential processors execute one instruction at a time.
• many processors have the capability to execute several instructions concurrently in a manner that is transparent
to the programmer, through techniques such as pipelining, multiple execution units, and multiple cores.
Pipelining is a powerful technique that is used in almost all current processor implementations.
• Pipelining is a powerful technique that is used in almost all current processor implementations.
• Instruction - level parallelism (ILP) means that multiple operations can be executed in parallel within a program.
ILP may be achieved with hardware, compiler, or operating system techniques.
Processor: A Functional View
• SOC designs and the processor used in each design. For these examples, we can characterize them as
general purpose, or special purpose with support for gaming or signal processing applications.
Processor: An Architectural View
• Simple Sequential Processor :
• Sequential processors directly implement the sequential execution model. These processors
process instructions sequentially from the instruction stream. The next instruction is not
processed until all execution for the current instruction is complete and its results have been
committed.
• 1. fetching the instruction into the instruction register (IF),
• 2. decoding the opcode of the instruction (ID),
• 3. generating the address in memory of any data item residing there (AG),
• 4. fetching data operands into executable registers (DF),
• 5. executing the specifi ed operation (EX), and
• 6. writing back the result to the register fi le (WB).
Sequential processor model
• Working of a Sequential Processor
• Step 1: Instruction Fetch
• The processor fetches the instruction from memory at the address in the Program Counter (PC).
• The PC is then incremented to point to the next instruction.
• Step 2: Instruction Decode
• The fetched instruction is decoded by the Control Unit.
• The CU determines what operation the instruction represents.
• Step 3: Execute
• The ALU performs computations (if needed).
• Memory may be accessed for data load/store operations.
• Step 4: Write Back
• The result of the operation is written back to a register or memory.
• Step 5: Repeat
• The next instruction is fetched, and the cycle repeats.
Sequential processor model
• Applications
• Simple Embedded Systems – Used in microcontrollers for basic
tasks.
• Educational Purposes – Teaching basic computer architecture
concepts.
Pipelined Processor
• Pipelining is a straightforward approach to exploiting parallelism that is based on concurrently
performing different phases (instruction fetch, decode, execution, etc.) of processing an instruction.
• Pipelining assumes that these phases are independent between different operations and can be
overlapped — when this condition does not hold, the processor stalls the downstream phases to enforce
the dependency.
• Thus, multiple operations can be processed simultaneously with each operation at a different phase of its
processing.
Pipeline processor
• A pipeline processor refers to a processing architecture where multiple stages (or steps) are arranged in a sequence to
handle a stream of data. Each stage performs a specific operation and passes the result to the next stage, allowing for
parallel execution and efficient processing.
• Advantages of Pipeline Processing
• Increased Throughput: Multiple operations can be executed in parallel.
• Efficient Resource Utilization: Hardware or software resources are used effectively.
• Faster Execution: Overall execution time is reduced due to concurrent processing.
• Key Features of a Pipelined Processor Model
• Instruction-Level Parallelism (ILP) – Multiple instructions are processed simultaneously in different pipeline stages.
• Stages of Execution – The instruction cycle is divided into stages such as Fetch, Decode, Execute, Memory Access, and
Write Back.
• Increased Throughput – While an individual instruction still takes the same time, multiple instructions are in progress at
once, reducing overall execution time.
• Potential Hazards – Pipeline execution introduces challenges like data hazards, control hazards, and structural
hazards.
Pipelined processor model
Cycle IF ID EX MEM WB
1 I1
2 I2 I1
3 I3 I2 I1
4 I4 I3 I2 I1
5 I5 I4 I3 I2 I1
6 I6 I5 I4 I3 I2
... .. .. .. .. ..
Example: Instruction Execution in a Pipeline
Pipelined processor model
• Real-World Examples of Pipelined Processors
• RISC Processors – ARM, MIPS, PowerPC.
• Modern x86 CPUs – Intel and AMD processors use deep pipelining.
• GPU Pipelines – Used for parallel graphics processing
Superscalar Processor Model
• A superscalar processor is an advanced CPU architecture that
executes multiple instructions per clock cycle by using multiple
execution units. Unlike pipelined processors, which improve
instruction throughput by overlapping execution stages,
superscalar processors increase throughput by issuing multiple
instructions in parallel.
Cycle Scalar Pipeline (Single Issue)
Superscalar Pipeline (Dual
Issue)
1 IF (I1) IF (I1, I2)
2 ID (I1) ID (I1, I2)
3 EX (I1) EX (I1, I2)
4 MEM (I1) MEM (I1, I2)
5 WB (I1) WB (I1, I2)
Pipeline Comparison: Superscalar vs. Scalar
Examples of Superscalar Processors
• Intel Pentium (First Superscalar x86 Processor)
• AMD Ryzen (Modern superscalar CPU)
• ARM Cortex-A Series (used in mobile devices)
• IBM PowerPC Processors (used in servers and gaming consoles)
• Apple M-Series Chips (highly optimized superscalar processors)
SIMD Architectures: Array and Vector Processors
• Single Instruction, Multiple Data (SIMD) architectures allow a single instruction to operate on
multiple data elements simultaneously. This approach enhances performance in applications
requiring high data parallelism, such as graphics processing, scientific computing, and machine
learning.
• In SIMD architectures:
• A single instruction is executed across multiple data elements simultaneously.
• It is highly efficient for operations on large datasets, such as matrix multiplications and image
processing.
• here are two main types of SIMD architectures:
• Array Processors
• Vector Processors
• Array Processors
• Array processors consist of multiple processing elements (PEs) that operate in parallel under the
control of a single instruction stream.
SIMD Architectures: Array and Vector Processors
• Vector Processors
• Vector processors process vectors (arrays of data) in a single operation, rather than working on
individual elements.
MEMORY AND ADDRESSING
SOC Memory Examples
• Table 1.12 shows a number of different SOC designs and their cache and memory confi guration. It is
important for SOC designers to consider whether to put RAM and ROM on - die or off - die.
• 1. Real Memory (Physical Memory)
• Definition:
• Real memory refers to the actual physical RAM (Random Access Memory)
installed in a computer.
• It consists of hardware-based memory modules that store active data and
programs.
• Characteristics:
Limited in size (depends on installed RAM).
✔️
Directly accessed by the CPU.
✔️
Faster but expensive.
✔️
If a system runs out of real memory, it relies on virtual memory.
✔️
• 2. Virtual Memory
• Definition:
• Virtual memory is a technique that allows a computer to use a part of the
storage (HDD/SSD) as additional memory when RAM is full.
• Managed by the operating system using a page file (Windows) or swap space
(Linux/macOS).
• Characteristics:
Provides the illusion of a larger memory space.
✔️
Slower than real memory (because it uses the disk).
✔️
Helps run large programs with limited RAM.
✔️
Uses paging and swapping techniques.
✔️
SYSTEM - LEVEL INTERCONNECTION
• SOC technology typically relies on the interconnection of predesigned circuit modules (known as
intellectual property [IP] blocks) to form a complete system, which can be integrated onto a single
chip.
• SOC technology typically relies on the interconnection of predesigned circuit modules (known as
intellectual property [IP] blocks) to form a complete system, which can be integrated onto a single
chip.
• A well - designed interconnection scheme should have effi cient communication protocols.
• This facilitates interoperability between IP blocks designed by different people from different
organizations and encourages design reuse.
• SOC interconnect methods can be classified into two main approaches: buses and network - on - chip,
SYSTEM - LEVEL INTERCONNECTION
• Bus - Based Approach
• Bus-Based Interconnection in SoC
• In a System-on-Chip (SoC), components like the CPU, GPU, memory,
and peripherals must communicate efficiently. A bus-based
interconnection system enables this communication, ensuring data
transfer between these components.
• 1. What is a Bus in SoC?
• A bus is a communication pathway that transfers data, addresses, and
control signals between different components within the SoC. It allows
multiple components to share the same communication medium.
• Types of Signals in a Bus
• Data Bus Transfers actual data between components.
→
• Address Bus Carries memory addresses for data access.
→
.
Network - on - Chip Approach
• 1. What is Network-on-Chip (NoC)?
• NoC (Network-on-Chip) is an advanced interconnect architecture where
multiple processing elements (PEs), memory units, and peripherals
communicate via a structured network instead of a shared bus or
crossbar.
• 🔹 Think of NoC like the Internet inside an SoC!
• Instead of a single bus, components communicate via routers and links
like packets in a network.
• Eliminates bus contention, increases parallelism, and scales efficiently
for multi-core and AI SoCs.
AN APPROACH FOR SOC DESIGN
• Two important ideas in a design process are fi guring out the requirements and specifications, and iterating
through different stages of design toward an effi cient and effective completion.
• Requirements and Specifi cations
• Requirements and specifi cations are fundamental concepts in any system design situation. There must be a
thorough understanding of both before a design can begin.
• Requirements and specifi cations are fundamental concepts in any system design situation. There must be a
thorough understanding of both before a design can begin.
• The system requirements are the largely externally generated criteria for the system. They may come from
competition, from sales insights, from cus tomer requests, from product profi tability analysis, or from a
combination.
• compatibility with previous designs or published standards,
• reuse of previous designs,
AN APPROACH FOR SOC DESIGN
• customer requests/complaints,
• sales reports,
• cost analysis,
• competitive equipment analysis, and
• trouble reports (reliability) of previous products and competitive products
• The designer can also introduce new requirements based on new technology, new ideas, or
new materials that have not been used in a similar systems environment.
• The specifi cation does not complete any part of the design process; it initial izes the
process. Now the design can begin with the selection of components and approaches, the
study of alternatives, and the optimization of the parts of the system.
Design Iteration
• Design is always an iterative process. So, the obvious question is how to get the very fi rst, initial
design.
• Initial Design: This is the fi rst design that shows promise in meeting the key requirements,
while other performance and cost criteria are not considered. For instance, processor or memory
or input/output (I/O) should be sized to meet high - priority real - time constraints. Promising
components and their parameters are selected and analyzed to provide an understanding of their
expected idealized performance and cost.

SYSTEM approach in system on chip architecture

  • 1.
  • 2.
    Introduction to theSystems Approach UNIT-1 • SYSTEM ARCHITECTURE: AN OVERVIEW 1. The past 40 years have seen amazing advances in silicon technology and resulting increases in transistor density and performance. 2. In 1966, Fairchild Semiconductor [84] introduced a quad two input NAND gate with about 10 transistors on a die. In 2008, the Intel quad - core Itanium processor has 2 billion transistors 3. A system - on - chip (SOC) architecture is an ensemble of processors, memories, and interconnects tailored to an application domain 4. A simple example of such an architecture is the Emotion Engine for the Sony PlayStation , which has two main functions: behavior simulation and geometry translation. This system contains three essential components: a main processor of the reduced instruction set computer (RISC) style [118] and two vector processing units, VPU0 and VPU1, each of which contains four parallel processors of the single instruction, multiple data (SIMD) stream style
  • 4.
    Feature Microprocessor (MPU)Microcontroller (MCU) System on Chip (SoC) Definition A CPU (Central Processing Unit) that requires external components for full functionality. A compact integrated circuit with a CPU, memory, and peripherals. A highly integrated chip containing a CPU, memory, GPU, connectivity, and more. Components Included Only the processor; needs external RAM, ROM, and I/O devices. Processor, RAM, ROM, I/O ports, timers, and more in one chip. Processor(s), memory, GPU, wireless modules, I/O, and sometimes AI accelerators. Performance High-performance, used in complex computing tasks. Lower performance, optimized for control tasks. Varies widely, optimized for embedded and mobile applications. Power Consumption High power consumption, not suitable for battery-operated devices. Low power consumption, ideal for embedded systems. Optimized for power efficiency, common in mobile and IoT devices. Cost Expensive due to external components. Cost-effective as it integrates everything needed. Can be expensive but offers high integration and efficiency. Usage Used in computers, laptops, and high-end applications. Used in embedded systems like washing machines, ATMs, and remote controls. Used in smartphones, tablets, smart TVs, IoT devices, and AI applications.
  • 5.
    COMPONENTS OF THESYSTEM: PROCESSORS, MEMORIES, AND INTERCONNECTS • The term architecture denotes the operational structure and the user ’ s view of the system. • it has evolved to include both the functional specification and the hardware implementation. • The system architecture defines the system - level building blocks, such as processors and memories, and the interconnection between them. • The processor architecture determines the processor ’ s instruction set, the associated programming model, its detailed implementation, which may include hidden registers, branch prediction circuits and specific details concerning the ALU (arithmetic logic unit). • The implementation of a processor is also known as microarchitecture • As an example, an SOC for a smart phone would need to support, in addition to audio input and output capabilities for a traditional phone, Internet access functions and multimedia facilities for video communication, document processing, and entertainment such as games and movies. • If all the elements cannot be contained on a single chip, the implementation is probably best referred to as a system on a board, but often is still called a SOC.
  • 8.
  • 9.
    HARDWARE AND SOFTWARE •A fundamental decision in SOC design is to choose which components in the system are to be implemented in hardware and in software. The major benefits and drawbacks of hardware and software implementations are summarized in Table . • A software implementation is usually executed on a general - purpose processor (GPP), which interprets instructions at run time. This architecture offers flexibility and adaptability, and provides a way of sharing resources among different applications; • the hardware implementation of the ISA is generally slower and more power hungry than implementing the corresponding function directly in hardware without the overhead of fetching and decoding instructions. • Given that hardware and software have complementary features, many SOC designs aim to combine the individual benefits of the two. The obvious method is to implement the performance - critical parts of the application in hardware, and the rest in software. • Custom ASIC hardware and software on GPPs can be seen as two extremes in the technology spectrum with different trade - offs in programmability and performance; • there are various technologies that lie between these two extremes (Figure 1.6 ). The two more well - known ones are application - specific instruction processors (ASIPs) and field - programmable gate arrays (FPGAs).
  • 10.
    A simplifi edtechnology comparison: programmability versus performance. GPP, general - purpose processor; CGRA, coarse - grained reconfi gurable architecture.
  • 11.
    Programmability vs performance •1. ASIC (Application-Specific Integrated Circuit) • Definition: A custom-designed chip tailored for a specific application or task. • Flexibility: Not reprogrammable after manufacturing. • Performance: Highest performance and power efficiency since it's optimized for a specific function. • Cost: High initial cost due to design and fabrication, but lower unit cost in large volumes. • Use Cases: Used in consumer electronics (smartphones, GPUs, AI accelerators, Bitcoin mining).
  • 12.
    FPGA (Field-Programmable GateArray) • Definition: A reconfigurable chip that can be programmed after manufacturing using hardware description languages (HDLs). • Flexibility: Highly flexible and reprogrammable for different applications. • Performance: Slower than ASICs but faster than general-purpose processors due to configurable hardware. • Cost: Lower upfront cost (no fabrication needed), but higher per-unit cost compared to ASICs in large volumes. • Use Cases: Used in prototyping, real-time processing, telecom (5G), aerospace, and data centers.
  • 13.
    ASIP (Application-Specific Instruction-setProcessor) • Definition: A processor optimized for a specific domain with a specialized instruction set. • Flexibility: More flexible than ASICs but less than FPGAs, as it can be programmed with software but has a fixed instruction set. • Performance: Higher than general-purpose processors for targeted tasks but lower than ASICs. • Cost: Lower cost than ASICs for domain-specific applications, but development can be complex. • Use Cases: Used in signal processing, embedded systems, and IoT devices requiring domain-specific processing.
  • 14.
    application - specificinstruction processors (ASIPs) and field - programmable gate arrays (FPGAs). • An ASIP is a processor with an instruction set customized for a specific application or domain. • Custom instructions efficiently implemented in hard ware are often integrated into a base processor with a basic instruction set. • An FPGA typically contains an array of computation units, memories, and their interconnections, and all three are usually programmable in the field by application builders. • FPGA technology often offers a good compromise: It is faster than software while being more flexible and having shorter development times than custom ASIC hardware implementations; • like GPPs, they are offered as off - the - shelf devices that can be programmed without going through chip fabrication. Because of the growing demand for reducing the time to market and the increasing cost of chip fabrication, FPGAs are becoming more popular for implementing digital designs.
  • 15.
    PROCESSOR ARCHITECTURES • Typically,processors are characterized either by their application or by their architecture
  • 16.
    PROCESSOR ARCHITECTURES • Therequirements space of an application is often large, and there is a range of implementation options. Thus, it is usually difficult to associate a particular architecture with a particular application. • In addition, some architectures combine different implementation approaches as seen in the PlayStation example of Section 1.1 . There, the graphics processor consists of a four - element SIMD array of vector processing functional units (FUs). Other SOC implementations consist of multiprocessors using very long instruction word (VLIW) and/or superscalar processors. • From the programmer ’ s point of view, sequential processors execute one instruction at a time. • many processors have the capability to execute several instructions concurrently in a manner that is transparent to the programmer, through techniques such as pipelining, multiple execution units, and multiple cores. Pipelining is a powerful technique that is used in almost all current processor implementations. • Pipelining is a powerful technique that is used in almost all current processor implementations. • Instruction - level parallelism (ILP) means that multiple operations can be executed in parallel within a program. ILP may be achieved with hardware, compiler, or operating system techniques.
  • 17.
    Processor: A FunctionalView • SOC designs and the processor used in each design. For these examples, we can characterize them as general purpose, or special purpose with support for gaming or signal processing applications.
  • 18.
    Processor: An ArchitecturalView • Simple Sequential Processor : • Sequential processors directly implement the sequential execution model. These processors process instructions sequentially from the instruction stream. The next instruction is not processed until all execution for the current instruction is complete and its results have been committed. • 1. fetching the instruction into the instruction register (IF), • 2. decoding the opcode of the instruction (ID), • 3. generating the address in memory of any data item residing there (AG), • 4. fetching data operands into executable registers (DF), • 5. executing the specifi ed operation (EX), and • 6. writing back the result to the register fi le (WB).
  • 19.
    Sequential processor model •Working of a Sequential Processor • Step 1: Instruction Fetch • The processor fetches the instruction from memory at the address in the Program Counter (PC). • The PC is then incremented to point to the next instruction. • Step 2: Instruction Decode • The fetched instruction is decoded by the Control Unit. • The CU determines what operation the instruction represents. • Step 3: Execute • The ALU performs computations (if needed). • Memory may be accessed for data load/store operations. • Step 4: Write Back • The result of the operation is written back to a register or memory. • Step 5: Repeat • The next instruction is fetched, and the cycle repeats.
  • 20.
    Sequential processor model •Applications • Simple Embedded Systems – Used in microcontrollers for basic tasks. • Educational Purposes – Teaching basic computer architecture concepts.
  • 21.
    Pipelined Processor • Pipeliningis a straightforward approach to exploiting parallelism that is based on concurrently performing different phases (instruction fetch, decode, execution, etc.) of processing an instruction. • Pipelining assumes that these phases are independent between different operations and can be overlapped — when this condition does not hold, the processor stalls the downstream phases to enforce the dependency. • Thus, multiple operations can be processed simultaneously with each operation at a different phase of its processing.
  • 23.
    Pipeline processor • Apipeline processor refers to a processing architecture where multiple stages (or steps) are arranged in a sequence to handle a stream of data. Each stage performs a specific operation and passes the result to the next stage, allowing for parallel execution and efficient processing. • Advantages of Pipeline Processing • Increased Throughput: Multiple operations can be executed in parallel. • Efficient Resource Utilization: Hardware or software resources are used effectively. • Faster Execution: Overall execution time is reduced due to concurrent processing. • Key Features of a Pipelined Processor Model • Instruction-Level Parallelism (ILP) – Multiple instructions are processed simultaneously in different pipeline stages. • Stages of Execution – The instruction cycle is divided into stages such as Fetch, Decode, Execute, Memory Access, and Write Back. • Increased Throughput – While an individual instruction still takes the same time, multiple instructions are in progress at once, reducing overall execution time. • Potential Hazards – Pipeline execution introduces challenges like data hazards, control hazards, and structural hazards.
  • 24.
    Pipelined processor model CycleIF ID EX MEM WB 1 I1 2 I2 I1 3 I3 I2 I1 4 I4 I3 I2 I1 5 I5 I4 I3 I2 I1 6 I6 I5 I4 I3 I2 ... .. .. .. .. .. Example: Instruction Execution in a Pipeline
  • 25.
    Pipelined processor model •Real-World Examples of Pipelined Processors • RISC Processors – ARM, MIPS, PowerPC. • Modern x86 CPUs – Intel and AMD processors use deep pipelining. • GPU Pipelines – Used for parallel graphics processing
  • 27.
    Superscalar Processor Model •A superscalar processor is an advanced CPU architecture that executes multiple instructions per clock cycle by using multiple execution units. Unlike pipelined processors, which improve instruction throughput by overlapping execution stages, superscalar processors increase throughput by issuing multiple instructions in parallel.
  • 28.
    Cycle Scalar Pipeline(Single Issue) Superscalar Pipeline (Dual Issue) 1 IF (I1) IF (I1, I2) 2 ID (I1) ID (I1, I2) 3 EX (I1) EX (I1, I2) 4 MEM (I1) MEM (I1, I2) 5 WB (I1) WB (I1, I2) Pipeline Comparison: Superscalar vs. Scalar
  • 29.
    Examples of SuperscalarProcessors • Intel Pentium (First Superscalar x86 Processor) • AMD Ryzen (Modern superscalar CPU) • ARM Cortex-A Series (used in mobile devices) • IBM PowerPC Processors (used in servers and gaming consoles) • Apple M-Series Chips (highly optimized superscalar processors)
  • 31.
    SIMD Architectures: Arrayand Vector Processors • Single Instruction, Multiple Data (SIMD) architectures allow a single instruction to operate on multiple data elements simultaneously. This approach enhances performance in applications requiring high data parallelism, such as graphics processing, scientific computing, and machine learning. • In SIMD architectures: • A single instruction is executed across multiple data elements simultaneously. • It is highly efficient for operations on large datasets, such as matrix multiplications and image processing. • here are two main types of SIMD architectures: • Array Processors • Vector Processors • Array Processors • Array processors consist of multiple processing elements (PEs) that operate in parallel under the control of a single instruction stream.
  • 32.
    SIMD Architectures: Arrayand Vector Processors • Vector Processors • Vector processors process vectors (arrays of data) in a single operation, rather than working on individual elements.
  • 33.
  • 37.
    SOC Memory Examples •Table 1.12 shows a number of different SOC designs and their cache and memory confi guration. It is important for SOC designers to consider whether to put RAM and ROM on - die or off - die.
  • 38.
    • 1. RealMemory (Physical Memory) • Definition: • Real memory refers to the actual physical RAM (Random Access Memory) installed in a computer. • It consists of hardware-based memory modules that store active data and programs. • Characteristics: Limited in size (depends on installed RAM). ✔️ Directly accessed by the CPU. ✔️ Faster but expensive. ✔️ If a system runs out of real memory, it relies on virtual memory. ✔️
  • 39.
    • 2. VirtualMemory • Definition: • Virtual memory is a technique that allows a computer to use a part of the storage (HDD/SSD) as additional memory when RAM is full. • Managed by the operating system using a page file (Windows) or swap space (Linux/macOS). • Characteristics: Provides the illusion of a larger memory space. ✔️ Slower than real memory (because it uses the disk). ✔️ Helps run large programs with limited RAM. ✔️ Uses paging and swapping techniques. ✔️
  • 40.
    SYSTEM - LEVELINTERCONNECTION • SOC technology typically relies on the interconnection of predesigned circuit modules (known as intellectual property [IP] blocks) to form a complete system, which can be integrated onto a single chip. • SOC technology typically relies on the interconnection of predesigned circuit modules (known as intellectual property [IP] blocks) to form a complete system, which can be integrated onto a single chip. • A well - designed interconnection scheme should have effi cient communication protocols. • This facilitates interoperability between IP blocks designed by different people from different organizations and encourages design reuse. • SOC interconnect methods can be classified into two main approaches: buses and network - on - chip,
  • 41.
    SYSTEM - LEVELINTERCONNECTION • Bus - Based Approach
  • 42.
    • Bus-Based Interconnectionin SoC • In a System-on-Chip (SoC), components like the CPU, GPU, memory, and peripherals must communicate efficiently. A bus-based interconnection system enables this communication, ensuring data transfer between these components. • 1. What is a Bus in SoC? • A bus is a communication pathway that transfers data, addresses, and control signals between different components within the SoC. It allows multiple components to share the same communication medium. • Types of Signals in a Bus • Data Bus Transfers actual data between components. → • Address Bus Carries memory addresses for data access. → .
  • 43.
    Network - on- Chip Approach
  • 44.
    • 1. Whatis Network-on-Chip (NoC)? • NoC (Network-on-Chip) is an advanced interconnect architecture where multiple processing elements (PEs), memory units, and peripherals communicate via a structured network instead of a shared bus or crossbar. • 🔹 Think of NoC like the Internet inside an SoC! • Instead of a single bus, components communicate via routers and links like packets in a network. • Eliminates bus contention, increases parallelism, and scales efficiently for multi-core and AI SoCs.
  • 45.
    AN APPROACH FORSOC DESIGN • Two important ideas in a design process are fi guring out the requirements and specifications, and iterating through different stages of design toward an effi cient and effective completion. • Requirements and Specifi cations • Requirements and specifi cations are fundamental concepts in any system design situation. There must be a thorough understanding of both before a design can begin. • Requirements and specifi cations are fundamental concepts in any system design situation. There must be a thorough understanding of both before a design can begin. • The system requirements are the largely externally generated criteria for the system. They may come from competition, from sales insights, from cus tomer requests, from product profi tability analysis, or from a combination. • compatibility with previous designs or published standards, • reuse of previous designs,
  • 46.
    AN APPROACH FORSOC DESIGN • customer requests/complaints, • sales reports, • cost analysis, • competitive equipment analysis, and • trouble reports (reliability) of previous products and competitive products • The designer can also introduce new requirements based on new technology, new ideas, or new materials that have not been used in a similar systems environment. • The specifi cation does not complete any part of the design process; it initial izes the process. Now the design can begin with the selection of components and approaches, the study of alternatives, and the optimization of the parts of the system.
  • 47.
    Design Iteration • Designis always an iterative process. So, the obvious question is how to get the very fi rst, initial design. • Initial Design: This is the fi rst design that shows promise in meeting the key requirements, while other performance and cost criteria are not considered. For instance, processor or memory or input/output (I/O) should be sized to meet high - priority real - time constraints. Promising components and their parameters are selected and analyzed to provide an understanding of their expected idealized performance and cost.