SlideShare a Scribd company logo
1 of 69
Download to read offline
Introduction to the Systems
Approach
Mr. A. B. Shinde
Assistant Professor,
Electronics Engineering,
PVPIT, Budhgaon, Sangli
shindesir.pvp@gmail.com
System Architecture
• The term architecture denotes:
– The operational structure and
– The user’s view of the system.
• Over time, it has evolved to include
– Functional specification and
– Hardware implementation.
• The system architecture defines the system - level building blocks
2
System Architecture: An Overview
3
The increasing transistor density on
a silicon die
The decrease of transistor cost
over the years
Components of the System
• Processors (core, multimedia & coprocessors)
• Memories (storing elements)
• Interconnects
4
A basic SOC system model
Hardware and Software
• Crucial decision to be made is:
1. Which components are to be implemented in hardware and
2. Which components are to be implemented in software.
• Software implementation usually uses general - purpose processor
(GPP), which interprets instructions at run time.
• This architecture offers flexibility and adaptability and provides a way
of sharing resources among different applications
• Hardware implementation of the ISA (Instruction Set Architecture) is
generally slower and consumes more power.
5
Hardware and Software
6
 Most software developers use high - level languages and tools that
enhance productivity, such as program development environments,
optimizing compilers and performance profilers.
 Direct implementation of applications using hardware (ASICs), provides
high performance at the expense of programmability, flexibility,
productivity and cost.
Benefits Drawbacks
Hardware
Fast,
low power consumption
Inflexible,
unadaptable,
complex to build and test
Software
Flexible,
Adaptable,
simple to build and test
Slow,
high power consumption
Hardware and Software
7
Programmability versus Performance
Coarse - Grained Reconfigurable Architecture
Application – Specific Instruction Processors
System Architecture: An Overview
• Custom ASIC hardware and software on GPPs can be seen as two
extremes in the technology spectrum with different trade – offs
• An ASIP is a processor with an instruction set customized for a
specific application or domain.
• An FPGA typically contains an array of computation units, memories,
and their interconnections, all are programmable in the field by user.
• Because of the growing demand for reducing the time to market and
the increasing cost of chip fabrication, FPGAs are becoming more
popular for implementing digital designs.
• Coarse - Grained Reconfigurable Architecture (CGRA) It contains
logic blocks that process byte - wide or multiple byte - wide data,
which can form building blocks of datapaths.
• Structured ASIC It allows application builders to customize the
resources before fabrication. While it offers performance close to that
of ASIC.
• Digital Signal Processors (DSPs) The organization and instruction
set for these devices are optimized for digital signal processing
applications.
8
Processor Architectures
• Processors are characterized either by their application or by their
architecture (or structure)
9
Processors Identified by Function
Processors Identified by Architecture
Processor Architectures
• Sequential processors execute one instruction at a time.
• Many processors have the capability to execute several instructions
concurrently by means of pipelining, multiple execution units, and
multiple cores.
• Pipelining is a powerful technique.
• Instruction - level parallelism (ILP) means that multiple operations
can be executed in parallel within a program.
• ILP may be achieved with hardware, compiler, or operating system,
provided that there is no data dependency
• In general, a computer architecture consists of one or more
interconnected processor elements (PEs) that operate concurrently,
solving a single overall problem.
10
Instruction - level parallelism (ILP)
• Instruction-level parallelism (ILP) is a measure of how many of the
operations in a computer program can be performed simultaneously.
Consider the following program:
• For Example:
1. e = a + b
2. f = c + d
3. g = e * f
Here, Operation 3 depends on the results of operations 1 and 2, so it
cannot be calculated until both of them are completed. As, operations
1 and 2 do not depend on any other operation, so they can be
calculated simultaneously.
If each operation is completed in one unit of time then three
instructions can be completed in two units of time, giving an ILP of 3/2.
11
Processor: A Functional View
• Table shows different SOC designs and the processor used in each
design.
• For these examples, we can characterize them as general purpose, or
special purpose.
12
Processor Models for Different SOC Examples
Processor: An Architectural View
• Simple Sequential Processor directly implement the sequential
execution model.
• The next instruction is not processed until all execution for the current
instruction is complete and its results have been committed.
13
Instruction execution sequence.
1. IF - fetching the instruction into the instruction register (IF),
2. ID - decoding the opcode of the instruction (ID),
3. AG - generating the address in memory of any data item residing there,
4. DF - fetching data operands into executable registers (DF),
5. EX - executing the specified operation (EX), and
6. WB - writing back the result to the register file (WB).
System Architecture: An Overview
• Sequential processor model
14
During execution, a sequential processor executes one or more
operations per clock cycle from the instruction stream.
An instruction is a container that represents the smallest execution
packet managed explicitly by the processor.
Scalar and superscalar processors consume one or more instructions per
cycle, where each instruction contains a single operation.
System Architecture: An Overview
• Pipelined Processor Pipelining is approach to exploit parallelism that
is based on concurrently performing different phases (instruction fetch,
decode, execution, etc.) of processing an instruction.
• Pipelining assumes that these phases are independent — when this
condition does not hold, the processor stalls the downstream phases
to enforce the dependency.
• Thus, multiple operations can be processed simultaneously with each
operation at a different phase of its processing.
15
Instruction timing in a pipelined processor.
System Architecture: An Overview
• Pipelined Processor
16
Figure shows the general form of a pipelined processor.
Static pipeline, requires the processor to go through all stages or
phases of the pipeline.
Dynamic pipeline allows the bypassing of one or more pipeline
stages, depending on the requirements of the instruction.
The more complex dynamic pipelines allow instructions to complete out of
(sequential) order, or even to initiate out of order.
System Architecture: An Overview
• When pipelining does not execute multiple instructions at exactly the
same time, there are other techniques to do the same.
• Most instructions consist of only a single operation, this kind of
parallelism has been named ILP (instruction level parallelism).
• Two architectures that exploit (implement) ILP are superscalar and
VLIW (Very Long Instruction Word) processors.
• A superscalar processor dynamically examines the instruction
stream to determine which operations are independent and can be
executed.
• A VLIW processor relies on the compiler to analyze the available
operations (OP) and to schedule independent operations into wide
instruction words.
17
System Architecture: An Overview
• Figure shows the instruction timing of a pipelined superscalar or
VLIW processor executing two instructions per cycle.
• In this case, all the instructions are independent so that they can be
executed in parallel.
18
System Architecture: An Overview
• Superscalar Processors:
• Dynamic pipelined processors has limitations to execute a single
operation per cycle.
• This limitation can be avoided with the addition of multiple
functional units and a dynamic scheduler to process more than one
instruction per cycle.
• Superscalar processors can achieve execution rates of several
instructions per cycle (usually limited to two, but more is possible
depending on the application).
19
System Architecture: An Overview
20
Superscalar processor model
System Architecture: An Overview
• Superscalar Processors:
• A superscalar processor adds a scheduling instruction window that
analyzes multiple instructions from the instruction stream in each
cycle.
• These instructions are treated in the same manner as in a pipelined
processor.
• Because of the complexity of the dynamic scheduling logic, high –
performance superscalar processors are limited to processing
four to six instructions per cycle.
• In Flynn's taxonomy, a single-core superscalar processor is
classified as an SISD processor whereas a multi-core superscalar
processor is classified as an MIMD processor
21
System Architecture: An Overview
• VLIW (Very Long Instruction Word) Processors
22
In contrast to dynamic analysis in hardware to determine which
operations can be executed in parallel, VLIW processors rely on static
analysis in the compiler
VLIW processor model
System Architecture: An Overview
• VLIW processors are less complex than superscalar processors and
have the potential for higher performance.
• A VLIW processor executes operations from statically scheduled
instructions.
• VLIW processors rely on the static analysis performed by the
compiler and are unable to take advantage of any dynamic
execution characteristics.
• A simple VLIW implementation results in high performance.
23
System Architecture: An Overview
• VLIW processors:
• VLIW instruction encodes multiple operations, at least one
operation for each execution unit of a device.
• For example, if a VLIW device has four execution units, then a
VLIW instruction for the device has four operation fields, each field
specifying what operation should be done on that corresponding
execution unit.
• To accommodate these operation fields, VLIW instructions are usually
at least 64 bits wide, and far wider on some architectures.
• For example:
• In one cycle, it does a floating-point multiply, a floating-point add, and
two auto increment loads. All of this fits in one 48-bit instruction:
f12 = f0 * f4, f8 = f8 + f12, f0 = dm(i0, m3), f4 = pm(i8, m9);
24
System Architecture: An Overview
• SIMD Architectures:
25
System Architecture: An Overview
• SIMD Architectures:
26
SIMD Processable Patterns SIMD Unprocesable Patterns
Example: Brightness Computation by SIMD Operations
System Architecture: An Overview
• SIMD Architectures: Array and Vector Processors:
• The SIMD class of processor architecture includes both array and
vector processors.
• From the view of an assembly - level programmer, programming
SIMD architecture appears to be similar to programming a simple
processor.
27
System Architecture: An Overview
• SIMD Architectures: Array and Vector Processors:
• The two popular types of SIMD processor are the array processor
and the vector processor.
• They differ both in their implementations and in their data
organizations.
• An array processor consists of many interconnected processor
elements, each having their own local memory space.
• A vector processor consists of a single processor that references a
global memory space and has special function units that operate
on vectors.
28
System Architecture: An Overview
• Array processor
29
Array processor model.
Array processor: Is a set of parallel processor elements connected via
one or more networks, possibly including local and global inter element
communications and control communications.
Each processor element (PE) has its own private memory.
Data is distributed across the elements in a regular fashion… i.e. dependent
on both the actual structure of the data and also the computations to be
performed on the data.
System Architecture: An Overview
• Array processor:
• Direct access to global memory or another processor element’s
local memory is expensive, so intermediate values are propagated
through the array through local inter-processor connections.
• Since instructions are broadcast, processor element cannot alter
the flow of the instruction stream;
however, individual processor elements can conditionally
disable instructions based on local status information i.e. processor
elements remains idle.
30
System Architecture: An Overview
• Array processor:
• The actual instruction stream consists of more than a fixed stream
of operations.
• An array processor is typically coupled to a general - purpose
control processor that provides both scalar operations as well as
array operations that are broadcasted to all processor elements in
the array.
• The control processor performs the scalar sections of the
application, interfaces with the outside world, and controls the
flow of execution;
• The array processor performs the array sections of the application
as directed by the control processor.
31
System Architecture: An Overview
• Array processor:
32
System Architecture: An Overview
• Vector Processors:
• A vector processor is a single processor that resembles a
traditional single stream processor, except that some of the functional
units (and registers) operate on vectors (sequences of data values).
• These functional units are deeply pipelined and have high clock
rates.
• The vector pipelines often have the rapid delivery of the input
vector data elements, together with the high clock rates, results in a
significant throughput.
33
System Architecture: An Overview
34
Vector processor model.
System Architecture: An Overview
• A typical vector processor configuration consists of:
a vector register file,
one vector addition unit,
one vector multiplication unit, and
one vector reciprocal unit (used along with the vector multiplication unit to
perform division)
35
Vector processor model.
System Architecture: An Overview
• In modern vector processors, vectors are loaded into special vector
registers and stored back into memory
• Vector processors have several features that enable them to achieve
high performance.
• One of the feature is, the ability to concurrently load and store
values between the vector register file and the main memory while
performing computations on values in the vector register file.
• If the length of vector register is limited and vectors longer than
the register length needs to be processed then they are processed
in segments.
36
System Architecture: An Overview
• Most vector processors support a result bypassing:
— that allows a follow - on computation to commence as soon as
the first value is available from the preceding computation.
Thus, instead of waiting for the entire vector to be processed, the
follow - on computation can be significantly overlapped with the
preceding computation that it is dependent on.
37
Memory and Addressing
• Memory requirements in SOC applications varies significantly.
• In one case, the memory program can reside entirely in an on-chip
ROM, with the data in on-chip RAM.
• In another case, the memory system might support an elaborate
operating system requiring a large off-chip memory (system on a
board), with a memory management unit and cache.
38
Memory and Addressing
• Why not simply include memory with the processor on the die?
• This has many attractions
(Benefits of embedding memory & processor on same die):
1. It improves the accessibility of memory, improving both memory
access time and bandwidth.
2. It reduces the need for large cache.
3. It improves performance for memory based applications.
39
Memory and Addressing
• Drawbacks of embedding memory & processor on same die:
• First problem is that DRAM memory process technology differs
from standard microprocessor process technology, and would
cause some sacrifice in achievable bit density.
• Second problem is more serious: If memory were restricted to the
processor die, its size would be correspondingly limited.
• Thus, the conventional processor die model has evolved to implement
multiple robust homogeneous processors sharing the higher levels of
a two or three-level cache structure with the main memory off-die, on
its own multi die module.
40
Memory and Addressing
• From a design complexity point of view, this has the advantage of
being a ― universal ‖ solution: One implementation fits all
applications, although not necessarily equally well.
• So, the production quantities can be large enough to justify the
costs.
41
Memory and Addressing
• An alternative to the previous approach is integrate the memory on
system die.
• For specific applications, whose memory size can be bounded, we can
implement an integrated memory SOC. This concept is illustrated in
above figure
42
Memory and Addressing
• SOC Memory Examples
• It is important for SOC designers to consider whether to put RAM and
ROM on-die or off-die.
• Table shows various examples of SOC embedded memory macro cell.
43
SOC Memory
Considerations
Memory and Addressing
• Addressing: The Architecture of Memory
• Addressing facilities are available to the programmer.
• Some facilities are available to the application programmer and
some to the operating system programmer.
• Virtual memory enables programs requiring larger storage than
the physical memory.
• When virtual addressing facilities are properly implemented and
programmed, memory can be efficiently and securely accessed.
44
Memory and Addressing
• Virtual memory is often supported by a memory management unit.
• The physical memory address is determined by a sequence of (at
least) three steps:
1. The application produces a process address, this together with the
process or user ID, defines the virtual address:
virtual address = offset + (program) base + index ,
where the offset is specified in the instruction while the base and
index values are in specified registers.
45
Memory and Addressing
• Virtual memory is often supported by a memory management unit.
• The physical memory address is determined by a sequence of (at
least) three steps:
2. Since multiple processes must cooperate in the same memory space,
the process addresses must be coordinated and relocated.
This is typically done by a segment table.
Upper bits of the virtual address are used to address a segment
table.
system address = virtual address + (process) base
where the system address must be less than the bound .
46
Memory and Addressing
• Virtual memory is often supported by a memory management unit.
• The physical memory address is determined by a sequence of (at
least) three steps:
3. Virtual versus real. For many SOC applications, the memory space
exceeds the available (real) implemented memory.
Memory space is implemented on disk and only the recently used
regions (pages) are brought into memory.
47
System - Level Interconnection
• SOC technology typically relies on the interconnection of
predesigned circuit modules (known as intellectual property (IP)
blocks) to form a complete system.
• In this way, the design task is raised from a circuit level to a system
level.
• Central to the system – level performance and the reliability of the
finished product is dependent on the method of interconnection used.
• A well - designed interconnection scheme should have efficient
communication protocols
48
System - Level Interconnection
• It should provide efficient communication between different modules
maximizing the degree of parallelism.
• SOC interconnect methods can be classified into two main
approaches: buses and network - on - chip,
49
System - Level Interconnection
• Bus - Based Approach
50
With the bus - based approach, IP blocks are designed to conform to published bus
standards (AMBA or CoreConnect).
Communication between modules is achieved through the sharing of the physical
connections of address, data, and control bus signals.
Usually, two or more buses are employed in a system, organized in a hierarchical
fashion.
To optimize system - level performance and cost, the bus closest to the CPU has the
highest bandwidth, and the bus farthest from the CPU has the lowest bandwidth.
System - Level Interconnection
• Network - on - Chip Approach
51
A network - on - chip system consists of an
array of switches, either dynamically
switched as in a crossbar or statically
switched as in a mesh.
The crossbar approach uses
asynchronous channels to connect
synchronous modules that can operate at
different clock frequencies.
This approach has the advantage of higher
throughput than a bus - based system
while making integration of a system with
multiple clock domains easier.
System - Level Interconnection
• Network - on - Chip Approach
52
This interconnect scheme is based on a
two - dimensional mesh topology.
All communications between switches
are conducted through data packets,
routed through the router interface circuit
within each node.
Since the interconnections between
switches have a fixed distance,
interconnect - related problems such as
wire delay and cross talk noise are much
reduced.
System - Level Interconnection
53
Interconnect Models for Different SOC Examples
An Approach For Soc Design
• Two important ideas in a design process are:
1. Figuring out the requirements and specifications, and
2. Iterating through different stages of design toward an efficient and
effective completion.
• Requirements and Specifications
• Design Iteration
– Initial Design
– Optimized Design
54
An Approach For Soc Design
• Requirements and Specifications :
• There must be a deep understanding before a starting of design
process.
• They are useful at the beginning and at the end of the design process:
– At the beginning, to clarify what needs to be achieved; and
– At the end, as a reference against which the completed design
can be evaluated.
55
An Approach For Soc Design
• Requirements and Specifications :
• The system requirements are the externally generated criteria for
the system.
• They may come from competition, from sales insights, from
customer requests, from product profitability analysis, or from a
combination of all.
• Requirements are rarely definitive of anything about the system.
Indeed, requirements can frequently be unrealistic:
“I want it fast, I want it cheap, and I want it now ”
56
An Approach For Soc Design
• Requirements and Specifications :
• It is important for the designer to analyze carefully the
requirements, expressions, and to spend sufficient time in
understanding the market situation to determine all the factors
expressed in the requirements.
57
An Approach For Soc Design
• Requirements and Specifications :
• Some of the factors the designer considers in determining
requirements include
• Compatibility with previous designs or published standards,
• Reuse of previous designs,
• Customer requests/complaints,
• Sales reports,
• Cost analysis,
• Competitive equipment analysis, and
• Trouble reports (reliability) of previous products and
competitive products.
58
An Approach For Soc Design
• Design Iteration:
• Design is always an iterative process.
• So, the obvious question is how to get the very first, initial design?
• This is the design that we can then iterate through and optimize
according to the design criteria.
59
An Approach For Soc Design
• Design Iteration:
• Initial Design:
• This is the first design that meets
the key requirements, while other
performance and cost criteria’s
are not considered.
• For instance, processor or
memory or input/output (I/O)
should be sized to meet all (high
– priority, real – time) constraints.
• Promising components and their
parameters are selected for
better performance.
• It is simplified model of the
expected computational capacity
or data bandwidth capability.
60
The SOC initial design process
An Approach For Soc Design
• Design Iteration:
• Optimized Design:
• Once the base performance (or area) requirements are met and the
base functionality is ensured, then the goal is to minimize:
– the cost (area) and/or
– the power consumption or
– the design effort required to complete the design.
• This is the iterative step of the process.
• The first steps of this process use higher - fidelity tools (simulations,
trial layouts, etc.) to ensure that the initial design actually does satisfy
the design specifications and requirements.
• The later steps refine, complete, and improve the design according to
the design criteria.
61
System Architecture and Complexity
• The basic difference between processor architecture and system
architecture is that the system adds another layer of complexity, and
the complexity of these systems limits the cost savings.
• Historically, the computer is a single processor plus a memory.
• As long as this is fixed (within broad tolerances), implementing that
processor on one or more silicon die does not change the design
complexity.
• Once die densities enable a scalar processor to fit on a chip, the
complexity issue changes.
62
System Architecture and Complexity
• As transistor densities significantly improve after this point, there are
obvious processor extension strategies to improve performance:
1. Additional Cache: Here we add cache storage and, as large
caches have slower access times, a second - level cache.
2. A More Advanced Processor: We implement a superscalar or a
VLIW processor that executes more than one instruction each cycle.
Additionally, we can speed up the execution units that affects the
critical path delay, especially the floating - point execution times.
3. Multiple Processors: Now we implement multiple (superscalar)
processors and their associated multilevel caches.
63
System Architecture and Complexity
• Instead of the 100,000
transistor processors, our
advanced processor has
millions of transistors.
• The way to manage this
complexity is to reuse
designs.
• So, reusing several simpler
processor designs
implemented on a die is
preferable to a new, more
advanced, single
processor.
• For this to work, we need a
good interconnection
mechanism to access the
various processors and
memory.
64
Complexity of design
System Architecture and Complexity
• So, when an application is well specified, the system-on-a-chip
approach includes
1. Multiple (usually) heterogeneous processors, each specialized for
specific parts of the application;
2. The main memory with (often) ROM for partial program storage;
3. A relatively simple, small (single - level) cache structure or
buffering schemes associated with each processor; and
4. A bus or switching mechanism for communications.
65
Product Economics And Implications For SoC
• Factors Affecting
Product Costs
• The basic cost and
profitability of a product
depend on many factors:
• its technical appeal,
• its cost,
• the market size, and
• the effect the product
has on future products.
• The issue of cost goes well
beyond the product’s
manufacturing cost.
66
Project cost components
Product Economics And Implications For SoC
• Factors Affecting Engineering Costs
67
Product Economics And Implications For SoC
• Modelling Product Economics and Technology Complexity
• To put all this into perspective, consider a general model of a product’s
average unit cost:
unit cost = (project cost) / (number of units).
• The product cost is simply the sum of all the fixed and variable costs.
• We represent the fixed cost as a constant, Kf .
• It is also clear that the variable costs are of the form Kv × n , where n is
the number of units.
68
Thank You…
This presentation is published only for Educational Purpose
69

More Related Content

What's hot

System On Chip
System On ChipSystem On Chip
System On Chip
anishgoel
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)
Shivam Gupta
 

What's hot (20)

SOC Chip Basics
SOC Chip BasicsSOC Chip Basics
SOC Chip Basics
 
Arm architecture chapter2_steve_furber
Arm architecture chapter2_steve_furberArm architecture chapter2_steve_furber
Arm architecture chapter2_steve_furber
 
System On Chip
System On ChipSystem On Chip
System On Chip
 
Superscalar and VLIW architectures
Superscalar and VLIW architecturesSuperscalar and VLIW architectures
Superscalar and VLIW architectures
 
System On Chip
System On ChipSystem On Chip
System On Chip
 
SoC Design
SoC DesignSoC Design
SoC Design
 
VLIW Processors
VLIW ProcessorsVLIW Processors
VLIW Processors
 
SOC design
SOC design SOC design
SOC design
 
Arm processor
Arm processorArm processor
Arm processor
 
UNIT 4 B.docx
UNIT 4 B.docxUNIT 4 B.docx
UNIT 4 B.docx
 
ARM architcture
ARM architcture ARM architcture
ARM architcture
 
RISC (reduced instruction set computer)
RISC (reduced instruction set computer)RISC (reduced instruction set computer)
RISC (reduced instruction set computer)
 
ARM Processors
ARM ProcessorsARM Processors
ARM Processors
 
ARM CORTEX M3 PPT
ARM CORTEX M3 PPTARM CORTEX M3 PPT
ARM CORTEX M3 PPT
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)
 
Vxworks
VxworksVxworks
Vxworks
 
Unit II arm 7 Instruction Set
Unit II arm 7 Instruction SetUnit II arm 7 Instruction Set
Unit II arm 7 Instruction Set
 
ARM - Advance RISC Machine
ARM - Advance RISC MachineARM - Advance RISC Machine
ARM - Advance RISC Machine
 
ARM Exception and interrupts
ARM Exception and interrupts ARM Exception and interrupts
ARM Exception and interrupts
 
Vx works RTOS
Vx works RTOSVx works RTOS
Vx works RTOS
 

Viewers also liked

System approach to management
System approach to managementSystem approach to management
System approach to management
17somya
 
organizational structure
organizational structureorganizational structure
organizational structure
Saqiba Nargis
 
Soc government
Soc  governmentSoc  government
Soc government
15muellt
 

Viewers also liked (20)

SOC Peripheral Components & SOC Tools
SOC Peripheral Components & SOC ToolsSOC Peripheral Components & SOC Tools
SOC Peripheral Components & SOC Tools
 
System approach to management
System approach to managementSystem approach to management
System approach to management
 
System approach
System approachSystem approach
System approach
 
Systems Approach to Management
Systems Approach to ManagementSystems Approach to Management
Systems Approach to Management
 
The system Approach of Management
The system Approach of ManagementThe system Approach of Management
The system Approach of Management
 
Spartan-II FPGA (xc2s30)
Spartan-II FPGA (xc2s30)Spartan-II FPGA (xc2s30)
Spartan-II FPGA (xc2s30)
 
Advanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILPAdvanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILP
 
How to Make Effective Presentation
How to Make Effective PresentationHow to Make Effective Presentation
How to Make Effective Presentation
 
System on chip buses
System on chip busesSystem on chip buses
System on chip buses
 
SoC: System On Chip
SoC: System On ChipSoC: System On Chip
SoC: System On Chip
 
Modern theories of management
Modern theories of managementModern theories of management
Modern theories of management
 
Image processing fundamentals
Image processing fundamentalsImage processing fundamentals
Image processing fundamentals
 
Security Operation Center - Design & Build
Security Operation Center - Design & BuildSecurity Operation Center - Design & Build
Security Operation Center - Design & Build
 
Slideshare ppt
Slideshare pptSlideshare ppt
Slideshare ppt
 
organizational structure
organizational structureorganizational structure
organizational structure
 
Soc government
Soc  governmentSoc  government
Soc government
 
Cpdd[1]
Cpdd[1]Cpdd[1]
Cpdd[1]
 
Sonia.Sharma
Sonia.SharmaSonia.Sharma
Sonia.Sharma
 
Chapter 08
Chapter 08Chapter 08
Chapter 08
 
SoC Power Reduction
SoC Power ReductionSoC Power Reduction
SoC Power Reduction
 

Similar to SOC System Design Approach

FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
Dharmesh Tank
 
Module2 MultiThreads.ppt
Module2 MultiThreads.pptModule2 MultiThreads.ppt
Module2 MultiThreads.ppt
shreesha16
 
Digital signal processor architecture
Digital signal processor architectureDigital signal processor architecture
Digital signal processor architecture
komal mistry
 
Computer organization & architecture chapter-1
Computer organization & architecture chapter-1Computer organization & architecture chapter-1
Computer organization & architecture chapter-1
Shah Rukh Rayaz
 

Similar to SOC System Design Approach (20)

Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principles
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and Microcontroller
 
Basics of micro controllers for biginners
Basics of  micro controllers for biginnersBasics of  micro controllers for biginners
Basics of micro controllers for biginners
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Module2 MultiThreads.ppt
Module2 MultiThreads.pptModule2 MultiThreads.ppt
Module2 MultiThreads.ppt
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors ppt
 
Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6
 
esunit1.pptx
esunit1.pptxesunit1.pptx
esunit1.pptx
 
MIPS Implementation and pipelining
MIPS Implementation and pipelining MIPS Implementation and pipelining
MIPS Implementation and pipelining
 
Unit 2 ca- control unit
Unit 2 ca- control unitUnit 2 ca- control unit
Unit 2 ca- control unit
 
lect4_SDNbasic_openflow.pptx
lect4_SDNbasic_openflow.pptxlect4_SDNbasic_openflow.pptx
lect4_SDNbasic_openflow.pptx
 
Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler
 
Scope of parallelism
Scope of parallelismScope of parallelism
Scope of parallelism
 
EC 308 Embedded Systems Module 1 Notes APJKTU
EC 308 Embedded Systems Module 1 Notes APJKTUEC 308 Embedded Systems Module 1 Notes APJKTU
EC 308 Embedded Systems Module 1 Notes APJKTU
 
Digital signal processor architecture
Digital signal processor architectureDigital signal processor architecture
Digital signal processor architecture
 
5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdf5-Embedded processor technology-06-01-2024.pdf
5-Embedded processor technology-06-01-2024.pdf
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computers
 
Computer organization & architecture chapter-1
Computer organization & architecture chapter-1Computer organization & architecture chapter-1
Computer organization & architecture chapter-1
 
RISC.ppt
RISC.pptRISC.ppt
RISC.ppt
 

More from A B Shinde

More from A B Shinde (20)

Communication System Basics
Communication System BasicsCommunication System Basics
Communication System Basics
 
MOSFETs: Single Stage IC Amplifier
MOSFETs: Single Stage IC AmplifierMOSFETs: Single Stage IC Amplifier
MOSFETs: Single Stage IC Amplifier
 
MOSFETs
MOSFETsMOSFETs
MOSFETs
 
Color Image Processing: Basics
Color Image Processing: BasicsColor Image Processing: Basics
Color Image Processing: Basics
 
Edge Detection and Segmentation
Edge Detection and SegmentationEdge Detection and Segmentation
Edge Detection and Segmentation
 
Image Processing: Spatial filters
Image Processing: Spatial filtersImage Processing: Spatial filters
Image Processing: Spatial filters
 
Image Enhancement in Spatial Domain
Image Enhancement in Spatial DomainImage Enhancement in Spatial Domain
Image Enhancement in Spatial Domain
 
Resume Format
Resume FormatResume Format
Resume Format
 
Digital Image Fundamentals
Digital Image FundamentalsDigital Image Fundamentals
Digital Image Fundamentals
 
Resume Writing
Resume WritingResume Writing
Resume Writing
 
Image Processing Basics
Image Processing BasicsImage Processing Basics
Image Processing Basics
 
Blooms Taxonomy in Engineering Education
Blooms Taxonomy in Engineering EducationBlooms Taxonomy in Engineering Education
Blooms Taxonomy in Engineering Education
 
ISE 7.1i Software
ISE 7.1i SoftwareISE 7.1i Software
ISE 7.1i Software
 
VHDL Coding Syntax
VHDL Coding SyntaxVHDL Coding Syntax
VHDL Coding Syntax
 
VHDL Programs
VHDL ProgramsVHDL Programs
VHDL Programs
 
VLSI Testing Techniques
VLSI Testing TechniquesVLSI Testing Techniques
VLSI Testing Techniques
 
Selecting Engineering Project
Selecting Engineering ProjectSelecting Engineering Project
Selecting Engineering Project
 
Interview Techniques
Interview TechniquesInterview Techniques
Interview Techniques
 
Semiconductors
SemiconductorsSemiconductors
Semiconductors
 
Diode Applications & Transistor Basics
Diode Applications & Transistor BasicsDiode Applications & Transistor Basics
Diode Applications & Transistor Basics
 

Recently uploaded

result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
 

Recently uploaded (20)

result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 

SOC System Design Approach

  • 1. Introduction to the Systems Approach Mr. A. B. Shinde Assistant Professor, Electronics Engineering, PVPIT, Budhgaon, Sangli shindesir.pvp@gmail.com
  • 2. System Architecture • The term architecture denotes: – The operational structure and – The user’s view of the system. • Over time, it has evolved to include – Functional specification and – Hardware implementation. • The system architecture defines the system - level building blocks 2
  • 3. System Architecture: An Overview 3 The increasing transistor density on a silicon die The decrease of transistor cost over the years
  • 4. Components of the System • Processors (core, multimedia & coprocessors) • Memories (storing elements) • Interconnects 4 A basic SOC system model
  • 5. Hardware and Software • Crucial decision to be made is: 1. Which components are to be implemented in hardware and 2. Which components are to be implemented in software. • Software implementation usually uses general - purpose processor (GPP), which interprets instructions at run time. • This architecture offers flexibility and adaptability and provides a way of sharing resources among different applications • Hardware implementation of the ISA (Instruction Set Architecture) is generally slower and consumes more power. 5
  • 6. Hardware and Software 6  Most software developers use high - level languages and tools that enhance productivity, such as program development environments, optimizing compilers and performance profilers.  Direct implementation of applications using hardware (ASICs), provides high performance at the expense of programmability, flexibility, productivity and cost. Benefits Drawbacks Hardware Fast, low power consumption Inflexible, unadaptable, complex to build and test Software Flexible, Adaptable, simple to build and test Slow, high power consumption
  • 7. Hardware and Software 7 Programmability versus Performance Coarse - Grained Reconfigurable Architecture Application – Specific Instruction Processors
  • 8. System Architecture: An Overview • Custom ASIC hardware and software on GPPs can be seen as two extremes in the technology spectrum with different trade – offs • An ASIP is a processor with an instruction set customized for a specific application or domain. • An FPGA typically contains an array of computation units, memories, and their interconnections, all are programmable in the field by user. • Because of the growing demand for reducing the time to market and the increasing cost of chip fabrication, FPGAs are becoming more popular for implementing digital designs. • Coarse - Grained Reconfigurable Architecture (CGRA) It contains logic blocks that process byte - wide or multiple byte - wide data, which can form building blocks of datapaths. • Structured ASIC It allows application builders to customize the resources before fabrication. While it offers performance close to that of ASIC. • Digital Signal Processors (DSPs) The organization and instruction set for these devices are optimized for digital signal processing applications. 8
  • 9. Processor Architectures • Processors are characterized either by their application or by their architecture (or structure) 9 Processors Identified by Function Processors Identified by Architecture
  • 10. Processor Architectures • Sequential processors execute one instruction at a time. • Many processors have the capability to execute several instructions concurrently by means of pipelining, multiple execution units, and multiple cores. • Pipelining is a powerful technique. • Instruction - level parallelism (ILP) means that multiple operations can be executed in parallel within a program. • ILP may be achieved with hardware, compiler, or operating system, provided that there is no data dependency • In general, a computer architecture consists of one or more interconnected processor elements (PEs) that operate concurrently, solving a single overall problem. 10
  • 11. Instruction - level parallelism (ILP) • Instruction-level parallelism (ILP) is a measure of how many of the operations in a computer program can be performed simultaneously. Consider the following program: • For Example: 1. e = a + b 2. f = c + d 3. g = e * f Here, Operation 3 depends on the results of operations 1 and 2, so it cannot be calculated until both of them are completed. As, operations 1 and 2 do not depend on any other operation, so they can be calculated simultaneously. If each operation is completed in one unit of time then three instructions can be completed in two units of time, giving an ILP of 3/2. 11
  • 12. Processor: A Functional View • Table shows different SOC designs and the processor used in each design. • For these examples, we can characterize them as general purpose, or special purpose. 12 Processor Models for Different SOC Examples
  • 13. Processor: An Architectural View • Simple Sequential Processor directly implement the sequential execution model. • The next instruction is not processed until all execution for the current instruction is complete and its results have been committed. 13 Instruction execution sequence. 1. IF - fetching the instruction into the instruction register (IF), 2. ID - decoding the opcode of the instruction (ID), 3. AG - generating the address in memory of any data item residing there, 4. DF - fetching data operands into executable registers (DF), 5. EX - executing the specified operation (EX), and 6. WB - writing back the result to the register file (WB).
  • 14. System Architecture: An Overview • Sequential processor model 14 During execution, a sequential processor executes one or more operations per clock cycle from the instruction stream. An instruction is a container that represents the smallest execution packet managed explicitly by the processor. Scalar and superscalar processors consume one or more instructions per cycle, where each instruction contains a single operation.
  • 15. System Architecture: An Overview • Pipelined Processor Pipelining is approach to exploit parallelism that is based on concurrently performing different phases (instruction fetch, decode, execution, etc.) of processing an instruction. • Pipelining assumes that these phases are independent — when this condition does not hold, the processor stalls the downstream phases to enforce the dependency. • Thus, multiple operations can be processed simultaneously with each operation at a different phase of its processing. 15 Instruction timing in a pipelined processor.
  • 16. System Architecture: An Overview • Pipelined Processor 16 Figure shows the general form of a pipelined processor. Static pipeline, requires the processor to go through all stages or phases of the pipeline. Dynamic pipeline allows the bypassing of one or more pipeline stages, depending on the requirements of the instruction. The more complex dynamic pipelines allow instructions to complete out of (sequential) order, or even to initiate out of order.
  • 17. System Architecture: An Overview • When pipelining does not execute multiple instructions at exactly the same time, there are other techniques to do the same. • Most instructions consist of only a single operation, this kind of parallelism has been named ILP (instruction level parallelism). • Two architectures that exploit (implement) ILP are superscalar and VLIW (Very Long Instruction Word) processors. • A superscalar processor dynamically examines the instruction stream to determine which operations are independent and can be executed. • A VLIW processor relies on the compiler to analyze the available operations (OP) and to schedule independent operations into wide instruction words. 17
  • 18. System Architecture: An Overview • Figure shows the instruction timing of a pipelined superscalar or VLIW processor executing two instructions per cycle. • In this case, all the instructions are independent so that they can be executed in parallel. 18
  • 19. System Architecture: An Overview • Superscalar Processors: • Dynamic pipelined processors has limitations to execute a single operation per cycle. • This limitation can be avoided with the addition of multiple functional units and a dynamic scheduler to process more than one instruction per cycle. • Superscalar processors can achieve execution rates of several instructions per cycle (usually limited to two, but more is possible depending on the application). 19
  • 20. System Architecture: An Overview 20 Superscalar processor model
  • 21. System Architecture: An Overview • Superscalar Processors: • A superscalar processor adds a scheduling instruction window that analyzes multiple instructions from the instruction stream in each cycle. • These instructions are treated in the same manner as in a pipelined processor. • Because of the complexity of the dynamic scheduling logic, high – performance superscalar processors are limited to processing four to six instructions per cycle. • In Flynn's taxonomy, a single-core superscalar processor is classified as an SISD processor whereas a multi-core superscalar processor is classified as an MIMD processor 21
  • 22. System Architecture: An Overview • VLIW (Very Long Instruction Word) Processors 22 In contrast to dynamic analysis in hardware to determine which operations can be executed in parallel, VLIW processors rely on static analysis in the compiler VLIW processor model
  • 23. System Architecture: An Overview • VLIW processors are less complex than superscalar processors and have the potential for higher performance. • A VLIW processor executes operations from statically scheduled instructions. • VLIW processors rely on the static analysis performed by the compiler and are unable to take advantage of any dynamic execution characteristics. • A simple VLIW implementation results in high performance. 23
  • 24. System Architecture: An Overview • VLIW processors: • VLIW instruction encodes multiple operations, at least one operation for each execution unit of a device. • For example, if a VLIW device has four execution units, then a VLIW instruction for the device has four operation fields, each field specifying what operation should be done on that corresponding execution unit. • To accommodate these operation fields, VLIW instructions are usually at least 64 bits wide, and far wider on some architectures. • For example: • In one cycle, it does a floating-point multiply, a floating-point add, and two auto increment loads. All of this fits in one 48-bit instruction: f12 = f0 * f4, f8 = f8 + f12, f0 = dm(i0, m3), f4 = pm(i8, m9); 24
  • 25. System Architecture: An Overview • SIMD Architectures: 25
  • 26. System Architecture: An Overview • SIMD Architectures: 26 SIMD Processable Patterns SIMD Unprocesable Patterns Example: Brightness Computation by SIMD Operations
  • 27. System Architecture: An Overview • SIMD Architectures: Array and Vector Processors: • The SIMD class of processor architecture includes both array and vector processors. • From the view of an assembly - level programmer, programming SIMD architecture appears to be similar to programming a simple processor. 27
  • 28. System Architecture: An Overview • SIMD Architectures: Array and Vector Processors: • The two popular types of SIMD processor are the array processor and the vector processor. • They differ both in their implementations and in their data organizations. • An array processor consists of many interconnected processor elements, each having their own local memory space. • A vector processor consists of a single processor that references a global memory space and has special function units that operate on vectors. 28
  • 29. System Architecture: An Overview • Array processor 29 Array processor model. Array processor: Is a set of parallel processor elements connected via one or more networks, possibly including local and global inter element communications and control communications. Each processor element (PE) has its own private memory. Data is distributed across the elements in a regular fashion… i.e. dependent on both the actual structure of the data and also the computations to be performed on the data.
  • 30. System Architecture: An Overview • Array processor: • Direct access to global memory or another processor element’s local memory is expensive, so intermediate values are propagated through the array through local inter-processor connections. • Since instructions are broadcast, processor element cannot alter the flow of the instruction stream; however, individual processor elements can conditionally disable instructions based on local status information i.e. processor elements remains idle. 30
  • 31. System Architecture: An Overview • Array processor: • The actual instruction stream consists of more than a fixed stream of operations. • An array processor is typically coupled to a general - purpose control processor that provides both scalar operations as well as array operations that are broadcasted to all processor elements in the array. • The control processor performs the scalar sections of the application, interfaces with the outside world, and controls the flow of execution; • The array processor performs the array sections of the application as directed by the control processor. 31
  • 32. System Architecture: An Overview • Array processor: 32
  • 33. System Architecture: An Overview • Vector Processors: • A vector processor is a single processor that resembles a traditional single stream processor, except that some of the functional units (and registers) operate on vectors (sequences of data values). • These functional units are deeply pipelined and have high clock rates. • The vector pipelines often have the rapid delivery of the input vector data elements, together with the high clock rates, results in a significant throughput. 33
  • 34. System Architecture: An Overview 34 Vector processor model.
  • 35. System Architecture: An Overview • A typical vector processor configuration consists of: a vector register file, one vector addition unit, one vector multiplication unit, and one vector reciprocal unit (used along with the vector multiplication unit to perform division) 35 Vector processor model.
  • 36. System Architecture: An Overview • In modern vector processors, vectors are loaded into special vector registers and stored back into memory • Vector processors have several features that enable them to achieve high performance. • One of the feature is, the ability to concurrently load and store values between the vector register file and the main memory while performing computations on values in the vector register file. • If the length of vector register is limited and vectors longer than the register length needs to be processed then they are processed in segments. 36
  • 37. System Architecture: An Overview • Most vector processors support a result bypassing: — that allows a follow - on computation to commence as soon as the first value is available from the preceding computation. Thus, instead of waiting for the entire vector to be processed, the follow - on computation can be significantly overlapped with the preceding computation that it is dependent on. 37
  • 38. Memory and Addressing • Memory requirements in SOC applications varies significantly. • In one case, the memory program can reside entirely in an on-chip ROM, with the data in on-chip RAM. • In another case, the memory system might support an elaborate operating system requiring a large off-chip memory (system on a board), with a memory management unit and cache. 38
  • 39. Memory and Addressing • Why not simply include memory with the processor on the die? • This has many attractions (Benefits of embedding memory & processor on same die): 1. It improves the accessibility of memory, improving both memory access time and bandwidth. 2. It reduces the need for large cache. 3. It improves performance for memory based applications. 39
  • 40. Memory and Addressing • Drawbacks of embedding memory & processor on same die: • First problem is that DRAM memory process technology differs from standard microprocessor process technology, and would cause some sacrifice in achievable bit density. • Second problem is more serious: If memory were restricted to the processor die, its size would be correspondingly limited. • Thus, the conventional processor die model has evolved to implement multiple robust homogeneous processors sharing the higher levels of a two or three-level cache structure with the main memory off-die, on its own multi die module. 40
  • 41. Memory and Addressing • From a design complexity point of view, this has the advantage of being a ― universal ‖ solution: One implementation fits all applications, although not necessarily equally well. • So, the production quantities can be large enough to justify the costs. 41
  • 42. Memory and Addressing • An alternative to the previous approach is integrate the memory on system die. • For specific applications, whose memory size can be bounded, we can implement an integrated memory SOC. This concept is illustrated in above figure 42
  • 43. Memory and Addressing • SOC Memory Examples • It is important for SOC designers to consider whether to put RAM and ROM on-die or off-die. • Table shows various examples of SOC embedded memory macro cell. 43 SOC Memory Considerations
  • 44. Memory and Addressing • Addressing: The Architecture of Memory • Addressing facilities are available to the programmer. • Some facilities are available to the application programmer and some to the operating system programmer. • Virtual memory enables programs requiring larger storage than the physical memory. • When virtual addressing facilities are properly implemented and programmed, memory can be efficiently and securely accessed. 44
  • 45. Memory and Addressing • Virtual memory is often supported by a memory management unit. • The physical memory address is determined by a sequence of (at least) three steps: 1. The application produces a process address, this together with the process or user ID, defines the virtual address: virtual address = offset + (program) base + index , where the offset is specified in the instruction while the base and index values are in specified registers. 45
  • 46. Memory and Addressing • Virtual memory is often supported by a memory management unit. • The physical memory address is determined by a sequence of (at least) three steps: 2. Since multiple processes must cooperate in the same memory space, the process addresses must be coordinated and relocated. This is typically done by a segment table. Upper bits of the virtual address are used to address a segment table. system address = virtual address + (process) base where the system address must be less than the bound . 46
  • 47. Memory and Addressing • Virtual memory is often supported by a memory management unit. • The physical memory address is determined by a sequence of (at least) three steps: 3. Virtual versus real. For many SOC applications, the memory space exceeds the available (real) implemented memory. Memory space is implemented on disk and only the recently used regions (pages) are brought into memory. 47
  • 48. System - Level Interconnection • SOC technology typically relies on the interconnection of predesigned circuit modules (known as intellectual property (IP) blocks) to form a complete system. • In this way, the design task is raised from a circuit level to a system level. • Central to the system – level performance and the reliability of the finished product is dependent on the method of interconnection used. • A well - designed interconnection scheme should have efficient communication protocols 48
  • 49. System - Level Interconnection • It should provide efficient communication between different modules maximizing the degree of parallelism. • SOC interconnect methods can be classified into two main approaches: buses and network - on - chip, 49
  • 50. System - Level Interconnection • Bus - Based Approach 50 With the bus - based approach, IP blocks are designed to conform to published bus standards (AMBA or CoreConnect). Communication between modules is achieved through the sharing of the physical connections of address, data, and control bus signals. Usually, two or more buses are employed in a system, organized in a hierarchical fashion. To optimize system - level performance and cost, the bus closest to the CPU has the highest bandwidth, and the bus farthest from the CPU has the lowest bandwidth.
  • 51. System - Level Interconnection • Network - on - Chip Approach 51 A network - on - chip system consists of an array of switches, either dynamically switched as in a crossbar or statically switched as in a mesh. The crossbar approach uses asynchronous channels to connect synchronous modules that can operate at different clock frequencies. This approach has the advantage of higher throughput than a bus - based system while making integration of a system with multiple clock domains easier.
  • 52. System - Level Interconnection • Network - on - Chip Approach 52 This interconnect scheme is based on a two - dimensional mesh topology. All communications between switches are conducted through data packets, routed through the router interface circuit within each node. Since the interconnections between switches have a fixed distance, interconnect - related problems such as wire delay and cross talk noise are much reduced.
  • 53. System - Level Interconnection 53 Interconnect Models for Different SOC Examples
  • 54. An Approach For Soc Design • Two important ideas in a design process are: 1. Figuring out the requirements and specifications, and 2. Iterating through different stages of design toward an efficient and effective completion. • Requirements and Specifications • Design Iteration – Initial Design – Optimized Design 54
  • 55. An Approach For Soc Design • Requirements and Specifications : • There must be a deep understanding before a starting of design process. • They are useful at the beginning and at the end of the design process: – At the beginning, to clarify what needs to be achieved; and – At the end, as a reference against which the completed design can be evaluated. 55
  • 56. An Approach For Soc Design • Requirements and Specifications : • The system requirements are the externally generated criteria for the system. • They may come from competition, from sales insights, from customer requests, from product profitability analysis, or from a combination of all. • Requirements are rarely definitive of anything about the system. Indeed, requirements can frequently be unrealistic: “I want it fast, I want it cheap, and I want it now ” 56
  • 57. An Approach For Soc Design • Requirements and Specifications : • It is important for the designer to analyze carefully the requirements, expressions, and to spend sufficient time in understanding the market situation to determine all the factors expressed in the requirements. 57
  • 58. An Approach For Soc Design • Requirements and Specifications : • Some of the factors the designer considers in determining requirements include • Compatibility with previous designs or published standards, • Reuse of previous designs, • Customer requests/complaints, • Sales reports, • Cost analysis, • Competitive equipment analysis, and • Trouble reports (reliability) of previous products and competitive products. 58
  • 59. An Approach For Soc Design • Design Iteration: • Design is always an iterative process. • So, the obvious question is how to get the very first, initial design? • This is the design that we can then iterate through and optimize according to the design criteria. 59
  • 60. An Approach For Soc Design • Design Iteration: • Initial Design: • This is the first design that meets the key requirements, while other performance and cost criteria’s are not considered. • For instance, processor or memory or input/output (I/O) should be sized to meet all (high – priority, real – time) constraints. • Promising components and their parameters are selected for better performance. • It is simplified model of the expected computational capacity or data bandwidth capability. 60 The SOC initial design process
  • 61. An Approach For Soc Design • Design Iteration: • Optimized Design: • Once the base performance (or area) requirements are met and the base functionality is ensured, then the goal is to minimize: – the cost (area) and/or – the power consumption or – the design effort required to complete the design. • This is the iterative step of the process. • The first steps of this process use higher - fidelity tools (simulations, trial layouts, etc.) to ensure that the initial design actually does satisfy the design specifications and requirements. • The later steps refine, complete, and improve the design according to the design criteria. 61
  • 62. System Architecture and Complexity • The basic difference between processor architecture and system architecture is that the system adds another layer of complexity, and the complexity of these systems limits the cost savings. • Historically, the computer is a single processor plus a memory. • As long as this is fixed (within broad tolerances), implementing that processor on one or more silicon die does not change the design complexity. • Once die densities enable a scalar processor to fit on a chip, the complexity issue changes. 62
  • 63. System Architecture and Complexity • As transistor densities significantly improve after this point, there are obvious processor extension strategies to improve performance: 1. Additional Cache: Here we add cache storage and, as large caches have slower access times, a second - level cache. 2. A More Advanced Processor: We implement a superscalar or a VLIW processor that executes more than one instruction each cycle. Additionally, we can speed up the execution units that affects the critical path delay, especially the floating - point execution times. 3. Multiple Processors: Now we implement multiple (superscalar) processors and their associated multilevel caches. 63
  • 64. System Architecture and Complexity • Instead of the 100,000 transistor processors, our advanced processor has millions of transistors. • The way to manage this complexity is to reuse designs. • So, reusing several simpler processor designs implemented on a die is preferable to a new, more advanced, single processor. • For this to work, we need a good interconnection mechanism to access the various processors and memory. 64 Complexity of design
  • 65. System Architecture and Complexity • So, when an application is well specified, the system-on-a-chip approach includes 1. Multiple (usually) heterogeneous processors, each specialized for specific parts of the application; 2. The main memory with (often) ROM for partial program storage; 3. A relatively simple, small (single - level) cache structure or buffering schemes associated with each processor; and 4. A bus or switching mechanism for communications. 65
  • 66. Product Economics And Implications For SoC • Factors Affecting Product Costs • The basic cost and profitability of a product depend on many factors: • its technical appeal, • its cost, • the market size, and • the effect the product has on future products. • The issue of cost goes well beyond the product’s manufacturing cost. 66 Project cost components
  • 67. Product Economics And Implications For SoC • Factors Affecting Engineering Costs 67
  • 68. Product Economics And Implications For SoC • Modelling Product Economics and Technology Complexity • To put all this into perspective, consider a general model of a product’s average unit cost: unit cost = (project cost) / (number of units). • The product cost is simply the sum of all the fixed and variable costs. • We represent the fixed cost as a constant, Kf . • It is also clear that the variable costs are of the form Kv × n , where n is the number of units. 68
  • 69. Thank You… This presentation is published only for Educational Purpose 69