SlideShare a Scribd company logo
An introduction to
DSP’s
Examples of DSP applications
Why a DSP?
Characteristics of a DSP
Architectures
DSP example: mobile phone
DSP example: mobile phone with video camera
DSP: applications
Why a DSP?
It’s easy: we want an architecture optimized for Digital
Signal Processing
Some versions are further optimized for some specific
applications
- e.g. very low power consumption for mobile phones
Which is the difference between a DSP and a
general purpose processor? (1/4)
Memory architecture and bus
The first processors (in the ‘40) had a Harvard
architecture: separate memories for program and data
But it’s complex -> soon replaced by Von Neumann
architecture: no real difference between program and
data (an instruction has two fields: operation and data)
Problem: the processor cannot access instructions and
data simultaneously
To improve performance: Harvard architecture again!
In particular
- separate memories and busses for program and data
- possibly, another separate bus for the DMA
Which is the difference between a DSP and a
general purpose processor? (2/4)
A DSP is often used to realize a linear filter
The convolution integral
is actually a sum:
yn=Σixn-ihi
- if the number of sums is finite: FIR filter (finite impulse
response),
- otherwise: IIR (infinite impulse response),
- which can be realized using two finite sums:
yn=Σixn-ibi + Σiyn-iai
Which is the difference between a DSP and a
general purpose processor? (3/4)
A common operation in a FIR or IIR filter is A=BC+D: we
need
- a hardware multiplier (introduced in DSPs in the '70)
- a multiply and accumulate in only one clock cycle: MAC
instruction.
Actually, the MAC is in a loop: we also need a zero
overhead loop:
- H/W for address generation (the access to memory is
not random)
- loop management
- auto-increment; circular addressing
Other possible H/W:
- H/W saturation
- Instructions to perform a division quickly
- Bit reversal for FFT
Which is the difference between a DSP and a
general purpose processor? (4/4)
Other possible features:
Often, data are 16- o 8-bit wide (e.g., audio or images)
- a 32-bit ALU can be splitted in two 16-bit ALUs or four
8-bit ALUs,
-> 2 o 4 operations in parallel
several ALUs which work in parallel
fixed point ALUs, o 16-bit ALUs, to reduce power
consumption and costs
optimized versions:
- cost: for consumer applications
- power: for mobile applications
- for specific applications, e.g. electric motor control
Example: ‘C30
(Texas Instruments,
1982)
Example: FIR filter
using a ‘C30
Note: several of these characteristics, which were born on
DSPs, have been ported to general purpose processors
E.g.: the cache in the
Pentium processor is
Harvard-like
Another example.: several units working in parallel, and
splittable ALUs (see. MMX extensions) in the Pentium 4
processor
Pipeline…
Example of a 4-stage pipeline (TI ‘C30)
each instruction is executed in 4 clock cycles, but (normally)
can be put just 1 cycle after the previous one (data are
needed only 3 cycles later)
Pipeline: branch (e.g. on the ‘C30)
Standard branch: the pipeline is flushed to correctly handle
the PC -> 4 cycles
Delayed branch: the pipeline is not flushed, and the 3
following instructions are loaded before modifying the PC
-> only 1 cycle needed!
BRD label ; delayed branch
MPYF ; executed
ADDF ; executed
SUBF ; executed
AND ; not executed
…
…
label MPYF ; fetched after SUBF
…
Two architectures
In order to exploit the instruction level parallelism (ILP): two
possible architectures
- Superscalar: the parallelism is dynamically managed by
the hardware
- Very Long Instruction Word (VLIW): the parallelism is
statically managed by the compiler
Which is the problem?
Dependences in data or control can generate conflicts
- on data (an instruction needs the result of a previous
instruction, but the results is not ready yet), or
- on control (conditional jump, but the condition is not ready
yet)
-> pipeline stall
Superscalar
The analysis of the independent instructions is dynamically
done by hardware (which is complex!)
The sequence of instructions can be executed out-of-order;
then, the completion of the instructions (commit) is done in-
order to correctly update the state of the CPU
VLIW
Very Long Instruction Word (VLIW): the parallelism is
statically managed by the compiler
The analysis of independent instructions is statically
realized during the compilation phase;
- the instructions which can be realized in parallel are
assembled in long instructions and send to the various
functional units in-order
Convenient solution for DSP programs (fixed length
cycles, few conditional operations); less convenient for
general purpose applications
Simpler hardware! But a specific compilation for each
platform is needed
Deterministic behaviour -> exact computation of
execution times

More Related Content

What's hot

Digital signal processor architecture
Digital signal processor architectureDigital signal processor architecture
Digital signal processor architecture
komal mistry
 
Digital signal processor part4
Digital signal processor part4Digital signal processor part4
Digital signal processor part4
Vaagdevi College of Engineering
 
DSP Processor
DSP Processor DSP Processor
DSP Processor
Laxmikant Kalkonde
 
The evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'sThe evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'s
Ritul Sonania
 
1 introduction to dsp processor 20140919
1 introduction to dsp processor 201409191 introduction to dsp processor 20140919
1 introduction to dsp processor 20140919
Hans Kuo
 
Digital signal processors
Digital signal processorsDigital signal processors
Digital signal processors
Prem Ranjan
 
digital signal processing
digital signal processing digital signal processing
digital signal processing
Marmik Kothari
 
Advanced Topics In Digital Signal Processing
Advanced Topics In Digital Signal ProcessingAdvanced Topics In Digital Signal Processing
Advanced Topics In Digital Signal Processing
Jim Jenkins
 
Dsp ajal
Dsp  ajalDsp  ajal
Dsp ajal
AJAL A J
 
Real-Time Signal Processing: Implementation and Application
Real-Time Signal Processing:  Implementation and ApplicationReal-Time Signal Processing:  Implementation and Application
Real-Time Signal Processing: Implementation and Application
sathish sak
 
Introduction to digital signal processing 2
Introduction to digital signal processing 2Introduction to digital signal processing 2
Introduction to digital signal processing 2
Hossam Hassan
 
Introduction to DSP - Digital Signal Processing
Introduction to DSP - Digital Signal ProcessingIntroduction to DSP - Digital Signal Processing
Introduction to DSP - Digital Signal Processing
Dr. Shivananda Koteshwar
 
DIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSINGDIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSING
Snehal Hedau
 
3F3 – Digital Signal Processing (DSP) - Part1
3F3 – Digital Signal Processing (DSP) - Part13F3 – Digital Signal Processing (DSP) - Part1
3F3 – Digital Signal Processing (DSP) - Part1
op205
 
Introduction to dsp
Introduction to dspIntroduction to dsp
Practical Digital Signal Processing for Engineers and Technicians
Practical Digital Signal Processing for Engineers and TechniciansPractical Digital Signal Processing for Engineers and Technicians
Practical Digital Signal Processing for Engineers and Technicians
Living Online
 
Convolution
ConvolutionConvolution
Convolution
sridharbommu
 

What's hot (17)

Digital signal processor architecture
Digital signal processor architectureDigital signal processor architecture
Digital signal processor architecture
 
Digital signal processor part4
Digital signal processor part4Digital signal processor part4
Digital signal processor part4
 
DSP Processor
DSP Processor DSP Processor
DSP Processor
 
The evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'sThe evolution of TMS, family of DSP\'s
The evolution of TMS, family of DSP\'s
 
1 introduction to dsp processor 20140919
1 introduction to dsp processor 201409191 introduction to dsp processor 20140919
1 introduction to dsp processor 20140919
 
Digital signal processors
Digital signal processorsDigital signal processors
Digital signal processors
 
digital signal processing
digital signal processing digital signal processing
digital signal processing
 
Advanced Topics In Digital Signal Processing
Advanced Topics In Digital Signal ProcessingAdvanced Topics In Digital Signal Processing
Advanced Topics In Digital Signal Processing
 
Dsp ajal
Dsp  ajalDsp  ajal
Dsp ajal
 
Real-Time Signal Processing: Implementation and Application
Real-Time Signal Processing:  Implementation and ApplicationReal-Time Signal Processing:  Implementation and Application
Real-Time Signal Processing: Implementation and Application
 
Introduction to digital signal processing 2
Introduction to digital signal processing 2Introduction to digital signal processing 2
Introduction to digital signal processing 2
 
Introduction to DSP - Digital Signal Processing
Introduction to DSP - Digital Signal ProcessingIntroduction to DSP - Digital Signal Processing
Introduction to DSP - Digital Signal Processing
 
DIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSINGDIGITAL SIGNAL PROCESSING
DIGITAL SIGNAL PROCESSING
 
3F3 – Digital Signal Processing (DSP) - Part1
3F3 – Digital Signal Processing (DSP) - Part13F3 – Digital Signal Processing (DSP) - Part1
3F3 – Digital Signal Processing (DSP) - Part1
 
Introduction to dsp
Introduction to dspIntroduction to dsp
Introduction to dsp
 
Practical Digital Signal Processing for Engineers and Technicians
Practical Digital Signal Processing for Engineers and TechniciansPractical Digital Signal Processing for Engineers and Technicians
Practical Digital Signal Processing for Engineers and Technicians
 
Convolution
ConvolutionConvolution
Convolution
 

Similar to 01 dsp intro_1

Embedded systems-unit-1
Embedded systems-unit-1Embedded systems-unit-1
Embedded systems-unit-1
Prabhu Mali
 
Ch2 embedded processors-i
Ch2 embedded processors-iCh2 embedded processors-i
Ch2 embedded processors-i
Ankit Shah
 
unit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptxunit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptx
KandavelEee
 
Introduction to DSP Processors-UNIT-6
Introduction to DSP Processors-UNIT-6Introduction to DSP Processors-UNIT-6
Introduction to DSP Processors-UNIT-6
Ananda Gopathoti
 
DSP Processor.pptx
DSP Processor.pptxDSP Processor.pptx
DSP Processor.pptx
AswathSelvaraj
 
Unit2 arm
Unit2 armUnit2 arm
Unit2 arm
Karthik Vivek
 
Reconfigurable computing
Reconfigurable computingReconfigurable computing
Reconfigurable computing
Sudhanshu Janwadkar
 
Presentation on risc pipeline
Presentation on risc pipelinePresentation on risc pipeline
Presentation on risc pipeline
Arijit Chakraborty
 
Bc0040
Bc0040Bc0040
Bc0040
hayerpa
 
Msp 430 architecture module 1
Msp 430 architecture module 1Msp 430 architecture module 1
Msp 430 architecture module 1
SARALA T
 
viva q&a for mp lab
viva q&a for mp labviva q&a for mp lab
viva q&a for mp lab
g yugandhar srinivas
 
esunit1.pptx
esunit1.pptxesunit1.pptx
esunit1.pptx
AmitKumar7572
 
Introduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSPIntroduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSP
Pantech ProLabs India Pvt Ltd
 
Arm arc-2016
Arm arc-2016Arm arc-2016
Arm arc-2016
Mohammed Gomaa
 
Nt1310 Unit 5 Algorithm
Nt1310 Unit 5 AlgorithmNt1310 Unit 5 Algorithm
Nt1310 Unit 5 Algorithm
Angie Lee
 
CISC & RISC Architecture
CISC & RISC Architecture CISC & RISC Architecture
CISC & RISC Architecture
Suvendu Kumar Dash
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
Marina Kolpakova
 
Pipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline Processor
Smit Shah
 
Computer organization research, everything that u want (2020)
Computer organization research, everything that u want (2020)Computer organization research, everything that u want (2020)
Computer organization research, everything that u want (2020)
Ahmed Magdy
 
E-Note_19681_Content_Document_20240512114009AM.pdf
E-Note_19681_Content_Document_20240512114009AM.pdfE-Note_19681_Content_Document_20240512114009AM.pdf
E-Note_19681_Content_Document_20240512114009AM.pdf
gowdapriya678
 

Similar to 01 dsp intro_1 (20)

Embedded systems-unit-1
Embedded systems-unit-1Embedded systems-unit-1
Embedded systems-unit-1
 
Ch2 embedded processors-i
Ch2 embedded processors-iCh2 embedded processors-i
Ch2 embedded processors-i
 
unit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptxunit 1ARM INTRODUCTION.pptx
unit 1ARM INTRODUCTION.pptx
 
Introduction to DSP Processors-UNIT-6
Introduction to DSP Processors-UNIT-6Introduction to DSP Processors-UNIT-6
Introduction to DSP Processors-UNIT-6
 
DSP Processor.pptx
DSP Processor.pptxDSP Processor.pptx
DSP Processor.pptx
 
Unit2 arm
Unit2 armUnit2 arm
Unit2 arm
 
Reconfigurable computing
Reconfigurable computingReconfigurable computing
Reconfigurable computing
 
Presentation on risc pipeline
Presentation on risc pipelinePresentation on risc pipeline
Presentation on risc pipeline
 
Bc0040
Bc0040Bc0040
Bc0040
 
Msp 430 architecture module 1
Msp 430 architecture module 1Msp 430 architecture module 1
Msp 430 architecture module 1
 
viva q&a for mp lab
viva q&a for mp labviva q&a for mp lab
viva q&a for mp lab
 
esunit1.pptx
esunit1.pptxesunit1.pptx
esunit1.pptx
 
Introduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSPIntroduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSP
 
Arm arc-2016
Arm arc-2016Arm arc-2016
Arm arc-2016
 
Nt1310 Unit 5 Algorithm
Nt1310 Unit 5 AlgorithmNt1310 Unit 5 Algorithm
Nt1310 Unit 5 Algorithm
 
CISC & RISC Architecture
CISC & RISC Architecture CISC & RISC Architecture
CISC & RISC Architecture
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
 
Pipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline Processor
 
Computer organization research, everything that u want (2020)
Computer organization research, everything that u want (2020)Computer organization research, everything that u want (2020)
Computer organization research, everything that u want (2020)
 
E-Note_19681_Content_Document_20240512114009AM.pdf
E-Note_19681_Content_Document_20240512114009AM.pdfE-Note_19681_Content_Document_20240512114009AM.pdf
E-Note_19681_Content_Document_20240512114009AM.pdf
 

Recently uploaded

怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
rtunex8r
 
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
dtagbe
 
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
thezot
 
cyber crime.pptx..........................
cyber crime.pptx..........................cyber crime.pptx..........................
cyber crime.pptx..........................
GNAMBIKARAO
 
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
3a0sd7z3
 
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
APNIC
 
Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
Tarandeep Singh
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
3a0sd7z3
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
Donato Onofri
 
How to make a complaint to the police for Social Media Fraud.pdf
How to make a complaint to the police for Social Media Fraud.pdfHow to make a complaint to the police for Social Media Fraud.pdf
How to make a complaint to the police for Social Media Fraud.pdf
Infosec train
 
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
APNIC
 

Recently uploaded (11)

怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
 
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
 
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
 
cyber crime.pptx..........................
cyber crime.pptx..........................cyber crime.pptx..........................
cyber crime.pptx..........................
 
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
 
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
 
Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
 
How to make a complaint to the police for Social Media Fraud.pdf
How to make a complaint to the police for Social Media Fraud.pdfHow to make a complaint to the police for Social Media Fraud.pdf
How to make a complaint to the police for Social Media Fraud.pdf
 
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
 

01 dsp intro_1

  • 1. An introduction to DSP’s Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures
  • 3. DSP example: mobile phone with video camera
  • 5. Why a DSP? It’s easy: we want an architecture optimized for Digital Signal Processing Some versions are further optimized for some specific applications - e.g. very low power consumption for mobile phones
  • 6. Which is the difference between a DSP and a general purpose processor? (1/4) Memory architecture and bus The first processors (in the ‘40) had a Harvard architecture: separate memories for program and data But it’s complex -> soon replaced by Von Neumann architecture: no real difference between program and data (an instruction has two fields: operation and data) Problem: the processor cannot access instructions and data simultaneously To improve performance: Harvard architecture again! In particular - separate memories and busses for program and data - possibly, another separate bus for the DMA
  • 7. Which is the difference between a DSP and a general purpose processor? (2/4) A DSP is often used to realize a linear filter The convolution integral is actually a sum: yn=Σixn-ihi - if the number of sums is finite: FIR filter (finite impulse response), - otherwise: IIR (infinite impulse response), - which can be realized using two finite sums: yn=Σixn-ibi + Σiyn-iai
  • 8. Which is the difference between a DSP and a general purpose processor? (3/4) A common operation in a FIR or IIR filter is A=BC+D: we need - a hardware multiplier (introduced in DSPs in the '70) - a multiply and accumulate in only one clock cycle: MAC instruction. Actually, the MAC is in a loop: we also need a zero overhead loop: - H/W for address generation (the access to memory is not random) - loop management - auto-increment; circular addressing Other possible H/W: - H/W saturation - Instructions to perform a division quickly - Bit reversal for FFT
  • 9. Which is the difference between a DSP and a general purpose processor? (4/4) Other possible features: Often, data are 16- o 8-bit wide (e.g., audio or images) - a 32-bit ALU can be splitted in two 16-bit ALUs or four 8-bit ALUs, -> 2 o 4 operations in parallel several ALUs which work in parallel fixed point ALUs, o 16-bit ALUs, to reduce power consumption and costs optimized versions: - cost: for consumer applications - power: for mobile applications - for specific applications, e.g. electric motor control
  • 12. Note: several of these characteristics, which were born on DSPs, have been ported to general purpose processors E.g.: the cache in the Pentium processor is Harvard-like
  • 13. Another example.: several units working in parallel, and splittable ALUs (see. MMX extensions) in the Pentium 4 processor
  • 14. Pipeline… Example of a 4-stage pipeline (TI ‘C30) each instruction is executed in 4 clock cycles, but (normally) can be put just 1 cycle after the previous one (data are needed only 3 cycles later)
  • 15. Pipeline: branch (e.g. on the ‘C30) Standard branch: the pipeline is flushed to correctly handle the PC -> 4 cycles Delayed branch: the pipeline is not flushed, and the 3 following instructions are loaded before modifying the PC -> only 1 cycle needed! BRD label ; delayed branch MPYF ; executed ADDF ; executed SUBF ; executed AND ; not executed … … label MPYF ; fetched after SUBF …
  • 16. Two architectures In order to exploit the instruction level parallelism (ILP): two possible architectures - Superscalar: the parallelism is dynamically managed by the hardware - Very Long Instruction Word (VLIW): the parallelism is statically managed by the compiler Which is the problem? Dependences in data or control can generate conflicts - on data (an instruction needs the result of a previous instruction, but the results is not ready yet), or - on control (conditional jump, but the condition is not ready yet) -> pipeline stall
  • 17. Superscalar The analysis of the independent instructions is dynamically done by hardware (which is complex!) The sequence of instructions can be executed out-of-order; then, the completion of the instructions (commit) is done in- order to correctly update the state of the CPU
  • 18. VLIW Very Long Instruction Word (VLIW): the parallelism is statically managed by the compiler The analysis of independent instructions is statically realized during the compilation phase; - the instructions which can be realized in parallel are assembled in long instructions and send to the various functional units in-order Convenient solution for DSP programs (fixed length cycles, few conditional operations); less convenient for general purpose applications Simpler hardware! But a specific compilation for each platform is needed Deterministic behaviour -> exact computation of execution times