SlideShare a Scribd company logo
CSL 718
Architecture of High Performance Systems
          Minor Test I Solution
                   2008
1. Consider the following architectural changes in a non-pipelined
  processor that has a clock period of T ns, executes N instructions
  to run a particular benchmark with an average of C cycles per
  instruction.
    i) A new instruction is introduced which replaces a sequence of
      operations occurring at several places in that benchmark.
    ii) Pipelining is introduced.
    iii) The stage with maximum propagation delay is split into two
      stages.
  For each of these changes, indicate how are N, T, C, T*C, and
  N*T*C likely to change, giving reasons. Suppose the new
  instruction in i) is able to replace 75% of the instructions executed,
  what is the upper bound on possible performance improvement by
  this change?
Solution:
 i) N will decrease because multiple instructions are being
     replaced by a single instruction. T and C are likely to go up
     because the new instruction has a more complex task to
     perform which would need more cycles and/or the cycles have
     to accommodate more work.

    Assuming that the CPI of the instructions replaced and that of
    the instructions not replaced is same, 25% of the execution
    time is remaining unaffected. Suppose the remaining execution
    time, which is 75%, reduces by a factor k by using the new
    instruction. Then the overall speedup is -
         1
           .75
     .25 +
            k
    This can be at most 4.
ii) Pipelining will lead to overlapped execution of instructions.
    Therefore, C will decrease. Pipelining will ideally tend to make
    C = 1, but because of hazards, it would usually be more than 1.
    If pipeline stages correspond to the original break-up of
    instructions into cycles, T will remain unchanged. N will
    certainly remain unchanged as there is no change in the
    instruction set.

iii) The stage with maximum propagation delay determines the
     clock period T. Therefore, if this stage is split into two stages, T
     will decrease (provided that there was no other stage with the
     same propagation delay). This would also introduce an
     additional cycle for the affected instructions. Therefore, C will
     go up. N will remain unchanged as there is no change in the
     instruction set.
2. A processor has a non-linear pipeline with 4 stages A, B, C and
  D. Each instruction goes through different stages in the following
  order A B C B A D C. Find the bounds on the maximum
  instruction throughput in a static hazard free schedule.

Solution:
The reservation table for this pipeline is as follows.

                     1     2    3     4    5    6        7
                A    X                     X
                B           X         X
                C                X                  X
                D                                X
Intervals which cause collision are:
Row A – 4 Row B – 2 Row C – 4 Row D – none.
Therefore, the initial collision vector is - 001010
No. of 1’s in the initial collision vector = 2.
Therefore, minimum average latency ≤ 2+1 = 3
That is, maximum instruction throughput ≥ 1/3 instructions per
cycle.

Maximum number of checks in a row of the reservation table = 2
Therefore, minimum average latency ≥ 2
That is, maximum instruction throughput ≤ 1/2 instructions per
cycle.
3. Compute the number of cycles lost due to a branch hazard in a
  pipelined processor with 5 stages – instruction fetch (IF), decode
  (D), execute (EX), memory access (M) and write back (WB).
  Assume that in a branch instruction, decision-making as well as
  address calculation are completed in EX stage and also assume
  that the branches are taken 70% of the times. Consider the
  following cases –
i) there is no delayed branch and no branch prediction,
ii) there is one delayed branch slot which is filled with a useful
instruction,
iii) branch is statically predicted to be taken,
iv) there is a branch target address buffer which is looked up in the
IF stage itself and a hit (or miss) in this buffer (assume 80% hit) is
used for predicting the branch to be taken (or not taken).
Solution: Instruction N is the branch instruction and T is the target
instruction. Instructions wrongly started and abandoned are shown
in red and those executed correctly are shown in green. Time slots
in which an instruction is stalled are shown as ██.
i) No delayed branch slot, no branch prediction
(a) branch not taken
N         IF|D |EX
N+1           IF|██|D |EX|M |WB
N+2                ██|IF|D |EX|M |WB
delay = 1
(b) branch taken
N         IF|D |EX
N+1/T         IF|██|IF|D |EX|M |WB
delay = 2
T+1                ██|██|IF|D |EX|M |WB
Average delay = 1*0.3 + 2*0.7 = 1.7
ii) One delayed branch slot, filled with useful instruction N+1

(a) branch not taken
N        IF|D |EX
N+1         IF|D |EX|M |WB
N+2             ██|IF|D |EX|M |WB
delay = 1

(b) branch taken
N        IF|D |EX
N+1         IF|D |EX|M |WB
T              ██|IF|D |EX|M |WB
T+1               ██|IF|D |EX|M |WB
delay = 1

Average delay = 1*0.3 + 1*0.7 = 1.0
iii) Branch statically predicted to be taken

(a) branch not taken (prediction incorrect)
N        IF|D |EX
N+1/T       IF|██|IF
N+1             ██|IF|D |EX|M |WB
delay = 1

(b) branch taken (prediction correct)
N        IF|D |EX
N+1/T        IF|██|IF|D |EX|M |WB
T+1              ██|██|IF|D |EX|M |WB
delay = 2
Average delay = 1*0.3 + 2*0.7 = 1.7
Here branch prediction offers no advantage, because target address
calculation and decision making are happening in the same stage.
iv) Branch target address buffer with 80% hit

(a) hit and branch not taken (prediction incorrect)
N        IF|D |EX
T/N+1       IF|D |IF|D |EX|M |WB
T+1/N+2         IF|██|IF|D |EX|M |WB
delay = 2

(b) hit and branch taken (prediction correct)
N        IF|D |EX
T           IF|D |EX|M |WB
T+1             IF|D |EX|M |WB
delay = 0
(c) miss and branch not taken (prediction correct)
N        IF|D |EX
N+1         IF|D |EX|M |WB
N+2            IF|D |EX|M |WB
delay = 0

(d) miss and branch taken (prediction incorrect)
N        IF|D |EX
N+1/T       IF|D |IF|D |EX|M |WB
N+2/T+1        IF|██|IF|D |EX|M |WB
delay = 2

Average delay = 0.8*(2*0.3 + 0*0.7) + 0.2*(0*0.3 + 2*0.7) =
0.8*0.6 + 0.2*1.4 = 0.76
4. A processor with dynamic scheduling and issue bound operand
  fetch has 3 execution units – one LOAD/STORE unit, one
  ADD/SUB unit and one MUL/DIV unit. It has a reservation
  station with 1 slot per execution unit and a single register file.
  Starting with the following instruction sequence in the instruction
  fetch buffer and empty reservation stations, for each instruction
  find the cycle in which it will be issued and the cycle in which it
  will write result.
                                  Assume out of order issue and out
   load R6, 34(R12)               of order execution. Execute cycles
   load R2, 45(R13)               taken by different instructions are -
   mul R0, R2, R4                 LOAD/STORE : 2
   sub R8, R2, R6                 ADD/SUB :        1
   div R10, R0, R6                MUL :             2
   add R6, R8, R2                 DIV :             4.
Solution:
The following chart shows the execution of the given instruction
sequence cycle by cycle. The stages of instruction execution are
annotated as follows:
IF Instruction fetch
D     Decode and issue
EX1 Execute in LOAD/STORE unit
EX2 Execute in ADD/SUB unit
EX3 Execute in MUL/DIV unit
WB Write back into register file and reservation stations
                 1   2    3   4   5   6   7    8    9    10   11   12   13   14   15   16
 Instr   cycle
   ⇓     no.⇒
load      IF     D   EX1 EX1 WB
                 •    •   •
load      IF                  D EX1 EX1   WB
mul       IF     D                             EX3 EX3   WB
sub       IF     D                             EX2 WB
                 •    •   •   •   •   •   •     •   •
div       IF                                             D        EX3 EX3 EX3 EX3      WB
                 •    •   •   •   •   •   •     •
add       IF                                        D         EX2 WB
Cycles in which an instruction is waiting for a reservation station
are marked as • and the cycles in which an instruction is waiting for
one or more operands are marked as . As seen in the time chart,
the issue and write back cycles for various instructions are as
follows.
 Instruction   issue cycle   write back cycle
   load             1                4
   load             4               7
    mul             1               10
    sub             1               9
    div            10               16
    add             9               12

More Related Content

Viewers also liked

MOINC Server
MOINC ServerMOINC Server
MOINC Server
aravinda777
 
The Other Social, Collaboration Days 2014
The Other Social, Collaboration Days 2014The Other Social, Collaboration Days 2014
The Other Social, Collaboration Days 2014
Stefan Heinz
 
마케팅전쟁 Sp
마케팅전쟁 Sp마케팅전쟁 Sp
마케팅전쟁 Spytkim
 
Lec Feb02 2009
Lec Feb02 2009Lec Feb02 2009
Lec Feb02 2009Ravi Soni
 
Lec 2 Multidisciplinary 183
Lec 2  Multidisciplinary 183Lec 2  Multidisciplinary 183
Lec 2 Multidisciplinary 183Ravi Soni
 
Lec Jan15 2009
Lec Jan15 2009Lec Jan15 2009
Lec Jan15 2009Ravi Soni
 
Lec Jan22 2009
Lec Jan22 2009Lec Jan22 2009
Lec Jan22 2009Ravi Soni
 
Lec Jan12 2009
Lec Jan12 2009Lec Jan12 2009
Lec Jan12 2009Ravi Soni
 

Viewers also liked (8)

MOINC Server
MOINC ServerMOINC Server
MOINC Server
 
The Other Social, Collaboration Days 2014
The Other Social, Collaboration Days 2014The Other Social, Collaboration Days 2014
The Other Social, Collaboration Days 2014
 
마케팅전쟁 Sp
마케팅전쟁 Sp마케팅전쟁 Sp
마케팅전쟁 Sp
 
Lec Feb02 2009
Lec Feb02 2009Lec Feb02 2009
Lec Feb02 2009
 
Lec 2 Multidisciplinary 183
Lec 2  Multidisciplinary 183Lec 2  Multidisciplinary 183
Lec 2 Multidisciplinary 183
 
Lec Jan15 2009
Lec Jan15 2009Lec Jan15 2009
Lec Jan15 2009
 
Lec Jan22 2009
Lec Jan22 2009Lec Jan22 2009
Lec Jan22 2009
 
Lec Jan12 2009
Lec Jan12 2009Lec Jan12 2009
Lec Jan12 2009
 

Similar to Cs718min1 2008soln View

Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
Mazin Alwaaly
 
Spiceを活用した電源回路シミュレーションセミナーテキスト 18 feb2015
Spiceを活用した電源回路シミュレーションセミナーテキスト 18 feb2015Spiceを活用した電源回路シミュレーションセミナーテキスト 18 feb2015
Spiceを活用した電源回路シミュレーションセミナーテキスト 18 feb2015
マルツエレック株式会社 marutsuelec
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
turki_09
 
Pipeline r014
Pipeline   r014Pipeline   r014
Pipeline r014
arunachalamr16
 
Pdc lab manualnew
Pdc lab manualnewPdc lab manualnew
Pdc lab manualnew
ACE ENGINEERING COLLEGE
 
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
Hsien-Hsin Sean Lee, Ph.D.
 
Agilent ADS 模擬手冊 [實習1] 基本操作與射頻放大器設計
Agilent ADS 模擬手冊 [實習1] 基本操作與射頻放大器設計Agilent ADS 模擬手冊 [實習1] 基本操作與射頻放大器設計
Agilent ADS 模擬手冊 [實習1] 基本操作與射頻放大器設計
Simen Li
 
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
vtunotesbysree
 
Two stage op amp design on cadence
Two stage op amp design on cadenceTwo stage op amp design on cadence
Two stage op amp design on cadence
Haowei Jiang
 
FPGA based BCH Decoder
FPGA based BCH DecoderFPGA based BCH Decoder
FPGA based BCH Decoder
ijsrd.com
 
Lecture21-BJT ExamplesAnd Pspice based sSim.pdf
Lecture21-BJT ExamplesAnd Pspice based sSim.pdfLecture21-BJT ExamplesAnd Pspice based sSim.pdf
Lecture21-BJT ExamplesAnd Pspice based sSim.pdf
Balraj Singh
 
P410498102
P410498102P410498102
P410498102
IJERA Editor
 
Electric Circuits LabInstructor- -----------Parallel ResonanceStudent.docx
Electric Circuits LabInstructor-  -----------Parallel ResonanceStudent.docxElectric Circuits LabInstructor-  -----------Parallel ResonanceStudent.docx
Electric Circuits LabInstructor- -----------Parallel ResonanceStudent.docx
rosaliaj1
 
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
VLSICS Design
 
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
VLSICS Design
 
pipelining ppt.pdf
pipelining ppt.pdfpipelining ppt.pdf
pipelining ppt.pdf
WilliamTom9
 
Switch Control and Time Delay - Keypad
Switch Control and Time Delay - KeypadSwitch Control and Time Delay - Keypad
Switch Control and Time Delay - Keypad
Ariel Tonatiuh Espindola
 
Instruction set
Instruction setInstruction set
Instruction set
Lívia Sousa
 
7-bit 0.8-1.2GS/s Dynamic Architecture and Frequency Scaling Subrange ADC wi...
7-bit 0.8-1.2GS/s Dynamic Architecture and Frequency Scaling Subrange ADC wi...7-bit 0.8-1.2GS/s Dynamic Architecture and Frequency Scaling Subrange ADC wi...
7-bit 0.8-1.2GS/s Dynamic Architecture and Frequency Scaling Subrange ADC wi...
Kentaro Yoshioka
 
PSpiceで位相余裕度シミュレーション
PSpiceで位相余裕度シミュレーション PSpiceで位相余裕度シミュレーション
PSpiceで位相余裕度シミュレーション
Tsuyoshi Horigome
 

Similar to Cs718min1 2008soln View (20)

Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
 
Spiceを活用した電源回路シミュレーションセミナーテキスト 18 feb2015
Spiceを活用した電源回路シミュレーションセミナーテキスト 18 feb2015Spiceを活用した電源回路シミュレーションセミナーテキスト 18 feb2015
Spiceを活用した電源回路シミュレーションセミナーテキスト 18 feb2015
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
 
Pipeline r014
Pipeline   r014Pipeline   r014
Pipeline r014
 
Pdc lab manualnew
Pdc lab manualnewPdc lab manualnew
Pdc lab manualnew
 
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
 
Agilent ADS 模擬手冊 [實習1] 基本操作與射頻放大器設計
Agilent ADS 模擬手冊 [實習1] 基本操作與射頻放大器設計Agilent ADS 模擬手冊 [實習1] 基本操作與射頻放大器設計
Agilent ADS 模擬手冊 [實習1] 基本操作與射頻放大器設計
 
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
 
Two stage op amp design on cadence
Two stage op amp design on cadenceTwo stage op amp design on cadence
Two stage op amp design on cadence
 
FPGA based BCH Decoder
FPGA based BCH DecoderFPGA based BCH Decoder
FPGA based BCH Decoder
 
Lecture21-BJT ExamplesAnd Pspice based sSim.pdf
Lecture21-BJT ExamplesAnd Pspice based sSim.pdfLecture21-BJT ExamplesAnd Pspice based sSim.pdf
Lecture21-BJT ExamplesAnd Pspice based sSim.pdf
 
P410498102
P410498102P410498102
P410498102
 
Electric Circuits LabInstructor- -----------Parallel ResonanceStudent.docx
Electric Circuits LabInstructor-  -----------Parallel ResonanceStudent.docxElectric Circuits LabInstructor-  -----------Parallel ResonanceStudent.docx
Electric Circuits LabInstructor- -----------Parallel ResonanceStudent.docx
 
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
 
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
A 80Ms/sec 10bit PIPELINED ADC Using 1.5Bit Stages And Built-in Digital Error...
 
pipelining ppt.pdf
pipelining ppt.pdfpipelining ppt.pdf
pipelining ppt.pdf
 
Switch Control and Time Delay - Keypad
Switch Control and Time Delay - KeypadSwitch Control and Time Delay - Keypad
Switch Control and Time Delay - Keypad
 
Instruction set
Instruction setInstruction set
Instruction set
 
7-bit 0.8-1.2GS/s Dynamic Architecture and Frequency Scaling Subrange ADC wi...
7-bit 0.8-1.2GS/s Dynamic Architecture and Frequency Scaling Subrange ADC wi...7-bit 0.8-1.2GS/s Dynamic Architecture and Frequency Scaling Subrange ADC wi...
7-bit 0.8-1.2GS/s Dynamic Architecture and Frequency Scaling Subrange ADC wi...
 
PSpiceで位相余裕度シミュレーション
PSpiceで位相余裕度シミュレーション PSpiceで位相余裕度シミュレーション
PSpiceで位相余裕度シミュレーション
 

More from Ravi Soni

Google Never Dies Meetup ( Obbserv + SEMrush ) the vision of digital you
Google Never Dies Meetup ( Obbserv + SEMrush ) the vision of digital you Google Never Dies Meetup ( Obbserv + SEMrush ) the vision of digital you
Google Never Dies Meetup ( Obbserv + SEMrush ) the vision of digital you
Ravi Soni
 
Stakeholder Theory, Ethics 209
Stakeholder Theory, Ethics 209Stakeholder Theory, Ethics 209
Stakeholder Theory, Ethics 209Ravi Soni
 
Lec 6 Structure (Types) 196
Lec 6  Structure (Types) 196Lec 6  Structure (Types) 196
Lec 6 Structure (Types) 196Ravi Soni
 
Lec 3 Organizational Effectiveness 184
Lec 3  Organizational Effectiveness 184Lec 3  Organizational Effectiveness 184
Lec 3 Organizational Effectiveness 184Ravi Soni
 
Lec 5 Structure (Basics) 186
Lec 5  Structure (Basics) 186Lec 5  Structure (Basics) 186
Lec 5 Structure (Basics) 186Ravi Soni
 
Lec Jan29 2009
Lec Jan29 2009Lec Jan29 2009
Lec Jan29 2009Ravi Soni
 
Lec Feb05 2009
Lec Feb05 2009Lec Feb05 2009
Lec Feb05 2009Ravi Soni
 
Lec Feb09 2009
Lec Feb09 2009Lec Feb09 2009
Lec Feb09 2009Ravi Soni
 
Lec Jan19 2009
Lec Jan19 2009Lec Jan19 2009
Lec Jan19 2009Ravi Soni
 

More from Ravi Soni (10)

Google Never Dies Meetup ( Obbserv + SEMrush ) the vision of digital you
Google Never Dies Meetup ( Obbserv + SEMrush ) the vision of digital you Google Never Dies Meetup ( Obbserv + SEMrush ) the vision of digital you
Google Never Dies Meetup ( Obbserv + SEMrush ) the vision of digital you
 
Stakeholder Theory, Ethics 209
Stakeholder Theory, Ethics 209Stakeholder Theory, Ethics 209
Stakeholder Theory, Ethics 209
 
Lec 6 Structure (Types) 196
Lec 6  Structure (Types) 196Lec 6  Structure (Types) 196
Lec 6 Structure (Types) 196
 
Lec 3 Organizational Effectiveness 184
Lec 3  Organizational Effectiveness 184Lec 3  Organizational Effectiveness 184
Lec 3 Organizational Effectiveness 184
 
Lec 1 182
Lec 1 182Lec 1 182
Lec 1 182
 
Lec 5 Structure (Basics) 186
Lec 5  Structure (Basics) 186Lec 5  Structure (Basics) 186
Lec 5 Structure (Basics) 186
 
Lec Jan29 2009
Lec Jan29 2009Lec Jan29 2009
Lec Jan29 2009
 
Lec Feb05 2009
Lec Feb05 2009Lec Feb05 2009
Lec Feb05 2009
 
Lec Feb09 2009
Lec Feb09 2009Lec Feb09 2009
Lec Feb09 2009
 
Lec Jan19 2009
Lec Jan19 2009Lec Jan19 2009
Lec Jan19 2009
 

Recently uploaded

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Cs718min1 2008soln View

  • 1. CSL 718 Architecture of High Performance Systems Minor Test I Solution 2008
  • 2. 1. Consider the following architectural changes in a non-pipelined processor that has a clock period of T ns, executes N instructions to run a particular benchmark with an average of C cycles per instruction. i) A new instruction is introduced which replaces a sequence of operations occurring at several places in that benchmark. ii) Pipelining is introduced. iii) The stage with maximum propagation delay is split into two stages. For each of these changes, indicate how are N, T, C, T*C, and N*T*C likely to change, giving reasons. Suppose the new instruction in i) is able to replace 75% of the instructions executed, what is the upper bound on possible performance improvement by this change?
  • 3. Solution: i) N will decrease because multiple instructions are being replaced by a single instruction. T and C are likely to go up because the new instruction has a more complex task to perform which would need more cycles and/or the cycles have to accommodate more work. Assuming that the CPI of the instructions replaced and that of the instructions not replaced is same, 25% of the execution time is remaining unaffected. Suppose the remaining execution time, which is 75%, reduces by a factor k by using the new instruction. Then the overall speedup is - 1 .75 .25 + k This can be at most 4.
  • 4. ii) Pipelining will lead to overlapped execution of instructions. Therefore, C will decrease. Pipelining will ideally tend to make C = 1, but because of hazards, it would usually be more than 1. If pipeline stages correspond to the original break-up of instructions into cycles, T will remain unchanged. N will certainly remain unchanged as there is no change in the instruction set. iii) The stage with maximum propagation delay determines the clock period T. Therefore, if this stage is split into two stages, T will decrease (provided that there was no other stage with the same propagation delay). This would also introduce an additional cycle for the affected instructions. Therefore, C will go up. N will remain unchanged as there is no change in the instruction set.
  • 5. 2. A processor has a non-linear pipeline with 4 stages A, B, C and D. Each instruction goes through different stages in the following order A B C B A D C. Find the bounds on the maximum instruction throughput in a static hazard free schedule. Solution: The reservation table for this pipeline is as follows. 1 2 3 4 5 6 7 A X X B X X C X X D X Intervals which cause collision are: Row A – 4 Row B – 2 Row C – 4 Row D – none. Therefore, the initial collision vector is - 001010
  • 6. No. of 1’s in the initial collision vector = 2. Therefore, minimum average latency ≤ 2+1 = 3 That is, maximum instruction throughput ≥ 1/3 instructions per cycle. Maximum number of checks in a row of the reservation table = 2 Therefore, minimum average latency ≥ 2 That is, maximum instruction throughput ≤ 1/2 instructions per cycle.
  • 7. 3. Compute the number of cycles lost due to a branch hazard in a pipelined processor with 5 stages – instruction fetch (IF), decode (D), execute (EX), memory access (M) and write back (WB). Assume that in a branch instruction, decision-making as well as address calculation are completed in EX stage and also assume that the branches are taken 70% of the times. Consider the following cases – i) there is no delayed branch and no branch prediction, ii) there is one delayed branch slot which is filled with a useful instruction, iii) branch is statically predicted to be taken, iv) there is a branch target address buffer which is looked up in the IF stage itself and a hit (or miss) in this buffer (assume 80% hit) is used for predicting the branch to be taken (or not taken).
  • 8. Solution: Instruction N is the branch instruction and T is the target instruction. Instructions wrongly started and abandoned are shown in red and those executed correctly are shown in green. Time slots in which an instruction is stalled are shown as ██. i) No delayed branch slot, no branch prediction (a) branch not taken N IF|D |EX N+1 IF|██|D |EX|M |WB N+2 ██|IF|D |EX|M |WB delay = 1 (b) branch taken N IF|D |EX N+1/T IF|██|IF|D |EX|M |WB delay = 2 T+1 ██|██|IF|D |EX|M |WB Average delay = 1*0.3 + 2*0.7 = 1.7
  • 9. ii) One delayed branch slot, filled with useful instruction N+1 (a) branch not taken N IF|D |EX N+1 IF|D |EX|M |WB N+2 ██|IF|D |EX|M |WB delay = 1 (b) branch taken N IF|D |EX N+1 IF|D |EX|M |WB T ██|IF|D |EX|M |WB T+1 ██|IF|D |EX|M |WB delay = 1 Average delay = 1*0.3 + 1*0.7 = 1.0
  • 10. iii) Branch statically predicted to be taken (a) branch not taken (prediction incorrect) N IF|D |EX N+1/T IF|██|IF N+1 ██|IF|D |EX|M |WB delay = 1 (b) branch taken (prediction correct) N IF|D |EX N+1/T IF|██|IF|D |EX|M |WB T+1 ██|██|IF|D |EX|M |WB delay = 2 Average delay = 1*0.3 + 2*0.7 = 1.7 Here branch prediction offers no advantage, because target address calculation and decision making are happening in the same stage.
  • 11. iv) Branch target address buffer with 80% hit (a) hit and branch not taken (prediction incorrect) N IF|D |EX T/N+1 IF|D |IF|D |EX|M |WB T+1/N+2 IF|██|IF|D |EX|M |WB delay = 2 (b) hit and branch taken (prediction correct) N IF|D |EX T IF|D |EX|M |WB T+1 IF|D |EX|M |WB delay = 0
  • 12. (c) miss and branch not taken (prediction correct) N IF|D |EX N+1 IF|D |EX|M |WB N+2 IF|D |EX|M |WB delay = 0 (d) miss and branch taken (prediction incorrect) N IF|D |EX N+1/T IF|D |IF|D |EX|M |WB N+2/T+1 IF|██|IF|D |EX|M |WB delay = 2 Average delay = 0.8*(2*0.3 + 0*0.7) + 0.2*(0*0.3 + 2*0.7) = 0.8*0.6 + 0.2*1.4 = 0.76
  • 13. 4. A processor with dynamic scheduling and issue bound operand fetch has 3 execution units – one LOAD/STORE unit, one ADD/SUB unit and one MUL/DIV unit. It has a reservation station with 1 slot per execution unit and a single register file. Starting with the following instruction sequence in the instruction fetch buffer and empty reservation stations, for each instruction find the cycle in which it will be issued and the cycle in which it will write result. Assume out of order issue and out load R6, 34(R12) of order execution. Execute cycles load R2, 45(R13) taken by different instructions are - mul R0, R2, R4 LOAD/STORE : 2 sub R8, R2, R6 ADD/SUB : 1 div R10, R0, R6 MUL : 2 add R6, R8, R2 DIV : 4.
  • 14. Solution: The following chart shows the execution of the given instruction sequence cycle by cycle. The stages of instruction execution are annotated as follows: IF Instruction fetch D Decode and issue EX1 Execute in LOAD/STORE unit EX2 Execute in ADD/SUB unit EX3 Execute in MUL/DIV unit WB Write back into register file and reservation stations 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Instr cycle ⇓ no.⇒ load IF D EX1 EX1 WB • • • load IF D EX1 EX1 WB mul IF D EX3 EX3 WB sub IF D EX2 WB • • • • • • • • • div IF D EX3 EX3 EX3 EX3 WB • • • • • • • • add IF D EX2 WB
  • 15. Cycles in which an instruction is waiting for a reservation station are marked as • and the cycles in which an instruction is waiting for one or more operands are marked as . As seen in the time chart, the issue and write back cycles for various instructions are as follows. Instruction issue cycle write back cycle load 1 4 load 4 7 mul 1 10 sub 1 9 div 10 16 add 9 12