SlideShare a Scribd company logo
ECE 4100/6100
Advanced Computer Architecture
Lecture 7 Dynamic Scheduling (I)
Prof. Hsien-Hsin Sean Lee
School of Electrical and Computer Engineering
Georgia Institute of Technology
Data Flow Graph (DFG)
i1: r2 = 4(r22)
i2: r10 = 4(r25)
i3: r10 = r2 + r10
i4: 4(r26) = r10
i5: r14 = 8(r27)
i6: r6 = (r22)
i7: r5 = (r23)
i8: r5 = r6 – r5
i9: r4 = r14 * r5
i10: r15 = 12(r27)
i11: r7 = 4(r22)
i12: r8 = 4(r23)
i13: r8 = r7 – r8
i14: r8 = r15* r8
i15: r8 = r4 – r8
i16: (r28) = r8
i1i1 i2i2
i3i3
i4i4
i6i6 i7i7
i8i8
i5i5
i9i9
i11i11 i12i12
i13i13
i10i10
i14i14
i15i15
i16i16
Data Flow Graph (or Data Dependency Graph)
Data Flow Execution Model
• To exploit maximal ILP
• An instruction can be executed immediately after
– All source operands are ready
– Execution unit available
– Destination is ready (to be written)
Dynamic Scheduling
• Exploit ILP at run-time
• Execute instructions out-of-order by a restricted data flow
execution model (still use PC!)
• Hardware will
– Maintain true dependency (data flow manner)
– Maintain exception behavior
– Find ILP within an Instruction Window (pool)
• Need an accurate branch predictor
• Pros
– Scalable performance: allows code to be compiled on one platform,
but also run efficiently on another
– Handle cases where dependency is unknown at compile-time
• Cons
– Hardware complexity (main argument from the VLIW/EPIC camp)
Out-of-Order Execution
i1: r2 = 4(r22)
i2: r10 = 4(r25)
i3: r10 = r2 + r10
i4: 4(r26) = r10
i5: r14 = 8(r27)
i6: r6 = (r22)
i7: r5 = (r23)
i8: r5 = r6 – r5
i9: r4 = r14 * r5
i10: r15 = 12(r27)
i11: r7 = 4(r22)
i12: r8 = 4(r23)
i13: r8 = r7 – r8
i14: r8 = r15* r8
i15: r8 = r4 – r8
i16: (r28) = r8
i1i1 i2i2
i3i3
i4i4
i6i6 i7i7
i8i8
i5i5
i9i9
i11i11 i12i12
i13i13
i10i10
i14i14
i15i15
i16i16
OOO Execution
• OOO execution ≡ out-of-order completion
• OOO execution ≠ out-of-order retirement (commit)
• No (speculative) instruction allowed to retire until it
is confirmed on the right path
• Fetch, decode, issue (i.e., front-end) are still done in
the program order
CDC 6600 Scoreboard Algorithm
• Enable OOO Execution to address long-latency FP
instructions
• Use scoreboard tables to track
– Functional unit status
– Register update status
• Issue and execute instructions whenever
– No structural hazard
– No data hazard
• Cons
– Stop issue when WAW is detected
– Stop writeback when WAR is detected
CDC6600 Scoreboard
FunctionalUnits
Registers
FP MultFP Mult
FP MultFP Mult
FP DivideFP Divide
FP AddFP Add
IntegerInteger
Memory
SCOREBOARDSCOREBOARD
Data bus
Data bus
Data bus
Data bus
Control bus/Status
Int
Mult1
Mult2
Add
Div
Fu
1
1
0
1
1
Busy
Load
Mult
Sub
Div
Op
F1
F0
F8
F2
Dest
R3
F1
F6
F0
Src1
F4
F1
F6
Src2
Int
Mult1
Dep1
Int
Dep2
F0 F1 F2 .. .. .. F31
Mult1 Int Div .. .. .. xxxFU
FU Status Table
Register Update Table
IBM 360
• IBM 360 introduced
– 8-bit = 1 byte
– 32-bit = 1 word
– Byte-addressable memory
– Differentiate an “architecture” from an “implementation”
• IBM 360/91 FPU about 3 years after CDC 6600 (1966-7)
• Tomasulo algorithm
– Dynamic scheduling
– Register renaming
Tomasulo Algorithm
• Goal: High Performance without special compilers
– Dynamic scheduling done completely by HW
– We generally use “supercalar processor” for such category as
opposed to “VLIW” or “EPIC”
• Differences between IBM 360 and CDC 6600 ISA
– IBM has only 2 register specifiers per inst vs. 3 in CDC 6600
• Make WAW and WAR much worse
– IBM has 4 FP registers vs. 8 in CDC 6600
• Smaller number of architectural register, compiler is incapable of
exploiting better register allocation
– IBM has memory-to-register operations
• Why study? Lead to Pentium Pro/II/III/4, Core, Alpha 21264, MIPS
R10000, HP 8000, PowerPC 604
IBM 360/91 FPU w/ Tomasulo Algorithm
• To not stall floating point instructions due to long latency
– Two function units  FP Add + FP Mult/Div
– 360/91 FPU is not pipelined
• Three new Mechanisms
– Reservation StationsReservation Stations (RS)
• 3 in FP Add, 2 in FP mult/div
• Register name is discarded when issue to reservation station
– TagsTags
• 4-bit tag for one of the 11 possible sources (5 RSs + 6 FLB for loads)
• Written for unavailable sources whose results are being generated by
one of the sources (5 RS or 6 FLB)
• New tag assignment eliminates false dependency
– Common Data BusCommon Data Bus (CDB), driven by
• 11 Sources: 5 RS + 6 FLB
• 17 Destinations: 2*5 RS + 3 SDB + 4 FLR
Basic Principles
• Do not rely on a centralized register file !
• RS fetches and buffers an operand as soon as it is
available via CDB
– Eliminating the need to get it from a register (No WAR)
– Data Flow execution model
• Pending instructions designate the RS that will
provide their input (renaming and maintain RAW)
• Due to in-order issue, the register status table
always keeps the latest write (No WAW issue)
Key Representation
• Op  Operation to perform in the units
• Vj  Value of Source 1 (called SINK in 360/91)
• Vk  Value of Source 2 (called SOURCE in
360/91)
• Qj  The RS (tag) will produce source 1
• Qk  The RS (tag) will produce source 2
• A(ddress)  Hold info for the memory address
generation for a load or store
• Qi  Whose value should be stored into the
register
IBM 360/91 FPU w/ Tomasulo Algorithm
From Mem FP Registers (FLR)
ReservationReservation
StationsStations
Common Data Bus (CDB)Common Data Bus (CDB)
To Mem
FP operation stack (FLOS)
FP Load Buffers
(FLB)
Store Data
Buffers
(SDB)
6
5
4
3
2
1
FP AdderFP Adder FP Mult/DivFP Mult/Div
3
2
1
2
1
IBM 360/91 FPU w/ Tomasulo Algorithm
From Mem
FLR
ReservationReservation
StationsStations
Common Data Bus (CDB)Common Data Bus (CDB)
To Mem
FP operation stack (FLOS)
FLB
Store Data
Buffers
(SDB)
6
5
4
3
2
1
FP AdderFP Adder FP Mult/DivFP Mult/Div
3
2
1
2
1
Tag
(Qj)
Tag
(Qk)
Sink
(Vj)
Source
(Vk)
Control
Tags and other info in RSTags and other info in RS
Control
Tag
(Qi) Tags in FLBTags in FLB
Control Tag
Control
RAWRAW Example: i: R2i: R2 ←← R0 + R4 (2 clks)R0 + R4 (2 clks)
j: R8j: R8 ←← R0 + R2 (2 clks)R0 + R2 (2 clks)
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 6.0
2 3.5
4 10.0
8 7.8
Cycle #0:Cycle #0:
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11 0 6.0 0 10.0
22
33
Adder
FLR Busy Tag Data
0 6.0
2 1 11 ---
4 10.0
8 7.8
Cycle #1: IssueCycle #1: Issue ii
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11 0 6.0 0 10.0
22 0 6.0 11 ---
33
Adder
FLR Busy Tag Data
0 6.0
2 1 11 ---
4 10.0
8 1 22 ---
Cycle #2: IssueCycle #2: Issue jj
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RAWRAW Example: i: R2i: R2 ←← R0 + R4 (2 clks)R0 + R4 (2 clks)
j: R8j: R8 ←← R0 + R2 (2 clks)R0 + R2 (2 clks)
RSRS Tag Sink Tag Src
11
22 0 6.0 0 16.0
33
Adder
FLR Busy Tag Data
0 6.0
2 16.0
4 10.0
8 1 22 ---
Cycle #3:Cycle #3: Broadcasts tag and result: CDB_a=<RS1,16.0>Broadcasts tag and result: CDB_a=<RS1,16.0>
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 6.0
2 16.0
4 10.0
8 22.0
Cycle #5:Cycle #5: Broadcasts tag and result: CDB_a=<RS2,22.0>Broadcasts tag and result: CDB_a=<RS2,22.0>
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 6.0
2 3.5
4 10.0
8 7.8
Cycle #0:Cycle #0:
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 6.0
2 3.5
4 1 44 ---
8 7.8
Cycle #1: IssueCycle #1: Issue ii
RSRS Tag Sink Tag Src
44 0 6.0 0 7.8
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 1 55 ---
2 3.5
4 1 44 ---
8 7.8
Cycle #2: IssueCycle #2: Issue jj
RSRS Tag Sink Tag Src
44 0 6.0 0 7.8
55 44 --- 0 3.5
Multiplier/Divider
WARWAR Example:
i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3)
j: R0j: R0 ←← R4 x R2 (3)R4 x R2 (3)
k: R2k: R2 ←← R2 + R8 (2)R2 + R8 (2)
Cycle #3: IssueCycle #3: Issue kk
WARWAR Example:
i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3)
j: R0j: R0 ←← R4 x R2 (3)R4 x R2 (3)
k: R2k: R2 ←← R2 + R8 (2)R2 + R8 (2)
RSRS Tag Sink Tag Src
11 0 3.5 0 7.8
22
33
Adder
FLR Busy Tag Data
0 1 55 ---
2 1 11 ---
4 1 44 ---
8 7.8
RSRS Tag Sink Tag Src
44 0 6.0 0 7.8
55 44 --- 0 3.5
Multiplier/Divider
RSRS Tag Sink Tag Src
11 0 3.5 0 7.8
22
33
Adder
FLR Busy Tag Data
0 1 55 ---
2 1 11 ---
4 46.8
8 7.8
Cycle #4:Cycle #4: Broadcasts CDB_m=<RS4,46.8>;Broadcasts CDB_m=<RS4,46.8>;
RSRS Tag Sink Tag Src
44
55 0 46.8 0 3.5
Multiplier/Divider
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 1 55 ---
2 11.3
4 46.8
8 7.8
Cycle #5:Cycle #5: Broadcasts CDB_a=<RS1,11.3>Broadcasts CDB_a=<RS1,11.3>
RSRS Tag Sink Tag Src
44
55 0 46.8 0 3.5
Multiplier/Divider
WARWAR Example:
i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3)
j: R0j: R0 ←← R4 x R2 (3)R4 x R2 (3)
k: R2k: R2 ←← R2 + R8 (2)R2 + R8 (2)
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 163.8
2 11.3
4 46.8
8 7.8
Cycle #7:Cycle #7: Broadcasts CDB_m=<RS5,163.8>Broadcasts CDB_m=<RS5,163.8>
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11 0 6.0 44 ---
22
33
Adder
FLR Busy Tag Data
0 6.0
2 1 11 ---
4 1 44 ---
8 7.8
Cycle #2: IssueCycle #2: Issue jj
RSRS Tag Sink Tag Src
44 0 6.0 0 7.8
55
Multiplier/Divider
WAWWAW Example:
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 6.0
2 3.5
4 10.0
8 7.8
Cycle #0:Cycle #0:
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 6.0
2 3.5
4 1 44 ---
8 7.8
Cycle #1: IssueCycle #1: Issue ii
RSRS Tag Sink Tag Src
44 0 6.0 0 7.8
55
Multiplier/Divider
i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3)
j: R2j: R2 ←← R0 + R4 (2)R0 + R4 (2)
k: R4k: R4 ←← R0 + R8 (2)R0 + R8 (2)
RSRS Tag Sink Tag Src
11 0 6.0 44 ---
22 0 6.0 0 7.8
33
Adder
FLR Busy Tag Data
0 6.0
2 1 11 ---
4 1 22 ---
8 7.8
Cycle #3: IssueCycle #3: Issue kk
RSRS Tag Sink Tag Src
44 0 6.0 0 7.8
55
Multiplier/Divider
WAWWAW Example:
i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3)
j: R2j: R2 ←← R0 + R4 (2)R0 + R4 (2)
k: R4k: R4 ←← R0 + R8 (2)R0 + R8 (2)
RSRS Tag Sink Tag Src
11 0 6.0 0 46.8
22 0 6.0 0 7.8
33
Adder
FLR Busy Tag Data
0 6.0
2 1 11 ---
4 1 22 ---
8 7.8
Cycle #4:Cycle #4: Broadcasts CDB_m=<RS4,46.8>Broadcasts CDB_m=<RS4,46.8>
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
RSRS Tag Sink Tag Src
11 0 6.0 0 46.8
22
33
Adder
FLR Busy Tag Data
0 6.0
2 1 11 ---
4 13.8
8 7.8
Cycle #5:Cycle #5: Broadcasts CDB_a=<RS2,13.8>Broadcasts CDB_a=<RS2,13.8>
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
WAWWAW Example:
RSRS Tag Sink Tag Src
11
22
33
Adder
FLR Busy Tag Data
0 6.0
2 52.8
4 13.8
8 7.8
Cycle #6:Cycle #6: Broadcasts CDB_a=<RS1,52.8>Broadcasts CDB_a=<RS1,52.8>
RSRS Tag Sink Tag Src
44
55
Multiplier/Divider
i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3)
j: R2j: R2 ←← R0 + R4 (2)R0 + R4 (2)
k: R4k: R4 ←← R0 + R8 (2)R0 + R8 (2)
Tomasulo Example (H&P Text)
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 Load1 No
LD F2 45+ R3 Load2 No
MULTD F0 F2 F4 Load3 No
SUBD F8 F6 F2
DIVD F10 F0 F6
ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
Mult2 No
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
0 Qi
These are RS,
we have only
“one FU” for each
type (MUL, ADD,
LD). We reduce
Load from 6 to 3 for
simplicity. SDB is
not shown either
Assumption
• INT (load)  1 cycle
• MULT  10 cycles
• ADD  2 cycles
• DIVIDE  40 cycles
Tomasulo Example Cycle 1
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 Load1 Yes 34+R2
LD F2 45+ R3 Load2 No
MULTD F0 F2 F4 Load3 No
SUBD F8 F6 F2
DIVD F10 F0 F6
ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
Mult2 No
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
1 Qi Load1
Tomasulo Example Cycle 2
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 Load1 Yes 34+R2
LD F2 45+ R3 2 Load2 Yes 45+R3
MULTD F0 F2 F4 Load3 No
SUBD F8 F6 F2
DIVD F10 F0 F6
ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
Mult2 No
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
2 Qi Load2 Load1
Note: Unlike CDC6600, RS enables multiple outstanding loads
Load is calculating the effective address
Tomasulo Example Cycle 3
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 Load1 Yes 34+R2
LD F2 45+ R3 2 Load2 Yes 45+R3
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2
DIVD F10 F0 F6
ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 Yes MULTD R(F4) Load2
Mult2 No
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
3 Qi Mult1 Load2 Load1
• Note: registers names are removed (“renamed”) in Reservation Stations; MULT issued
vs. scoreboard
• Load1 completing; what is waiting for Load1?
Tomasulo Example Cycle 4
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 Load2 Yes 45+R3
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4
DIVD F10 F0 F6
ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 Yes SUBD M(A1) Load2
Add2 No
Add3 No
Mult1 Yes MULTD R(F4) Load2
Mult2 No
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
4 Qi Mult1 Load2 M(A1) Add1
• Load1 write to CDB; Load2 completing; what is waiting for Load2?
Tomasulo Example Cycle 5
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4
DIVD F10 F0 F6 5
ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
2 Add1 Yes SUBD M(A1) M(A2)
Add2 No
Add3 No
10 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
5 Qi Mult1 M(A2) Add1 Mult2
Tomasulo Example Cycle 6
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
1 Add1 Yes SUBD M(A1) M(A2)
Add2 Yes ADDD R(F2) Add1
Add3 No
9 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
6 Qi Mult1 Add2 Add1 Mult2
• R(F6) was entered in Cycle 5
• Issue ADDD here vs. scoreboard?
Tomasulo Example Cycle 7
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4 7
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
0 Add1 Yes SUBD M(A1) M(A2)
Add2 Yes ADDD R(F2) Add1
Add3 No
8 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
7 Qi Mult1 Add2 Add1 Mult2
• Add1 completing; what is waiting for it?
Tomasulo Example Cycle 8
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
2 Add2 Yes ADDD(M1-M2)R(F2)
Add3 No
7 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
8 Qi Mult1 Add2 (M1-M2)Mult2
Tomasulo Example Cycle 9
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
1 Add2 Yes ADDD(M1-M2)R(F2)
Add3 No
6 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
9 FU Mult1 Add2 Mult2
Tomasulo Example Cycle 10
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6 10
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
0 Add2 Yes ADDD(M1-M2)R(F2)
Add3 No
5 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
10 FU Mult1 Add2 Mult2
• Add2 completing; what is waiting for it?
Tomasulo Example Cycle 11
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
4 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
11 FU Mult1 (M1-M2+M(A2)) Mult2
• Write result of ADDD here vs. scoreboard?
• All quick instructions complete in this cycle!
Tomasulo Example Cycle 12
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
3 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
12 FU Mult1 Mult2
Tomasulo Example Cycle 13
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
2 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
13 FU Mult1 Mult2
Tomasulo Example Cycle 14
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
1 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
14 FU Mult1 Mult2
Tomasulo Example Cycle 15
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 15 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
0 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
15 FU Mult1 Mult2
Tomasulo Example Cycle 16
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 15 16 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
40 Mult2 Yes DIVD M*F4 R(F6)
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
16 FU M*F4 Mult2
Faster than light computation
(skip a couple of cycles)
Tomasulo Example Cycle 55
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 15 16 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
1 Mult2 Yes DIVD M*F4 R(F6)
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
55 FU Mult2
Tomasulo Example Cycle 56
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 15 16 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5 56
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
0 Mult2 Yes DIVD M*F4 R(F6)
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
56 FU Mult2
• Mult2 is completing; what is waiting for it?
Tomasulo Example Cycle 57
Instruction status: Exec Write
Instruction j k Issue Comp Result Busy Address
LD F6 34+ R2 1 3 4 Load1 No
LD F2 45+ R3 2 4 5 Load2 No
MULTD F0 F2 F4 3 15 16 Load3 No
SUBD F8 F6 F2 4 7 8
DIVD F10 F0 F6 5 56 57
ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
Mult2 No
Register result status:
Clock F0 F2 F4 F6 F8 F10 F12 ... F30
57 FU Result
• Once again: In-order issue, out-of-order execution and
completion.
Compare to Scoreboard Cycle 62
Instruction status: Read Exec Write Exec Write
Instruction j k Issue Oper Comp Result Issue ComplResult
LD F6 34+ R2 1 2 3 4 1 3 4
LD F2 45+ R3 5 6 7 8 2 4 5
MULTD F0 F2 F4 6 9 19 20 3 15 16
SUBD F8 F6 F2 7 9 11 12 4 7 8
DIVD F10 F0 F6 8 21 61 62 5 56 57
ADDD F6 F8 F2 13 14 16 22 6 10 11
• Why take longer on scoreboard/6600?
• Structural Hazards
• Lack of forwarding
Issues in Tomasulo Algorithm
• CDB at high speed?
• Precise exception issues
• Speculative instructions
– Branch prediction enlarges instruction window
– How to rollback when mispredicted?

More Related Content

What's hot

Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...
Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...
Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Hsien-Hsin Sean Lee, Ph.D.
 
Lec19 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Pr...
Lec19 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Pr...Lec19 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Pr...
Lec19 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Pr...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILP
Lec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILPLec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILP
Lec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILP
Hsien-Hsin Sean Lee, Ph.D.
 
Lec8 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec8 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...Lec8 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec8 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Hsien-Hsin Sean Lee, Ph.D.
 
Lec05
Lec05Lec05
Assembly Language Lecture 3
Assembly Language Lecture 3Assembly Language Lecture 3
Assembly Language Lecture 3
Motaz Saad
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
Pragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
Pragmatic Optimization in Modern Programming - Mastering Compiler OptimizationsPragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
Pragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
Marina Kolpakova
 
ARM instruction set
ARM instruction  setARM instruction  set
ARM instruction set
Karthik Vivek
 
Pipelining and co processor.
Pipelining and co processor.Pipelining and co processor.
Pipelining and co processor.
Piyush Rochwani
 
Code GPU with CUDA - Device code optimization principle
Code GPU with CUDA - Device code optimization principleCode GPU with CUDA - Device code optimization principle
Code GPU with CUDA - Device code optimization principle
Marina Kolpakova
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
Marina Kolpakova
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
Kernel TLV
 
Introduction to Assembly Language
Introduction to Assembly LanguageIntroduction to Assembly Language
Introduction to Assembly Language
Motaz Saad
 
Programming Protocol-Independent Packet Processors
Programming Protocol-Independent Packet ProcessorsProgramming Protocol-Independent Packet Processors
Programming Protocol-Independent Packet Processors
Open Networking Summits
 
Code GPU with CUDA - SIMT
Code GPU with CUDA - SIMTCode GPU with CUDA - SIMT
Code GPU with CUDA - SIMT
Marina Kolpakova
 
BPF - All your packets belong to me
BPF - All your packets belong to meBPF - All your packets belong to me
BPF - All your packets belong to me
_xhr_
 
Code GPU with CUDA - Memory Subsystem
Code GPU with CUDA - Memory SubsystemCode GPU with CUDA - Memory Subsystem
Code GPU with CUDA - Memory Subsystem
Marina Kolpakova
 

What's hot (20)

Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...
Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...
Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
 
Lec19 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Pr...
Lec19 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Pr...Lec19 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Pr...
Lec19 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Pr...
 
Lec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILP
Lec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILPLec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILP
Lec2 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ILP
 
Lec8 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec8 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...Lec8 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
Lec8 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Sch...
 
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
Lec10 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part2
 
Lec05
Lec05Lec05
Lec05
 
Assembly Language Lecture 3
Assembly Language Lecture 3Assembly Language Lecture 3
Assembly Language Lecture 3
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
Pragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
Pragmatic Optimization in Modern Programming - Mastering Compiler OptimizationsPragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
Pragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
 
ARM instruction set
ARM instruction  setARM instruction  set
ARM instruction set
 
Pipelining and co processor.
Pipelining and co processor.Pipelining and co processor.
Pipelining and co processor.
 
Code GPU with CUDA - Device code optimization principle
Code GPU with CUDA - Device code optimization principleCode GPU with CUDA - Device code optimization principle
Code GPU with CUDA - Device code optimization principle
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
 
Introduction to Assembly Language
Introduction to Assembly LanguageIntroduction to Assembly Language
Introduction to Assembly Language
 
Programming Protocol-Independent Packet Processors
Programming Protocol-Independent Packet ProcessorsProgramming Protocol-Independent Packet Processors
Programming Protocol-Independent Packet Processors
 
Code GPU with CUDA - SIMT
Code GPU with CUDA - SIMTCode GPU with CUDA - SIMT
Code GPU with CUDA - SIMT
 
BPF - All your packets belong to me
BPF - All your packets belong to meBPF - All your packets belong to me
BPF - All your packets belong to me
 
Code GPU with CUDA - Memory Subsystem
Code GPU with CUDA - Memory SubsystemCode GPU with CUDA - Memory Subsystem
Code GPU with CUDA - Memory Subsystem
 

Viewers also liked

Lec20 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Da...
Lec20 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Da...Lec20 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Da...
Lec20 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Da...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec6 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Can...
Lec6 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Can...Lec6 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Can...
Lec6 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Can...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec10 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Mu...
Lec10 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Mu...Lec10 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Mu...
Lec10 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Mu...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec8 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Qui...
Lec8 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Qui...Lec8 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Qui...
Lec8 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Qui...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec2 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Num...
Lec2 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Num...Lec2 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Num...
Lec2 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Num...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec14 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Se...
Lec14 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Se...Lec14 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Se...
Lec14 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Se...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- IntroLec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Hsien-Hsin Sean Lee, Ph.D.
 
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PerformanceLec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Hsien-Hsin Sean Lee, Ph.D.
 
Lec9 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Com...
Lec9 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Com...Lec9 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Com...
Lec9 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Com...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec3 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMO...
Lec3 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMO...Lec3 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMO...
Lec3 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMO...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec13 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Sh...
Lec13 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Sh...Lec13 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Sh...
Lec13 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Sh...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec16 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Fi...
Lec16 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Fi...Lec16 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Fi...
Lec16 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Fi...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec12 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Ad...
Lec12 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Ad...Lec12 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Ad...
Lec12 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Ad...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec4 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMOS
Lec4 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMOSLec4 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMOS
Lec4 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMOS
Hsien-Hsin Sean Lee, Ph.D.
 
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec11 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- De...
Lec11 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- De...Lec11 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- De...
Lec11 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- De...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec5 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Boo...
Lec5 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Boo...Lec5 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Boo...
Lec5 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Boo...
Hsien-Hsin Sean Lee, Ph.D.
 
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Hsien-Hsin Sean Lee, Ph.D.
 
CMOS Topic 6 -_designing_combinational_logic_circuits
CMOS Topic 6 -_designing_combinational_logic_circuitsCMOS Topic 6 -_designing_combinational_logic_circuits
CMOS Topic 6 -_designing_combinational_logic_circuits
Ikhwan_Fakrudin
 

Viewers also liked (20)

Lec20 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Da...
Lec20 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Da...Lec20 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Da...
Lec20 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Da...
 
Lec6 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Can...
Lec6 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Can...Lec6 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Can...
Lec6 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Can...
 
Lec10 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Mu...
Lec10 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Mu...Lec10 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Mu...
Lec10 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Mu...
 
Lec8 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Qui...
Lec8 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Qui...Lec8 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Qui...
Lec8 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Qui...
 
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
 
Lec2 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Num...
Lec2 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Num...Lec2 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Num...
Lec2 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Num...
 
Lec14 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Se...
Lec14 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Se...Lec14 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Se...
Lec14 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Se...
 
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- IntroLec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
 
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PerformanceLec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
 
Lec9 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Com...
Lec9 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Com...Lec9 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Com...
Lec9 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Com...
 
Lec3 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMO...
Lec3 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMO...Lec3 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMO...
Lec3 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMO...
 
Lec13 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Sh...
Lec13 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Sh...Lec13 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Sh...
Lec13 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Sh...
 
Lec16 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Fi...
Lec16 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Fi...Lec16 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Fi...
Lec16 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Fi...
 
Lec12 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Ad...
Lec12 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Ad...Lec12 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Ad...
Lec12 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Ad...
 
Lec4 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMOS
Lec4 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMOSLec4 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMOS
Lec4 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- CMOS
 
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
Lec15 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Re...
 
Lec11 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- De...
Lec11 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- De...Lec11 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- De...
Lec11 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- De...
 
Lec5 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Boo...
Lec5 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Boo...Lec5 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Boo...
Lec5 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Boo...
 
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
Lec9 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part 1
 
CMOS Topic 6 -_designing_combinational_logic_circuits
CMOS Topic 6 -_designing_combinational_logic_circuitsCMOS Topic 6 -_designing_combinational_logic_circuits
CMOS Topic 6 -_designing_combinational_logic_circuits
 

Similar to Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Scheduling part 1

Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
Sidney Chen
 
Pipelining1
Pipelining1Pipelining1
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
Aman Gupta
 
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Nelson Calero
 
Arm architecture
Arm architectureArm architecture
Debugging 2013- Jesper Brouer
Debugging 2013- Jesper BrouerDebugging 2013- Jesper Brouer
Debugging 2013- Jesper Brouer
Mediehuset Ingeniøren Live
 
Dynamic user trace
Dynamic user traceDynamic user trace
Dynamic user trace
Vipin Varghese
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
Brendan Gregg
 
8051 Microcontroller
8051 Microcontroller8051 Microcontroller
8051 Microcontroller
Dr. Ritula Thakur
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
Lex Yu
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Brendan Gregg
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
Engine Yard
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
Brendan Gregg
 
Unit II Arm 7 Introduction
Unit II Arm 7 IntroductionUnit II Arm 7 Introduction
Unit II Arm 7 Introduction
Dr. Pankaj Zope
 
Exploring the x64
Exploring the x64Exploring the x64
Exploring the x64
FFRI, Inc.
 
Lec04
Lec04Lec04
Lec04
Lec04Lec04
WAN SDN meet Segment Routing
WAN SDN meet Segment RoutingWAN SDN meet Segment Routing
WAN SDN meet Segment Routing
APNIC
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
Jeff Larkin
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
Vipin Varghese
 

Similar to Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Scheduling part 1 (20)

Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
 
Pipelining1
Pipelining1Pipelining1
Pipelining1
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
 
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
 
Arm architecture
Arm architectureArm architecture
Arm architecture
 
Debugging 2013- Jesper Brouer
Debugging 2013- Jesper BrouerDebugging 2013- Jesper Brouer
Debugging 2013- Jesper Brouer
 
Dynamic user trace
Dynamic user traceDynamic user trace
Dynamic user trace
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
8051 Microcontroller
8051 Microcontroller8051 Microcontroller
8051 Microcontroller
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
 
Unit II Arm 7 Introduction
Unit II Arm 7 IntroductionUnit II Arm 7 Introduction
Unit II Arm 7 Introduction
 
Exploring the x64
Exploring the x64Exploring the x64
Exploring the x64
 
Lec04
Lec04Lec04
Lec04
 
Lec04
Lec04Lec04
Lec04
 
WAN SDN meet Segment Routing
WAN SDN meet Segment RoutingWAN SDN meet Segment Routing
WAN SDN meet Segment Routing
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 

Recently uploaded

欧洲杯投注-欧洲杯投注押注app-欧洲杯投注押注app官网|【​网址​🎉ac10.net🎉​】
欧洲杯投注-欧洲杯投注押注app-欧洲杯投注押注app官网|【​网址​🎉ac10.net🎉​】欧洲杯投注-欧洲杯投注押注app-欧洲杯投注押注app官网|【​网址​🎉ac10.net🎉​】
欧洲杯投注-欧洲杯投注押注app-欧洲杯投注押注app官网|【​网址​🎉ac10.net🎉​】
akrooshsaleem36
 
一比一原版不列颠哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版不列颠哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版不列颠哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版不列颠哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
bttak
 
一比一原版西三一大学毕业证(TWU毕业证书)学历如何办理
一比一原版西三一大学毕业证(TWU毕业证书)学历如何办理一比一原版西三一大学毕业证(TWU毕业证书)学历如何办理
一比一原版西三一大学毕业证(TWU毕业证书)学历如何办理
bttak
 
欧洲杯赌钱-欧洲杯赌钱冠军-欧洲杯赌钱冠军赔率|【​网址​🎉ac10.net🎉​】
欧洲杯赌钱-欧洲杯赌钱冠军-欧洲杯赌钱冠军赔率|【​网址​🎉ac10.net🎉​】欧洲杯赌钱-欧洲杯赌钱冠军-欧洲杯赌钱冠军赔率|【​网址​🎉ac10.net🎉​】
欧洲杯赌钱-欧洲杯赌钱冠军-欧洲杯赌钱冠军赔率|【​网址​🎉ac10.net🎉​】
hanniaarias53
 
一比一原版(SBU毕业证书)肯特州立大学毕业证如何办理
一比一原版(SBU毕业证书)肯特州立大学毕业证如何办理一比一原版(SBU毕业证书)肯特州立大学毕业证如何办理
一比一原版(SBU毕业证书)肯特州立大学毕业证如何办理
mbawufebxi
 
一比一原版圣托马斯大学毕业证(UST毕业证书)学历如何办理
一比一原版圣托马斯大学毕业证(UST毕业证书)学历如何办理一比一原版圣托马斯大学毕业证(UST毕业证书)学历如何办理
一比一原版圣托马斯大学毕业证(UST毕业证书)学历如何办理
bttak
 
按照学校原版(UPenn文凭证书)宾夕法尼亚大学毕业证快速办理
按照学校原版(UPenn文凭证书)宾夕法尼亚大学毕业证快速办理按照学校原版(UPenn文凭证书)宾夕法尼亚大学毕业证快速办理
按照学校原版(UPenn文凭证书)宾夕法尼亚大学毕业证快速办理
uwoso
 
欧洲杯体彩-欧洲杯体彩比赛投注-欧洲杯体彩比赛投注官网|【​网址​🎉ac99.net🎉​】
欧洲杯体彩-欧洲杯体彩比赛投注-欧洲杯体彩比赛投注官网|【​网址​🎉ac99.net🎉​】欧洲杯体彩-欧洲杯体彩比赛投注-欧洲杯体彩比赛投注官网|【​网址​🎉ac99.net🎉​】
欧洲杯体彩-欧洲杯体彩比赛投注-欧洲杯体彩比赛投注官网|【​网址​🎉ac99.net🎉​】
lopezkatherina914
 
买(usyd毕业证书)澳洲悉尼大学毕业证研究生文凭证书原版一模一样
买(usyd毕业证书)澳洲悉尼大学毕业证研究生文凭证书原版一模一样买(usyd毕业证书)澳洲悉尼大学毕业证研究生文凭证书原版一模一样
买(usyd毕业证书)澳洲悉尼大学毕业证研究生文凭证书原版一模一样
nvoyobt
 
"IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER"
"IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER""IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER"
"IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER"
Emmanuel Onwumere
 

Recently uploaded (10)

欧洲杯投注-欧洲杯投注押注app-欧洲杯投注押注app官网|【​网址​🎉ac10.net🎉​】
欧洲杯投注-欧洲杯投注押注app-欧洲杯投注押注app官网|【​网址​🎉ac10.net🎉​】欧洲杯投注-欧洲杯投注押注app-欧洲杯投注押注app官网|【​网址​🎉ac10.net🎉​】
欧洲杯投注-欧洲杯投注押注app-欧洲杯投注押注app官网|【​网址​🎉ac10.net🎉​】
 
一比一原版不列颠哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版不列颠哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版不列颠哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版不列颠哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版西三一大学毕业证(TWU毕业证书)学历如何办理
一比一原版西三一大学毕业证(TWU毕业证书)学历如何办理一比一原版西三一大学毕业证(TWU毕业证书)学历如何办理
一比一原版西三一大学毕业证(TWU毕业证书)学历如何办理
 
欧洲杯赌钱-欧洲杯赌钱冠军-欧洲杯赌钱冠军赔率|【​网址​🎉ac10.net🎉​】
欧洲杯赌钱-欧洲杯赌钱冠军-欧洲杯赌钱冠军赔率|【​网址​🎉ac10.net🎉​】欧洲杯赌钱-欧洲杯赌钱冠军-欧洲杯赌钱冠军赔率|【​网址​🎉ac10.net🎉​】
欧洲杯赌钱-欧洲杯赌钱冠军-欧洲杯赌钱冠军赔率|【​网址​🎉ac10.net🎉​】
 
一比一原版(SBU毕业证书)肯特州立大学毕业证如何办理
一比一原版(SBU毕业证书)肯特州立大学毕业证如何办理一比一原版(SBU毕业证书)肯特州立大学毕业证如何办理
一比一原版(SBU毕业证书)肯特州立大学毕业证如何办理
 
一比一原版圣托马斯大学毕业证(UST毕业证书)学历如何办理
一比一原版圣托马斯大学毕业证(UST毕业证书)学历如何办理一比一原版圣托马斯大学毕业证(UST毕业证书)学历如何办理
一比一原版圣托马斯大学毕业证(UST毕业证书)学历如何办理
 
按照学校原版(UPenn文凭证书)宾夕法尼亚大学毕业证快速办理
按照学校原版(UPenn文凭证书)宾夕法尼亚大学毕业证快速办理按照学校原版(UPenn文凭证书)宾夕法尼亚大学毕业证快速办理
按照学校原版(UPenn文凭证书)宾夕法尼亚大学毕业证快速办理
 
欧洲杯体彩-欧洲杯体彩比赛投注-欧洲杯体彩比赛投注官网|【​网址​🎉ac99.net🎉​】
欧洲杯体彩-欧洲杯体彩比赛投注-欧洲杯体彩比赛投注官网|【​网址​🎉ac99.net🎉​】欧洲杯体彩-欧洲杯体彩比赛投注-欧洲杯体彩比赛投注官网|【​网址​🎉ac99.net🎉​】
欧洲杯体彩-欧洲杯体彩比赛投注-欧洲杯体彩比赛投注官网|【​网址​🎉ac99.net🎉​】
 
买(usyd毕业证书)澳洲悉尼大学毕业证研究生文凭证书原版一模一样
买(usyd毕业证书)澳洲悉尼大学毕业证研究生文凭证书原版一模一样买(usyd毕业证书)澳洲悉尼大学毕业证研究生文凭证书原版一模一样
买(usyd毕业证书)澳洲悉尼大学毕业证研究生文凭证书原版一模一样
 
"IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER"
"IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER""IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER"
"IOS 18 CONTROL CENTRE REVAMP STREAMLINED IPHONE SHUTDOWN MADE EASIER"
 

Lec7 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Dynamic Scheduling part 1

  • 1. ECE 4100/6100 Advanced Computer Architecture Lecture 7 Dynamic Scheduling (I) Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia Institute of Technology
  • 2. Data Flow Graph (DFG) i1: r2 = 4(r22) i2: r10 = 4(r25) i3: r10 = r2 + r10 i4: 4(r26) = r10 i5: r14 = 8(r27) i6: r6 = (r22) i7: r5 = (r23) i8: r5 = r6 – r5 i9: r4 = r14 * r5 i10: r15 = 12(r27) i11: r7 = 4(r22) i12: r8 = 4(r23) i13: r8 = r7 – r8 i14: r8 = r15* r8 i15: r8 = r4 – r8 i16: (r28) = r8 i1i1 i2i2 i3i3 i4i4 i6i6 i7i7 i8i8 i5i5 i9i9 i11i11 i12i12 i13i13 i10i10 i14i14 i15i15 i16i16 Data Flow Graph (or Data Dependency Graph)
  • 3. Data Flow Execution Model • To exploit maximal ILP • An instruction can be executed immediately after – All source operands are ready – Execution unit available – Destination is ready (to be written)
  • 4. Dynamic Scheduling • Exploit ILP at run-time • Execute instructions out-of-order by a restricted data flow execution model (still use PC!) • Hardware will – Maintain true dependency (data flow manner) – Maintain exception behavior – Find ILP within an Instruction Window (pool) • Need an accurate branch predictor • Pros – Scalable performance: allows code to be compiled on one platform, but also run efficiently on another – Handle cases where dependency is unknown at compile-time • Cons – Hardware complexity (main argument from the VLIW/EPIC camp)
  • 5. Out-of-Order Execution i1: r2 = 4(r22) i2: r10 = 4(r25) i3: r10 = r2 + r10 i4: 4(r26) = r10 i5: r14 = 8(r27) i6: r6 = (r22) i7: r5 = (r23) i8: r5 = r6 – r5 i9: r4 = r14 * r5 i10: r15 = 12(r27) i11: r7 = 4(r22) i12: r8 = 4(r23) i13: r8 = r7 – r8 i14: r8 = r15* r8 i15: r8 = r4 – r8 i16: (r28) = r8 i1i1 i2i2 i3i3 i4i4 i6i6 i7i7 i8i8 i5i5 i9i9 i11i11 i12i12 i13i13 i10i10 i14i14 i15i15 i16i16
  • 6. OOO Execution • OOO execution ≡ out-of-order completion • OOO execution ≠ out-of-order retirement (commit) • No (speculative) instruction allowed to retire until it is confirmed on the right path • Fetch, decode, issue (i.e., front-end) are still done in the program order
  • 7. CDC 6600 Scoreboard Algorithm • Enable OOO Execution to address long-latency FP instructions • Use scoreboard tables to track – Functional unit status – Register update status • Issue and execute instructions whenever – No structural hazard – No data hazard • Cons – Stop issue when WAW is detected – Stop writeback when WAR is detected
  • 8. CDC6600 Scoreboard FunctionalUnits Registers FP MultFP Mult FP MultFP Mult FP DivideFP Divide FP AddFP Add IntegerInteger Memory SCOREBOARDSCOREBOARD Data bus Data bus Data bus Data bus Control bus/Status Int Mult1 Mult2 Add Div Fu 1 1 0 1 1 Busy Load Mult Sub Div Op F1 F0 F8 F2 Dest R3 F1 F6 F0 Src1 F4 F1 F6 Src2 Int Mult1 Dep1 Int Dep2 F0 F1 F2 .. .. .. F31 Mult1 Int Div .. .. .. xxxFU FU Status Table Register Update Table
  • 9. IBM 360 • IBM 360 introduced – 8-bit = 1 byte – 32-bit = 1 word – Byte-addressable memory – Differentiate an “architecture” from an “implementation” • IBM 360/91 FPU about 3 years after CDC 6600 (1966-7) • Tomasulo algorithm – Dynamic scheduling – Register renaming
  • 10. Tomasulo Algorithm • Goal: High Performance without special compilers – Dynamic scheduling done completely by HW – We generally use “supercalar processor” for such category as opposed to “VLIW” or “EPIC” • Differences between IBM 360 and CDC 6600 ISA – IBM has only 2 register specifiers per inst vs. 3 in CDC 6600 • Make WAW and WAR much worse – IBM has 4 FP registers vs. 8 in CDC 6600 • Smaller number of architectural register, compiler is incapable of exploiting better register allocation – IBM has memory-to-register operations • Why study? Lead to Pentium Pro/II/III/4, Core, Alpha 21264, MIPS R10000, HP 8000, PowerPC 604
  • 11. IBM 360/91 FPU w/ Tomasulo Algorithm • To not stall floating point instructions due to long latency – Two function units  FP Add + FP Mult/Div – 360/91 FPU is not pipelined • Three new Mechanisms – Reservation StationsReservation Stations (RS) • 3 in FP Add, 2 in FP mult/div • Register name is discarded when issue to reservation station – TagsTags • 4-bit tag for one of the 11 possible sources (5 RSs + 6 FLB for loads) • Written for unavailable sources whose results are being generated by one of the sources (5 RS or 6 FLB) • New tag assignment eliminates false dependency – Common Data BusCommon Data Bus (CDB), driven by • 11 Sources: 5 RS + 6 FLB • 17 Destinations: 2*5 RS + 3 SDB + 4 FLR
  • 12. Basic Principles • Do not rely on a centralized register file ! • RS fetches and buffers an operand as soon as it is available via CDB – Eliminating the need to get it from a register (No WAR) – Data Flow execution model • Pending instructions designate the RS that will provide their input (renaming and maintain RAW) • Due to in-order issue, the register status table always keeps the latest write (No WAW issue)
  • 13. Key Representation • Op  Operation to perform in the units • Vj  Value of Source 1 (called SINK in 360/91) • Vk  Value of Source 2 (called SOURCE in 360/91) • Qj  The RS (tag) will produce source 1 • Qk  The RS (tag) will produce source 2 • A(ddress)  Hold info for the memory address generation for a load or store • Qi  Whose value should be stored into the register
  • 14. IBM 360/91 FPU w/ Tomasulo Algorithm From Mem FP Registers (FLR) ReservationReservation StationsStations Common Data Bus (CDB)Common Data Bus (CDB) To Mem FP operation stack (FLOS) FP Load Buffers (FLB) Store Data Buffers (SDB) 6 5 4 3 2 1 FP AdderFP Adder FP Mult/DivFP Mult/Div 3 2 1 2 1
  • 15. IBM 360/91 FPU w/ Tomasulo Algorithm From Mem FLR ReservationReservation StationsStations Common Data Bus (CDB)Common Data Bus (CDB) To Mem FP operation stack (FLOS) FLB Store Data Buffers (SDB) 6 5 4 3 2 1 FP AdderFP Adder FP Mult/DivFP Mult/Div 3 2 1 2 1 Tag (Qj) Tag (Qk) Sink (Vj) Source (Vk) Control Tags and other info in RSTags and other info in RS Control Tag (Qi) Tags in FLBTags in FLB Control Tag Control
  • 16. RAWRAW Example: i: R2i: R2 ←← R0 + R4 (2 clks)R0 + R4 (2 clks) j: R8j: R8 ←← R0 + R2 (2 clks)R0 + R2 (2 clks) RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 6.0 2 3.5 4 10.0 8 7.8 Cycle #0:Cycle #0: RSRS Tag Sink Tag Src 44 55 Multiplier/Divider RSRS Tag Sink Tag Src 11 0 6.0 0 10.0 22 33 Adder FLR Busy Tag Data 0 6.0 2 1 11 --- 4 10.0 8 7.8 Cycle #1: IssueCycle #1: Issue ii RSRS Tag Sink Tag Src 44 55 Multiplier/Divider RSRS Tag Sink Tag Src 11 0 6.0 0 10.0 22 0 6.0 11 --- 33 Adder FLR Busy Tag Data 0 6.0 2 1 11 --- 4 10.0 8 1 22 --- Cycle #2: IssueCycle #2: Issue jj RSRS Tag Sink Tag Src 44 55 Multiplier/Divider
  • 17. RAWRAW Example: i: R2i: R2 ←← R0 + R4 (2 clks)R0 + R4 (2 clks) j: R8j: R8 ←← R0 + R2 (2 clks)R0 + R2 (2 clks) RSRS Tag Sink Tag Src 11 22 0 6.0 0 16.0 33 Adder FLR Busy Tag Data 0 6.0 2 16.0 4 10.0 8 1 22 --- Cycle #3:Cycle #3: Broadcasts tag and result: CDB_a=<RS1,16.0>Broadcasts tag and result: CDB_a=<RS1,16.0> RSRS Tag Sink Tag Src 44 55 Multiplier/Divider RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 6.0 2 16.0 4 10.0 8 22.0 Cycle #5:Cycle #5: Broadcasts tag and result: CDB_a=<RS2,22.0>Broadcasts tag and result: CDB_a=<RS2,22.0> RSRS Tag Sink Tag Src 44 55 Multiplier/Divider
  • 18. RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 6.0 2 3.5 4 10.0 8 7.8 Cycle #0:Cycle #0: RSRS Tag Sink Tag Src 44 55 Multiplier/Divider RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 6.0 2 3.5 4 1 44 --- 8 7.8 Cycle #1: IssueCycle #1: Issue ii RSRS Tag Sink Tag Src 44 0 6.0 0 7.8 55 Multiplier/Divider RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 1 55 --- 2 3.5 4 1 44 --- 8 7.8 Cycle #2: IssueCycle #2: Issue jj RSRS Tag Sink Tag Src 44 0 6.0 0 7.8 55 44 --- 0 3.5 Multiplier/Divider WARWAR Example: i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3) j: R0j: R0 ←← R4 x R2 (3)R4 x R2 (3) k: R2k: R2 ←← R2 + R8 (2)R2 + R8 (2)
  • 19. Cycle #3: IssueCycle #3: Issue kk WARWAR Example: i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3) j: R0j: R0 ←← R4 x R2 (3)R4 x R2 (3) k: R2k: R2 ←← R2 + R8 (2)R2 + R8 (2) RSRS Tag Sink Tag Src 11 0 3.5 0 7.8 22 33 Adder FLR Busy Tag Data 0 1 55 --- 2 1 11 --- 4 1 44 --- 8 7.8 RSRS Tag Sink Tag Src 44 0 6.0 0 7.8 55 44 --- 0 3.5 Multiplier/Divider RSRS Tag Sink Tag Src 11 0 3.5 0 7.8 22 33 Adder FLR Busy Tag Data 0 1 55 --- 2 1 11 --- 4 46.8 8 7.8 Cycle #4:Cycle #4: Broadcasts CDB_m=<RS4,46.8>;Broadcasts CDB_m=<RS4,46.8>; RSRS Tag Sink Tag Src 44 55 0 46.8 0 3.5 Multiplier/Divider RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 1 55 --- 2 11.3 4 46.8 8 7.8 Cycle #5:Cycle #5: Broadcasts CDB_a=<RS1,11.3>Broadcasts CDB_a=<RS1,11.3> RSRS Tag Sink Tag Src 44 55 0 46.8 0 3.5 Multiplier/Divider
  • 20. WARWAR Example: i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3) j: R0j: R0 ←← R4 x R2 (3)R4 x R2 (3) k: R2k: R2 ←← R2 + R8 (2)R2 + R8 (2) RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 163.8 2 11.3 4 46.8 8 7.8 Cycle #7:Cycle #7: Broadcasts CDB_m=<RS5,163.8>Broadcasts CDB_m=<RS5,163.8> RSRS Tag Sink Tag Src 44 55 Multiplier/Divider
  • 21. RSRS Tag Sink Tag Src 11 0 6.0 44 --- 22 33 Adder FLR Busy Tag Data 0 6.0 2 1 11 --- 4 1 44 --- 8 7.8 Cycle #2: IssueCycle #2: Issue jj RSRS Tag Sink Tag Src 44 0 6.0 0 7.8 55 Multiplier/Divider WAWWAW Example: RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 6.0 2 3.5 4 10.0 8 7.8 Cycle #0:Cycle #0: RSRS Tag Sink Tag Src 44 55 Multiplier/Divider RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 6.0 2 3.5 4 1 44 --- 8 7.8 Cycle #1: IssueCycle #1: Issue ii RSRS Tag Sink Tag Src 44 0 6.0 0 7.8 55 Multiplier/Divider i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3) j: R2j: R2 ←← R0 + R4 (2)R0 + R4 (2) k: R4k: R4 ←← R0 + R8 (2)R0 + R8 (2)
  • 22. RSRS Tag Sink Tag Src 11 0 6.0 44 --- 22 0 6.0 0 7.8 33 Adder FLR Busy Tag Data 0 6.0 2 1 11 --- 4 1 22 --- 8 7.8 Cycle #3: IssueCycle #3: Issue kk RSRS Tag Sink Tag Src 44 0 6.0 0 7.8 55 Multiplier/Divider WAWWAW Example: i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3) j: R2j: R2 ←← R0 + R4 (2)R0 + R4 (2) k: R4k: R4 ←← R0 + R8 (2)R0 + R8 (2) RSRS Tag Sink Tag Src 11 0 6.0 0 46.8 22 0 6.0 0 7.8 33 Adder FLR Busy Tag Data 0 6.0 2 1 11 --- 4 1 22 --- 8 7.8 Cycle #4:Cycle #4: Broadcasts CDB_m=<RS4,46.8>Broadcasts CDB_m=<RS4,46.8> RSRS Tag Sink Tag Src 44 55 Multiplier/Divider RSRS Tag Sink Tag Src 11 0 6.0 0 46.8 22 33 Adder FLR Busy Tag Data 0 6.0 2 1 11 --- 4 13.8 8 7.8 Cycle #5:Cycle #5: Broadcasts CDB_a=<RS2,13.8>Broadcasts CDB_a=<RS2,13.8> RSRS Tag Sink Tag Src 44 55 Multiplier/Divider
  • 23. WAWWAW Example: RSRS Tag Sink Tag Src 11 22 33 Adder FLR Busy Tag Data 0 6.0 2 52.8 4 13.8 8 7.8 Cycle #6:Cycle #6: Broadcasts CDB_a=<RS1,52.8>Broadcasts CDB_a=<RS1,52.8> RSRS Tag Sink Tag Src 44 55 Multiplier/Divider i: R4i: R4 ←← R0 x R8 (3)R0 x R8 (3) j: R2j: R2 ←← R0 + R4 (2)R0 + R4 (2) k: R4k: R4 ←← R0 + R8 (2)R0 + R8 (2)
  • 24. Tomasulo Example (H&P Text) Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 Load1 No LD F2 45+ R3 Load2 No MULTD F0 F2 F4 Load3 No SUBD F8 F6 F2 DIVD F10 F0 F6 ADDD F6 F8 F2 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No Mult1 No Mult2 No Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 0 Qi These are RS, we have only “one FU” for each type (MUL, ADD, LD). We reduce Load from 6 to 3 for simplicity. SDB is not shown either
  • 25. Assumption • INT (load)  1 cycle • MULT  10 cycles • ADD  2 cycles • DIVIDE  40 cycles
  • 26. Tomasulo Example Cycle 1 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 Load1 Yes 34+R2 LD F2 45+ R3 Load2 No MULTD F0 F2 F4 Load3 No SUBD F8 F6 F2 DIVD F10 F0 F6 ADDD F6 F8 F2 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No Mult1 No Mult2 No Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 1 Qi Load1
  • 27. Tomasulo Example Cycle 2 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 Load1 Yes 34+R2 LD F2 45+ R3 2 Load2 Yes 45+R3 MULTD F0 F2 F4 Load3 No SUBD F8 F6 F2 DIVD F10 F0 F6 ADDD F6 F8 F2 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No Mult1 No Mult2 No Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 2 Qi Load2 Load1 Note: Unlike CDC6600, RS enables multiple outstanding loads Load is calculating the effective address
  • 28. Tomasulo Example Cycle 3 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 Load1 Yes 34+R2 LD F2 45+ R3 2 Load2 Yes 45+R3 MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 DIVD F10 F0 F6 ADDD F6 F8 F2 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No Mult1 Yes MULTD R(F4) Load2 Mult2 No Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 3 Qi Mult1 Load2 Load1 • Note: registers names are removed (“renamed”) in Reservation Stations; MULT issued vs. scoreboard • Load1 completing; what is waiting for Load1?
  • 29. Tomasulo Example Cycle 4 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 Load2 Yes 45+R3 MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 DIVD F10 F0 F6 ADDD F6 F8 F2 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 Yes SUBD M(A1) Load2 Add2 No Add3 No Mult1 Yes MULTD R(F4) Load2 Mult2 No Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 4 Qi Mult1 Load2 M(A1) Add1 • Load1 write to CDB; Load2 completing; what is waiting for Load2?
  • 30. Tomasulo Example Cycle 5 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 DIVD F10 F0 F6 5 ADDD F6 F8 F2 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk 2 Add1 Yes SUBD M(A1) M(A2) Add2 No Add3 No 10 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 5 Qi Mult1 M(A2) Add1 Mult2
  • 31. Tomasulo Example Cycle 6 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk 1 Add1 Yes SUBD M(A1) M(A2) Add2 Yes ADDD R(F2) Add1 Add3 No 9 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 6 Qi Mult1 Add2 Add1 Mult2 • R(F6) was entered in Cycle 5 • Issue ADDD here vs. scoreboard?
  • 32. Tomasulo Example Cycle 7 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 7 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk 0 Add1 Yes SUBD M(A1) M(A2) Add2 Yes ADDD R(F2) Add1 Add3 No 8 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 7 Qi Mult1 Add2 Add1 Mult2 • Add1 completing; what is waiting for it?
  • 33. Tomasulo Example Cycle 8 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No 2 Add2 Yes ADDD(M1-M2)R(F2) Add3 No 7 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 8 Qi Mult1 Add2 (M1-M2)Mult2
  • 34. Tomasulo Example Cycle 9 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No 1 Add2 Yes ADDD(M1-M2)R(F2) Add3 No 6 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 9 FU Mult1 Add2 Mult2
  • 35. Tomasulo Example Cycle 10 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 10 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No 0 Add2 Yes ADDD(M1-M2)R(F2) Add3 No 5 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 10 FU Mult1 Add2 Mult2 • Add2 completing; what is waiting for it?
  • 36. Tomasulo Example Cycle 11 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No 4 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 11 FU Mult1 (M1-M2+M(A2)) Mult2 • Write result of ADDD here vs. scoreboard? • All quick instructions complete in this cycle!
  • 37. Tomasulo Example Cycle 12 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No 3 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 12 FU Mult1 Mult2
  • 38. Tomasulo Example Cycle 13 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No 2 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 13 FU Mult1 Mult2
  • 39. Tomasulo Example Cycle 14 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No 1 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 14 FU Mult1 Mult2
  • 40. Tomasulo Example Cycle 15 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 15 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No 0 Mult1 Yes MULTD M(A2) R(F4) Mult2 Yes DIVD R(F6) Mult1 Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 15 FU Mult1 Mult2
  • 41. Tomasulo Example Cycle 16 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 15 16 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No Mult1 No 40 Mult2 Yes DIVD M*F4 R(F6) Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 16 FU M*F4 Mult2
  • 42. Faster than light computation (skip a couple of cycles)
  • 43. Tomasulo Example Cycle 55 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 15 16 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No Mult1 No 1 Mult2 Yes DIVD M*F4 R(F6) Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 55 FU Mult2
  • 44. Tomasulo Example Cycle 56 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 15 16 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 56 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No Mult1 No 0 Mult2 Yes DIVD M*F4 R(F6) Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 56 FU Mult2 • Mult2 is completing; what is waiting for it?
  • 45. Tomasulo Example Cycle 57 Instruction status: Exec Write Instruction j k Issue Comp Result Busy Address LD F6 34+ R2 1 3 4 Load1 No LD F2 45+ R3 2 4 5 Load2 No MULTD F0 F2 F4 3 15 16 Load3 No SUBD F8 F6 F2 4 7 8 DIVD F10 F0 F6 5 56 57 ADDD F6 F8 F2 6 10 11 Reservation Stations: S1 S2 RS RS Time Name Busy Op Vj Vk Qj Qk Add1 No Add2 No Add3 No Mult1 No Mult2 No Register result status: Clock F0 F2 F4 F6 F8 F10 F12 ... F30 57 FU Result • Once again: In-order issue, out-of-order execution and completion.
  • 46. Compare to Scoreboard Cycle 62 Instruction status: Read Exec Write Exec Write Instruction j k Issue Oper Comp Result Issue ComplResult LD F6 34+ R2 1 2 3 4 1 3 4 LD F2 45+ R3 5 6 7 8 2 4 5 MULTD F0 F2 F4 6 9 19 20 3 15 16 SUBD F8 F6 F2 7 9 11 12 4 7 8 DIVD F10 F0 F6 8 21 61 62 5 56 57 ADDD F6 F8 F2 13 14 16 22 6 10 11 • Why take longer on scoreboard/6600? • Structural Hazards • Lack of forwarding
  • 47. Issues in Tomasulo Algorithm • CDB at high speed? • Precise exception issues • Speculative instructions – Branch prediction enlarges instruction window – How to rollback when mispredicted?

Editor's Notes

  1. &amp;lt;number&amp;gt;
  2. &amp;lt;number&amp;gt;
  3. &amp;lt;number&amp;gt;
  4. &amp;lt;number&amp;gt;
  5. &amp;lt;number&amp;gt;
  6. &amp;lt;number&amp;gt;
  7. &amp;lt;number&amp;gt;
  8. &amp;lt;number&amp;gt;
  9. Cycle 2 of LD is “Read Operand” which is not shown.