SlideShare a Scribd company logo
1
Copyright © 2019, Elsevier Inc. All rights reserved.
Chapter 1
Fundamentals of Quantitative
Design and Analysis
Computer Architecture
A Quantitative Approach, Sixth Edition
2
Computer Technology
 Performance improvements:
 Improvements in semiconductor technology
 Feature size, clock speed
 Improvements in computer architectures
 Enabled by HLL compilers, UNIX
 Lead to RISC architectures
 Together have enabled:
 Lightweight computers
 Productivity-based managed/interpreted
programming languages
Copyright © 2019, Elsevier Inc. All rights reserved.
Introduction
3
Single Processor Performance
Copyright © 2019, Elsevier Inc. All rights reserved.
Introduction
4
Copyright © 2019, Elsevier Inc. All rights reserved.
Current Trends in Architecture
 Cannot continue to leverage Instruction-Level
parallelism (ILP)
 Single processor performance improvement ended in
2003
 New models for performance:
 Data-level parallelism (DLP)
 Thread-level parallelism (TLP)
 Request-level parallelism (RLP)
 These require explicit restructuring of the
application
Introduction
5
Copyright © 2019, Elsevier Inc. All rights reserved.
Classes of Computers
 Personal Mobile Device (PMD)
 e.g. start phones, tablet computers
 Emphasis on energy efficiency and real-time
 Desktop Computing
 Emphasis on price-performance
 Servers
 Emphasis on availability, scalability, throughput
 Clusters / Warehouse Scale Computers
 Used for “Software as a Service (SaaS)”
 Emphasis on availability and price-performance
 Sub-class: Supercomputers, emphasis: floating-point
performance and fast internal networks
 Internet of Things/Embedded Computers
 Emphasis: price
Classes
of
Computers
6
Copyright © 2019, Elsevier Inc. All rights reserved.
Parallelism
 Classes of parallelism in applications:
 Data-Level Parallelism (DLP)
 Task-Level Parallelism (TLP)
 Classes of architectural parallelism:
 Instruction-Level Parallelism (ILP)
 Vector architectures/Graphic Processor Units (GPUs)
 Thread-Level Parallelism
 Request-Level Parallelism
Classes
of
Computers
7
Copyright © 2019, Elsevier Inc. All rights reserved.
Flynn’s Taxonomy
 Single instruction stream, single data stream (SISD)
 Single instruction stream, multiple data streams (SIMD)
 Vector architectures
 Multimedia extensions
 Graphics processor units
 Multiple instruction streams, single data stream (MISD)
 No commercial implementation
 Multiple instruction streams, multiple data streams
(MIMD)
 Tightly-coupled MIMD
 Loosely-coupled MIMD
Classes
of
Computers
8
Copyright © 2019, Elsevier Inc. All rights reserved.
Defining Computer Architecture
 “Old” view of computer architecture:
 Instruction Set Architecture (ISA) design
 i.e. decisions regarding:
 registers, memory addressing, addressing modes,
instruction operands, available operations, control flow
instructions, instruction encoding
 “Real” computer architecture:
 Specific requirements of the target machine
 Design to maximize performance within constraints:
cost, power, and availability
 Includes ISA, microarchitecture, hardware
Defining
Computer
Architecture
9
Instruction Set Architecture
 Class of ISA
 General-purpose registers
 Register-memory vs load-store
 RISC-V registers
 32 g.p., 32 f.p.
Copyright © 2019, Elsevier Inc. All rights reserved.
Defining
Computer
Architecture
Register Name Use Saver
x0 zero constant 0 n/a
x1 ra return addr caller
x2 sp stack ptr callee
x3 gp gbl ptr
x4 tp thread ptr
x5-x7 t0-t2 temporaries caller
x8 s0/fp saved/
frame ptr
callee
Register Name Use Saver
x9 s1 saved callee
x10-x17 a0-a7 arguments caller
x18-x27 s2-s11 saved callee
x28-x31 t3-t6 temporaries caller
f0-f7 ft0-ft7 FP temps caller
f8-f9 fs0-fs1 FP saved callee
f10-f17 fa0-fa7 FP
arguments
callee
f18-f27 fs2-fs21 FP saved callee
f28-f31 ft8-ft11 FP temps caller
10
Instruction Set Architecture
 Memory addressing
 RISC-V: byte addressed, aligned accesses faster
 Addressing modes
 RISC-V: Register, immediate, displacement
(base+offset)
 Other examples: autoincrement, indexed, PC-relative
 Types and size of operands
 RISC-V: 8-bit, 32-bit, 64-bit
Copyright © 2019, Elsevier Inc. All rights reserved.
Defining
Computer
Architecture
11
Instruction Set Architecture
 Operations
 RISC-V: data transfer, arithmetic, logical, control,
floating point
 See Fig. 1.5 in text
 Control flow instructions
 Use content of registers (RISC-V) vs. status bits (x86,
ARMv7, ARMv8)
 Return address in register (RISC-V, ARMv7, ARMv8)
vs. on stack (x86)
 Encoding
 Fixed (RISC-V, ARMv7/v8 except compact instruction
set) vs. variable length (x86)
Copyright © 2019, Elsevier Inc. All rights reserved.
Defining
Computer
Architecture
12
Copyright © 2019, Elsevier Inc. All rights reserved.
Trends in Technology
 Integrated circuit technology (Moore’s Law)
 Transistor density: 35%/year
 Die size: 10-20%/year
 Integration overall: 40-55%/year
 DRAM capacity: 25-40%/year (slowing)
 8 Gb (2014), 16 Gb (2019), possibly no 32 Gb
 Flash capacity: 50-60%/year
 8-10X cheaper/bit than DRAM
 Magnetic disk capacity: recently slowed to 5%/year
 Density increases may no longer be possible, maybe increase from 7 to
9 platters
 8-10X cheaper/bit then Flash
 200-300X cheaper/bit than DRAM
Trends
in
Technology
13
Copyright © 2019, Elsevier Inc. All rights reserved.
Bandwidth and Latency
 Bandwidth or throughput
 Total work done in a given time
 32,000-40,000X improvement for processors
 300-1200X improvement for memory and disks
 Latency or response time
 Time between start and completion of an event
 50-90X improvement for processors
 6-8X improvement for memory and disks
Trends
in
Technology
14
Copyright © 2019, Elsevier Inc. All rights reserved.
Bandwidth and Latency
Log-log plot of bandwidth and latency milestones
Trends
in
Technology
15
Copyright © 2019, Elsevier Inc. All rights reserved.
Transistors and Wires
 Feature size
 Minimum size of transistor or wire in x or y
dimension
 10 microns in 1971 to .011 microns in 2017
 Transistor performance scales linearly
 Wire delay does not improve with feature size!
 Integration density scales quadratically
Trends
in
Technology
16
Copyright © 2019, Elsevier Inc. All rights reserved.
Power and Energy
 Problem: Get power in, get power out
 Thermal Design Power (TDP)
 Characterizes sustained power consumption
 Used as target for power supply and cooling system
 Lower than peak power (1.5X higher), higher than
average power consumption
 Clock rate can be reduced dynamically to limit
power consumption
 Energy per task is often a better measurement
Trends
in
Power
and
Energy
17
Copyright © 2019, Elsevier Inc. All rights reserved.
Dynamic Energy and Power
 Dynamic energy
 Transistor switch from 0 -> 1 or 1 -> 0
 ½ x Capacitive load x Voltage2
 Dynamic power
 ½ x Capacitive load x Voltage2 x Frequency switched
 Reducing clock rate reduces power, not energy
Trends
in
Power
and
Energy
18
Copyright © 2019, Elsevier Inc. All rights reserved.
Power
 Intel 80386
consumed ~ 2 W
 3.3 GHz Intel
Core i7 consumes
130 W
 Heat must be
dissipated from
1.5 x 1.5 cm chip
 This is the limit of
what can be
cooled by air
Trends
in
Power
and
Energy
19
Copyright © 2019, Elsevier Inc. All rights reserved.
Reducing Power
 Techniques for reducing power:
 Do nothing well
 Dynamic Voltage-Frequency Scaling
 Low power state for DRAM, disks
 Overclocking, turning off cores
Trends
in
Power
and
Energy
20
Copyright © 2019, Elsevier Inc. All rights reserved.
Static Power
 Static power consumption
 25-50% of total power
 Currentstatic x Voltage
 Scales with number of transistors
 To reduce: power gating
Trends
in
Power
and
Energy
21
Copyright © 2019, Elsevier Inc. All rights reserved.
Trends in Cost
 Cost driven down by learning curve
 Yield
 DRAM: price closely tracks cost
 Microprocessors: price depends on
volume
 10% less for each doubling of volume
Trends
in
Cost
22
Copyright © 2019, Elsevier Inc. All rights reserved.
Integrated Circuit Cost
 Integrated circuit
 Bose-Einstein formula:
 Defects per unit area = 0.016-0.057 defects per square cm (2010)
 N = process-complexity factor = 11.5-15.5 (40 nm, 2010)
Trends
in
Cost
23
Copyright © 2019, Elsevier Inc. All rights reserved.
Dependability
 Module reliability
 Mean time to failure (MTTF)
 Mean time to repair (MTTR)
 Mean time between failures (MTBF) = MTTF + MTTR
 Availability = MTTF / MTBF
Dependability
24
Copyright © 2019, Elsevier Inc. All rights reserved.
Measuring Performance
 Typical performance metrics:
 Response time
 Throughput
 Speedup of X relative to Y
 Execution timeY / Execution timeX
 Execution time
 Wall clock time: includes all system overheads
 CPU time: only computation time
 Benchmarks
 Kernels (e.g. matrix multiply)
 Toy programs (e.g. sorting)
 Synthetic benchmarks (e.g. Dhrystone)
 Benchmark suites (e.g. SPEC06fp, TPC-C)
Measuring
Performance
25
Copyright © 2019, Elsevier Inc. All rights reserved.
Principles of Computer Design
 Take Advantage of Parallelism
 e.g. multiple processors, disks, memory banks,
pipelining, multiple functional units
 Principle of Locality
 Reuse of data and instructions
 Focus on the Common Case
 Amdahl’s Law
Principles
26
Copyright © 2019, Elsevier Inc. All rights reserved.
Principles of Computer Design
 The Processor Performance Equation
Principles
27
Copyright © 2019, Elsevier Inc. All rights reserved.
Principles of Computer Design
Principles
 Different instruction types having different
CPIs
28
Copyright © 2019, Elsevier Inc. All rights reserved.
Principles of Computer Design
Principles
 Different instruction types having different
CPIs
29
Fallacies and Pitfalls
 All exponential laws must come to an end
 Dennard scaling (constant power density)
 Stopped by threshold voltage
 Disk capacity
 30-100% per year to 5% per year
 Moore’s Law
 Most visible with DRAM capacity
 ITRS disbanded
 Only four foundries left producing state-of-the-art
logic chips
 11 nm, 3 nm might be the limit
Copyright © 2019, Elsevier Inc. All rights reserved.
30
Fallacies and Pitfalls
 Microprocessors are a silver bullet
 Performance is now a programmer’s burden
 Falling prey to Amdahl’s Law
 A single point of failure
 Hardware enhancements that increase
performance also improve energy
efficiency, or are at worst energy neutral
 Benchmarks remain valid indefinitely
 Compiler optimizations target benchmarks
Copyright © 2019, Elsevier Inc. All rights reserved.
31
Fallacies and Pitfalls
 The rated mean time to failure of disks is
1,200,000 hours or almost 140 years, so
disks practically never fail
 MTTF value from manufacturers assume
regular replacement
 Peak performance tracks observed
performance
 Fault detection can lower availability
 Not all operations are needed for correct
execution
Copyright © 2019, Elsevier Inc. All rights reserved.

More Related Content

Similar to Chapter 1.pptx

Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
Anand Haridass
 
Cell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology GroupCell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology Group
Slide_N
 
Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentation
vinaya.hs
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe Conference
LEGATO project
 
AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD
 
The Cloud & Its Impact on IT
The Cloud & Its Impact on ITThe Cloud & Its Impact on IT
The Cloud & Its Impact on IT
Anand Haridass
 
Adaptable embedded systems
Adaptable embedded systemsAdaptable embedded systems
Adaptable embedded systems
Springer
 
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
Paul Hofmann
 
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband EngineMichael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind
 
Chip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdfChip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdf
Slide_N
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
Sudarsun Santhiappan
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
inside-BigData.com
 
Future Trends in IT Storage
Future Trends in IT StorageFuture Trends in IT Storage
Future Trends in IT Storage
Tony Pearson
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
Ryousei Takano
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
IBM Switzerland
 
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind
 
Genesys System - 8dec2010
Genesys System - 8dec2010Genesys System - 8dec2010
Genesys System - 8dec2010
Agora Group
 
FUNDAMENTALS OF COMPUTER DESIGN
FUNDAMENTALS OF COMPUTER DESIGNFUNDAMENTALS OF COMPUTER DESIGN
FUNDAMENTALS OF COMPUTER DESIGN
venkatraman227
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
RCCSRENKEI
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applications
Accumulo Summit
 

Similar to Chapter 1.pptx (20)

Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
Cell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology GroupCell Today and Tomorrow - IBM Systems and Technology Group
Cell Today and Tomorrow - IBM Systems and Technology Group
 
Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentation
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe Conference
 
AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center AMD Bridges the X86 and ARM Ecosystems for the Data Center
AMD Bridges the X86 and ARM Ecosystems for the Data Center
 
The Cloud & Its Impact on IT
The Cloud & Its Impact on ITThe Cloud & Its Impact on IT
The Cloud & Its Impact on IT
 
Adaptable embedded systems
Adaptable embedded systemsAdaptable embedded systems
Adaptable embedded systems
 
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
 
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband EngineMichael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
 
Chip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdfChip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdf
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
Future Trends in IT Storage
Future Trends in IT StorageFuture Trends in IT Storage
Future Trends in IT Storage
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
Michael Gschwind, Cell Broadband Engine: Exploiting multiple levels of parall...
 
Genesys System - 8dec2010
Genesys System - 8dec2010Genesys System - 8dec2010
Genesys System - 8dec2010
 
FUNDAMENTALS OF COMPUTER DESIGN
FUNDAMENTALS OF COMPUTER DESIGNFUNDAMENTALS OF COMPUTER DESIGN
FUNDAMENTALS OF COMPUTER DESIGN
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
Performance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applicationsPerformance modeling and simulation for accumulo applications
Performance modeling and simulation for accumulo applications
 

Recently uploaded

一比一原版亚利桑那大学毕业证(UA毕业证书)如何办理
一比一原版亚利桑那大学毕业证(UA毕业证书)如何办理一比一原版亚利桑那大学毕业证(UA毕业证书)如何办理
一比一原版亚利桑那大学毕业证(UA毕业证书)如何办理
21uul8se
 
一比一原版(USQ毕业证书)南昆士兰大学毕业证如何办理
一比一原版(USQ毕业证书)南昆士兰大学毕业证如何办理一比一原版(USQ毕业证书)南昆士兰大学毕业证如何办理
一比一原版(USQ毕业证书)南昆士兰大学毕业证如何办理
p74xokfq
 
一比一原版(Coventry毕业证)英国考文垂大学毕业证如何办理
一比一原版(Coventry毕业证)英国考文垂大学毕业证如何办理一比一原版(Coventry毕业证)英国考文垂大学毕业证如何办理
一比一原版(Coventry毕业证)英国考文垂大学毕业证如何办理
tobbk6s8
 
一比一原版(UWS毕业证)澳洲西悉尼大学毕业证如何办理
一比一原版(UWS毕业证)澳洲西悉尼大学毕业证如何办理一比一原版(UWS毕业证)澳洲西悉尼大学毕业证如何办理
一比一原版(UWS毕业证)澳洲西悉尼大学毕业证如何办理
t34zod9l
 
一比一原版(UoN毕业证书)纽卡斯尔大学毕业证如何办理
一比一原版(UoN毕业证书)纽卡斯尔大学毕业证如何办理一比一原版(UoN毕业证书)纽卡斯尔大学毕业证如何办理
一比一原版(UoN毕业证书)纽卡斯尔大学毕业证如何办理
f22b6g9c
 
TOWER DESIGN PROCEDURE TOWER DESIGN BASIS .pptx
TOWER DESIGN PROCEDURE TOWER DESIGN BASIS .pptxTOWER DESIGN PROCEDURE TOWER DESIGN BASIS .pptx
TOWER DESIGN PROCEDURE TOWER DESIGN BASIS .pptx
BAWAALEX1
 
一比一原版马里兰大学毕业证(UMD毕业证书)如何办理
一比一原版马里兰大学毕业证(UMD毕业证书)如何办理一比一原版马里兰大学毕业证(UMD毕业证书)如何办理
一比一原版马里兰大学毕业证(UMD毕业证书)如何办理
9lq7ultg
 
一比一原版(UoB毕业证)英国伯明翰大学毕业证如何办理
一比一原版(UoB毕业证)英国伯明翰大学毕业证如何办理一比一原版(UoB毕业证)英国伯明翰大学毕业证如何办理
一比一原版(UoB毕业证)英国伯明翰大学毕业证如何办理
zv943dhb
 
原版制作(MDIS毕业证书)新加坡管理发展学院毕业证学位证一模一样
原版制作(MDIS毕业证书)新加坡管理发展学院毕业证学位证一模一样原版制作(MDIS毕业证书)新加坡管理发展学院毕业证学位证一模一样
原版制作(MDIS毕业证书)新加坡管理发展学院毕业证学位证一模一样
hw2xf1m
 
一比一原版(LSE毕业证书)伦敦政治经济学院毕业证如何办理
一比一原版(LSE毕业证书)伦敦政治经济学院毕业证如何办理一比一原版(LSE毕业证书)伦敦政治经济学院毕业证如何办理
一比一原版(LSE毕业证书)伦敦政治经济学院毕业证如何办理
340qn0m1
 
一比一原版(LSBU毕业证书)伦敦南岸大学毕业证如何办理
一比一原版(LSBU毕业证书)伦敦南岸大学毕业证如何办理一比一原版(LSBU毕业证书)伦敦南岸大学毕业证如何办理
一比一原版(LSBU毕业证书)伦敦南岸大学毕业证如何办理
k7nm6tk
 
NHR Engineers Portfolio 2023 2024 NISHANT RATHI
NHR Engineers Portfolio 2023 2024 NISHANT RATHINHR Engineers Portfolio 2023 2024 NISHANT RATHI
NHR Engineers Portfolio 2023 2024 NISHANT RATHI
NishantRathi18
 
ADESGN3S_Case-Study-Municipal-Health-Center.pdf
ADESGN3S_Case-Study-Municipal-Health-Center.pdfADESGN3S_Case-Study-Municipal-Health-Center.pdf
ADESGN3S_Case-Study-Municipal-Health-Center.pdf
GregMichaelTapawan
 
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
bz42w9z0
 
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
kmzsy4kn
 
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
qa8dk1wm
 
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
ka3y2ukz
 
一比一原版美国旧金山大学毕业证(USF学位证)如何办理
一比一原版美国旧金山大学毕业证(USF学位证)如何办理一比一原版美国旧金山大学毕业证(USF学位证)如何办理
一比一原版美国旧金山大学毕业证(USF学位证)如何办理
xnhwr8v
 
Heuristics Evaluation - How to Guide.pdf
Heuristics Evaluation - How to Guide.pdfHeuristics Evaluation - How to Guide.pdf
Heuristics Evaluation - How to Guide.pdf
Jaime Brown
 
一比一原版(ututaustin毕业证书)美国德克萨斯大学奥斯汀分校毕业证如何办理
一比一原版(ututaustin毕业证书)美国德克萨斯大学奥斯汀分校毕业证如何办理一比一原版(ututaustin毕业证书)美国德克萨斯大学奥斯汀分校毕业证如何办理
一比一原版(ututaustin毕业证书)美国德克萨斯大学奥斯汀分校毕业证如何办理
yqyquge
 

Recently uploaded (20)

一比一原版亚利桑那大学毕业证(UA毕业证书)如何办理
一比一原版亚利桑那大学毕业证(UA毕业证书)如何办理一比一原版亚利桑那大学毕业证(UA毕业证书)如何办理
一比一原版亚利桑那大学毕业证(UA毕业证书)如何办理
 
一比一原版(USQ毕业证书)南昆士兰大学毕业证如何办理
一比一原版(USQ毕业证书)南昆士兰大学毕业证如何办理一比一原版(USQ毕业证书)南昆士兰大学毕业证如何办理
一比一原版(USQ毕业证书)南昆士兰大学毕业证如何办理
 
一比一原版(Coventry毕业证)英国考文垂大学毕业证如何办理
一比一原版(Coventry毕业证)英国考文垂大学毕业证如何办理一比一原版(Coventry毕业证)英国考文垂大学毕业证如何办理
一比一原版(Coventry毕业证)英国考文垂大学毕业证如何办理
 
一比一原版(UWS毕业证)澳洲西悉尼大学毕业证如何办理
一比一原版(UWS毕业证)澳洲西悉尼大学毕业证如何办理一比一原版(UWS毕业证)澳洲西悉尼大学毕业证如何办理
一比一原版(UWS毕业证)澳洲西悉尼大学毕业证如何办理
 
一比一原版(UoN毕业证书)纽卡斯尔大学毕业证如何办理
一比一原版(UoN毕业证书)纽卡斯尔大学毕业证如何办理一比一原版(UoN毕业证书)纽卡斯尔大学毕业证如何办理
一比一原版(UoN毕业证书)纽卡斯尔大学毕业证如何办理
 
TOWER DESIGN PROCEDURE TOWER DESIGN BASIS .pptx
TOWER DESIGN PROCEDURE TOWER DESIGN BASIS .pptxTOWER DESIGN PROCEDURE TOWER DESIGN BASIS .pptx
TOWER DESIGN PROCEDURE TOWER DESIGN BASIS .pptx
 
一比一原版马里兰大学毕业证(UMD毕业证书)如何办理
一比一原版马里兰大学毕业证(UMD毕业证书)如何办理一比一原版马里兰大学毕业证(UMD毕业证书)如何办理
一比一原版马里兰大学毕业证(UMD毕业证书)如何办理
 
一比一原版(UoB毕业证)英国伯明翰大学毕业证如何办理
一比一原版(UoB毕业证)英国伯明翰大学毕业证如何办理一比一原版(UoB毕业证)英国伯明翰大学毕业证如何办理
一比一原版(UoB毕业证)英国伯明翰大学毕业证如何办理
 
原版制作(MDIS毕业证书)新加坡管理发展学院毕业证学位证一模一样
原版制作(MDIS毕业证书)新加坡管理发展学院毕业证学位证一模一样原版制作(MDIS毕业证书)新加坡管理发展学院毕业证学位证一模一样
原版制作(MDIS毕业证书)新加坡管理发展学院毕业证学位证一模一样
 
一比一原版(LSE毕业证书)伦敦政治经济学院毕业证如何办理
一比一原版(LSE毕业证书)伦敦政治经济学院毕业证如何办理一比一原版(LSE毕业证书)伦敦政治经济学院毕业证如何办理
一比一原版(LSE毕业证书)伦敦政治经济学院毕业证如何办理
 
一比一原版(LSBU毕业证书)伦敦南岸大学毕业证如何办理
一比一原版(LSBU毕业证书)伦敦南岸大学毕业证如何办理一比一原版(LSBU毕业证书)伦敦南岸大学毕业证如何办理
一比一原版(LSBU毕业证书)伦敦南岸大学毕业证如何办理
 
NHR Engineers Portfolio 2023 2024 NISHANT RATHI
NHR Engineers Portfolio 2023 2024 NISHANT RATHINHR Engineers Portfolio 2023 2024 NISHANT RATHI
NHR Engineers Portfolio 2023 2024 NISHANT RATHI
 
ADESGN3S_Case-Study-Municipal-Health-Center.pdf
ADESGN3S_Case-Study-Municipal-Health-Center.pdfADESGN3S_Case-Study-Municipal-Health-Center.pdf
ADESGN3S_Case-Study-Municipal-Health-Center.pdf
 
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
一比一原版澳洲科廷科技大学毕业证(Curtin毕业证)如何办理
 
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
一比一原版(KPU毕业证)加拿大昆特兰理工大学毕业证如何办理
 
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
一比一原版澳洲查理斯特大学毕业证(CSU学位证)如何办理
 
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
一比一原版(Brunel毕业证)英国布鲁内尔大学毕业证如何办理
 
一比一原版美国旧金山大学毕业证(USF学位证)如何办理
一比一原版美国旧金山大学毕业证(USF学位证)如何办理一比一原版美国旧金山大学毕业证(USF学位证)如何办理
一比一原版美国旧金山大学毕业证(USF学位证)如何办理
 
Heuristics Evaluation - How to Guide.pdf
Heuristics Evaluation - How to Guide.pdfHeuristics Evaluation - How to Guide.pdf
Heuristics Evaluation - How to Guide.pdf
 
一比一原版(ututaustin毕业证书)美国德克萨斯大学奥斯汀分校毕业证如何办理
一比一原版(ututaustin毕业证书)美国德克萨斯大学奥斯汀分校毕业证如何办理一比一原版(ututaustin毕业证书)美国德克萨斯大学奥斯汀分校毕业证如何办理
一比一原版(ututaustin毕业证书)美国德克萨斯大学奥斯汀分校毕业证如何办理
 

Chapter 1.pptx

  • 1. 1 Copyright © 2019, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative Approach, Sixth Edition
  • 2. 2 Computer Technology  Performance improvements:  Improvements in semiconductor technology  Feature size, clock speed  Improvements in computer architectures  Enabled by HLL compilers, UNIX  Lead to RISC architectures  Together have enabled:  Lightweight computers  Productivity-based managed/interpreted programming languages Copyright © 2019, Elsevier Inc. All rights reserved. Introduction
  • 3. 3 Single Processor Performance Copyright © 2019, Elsevier Inc. All rights reserved. Introduction
  • 4. 4 Copyright © 2019, Elsevier Inc. All rights reserved. Current Trends in Architecture  Cannot continue to leverage Instruction-Level parallelism (ILP)  Single processor performance improvement ended in 2003  New models for performance:  Data-level parallelism (DLP)  Thread-level parallelism (TLP)  Request-level parallelism (RLP)  These require explicit restructuring of the application Introduction
  • 5. 5 Copyright © 2019, Elsevier Inc. All rights reserved. Classes of Computers  Personal Mobile Device (PMD)  e.g. start phones, tablet computers  Emphasis on energy efficiency and real-time  Desktop Computing  Emphasis on price-performance  Servers  Emphasis on availability, scalability, throughput  Clusters / Warehouse Scale Computers  Used for “Software as a Service (SaaS)”  Emphasis on availability and price-performance  Sub-class: Supercomputers, emphasis: floating-point performance and fast internal networks  Internet of Things/Embedded Computers  Emphasis: price Classes of Computers
  • 6. 6 Copyright © 2019, Elsevier Inc. All rights reserved. Parallelism  Classes of parallelism in applications:  Data-Level Parallelism (DLP)  Task-Level Parallelism (TLP)  Classes of architectural parallelism:  Instruction-Level Parallelism (ILP)  Vector architectures/Graphic Processor Units (GPUs)  Thread-Level Parallelism  Request-Level Parallelism Classes of Computers
  • 7. 7 Copyright © 2019, Elsevier Inc. All rights reserved. Flynn’s Taxonomy  Single instruction stream, single data stream (SISD)  Single instruction stream, multiple data streams (SIMD)  Vector architectures  Multimedia extensions  Graphics processor units  Multiple instruction streams, single data stream (MISD)  No commercial implementation  Multiple instruction streams, multiple data streams (MIMD)  Tightly-coupled MIMD  Loosely-coupled MIMD Classes of Computers
  • 8. 8 Copyright © 2019, Elsevier Inc. All rights reserved. Defining Computer Architecture  “Old” view of computer architecture:  Instruction Set Architecture (ISA) design  i.e. decisions regarding:  registers, memory addressing, addressing modes, instruction operands, available operations, control flow instructions, instruction encoding  “Real” computer architecture:  Specific requirements of the target machine  Design to maximize performance within constraints: cost, power, and availability  Includes ISA, microarchitecture, hardware Defining Computer Architecture
  • 9. 9 Instruction Set Architecture  Class of ISA  General-purpose registers  Register-memory vs load-store  RISC-V registers  32 g.p., 32 f.p. Copyright © 2019, Elsevier Inc. All rights reserved. Defining Computer Architecture Register Name Use Saver x0 zero constant 0 n/a x1 ra return addr caller x2 sp stack ptr callee x3 gp gbl ptr x4 tp thread ptr x5-x7 t0-t2 temporaries caller x8 s0/fp saved/ frame ptr callee Register Name Use Saver x9 s1 saved callee x10-x17 a0-a7 arguments caller x18-x27 s2-s11 saved callee x28-x31 t3-t6 temporaries caller f0-f7 ft0-ft7 FP temps caller f8-f9 fs0-fs1 FP saved callee f10-f17 fa0-fa7 FP arguments callee f18-f27 fs2-fs21 FP saved callee f28-f31 ft8-ft11 FP temps caller
  • 10. 10 Instruction Set Architecture  Memory addressing  RISC-V: byte addressed, aligned accesses faster  Addressing modes  RISC-V: Register, immediate, displacement (base+offset)  Other examples: autoincrement, indexed, PC-relative  Types and size of operands  RISC-V: 8-bit, 32-bit, 64-bit Copyright © 2019, Elsevier Inc. All rights reserved. Defining Computer Architecture
  • 11. 11 Instruction Set Architecture  Operations  RISC-V: data transfer, arithmetic, logical, control, floating point  See Fig. 1.5 in text  Control flow instructions  Use content of registers (RISC-V) vs. status bits (x86, ARMv7, ARMv8)  Return address in register (RISC-V, ARMv7, ARMv8) vs. on stack (x86)  Encoding  Fixed (RISC-V, ARMv7/v8 except compact instruction set) vs. variable length (x86) Copyright © 2019, Elsevier Inc. All rights reserved. Defining Computer Architecture
  • 12. 12 Copyright © 2019, Elsevier Inc. All rights reserved. Trends in Technology  Integrated circuit technology (Moore’s Law)  Transistor density: 35%/year  Die size: 10-20%/year  Integration overall: 40-55%/year  DRAM capacity: 25-40%/year (slowing)  8 Gb (2014), 16 Gb (2019), possibly no 32 Gb  Flash capacity: 50-60%/year  8-10X cheaper/bit than DRAM  Magnetic disk capacity: recently slowed to 5%/year  Density increases may no longer be possible, maybe increase from 7 to 9 platters  8-10X cheaper/bit then Flash  200-300X cheaper/bit than DRAM Trends in Technology
  • 13. 13 Copyright © 2019, Elsevier Inc. All rights reserved. Bandwidth and Latency  Bandwidth or throughput  Total work done in a given time  32,000-40,000X improvement for processors  300-1200X improvement for memory and disks  Latency or response time  Time between start and completion of an event  50-90X improvement for processors  6-8X improvement for memory and disks Trends in Technology
  • 14. 14 Copyright © 2019, Elsevier Inc. All rights reserved. Bandwidth and Latency Log-log plot of bandwidth and latency milestones Trends in Technology
  • 15. 15 Copyright © 2019, Elsevier Inc. All rights reserved. Transistors and Wires  Feature size  Minimum size of transistor or wire in x or y dimension  10 microns in 1971 to .011 microns in 2017  Transistor performance scales linearly  Wire delay does not improve with feature size!  Integration density scales quadratically Trends in Technology
  • 16. 16 Copyright © 2019, Elsevier Inc. All rights reserved. Power and Energy  Problem: Get power in, get power out  Thermal Design Power (TDP)  Characterizes sustained power consumption  Used as target for power supply and cooling system  Lower than peak power (1.5X higher), higher than average power consumption  Clock rate can be reduced dynamically to limit power consumption  Energy per task is often a better measurement Trends in Power and Energy
  • 17. 17 Copyright © 2019, Elsevier Inc. All rights reserved. Dynamic Energy and Power  Dynamic energy  Transistor switch from 0 -> 1 or 1 -> 0  ½ x Capacitive load x Voltage2  Dynamic power  ½ x Capacitive load x Voltage2 x Frequency switched  Reducing clock rate reduces power, not energy Trends in Power and Energy
  • 18. 18 Copyright © 2019, Elsevier Inc. All rights reserved. Power  Intel 80386 consumed ~ 2 W  3.3 GHz Intel Core i7 consumes 130 W  Heat must be dissipated from 1.5 x 1.5 cm chip  This is the limit of what can be cooled by air Trends in Power and Energy
  • 19. 19 Copyright © 2019, Elsevier Inc. All rights reserved. Reducing Power  Techniques for reducing power:  Do nothing well  Dynamic Voltage-Frequency Scaling  Low power state for DRAM, disks  Overclocking, turning off cores Trends in Power and Energy
  • 20. 20 Copyright © 2019, Elsevier Inc. All rights reserved. Static Power  Static power consumption  25-50% of total power  Currentstatic x Voltage  Scales with number of transistors  To reduce: power gating Trends in Power and Energy
  • 21. 21 Copyright © 2019, Elsevier Inc. All rights reserved. Trends in Cost  Cost driven down by learning curve  Yield  DRAM: price closely tracks cost  Microprocessors: price depends on volume  10% less for each doubling of volume Trends in Cost
  • 22. 22 Copyright © 2019, Elsevier Inc. All rights reserved. Integrated Circuit Cost  Integrated circuit  Bose-Einstein formula:  Defects per unit area = 0.016-0.057 defects per square cm (2010)  N = process-complexity factor = 11.5-15.5 (40 nm, 2010) Trends in Cost
  • 23. 23 Copyright © 2019, Elsevier Inc. All rights reserved. Dependability  Module reliability  Mean time to failure (MTTF)  Mean time to repair (MTTR)  Mean time between failures (MTBF) = MTTF + MTTR  Availability = MTTF / MTBF Dependability
  • 24. 24 Copyright © 2019, Elsevier Inc. All rights reserved. Measuring Performance  Typical performance metrics:  Response time  Throughput  Speedup of X relative to Y  Execution timeY / Execution timeX  Execution time  Wall clock time: includes all system overheads  CPU time: only computation time  Benchmarks  Kernels (e.g. matrix multiply)  Toy programs (e.g. sorting)  Synthetic benchmarks (e.g. Dhrystone)  Benchmark suites (e.g. SPEC06fp, TPC-C) Measuring Performance
  • 25. 25 Copyright © 2019, Elsevier Inc. All rights reserved. Principles of Computer Design  Take Advantage of Parallelism  e.g. multiple processors, disks, memory banks, pipelining, multiple functional units  Principle of Locality  Reuse of data and instructions  Focus on the Common Case  Amdahl’s Law Principles
  • 26. 26 Copyright © 2019, Elsevier Inc. All rights reserved. Principles of Computer Design  The Processor Performance Equation Principles
  • 27. 27 Copyright © 2019, Elsevier Inc. All rights reserved. Principles of Computer Design Principles  Different instruction types having different CPIs
  • 28. 28 Copyright © 2019, Elsevier Inc. All rights reserved. Principles of Computer Design Principles  Different instruction types having different CPIs
  • 29. 29 Fallacies and Pitfalls  All exponential laws must come to an end  Dennard scaling (constant power density)  Stopped by threshold voltage  Disk capacity  30-100% per year to 5% per year  Moore’s Law  Most visible with DRAM capacity  ITRS disbanded  Only four foundries left producing state-of-the-art logic chips  11 nm, 3 nm might be the limit Copyright © 2019, Elsevier Inc. All rights reserved.
  • 30. 30 Fallacies and Pitfalls  Microprocessors are a silver bullet  Performance is now a programmer’s burden  Falling prey to Amdahl’s Law  A single point of failure  Hardware enhancements that increase performance also improve energy efficiency, or are at worst energy neutral  Benchmarks remain valid indefinitely  Compiler optimizations target benchmarks Copyright © 2019, Elsevier Inc. All rights reserved.
  • 31. 31 Fallacies and Pitfalls  The rated mean time to failure of disks is 1,200,000 hours or almost 140 years, so disks practically never fail  MTTF value from manufacturers assume regular replacement  Peak performance tracks observed performance  Fault detection can lower availability  Not all operations are needed for correct execution Copyright © 2019, Elsevier Inc. All rights reserved.

Editor's Notes

  1. 21 October 2023
  2. 21 October 2023
  3. 21 October 2023
  4. 21 October 2023
  5. 21 October 2023
  6. 21 October 2023
  7. 21 October 2023
  8. 21 October 2023
  9. 21 October 2023
  10. 21 October 2023
  11. 21 October 2023
  12. 21 October 2023
  13. 21 October 2023
  14. 21 October 2023
  15. 21 October 2023
  16. 21 October 2023
  17. 21 October 2023
  18. 21 October 2023
  19. 21 October 2023
  20. 21 October 2023
  21. 21 October 2023
  22. 21 October 2023
  23. 21 October 2023