SlideShare a Scribd company logo
1 of 56
UG Program
School of Electrical and Computer Engineering
Chapter 2:Computer Evolution and
Performance
Computer Architecture
and Organization
(ECEG-3163 )
Sem. I, 2014/15
Generations of Computer
2
Computer Architecture and Organization
 Vacuum tube - 1946-1957
 Transistor - 1958-1964
 Small scale integration - 1965 on
 Up to 100 devices on a chip
 Medium scale integration - to 1971
 100-3,000 devices on a chip
 Large scale integration - 1971-1977
 3,000 - 100,000 devices on a chip
 Very large scale integration - 1978 -1991
 100,000 - 100,000,000 devices on a chip
 Ultra large scale integration – 1991 -
 Over 100,000,000 devices on a chip
Sem. I, 2014/15
ENIAC - background
3
Computer Architecture and Organization
 Electronic Numerical
Integrator And Computer
 Eckert and Mauchly
 University of Pennsylvania
 Trajectory tables for weapons
 Started 1943
 Finished 1946
Too late for war effort
 Used until 1955
Sem. I, 2014/15
ENIAC - details
 Decimal (not binary)
 20 accumulators of 10 digits
 Programmed manually by switches
 18,000 vacuum tubes
 30 tons
 15,000 square feet
 140 kW power consumption
 5,000 additions per second
 Showed the general purpose of a computer
4
Computer Architecture and Organization
Sem. I, 2014/15
Von Neumann/Turing
 Stored Program concept
 Main memory storing programs and data
 ALU operating on binary data
 Control unit interpreting instructions from memory and executing
 Input and output equipment operated by control unit
 Princeton Institute for Advanced Studies
 IAS
 Completed 1952
 The first publication of the idea was in a 1945 proposal by von
Neumann for a new computer, the EDVAC (Electronic Discrete
Variable Computer)
5
Computer Architecture and Organization
Sem. I, 2014/15
Structure of von Neumann machine
 A main memory, which
stores both data and
instructions
 An arithmetic and logic
unit (ALU)capable of
operating on binary data
 A control unit, which
interprets the instructions in
memory and causes them to
be executed
 Input/output(I/O)equipment
operated by the control unit
6
Computer Architecture and Organization
Sem. I, 2014/15
Brief Description of the IAS computer
 The memory of the IAS consists of 1000 storage locations, called
words, of 40 binary digits (bits) each.
 Both data and instructions are stored there.
 Numbers are represented in binary form, and each instruction is a
binary code.
7
Computer Architecture and Organization
Sem. I, 2014/15
IAS Memory Format
8
Computer Architecture and Organization
 Each number is represented by a sign bit and a 39-bit value.
 A word may also contain two 20-bit instructions, with each instruction
 consisting of an 8-bit operation code (opcode)
 12-bit address
Sem. I, 2014/15
IAS Main Registers
9
Computer Architecture and Organization
 Memory buffer register (MBR):Contains a word to be stored in
memory or sent to the I/O unit, or is used to receive a word from
memory or from the I/O unit.
 Memory address register (MAR):Specifies the address in memory of
the word to be written from or read into the MBR.
 Instruction register (IR):Contains the 8-bit opcode instruction being
executed.
 Instruction buffer register (IBR):Employed to hold temporarily the
right hand instruction from a word in memory.
 Program counter (PC):Contains the address of the next instruction
pair to be fetched from memory.
 Accumulator (AC) and multiplier quotient (MQ): Employed to hold
temporarily operands and results of ALU operations. For example, the
result of multiplying two 40-bit numbers is an 80-bit number; the most
significant 40 bits are stored in the AC and the least significant in the
MQ.
Sem. I, 2014/15
Expanded Structure of IAS Computer
10
Computer Architecture and Organization
Sem. I, 2014/15
IAS operation
 Operates by repetitively performing an Instruction cycle.
 Each instruction cycle consists of two subcycles. Instruction Fetch and
Instruction Execute Cycle
 In fetch cycle the opcode of the next instruction is loaded into the IR and the
address portion is loaded into the MAR.
 This instruction may be taken from the IBR or from memory by loading a word
into the MBR, and then down to the IBR, IR, and MAR.
 There is only one register that is used to specify the address in memory for a
read or write and One register is used for the source or destination.
 Once the opcode is in the IR, the execute cycle is performed.
 Control circuitry interprets the opcode and executes the instruction.
11
Computer Architecture and Organization
Sem. I, 2014/15
IAS Operation Partial Flow Chart
12
Computer Architecture and Organization
Sem. I, 2014/15
IAS Instruction Set
The IAS computer had a total of 21 instructions, which can be grouped as follows:
 Data transfer: Move data between memory and ALU registers or between two
ALU registers.
 Unconditional branch: Normally, the control unit executes instructions in
sequence from memory. This sequence can be changed by a branch
instruction, which facilitates repetitive operations.
 Conditional branch: The branch can be made dependent on a condition, thus
allowing decision points.
 Arithmetic: Operations performed by the ALU.
 Address modify: Permits addresses to be computed in the ALU and then
inserted into instructions stored in memory. This allows a program considerable
addressing flexibility.
13
Computer Architecture and Organization
Sem. I, 2014/15
IAS Instruction Set Details
14
Computer Architecture and Organization
Sem. I, 2014/15
Commercial Computers
 1947 - Eckert-Mauchly Computer Corporation
 UNIVAC I (Universal Automatic Computer)
 US Bureau of Census 1950 calculations
 Became part of Sperry-Rand Corporation
 Late 1950s - UNIVAC II
 Faster
 More memory
15
Computer Architecture and Organization
Sem. I, 2014/15
IBM
 Punched-card processing equipment
 1953 - the 701
 IBM’s first Electronic stored program computer
 Scientific calculations
 1955 - the 702
 Business applications
 Lead to 700/7000 series
16
Computer Architecture and Organization
Sem. I, 2014/15
Transistors
 Replaced vacuum tubes
 Smaller
 Cheaper
 Less heat dissipation
 Solid State device
 Made from Silicon (Sand)
 Invented 1947 at Bell Labs
 William Shockley et al.
17
Computer Architecture and Organization
Sem. I, 2014/15
Second Generation : Transistors
 Transistor Based Computers
 More complex arithmetic and logic units and control units,
 The use high-level programming languages and software provided the ability to
load programs,(beginning of OSes)
 Data channels:-independent I/O module with its own processor & Instruction
set. -relieves the CPU of a considerable processing burden.
 Multiplexor :- termination point for data channels, the CPU, and memory.
- schedules access to the memory from the CPU and data channels
 NCR & RCA produced small transistor machines
 IBM 7000
 DEC - 1957
 Produced PDP-1
18
Computer Architecture and Organization
Sem. I, 2014/15
Microelectronics
 Literally - “small electronics”
 A computer is made up of gates, memory cells and
interconnections
 These can be manufactured on a semiconductor material.
e.g. silicon wafer
 Jack Kilby invented the integrated circuit at Texas
Instruments in 1958 creating a transistor , resistor and
capacitor from a piece of Germanium
 Robert Noyce also
independently invented
IC
19
Computer Architecture and Organization
Sem. I, 2014/15
Microelectronics- Modern IC Manufacturing
 Using technique of Photolithography
20
Computer Architecture and Organization
Sem. I, 2014/15
Moore’s Law
 Gordon Moore – co-founder of Intel
 Number of transistors on a chip will double every year
 Since 1970’s development has slowed a little
 Number of transistors doubles every 18 months
 Cost of a chip has remained almost unchanged
 Consequence of Moore’s Law
 Higher packing density means shorter electrical paths, giving
higher performance
 Smaller size gives increased flexibility
 Reduced power and cooling requirements
 Fewer interconnections increases reliability
21
Computer Architecture and Organization
Sem. I, 2014/15
Growth in Transistor Count on Integrated Circuits
22
Computer Architecture and Organization
Sem. I, 2014/15
IBM 360 series
 1964
 Replaced (& not compatible with) 7000 series
 First planned “family” of computers
 Similar or identical instruction sets
 Similar or identical O/S
 Increasing speed
 Increasing number of I/O ports (i.e. more terminals)
 Increased memory size
 Increased cost
23
Computer Architecture and Organization
Sem. I, 2014/15
DEC PDP-8
 1964
 First minicomputer
(after miniskirt!)
 Small enough to
sit on a lab bench
 $16,000
 $100k+ for IBM 360
24
Computer Architecture and Organization
Sem. I, 2014/15
Intel
 1971 - 4004
 First microprocessor
 All CPU components
on a single chip
 4 bit
25
Computer Architecture and Organization
Sem. I, 2014/15
Intel …
 Followed in 1972 by 8008
 8 bit
 Both designed for specific applications
 1974 - 8080
 Intel’s first general purpose microprocessor
 IBM(1981) produced PC using Intel 8088
microprocessor
 8086: A 16-bit machine
 80386: Intel’s first 32-bit machine.
26
Computer Architecture and Organization
Sem. I, 2014/15
Semiconductor Memory
 1970
 Fairchild
 Size of a single core
 i.e. 1 bit of magnetic core storage
 Holds 256 bits
 Non-destructive read
 Much faster than core
 Capacity approximately doubles each year
27
Computer Architecture and Organization
Sem. I, 2014/15
Microprocessor Speed
 Raw speed of the microprocessor is not used unless constant work is feed to
the CPU
 Among the techniques used are
 Pipelining: With pipelining, a processor can simultaneously work on multiple
instructions. I.e while one instruction is being executed, the computer is
decoding the next instruction.
 Branch prediction: The processor looks ahead in the instruction code fetched
from memory and predicts which branches, or groups of instructions, are likely
to be processed next. The more sophisticated examples of this strategy predict
not just the next branch but multiple branches ahead.
28
Computer Architecture and Organization
Sem. I, 2014/15
Microprocessor Speed …
 Data flow analysis: The processor analyzes which instructions are dependent
on each other’s results, or data, to create an optimized schedule of
instructions. In fact, instructions are scheduled to be executed when ready,
independent of the original program order. This prevents unnecessary delay.
 Speculative execution: Using branch prediction and data flow analysis, some
processors speculatively execute instructions ahead of their actual appearance
in the program execution, holding the results in temporary locations. This
enables the processor to keep its execution engines as busy as possible by
executing instructions that are likely to be needed.These and other
sophisticated techniques are made necessary by the sheer power
29
Computer Architecture and Organization
Sem. I, 2014/15
Speeding it up
 By increasing hardware speed of microprocessor
 Fundamentally due to shrinking logic gate size
 More gates, packed more tightly, reduce propagation time for signals
which intern increase clock speed
 chipmakers fabricate chips of greater density and then
speed and the processor designers come up with
techniques that use the higher speed
30
Computer Architecture and Organization
Sem. I, 2014/15
Performance Balance
 Processor speed increased
 Memory capacity increased
 Memory speed lags behind processor speed
31
Computer Architecture and Organization
Sem. I, 2014/15
Logic and Memory Performance Gap
32
Computer Architecture and Organization
Sem. I, 2014/15
Increased Cache Capacity
 Increase number of bits retrieved at one time
 Make DRAM “wider” rather than “deeper”
 Change DRAM interface
 Cache
 Reduce frequency of memory access
 More complex cache and cache on chip
 Increase interconnection bandwidth
 High speed buses
 Hierarchy of buses
33
Computer Architecture and Organization
Sem. I, 2014/15
Problems with Clock Speed and Logic Density
 Power
 Power density increases with density of logic and clock speed
 Dissipating large amount of heat and controlling the heat is
becoming difficult which will lead to permanent damage of
transistor
 RC delay
 Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting them
 Delay increases as RC product increases
 Wire interconnects thinner, increasing resistance
 Wires closer together, increasing capacitance
 Solution:
 More emphasis on other organizational and architectural
approaches
34
Computer Architecture and Organization
Sem. I, 2014/15
Increased Cache Capacity
 Typically two or three levels of cache between processor
and main memory
 Chip density increased
 More cache memory on chip
 Faster cache access
 Pentium chip devoted about 10% of chip area to cache
 Pentium 4 devotes about 50%
35
Computer Architecture and Organization
Sem. I, 2014/15
More Complex Execution Logic
 Enable parallel execution of instructions
 Pipeline works like assembly line
 Different stages of execution of different instructions at same time
along pipeline
 Superscalar allows multiple pipelines within single
processor
 Instructions that do not depend on one another can be executed in
parallel
36
Computer Architecture and Organization
Sem. I, 2014/15
Diminishing Returns
 As processors getting complex further significant increases
in parallelism likely to be relatively modest
 Designing different techniques become difficult
 Benefits from cache are reaching limit
37
Computer Architecture and Organization
Sem. I, 2014/15
New Approach – Multiple Cores
 Multiple processors on single chip
 Large shared cache
 If software can use multiple processors, doubling number
of processors almost doubles performance
 So, use two simpler processors on the chip rather than one
more complex processor
 With two processors, larger caches are justified
 Power consumption of memory logic less than processing logic
38
Computer Architecture and Organization
Sem. I, 2014/15
Intel Microprocessor Performance
39
Computer Architecture and Organization
Sem. I, 2014/15
x86 Evolution (1)
 8080
 first general purpose microprocessor
 8 bit data path
 Used in first personal computer – Altair
 8086 – 5MHz – 29,000 transistors
 much more powerful
 16 bit
 instruction cache, prefetch few instructions
 8088 (8 bit external data bus) used in first IBM PC
 80286
 16 Mbyte memory addressable
 up from 1Mb
 80386
 32 bit
 Support for multitasking
 80486
 sophisticated powerful cache and instruction pipelining
 built in maths co-processor
40
Computer Architecture and Organization
Sem. I, 2014/15
x86 Evolution (2)
 Pentium
 Superscalar
 Multiple instructions executed in parallel
 Pentium Pro
 Increased superscalar organization
 Aggressive register renaming
 branch prediction
 data flow analysis
 speculative execution
 Pentium II
 MMX technology
 graphics, video & audio processing
 Pentium III
 Additional floating point instructions for 3D graphics
41
Computer Architecture and Organization
Sem. I, 2014/15
x86 Evolution (3)
 Pentium 4
 Further floating point and multimedia enhancements
 Core
 First x86 with dual core
 Core 2
 64 bit architecture
 Core 2 Quad – 3GHz – 820 million transistors
 Four processors on chip
 x86 architecture dominant outside embedded systems
 Organization and technology changed dramatically
 Instruction set architecture evolved with backwards compatibility
 ~1 instruction per month added
 500 instructions available
42
Computer Architecture and Organization
Sem. I, 2014/15
Performance Assessment Clock Speed
 Key parameters of computer assessment
 Performance, cost, size, security, reliability, power consumption
 System clock speed is given in terms of Hz or multiples of
clock rate, clock cycle, clock tick, cycle time
 Signals in CPU take time to settle down to 1 or 0 and the
worst time taken by a signal inside the CPU is taken to be
the clock speed
 Instruction execution in discrete steps
 Fetch, decode, load and store, arithmetic or logical
 Usually require multiple clock cycles per instruction
 Pipelining gives simultaneous execution of instructions
 So, clock speed is not the whole story in performance
assessment
43
Computer Architecture and Organization
Sem. I, 2014/15
Instruction Execution Rate
 A processor is driven by a clock with a constant frequency for,
equivalently, a constant cycle time t, where t =1/f.
 instruction count, Ic
 An important parameter is the average cycles per instruction (CPI) for a
program.
 If all instructions required the same number of clock cycles, then CPI
would be a constant value for a processor.
 the number of clock cycles required varies for different types of
instructions, such as load, store, branch, and so on.
 Let CPIi be the number of cycles required for instruction type I and Ii
be the number of executed instructions of type I for a given program.
44
Computer Architecture and Organization
Sem. I, 2014/15
Instruction Execution Rate
 The processor time T needed to execute a given program can be expressed as
 We can refine this formulation by recognizing that during the execution of an instruction,
part of the work is done by the processor, and part of the time a word is being transferred
to or from memory. In this latter case, the time to transfer depends on the memory cycle
time, which may be greater than the processor cycle time.
 where p is the number of processor cycles needed to decode and execute the
instruction, m is the number of memory references needed, and k is the ratio between
them.
 The five performance factors in the preceding equation (Ic, p, m, k, t) are influenced by
four system attributes: the design of the instruction set (known as instruction set
architecture), compiler technology (how effective the compiler is in producing an efficient
machine language program from a high-level language program), processor
implementation, and cache
 and memory hierarchy.
45
Computer Architecture and Organization
Sem. I, 2014/15
Instruction Execution Rate …
46
Computer Architecture and Organization
Sem. I, 2014/15
Instruction Execution Rate Example
 For example, consider the execution of a program that results in the execution
of 2 million instructions on a 400-MHz processor. The program consists of four
major types of instructions. The instruction mix and the CPI for each instruction
type are given below based on the result of a program trace experiment:
 Calculate the Average CPI and MIPS ?
47
Computer Architecture and Organization
Sem. I, 2014/15
Instruction Execution Rate Example
48
Computer Architecture and Organization
The average CPI when the program is executed on a uniprocessor
with
the above trace results is
CPI =0.6 +(2 X0.18) +(4X0.12) +(8X0.1) =2.24.
The corresponding MIPS rate is
Sem. I, 2014/15
Benchmarks
 Programs designed to test performance
 Written in high level language
 Portable
 Represents style of task
 Systems, numerical, commercial
 Easily measured
 Widely distributed
 E.g. System Performance Evaluation Corporation (SPEC)
 CPU2006 for computation bound
 17 floating point programs in C, C++, Fortran
 12 integer programs in C, C++
 3 million lines of code
 Speed and rate metrics
 Single task and throughput
49
Computer Architecture and Organization
Sem. I, 2014/15
SPEC Speed Metric
50
Computer Architecture and Organization
 Single task
 Base runtime defined for each benchmark using
reference machine
 Results are reported as ratio of reference time to system
run time
 Trefi execution time for benchmark i on reference machine
 Tsuti execution time of benchmark i on test system
• Overall performance calculated by averaging
ratios for all 12 integer benchmarks
—Use geometric mean
– Appropriate for normalized numbers such as ratios
Sem. I, 2014/15
SPEC Speed Metric
51
Computer Architecture and Organization
The Sun system executes this program in 934 seconds. The reference
implementation requires 22,135 seconds. The ratio is calculated as:
22136/934 =23.7.
Sem. I, 2014/15
SPEC Rate Metric
 Measures throughput or rate of a machine carrying out a number of
tasks
 Multiple copies of benchmarks run simultaneously
 Typically, same as number of processors
 Ratio is calculated as follows:
 Trefi reference execution time for benchmark i
 N number of copies run simultaneously
 Tsuti elapsed time from start of execution of program on all N processors
until completion of all copies of program
 Again, a geometric mean is calculated
52
Computer Architecture and Organization
Sem. I, 2014/15
Amdahl’s Law
 Gene Amdahl [AMDA67]
 Speedup in one aspect of the technology or design does
not result in a corresponding improvement in performance
 Potential speed up of program using multiple processors
 Concluded that:
 Code needs to be parallelizable
 Speed up is bound, giving diminishing returns for more processors
 Task dependent
 Servers gain by maintaining multiple connections on multiple
processors
 Databases can be split into parallel tasks
53
Computer Architecture and Organization
Sem. I, 2014/15
Amdahl’s Law Formula
54
Computer Architecture and Organization
 Conclusions
 f small, parallel processors has little effect
 N ->∞, speedup bound by 1/(1 – f)
 Diminishing returns for using more processors
• For program running on single processor
—Fraction f of code infinitely parallelizable with no
scheduling overhead
—Fraction (1-f) of code inherently serial
—T is total execution time for program on single processor
—N is number of processors that fully exploit parralle
portions of code
Sem. I, 2014/15
Amdahl’s Law Formula …
55
Computer Architecture and Organization
UG Program
School of Electrical and Computer Engineering
End of Chapter 2
Computer Architecture
and Organization
(ECEG-3202 )

More Related Content

Similar to Computer_Architecture&O_ECEG_3163_02_computer_evolution_performance.pptx

Organisasi dan arsitektur komputer 2
Organisasi dan arsitektur komputer   2Organisasi dan arsitektur komputer   2
Organisasi dan arsitektur komputer 2Ajeng Savitri
 
Computer programming1
Computer programming1Computer programming1
Computer programming1A A
 
Lecturer1 introduction to computer architecture (ca)
Lecturer1   introduction to computer architecture (ca)Lecturer1   introduction to computer architecture (ca)
Lecturer1 introduction to computer architecture (ca)ADEOLA ADISA
 
M.sc I-sem-8086 notes
M.sc  I-sem-8086 notesM.sc  I-sem-8086 notes
M.sc I-sem-8086 notesDr.YNM
 
M.sc I-sem-8086 notes
M.sc  I-sem-8086 notesM.sc  I-sem-8086 notes
M.sc I-sem-8086 notesDr.YNM
 
02 Computer Evolution And Performance
02  Computer  Evolution And  Performance02  Computer  Evolution And  Performance
02 Computer Evolution And PerformanceJeanie Delos Arcos
 
Computer Organization & Architecture
Computer Organization & ArchitectureComputer Organization & Architecture
Computer Organization & ArchitectureAnupmaMunshi
 
VTU University Micro Controllers-06ES42 lecturer Notes
VTU University Micro Controllers-06ES42 lecturer NotesVTU University Micro Controllers-06ES42 lecturer Notes
VTU University Micro Controllers-06ES42 lecturer Notes24x7house
 
Microprocessor and Microcontroller Based Systems.ppt
Microprocessor and Microcontroller Based Systems.pptMicroprocessor and Microcontroller Based Systems.ppt
Microprocessor and Microcontroller Based Systems.pptTALHARIAZ46
 
Comparison between computers of past and present
Comparison between computers of past and presentComparison between computers of past and present
Comparison between computers of past and presentMuhammad Danish Badar
 
Coa module1
Coa module1Coa module1
Coa module1cs19club
 
Lecture 2 computer evolution and performance
Lecture 2 computer evolution and performanceLecture 2 computer evolution and performance
Lecture 2 computer evolution and performanceWajahat HuxaIn
 
coa-module1-170527034116.pdf
coa-module1-170527034116.pdfcoa-module1-170527034116.pdf
coa-module1-170527034116.pdfSnehithaKurimelli
 
cs305-170108135047.pptx
cs305-170108135047.pptxcs305-170108135047.pptx
cs305-170108135047.pptxLyka Gumatay
 

Similar to Computer_Architecture&O_ECEG_3163_02_computer_evolution_performance.pptx (20)

abc
abcabc
abc
 
Organisasi dan arsitektur komputer 2
Organisasi dan arsitektur komputer   2Organisasi dan arsitektur komputer   2
Organisasi dan arsitektur komputer 2
 
computer system organization basics
computer system organization basicscomputer system organization basics
computer system organization basics
 
Computer programming1
Computer programming1Computer programming1
Computer programming1
 
Lecturer1 introduction to computer architecture (ca)
Lecturer1   introduction to computer architecture (ca)Lecturer1   introduction to computer architecture (ca)
Lecturer1 introduction to computer architecture (ca)
 
M.sc I-sem-8086 notes
M.sc  I-sem-8086 notesM.sc  I-sem-8086 notes
M.sc I-sem-8086 notes
 
M.sc I-sem-8086 notes
M.sc  I-sem-8086 notesM.sc  I-sem-8086 notes
M.sc I-sem-8086 notes
 
Computer basic and cpu
Computer basic and cpuComputer basic and cpu
Computer basic and cpu
 
02 Computer Evolution And Performance
02  Computer  Evolution And  Performance02  Computer  Evolution And  Performance
02 Computer Evolution And Performance
 
Computer Organization & Architecture
Computer Organization & ArchitectureComputer Organization & Architecture
Computer Organization & Architecture
 
VTU University Micro Controllers-06ES42 lecturer Notes
VTU University Micro Controllers-06ES42 lecturer NotesVTU University Micro Controllers-06ES42 lecturer Notes
VTU University Micro Controllers-06ES42 lecturer Notes
 
Microprocessor and Microcontroller Based Systems.ppt
Microprocessor and Microcontroller Based Systems.pptMicroprocessor and Microcontroller Based Systems.ppt
Microprocessor and Microcontroller Based Systems.ppt
 
Pipe line
Pipe linePipe line
Pipe line
 
Comparison between computers of past and present
Comparison between computers of past and presentComparison between computers of past and present
Comparison between computers of past and present
 
Principles of computer design
Principles of computer designPrinciples of computer design
Principles of computer design
 
COA.pptx
COA.pptxCOA.pptx
COA.pptx
 
Coa module1
Coa module1Coa module1
Coa module1
 
Lecture 2 computer evolution and performance
Lecture 2 computer evolution and performanceLecture 2 computer evolution and performance
Lecture 2 computer evolution and performance
 
coa-module1-170527034116.pdf
coa-module1-170527034116.pdfcoa-module1-170527034116.pdf
coa-module1-170527034116.pdf
 
cs305-170108135047.pptx
cs305-170108135047.pptxcs305-170108135047.pptx
cs305-170108135047.pptx
 

Recently uploaded

Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...Call girls in Ahmedabad High profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 

Computer_Architecture&O_ECEG_3163_02_computer_evolution_performance.pptx

  • 1. UG Program School of Electrical and Computer Engineering Chapter 2:Computer Evolution and Performance Computer Architecture and Organization (ECEG-3163 )
  • 2. Sem. I, 2014/15 Generations of Computer 2 Computer Architecture and Organization  Vacuum tube - 1946-1957  Transistor - 1958-1964  Small scale integration - 1965 on  Up to 100 devices on a chip  Medium scale integration - to 1971  100-3,000 devices on a chip  Large scale integration - 1971-1977  3,000 - 100,000 devices on a chip  Very large scale integration - 1978 -1991  100,000 - 100,000,000 devices on a chip  Ultra large scale integration – 1991 -  Over 100,000,000 devices on a chip
  • 3. Sem. I, 2014/15 ENIAC - background 3 Computer Architecture and Organization  Electronic Numerical Integrator And Computer  Eckert and Mauchly  University of Pennsylvania  Trajectory tables for weapons  Started 1943  Finished 1946 Too late for war effort  Used until 1955
  • 4. Sem. I, 2014/15 ENIAC - details  Decimal (not binary)  20 accumulators of 10 digits  Programmed manually by switches  18,000 vacuum tubes  30 tons  15,000 square feet  140 kW power consumption  5,000 additions per second  Showed the general purpose of a computer 4 Computer Architecture and Organization
  • 5. Sem. I, 2014/15 Von Neumann/Turing  Stored Program concept  Main memory storing programs and data  ALU operating on binary data  Control unit interpreting instructions from memory and executing  Input and output equipment operated by control unit  Princeton Institute for Advanced Studies  IAS  Completed 1952  The first publication of the idea was in a 1945 proposal by von Neumann for a new computer, the EDVAC (Electronic Discrete Variable Computer) 5 Computer Architecture and Organization
  • 6. Sem. I, 2014/15 Structure of von Neumann machine  A main memory, which stores both data and instructions  An arithmetic and logic unit (ALU)capable of operating on binary data  A control unit, which interprets the instructions in memory and causes them to be executed  Input/output(I/O)equipment operated by the control unit 6 Computer Architecture and Organization
  • 7. Sem. I, 2014/15 Brief Description of the IAS computer  The memory of the IAS consists of 1000 storage locations, called words, of 40 binary digits (bits) each.  Both data and instructions are stored there.  Numbers are represented in binary form, and each instruction is a binary code. 7 Computer Architecture and Organization
  • 8. Sem. I, 2014/15 IAS Memory Format 8 Computer Architecture and Organization  Each number is represented by a sign bit and a 39-bit value.  A word may also contain two 20-bit instructions, with each instruction  consisting of an 8-bit operation code (opcode)  12-bit address
  • 9. Sem. I, 2014/15 IAS Main Registers 9 Computer Architecture and Organization  Memory buffer register (MBR):Contains a word to be stored in memory or sent to the I/O unit, or is used to receive a word from memory or from the I/O unit.  Memory address register (MAR):Specifies the address in memory of the word to be written from or read into the MBR.  Instruction register (IR):Contains the 8-bit opcode instruction being executed.  Instruction buffer register (IBR):Employed to hold temporarily the right hand instruction from a word in memory.  Program counter (PC):Contains the address of the next instruction pair to be fetched from memory.  Accumulator (AC) and multiplier quotient (MQ): Employed to hold temporarily operands and results of ALU operations. For example, the result of multiplying two 40-bit numbers is an 80-bit number; the most significant 40 bits are stored in the AC and the least significant in the MQ.
  • 10. Sem. I, 2014/15 Expanded Structure of IAS Computer 10 Computer Architecture and Organization
  • 11. Sem. I, 2014/15 IAS operation  Operates by repetitively performing an Instruction cycle.  Each instruction cycle consists of two subcycles. Instruction Fetch and Instruction Execute Cycle  In fetch cycle the opcode of the next instruction is loaded into the IR and the address portion is loaded into the MAR.  This instruction may be taken from the IBR or from memory by loading a word into the MBR, and then down to the IBR, IR, and MAR.  There is only one register that is used to specify the address in memory for a read or write and One register is used for the source or destination.  Once the opcode is in the IR, the execute cycle is performed.  Control circuitry interprets the opcode and executes the instruction. 11 Computer Architecture and Organization
  • 12. Sem. I, 2014/15 IAS Operation Partial Flow Chart 12 Computer Architecture and Organization
  • 13. Sem. I, 2014/15 IAS Instruction Set The IAS computer had a total of 21 instructions, which can be grouped as follows:  Data transfer: Move data between memory and ALU registers or between two ALU registers.  Unconditional branch: Normally, the control unit executes instructions in sequence from memory. This sequence can be changed by a branch instruction, which facilitates repetitive operations.  Conditional branch: The branch can be made dependent on a condition, thus allowing decision points.  Arithmetic: Operations performed by the ALU.  Address modify: Permits addresses to be computed in the ALU and then inserted into instructions stored in memory. This allows a program considerable addressing flexibility. 13 Computer Architecture and Organization
  • 14. Sem. I, 2014/15 IAS Instruction Set Details 14 Computer Architecture and Organization
  • 15. Sem. I, 2014/15 Commercial Computers  1947 - Eckert-Mauchly Computer Corporation  UNIVAC I (Universal Automatic Computer)  US Bureau of Census 1950 calculations  Became part of Sperry-Rand Corporation  Late 1950s - UNIVAC II  Faster  More memory 15 Computer Architecture and Organization
  • 16. Sem. I, 2014/15 IBM  Punched-card processing equipment  1953 - the 701  IBM’s first Electronic stored program computer  Scientific calculations  1955 - the 702  Business applications  Lead to 700/7000 series 16 Computer Architecture and Organization
  • 17. Sem. I, 2014/15 Transistors  Replaced vacuum tubes  Smaller  Cheaper  Less heat dissipation  Solid State device  Made from Silicon (Sand)  Invented 1947 at Bell Labs  William Shockley et al. 17 Computer Architecture and Organization
  • 18. Sem. I, 2014/15 Second Generation : Transistors  Transistor Based Computers  More complex arithmetic and logic units and control units,  The use high-level programming languages and software provided the ability to load programs,(beginning of OSes)  Data channels:-independent I/O module with its own processor & Instruction set. -relieves the CPU of a considerable processing burden.  Multiplexor :- termination point for data channels, the CPU, and memory. - schedules access to the memory from the CPU and data channels  NCR & RCA produced small transistor machines  IBM 7000  DEC - 1957  Produced PDP-1 18 Computer Architecture and Organization
  • 19. Sem. I, 2014/15 Microelectronics  Literally - “small electronics”  A computer is made up of gates, memory cells and interconnections  These can be manufactured on a semiconductor material. e.g. silicon wafer  Jack Kilby invented the integrated circuit at Texas Instruments in 1958 creating a transistor , resistor and capacitor from a piece of Germanium  Robert Noyce also independently invented IC 19 Computer Architecture and Organization
  • 20. Sem. I, 2014/15 Microelectronics- Modern IC Manufacturing  Using technique of Photolithography 20 Computer Architecture and Organization
  • 21. Sem. I, 2014/15 Moore’s Law  Gordon Moore – co-founder of Intel  Number of transistors on a chip will double every year  Since 1970’s development has slowed a little  Number of transistors doubles every 18 months  Cost of a chip has remained almost unchanged  Consequence of Moore’s Law  Higher packing density means shorter electrical paths, giving higher performance  Smaller size gives increased flexibility  Reduced power and cooling requirements  Fewer interconnections increases reliability 21 Computer Architecture and Organization
  • 22. Sem. I, 2014/15 Growth in Transistor Count on Integrated Circuits 22 Computer Architecture and Organization
  • 23. Sem. I, 2014/15 IBM 360 series  1964  Replaced (& not compatible with) 7000 series  First planned “family” of computers  Similar or identical instruction sets  Similar or identical O/S  Increasing speed  Increasing number of I/O ports (i.e. more terminals)  Increased memory size  Increased cost 23 Computer Architecture and Organization
  • 24. Sem. I, 2014/15 DEC PDP-8  1964  First minicomputer (after miniskirt!)  Small enough to sit on a lab bench  $16,000  $100k+ for IBM 360 24 Computer Architecture and Organization
  • 25. Sem. I, 2014/15 Intel  1971 - 4004  First microprocessor  All CPU components on a single chip  4 bit 25 Computer Architecture and Organization
  • 26. Sem. I, 2014/15 Intel …  Followed in 1972 by 8008  8 bit  Both designed for specific applications  1974 - 8080  Intel’s first general purpose microprocessor  IBM(1981) produced PC using Intel 8088 microprocessor  8086: A 16-bit machine  80386: Intel’s first 32-bit machine. 26 Computer Architecture and Organization
  • 27. Sem. I, 2014/15 Semiconductor Memory  1970  Fairchild  Size of a single core  i.e. 1 bit of magnetic core storage  Holds 256 bits  Non-destructive read  Much faster than core  Capacity approximately doubles each year 27 Computer Architecture and Organization
  • 28. Sem. I, 2014/15 Microprocessor Speed  Raw speed of the microprocessor is not used unless constant work is feed to the CPU  Among the techniques used are  Pipelining: With pipelining, a processor can simultaneously work on multiple instructions. I.e while one instruction is being executed, the computer is decoding the next instruction.  Branch prediction: The processor looks ahead in the instruction code fetched from memory and predicts which branches, or groups of instructions, are likely to be processed next. The more sophisticated examples of this strategy predict not just the next branch but multiple branches ahead. 28 Computer Architecture and Organization
  • 29. Sem. I, 2014/15 Microprocessor Speed …  Data flow analysis: The processor analyzes which instructions are dependent on each other’s results, or data, to create an optimized schedule of instructions. In fact, instructions are scheduled to be executed when ready, independent of the original program order. This prevents unnecessary delay.  Speculative execution: Using branch prediction and data flow analysis, some processors speculatively execute instructions ahead of their actual appearance in the program execution, holding the results in temporary locations. This enables the processor to keep its execution engines as busy as possible by executing instructions that are likely to be needed.These and other sophisticated techniques are made necessary by the sheer power 29 Computer Architecture and Organization
  • 30. Sem. I, 2014/15 Speeding it up  By increasing hardware speed of microprocessor  Fundamentally due to shrinking logic gate size  More gates, packed more tightly, reduce propagation time for signals which intern increase clock speed  chipmakers fabricate chips of greater density and then speed and the processor designers come up with techniques that use the higher speed 30 Computer Architecture and Organization
  • 31. Sem. I, 2014/15 Performance Balance  Processor speed increased  Memory capacity increased  Memory speed lags behind processor speed 31 Computer Architecture and Organization
  • 32. Sem. I, 2014/15 Logic and Memory Performance Gap 32 Computer Architecture and Organization
  • 33. Sem. I, 2014/15 Increased Cache Capacity  Increase number of bits retrieved at one time  Make DRAM “wider” rather than “deeper”  Change DRAM interface  Cache  Reduce frequency of memory access  More complex cache and cache on chip  Increase interconnection bandwidth  High speed buses  Hierarchy of buses 33 Computer Architecture and Organization
  • 34. Sem. I, 2014/15 Problems with Clock Speed and Logic Density  Power  Power density increases with density of logic and clock speed  Dissipating large amount of heat and controlling the heat is becoming difficult which will lead to permanent damage of transistor  RC delay  Speed at which electrons flow limited by resistance and capacitance of metal wires connecting them  Delay increases as RC product increases  Wire interconnects thinner, increasing resistance  Wires closer together, increasing capacitance  Solution:  More emphasis on other organizational and architectural approaches 34 Computer Architecture and Organization
  • 35. Sem. I, 2014/15 Increased Cache Capacity  Typically two or three levels of cache between processor and main memory  Chip density increased  More cache memory on chip  Faster cache access  Pentium chip devoted about 10% of chip area to cache  Pentium 4 devotes about 50% 35 Computer Architecture and Organization
  • 36. Sem. I, 2014/15 More Complex Execution Logic  Enable parallel execution of instructions  Pipeline works like assembly line  Different stages of execution of different instructions at same time along pipeline  Superscalar allows multiple pipelines within single processor  Instructions that do not depend on one another can be executed in parallel 36 Computer Architecture and Organization
  • 37. Sem. I, 2014/15 Diminishing Returns  As processors getting complex further significant increases in parallelism likely to be relatively modest  Designing different techniques become difficult  Benefits from cache are reaching limit 37 Computer Architecture and Organization
  • 38. Sem. I, 2014/15 New Approach – Multiple Cores  Multiple processors on single chip  Large shared cache  If software can use multiple processors, doubling number of processors almost doubles performance  So, use two simpler processors on the chip rather than one more complex processor  With two processors, larger caches are justified  Power consumption of memory logic less than processing logic 38 Computer Architecture and Organization
  • 39. Sem. I, 2014/15 Intel Microprocessor Performance 39 Computer Architecture and Organization
  • 40. Sem. I, 2014/15 x86 Evolution (1)  8080  first general purpose microprocessor  8 bit data path  Used in first personal computer – Altair  8086 – 5MHz – 29,000 transistors  much more powerful  16 bit  instruction cache, prefetch few instructions  8088 (8 bit external data bus) used in first IBM PC  80286  16 Mbyte memory addressable  up from 1Mb  80386  32 bit  Support for multitasking  80486  sophisticated powerful cache and instruction pipelining  built in maths co-processor 40 Computer Architecture and Organization
  • 41. Sem. I, 2014/15 x86 Evolution (2)  Pentium  Superscalar  Multiple instructions executed in parallel  Pentium Pro  Increased superscalar organization  Aggressive register renaming  branch prediction  data flow analysis  speculative execution  Pentium II  MMX technology  graphics, video & audio processing  Pentium III  Additional floating point instructions for 3D graphics 41 Computer Architecture and Organization
  • 42. Sem. I, 2014/15 x86 Evolution (3)  Pentium 4  Further floating point and multimedia enhancements  Core  First x86 with dual core  Core 2  64 bit architecture  Core 2 Quad – 3GHz – 820 million transistors  Four processors on chip  x86 architecture dominant outside embedded systems  Organization and technology changed dramatically  Instruction set architecture evolved with backwards compatibility  ~1 instruction per month added  500 instructions available 42 Computer Architecture and Organization
  • 43. Sem. I, 2014/15 Performance Assessment Clock Speed  Key parameters of computer assessment  Performance, cost, size, security, reliability, power consumption  System clock speed is given in terms of Hz or multiples of clock rate, clock cycle, clock tick, cycle time  Signals in CPU take time to settle down to 1 or 0 and the worst time taken by a signal inside the CPU is taken to be the clock speed  Instruction execution in discrete steps  Fetch, decode, load and store, arithmetic or logical  Usually require multiple clock cycles per instruction  Pipelining gives simultaneous execution of instructions  So, clock speed is not the whole story in performance assessment 43 Computer Architecture and Organization
  • 44. Sem. I, 2014/15 Instruction Execution Rate  A processor is driven by a clock with a constant frequency for, equivalently, a constant cycle time t, where t =1/f.  instruction count, Ic  An important parameter is the average cycles per instruction (CPI) for a program.  If all instructions required the same number of clock cycles, then CPI would be a constant value for a processor.  the number of clock cycles required varies for different types of instructions, such as load, store, branch, and so on.  Let CPIi be the number of cycles required for instruction type I and Ii be the number of executed instructions of type I for a given program. 44 Computer Architecture and Organization
  • 45. Sem. I, 2014/15 Instruction Execution Rate  The processor time T needed to execute a given program can be expressed as  We can refine this formulation by recognizing that during the execution of an instruction, part of the work is done by the processor, and part of the time a word is being transferred to or from memory. In this latter case, the time to transfer depends on the memory cycle time, which may be greater than the processor cycle time.  where p is the number of processor cycles needed to decode and execute the instruction, m is the number of memory references needed, and k is the ratio between them.  The five performance factors in the preceding equation (Ic, p, m, k, t) are influenced by four system attributes: the design of the instruction set (known as instruction set architecture), compiler technology (how effective the compiler is in producing an efficient machine language program from a high-level language program), processor implementation, and cache  and memory hierarchy. 45 Computer Architecture and Organization
  • 46. Sem. I, 2014/15 Instruction Execution Rate … 46 Computer Architecture and Organization
  • 47. Sem. I, 2014/15 Instruction Execution Rate Example  For example, consider the execution of a program that results in the execution of 2 million instructions on a 400-MHz processor. The program consists of four major types of instructions. The instruction mix and the CPI for each instruction type are given below based on the result of a program trace experiment:  Calculate the Average CPI and MIPS ? 47 Computer Architecture and Organization
  • 48. Sem. I, 2014/15 Instruction Execution Rate Example 48 Computer Architecture and Organization The average CPI when the program is executed on a uniprocessor with the above trace results is CPI =0.6 +(2 X0.18) +(4X0.12) +(8X0.1) =2.24. The corresponding MIPS rate is
  • 49. Sem. I, 2014/15 Benchmarks  Programs designed to test performance  Written in high level language  Portable  Represents style of task  Systems, numerical, commercial  Easily measured  Widely distributed  E.g. System Performance Evaluation Corporation (SPEC)  CPU2006 for computation bound  17 floating point programs in C, C++, Fortran  12 integer programs in C, C++  3 million lines of code  Speed and rate metrics  Single task and throughput 49 Computer Architecture and Organization
  • 50. Sem. I, 2014/15 SPEC Speed Metric 50 Computer Architecture and Organization  Single task  Base runtime defined for each benchmark using reference machine  Results are reported as ratio of reference time to system run time  Trefi execution time for benchmark i on reference machine  Tsuti execution time of benchmark i on test system • Overall performance calculated by averaging ratios for all 12 integer benchmarks —Use geometric mean – Appropriate for normalized numbers such as ratios
  • 51. Sem. I, 2014/15 SPEC Speed Metric 51 Computer Architecture and Organization The Sun system executes this program in 934 seconds. The reference implementation requires 22,135 seconds. The ratio is calculated as: 22136/934 =23.7.
  • 52. Sem. I, 2014/15 SPEC Rate Metric  Measures throughput or rate of a machine carrying out a number of tasks  Multiple copies of benchmarks run simultaneously  Typically, same as number of processors  Ratio is calculated as follows:  Trefi reference execution time for benchmark i  N number of copies run simultaneously  Tsuti elapsed time from start of execution of program on all N processors until completion of all copies of program  Again, a geometric mean is calculated 52 Computer Architecture and Organization
  • 53. Sem. I, 2014/15 Amdahl’s Law  Gene Amdahl [AMDA67]  Speedup in one aspect of the technology or design does not result in a corresponding improvement in performance  Potential speed up of program using multiple processors  Concluded that:  Code needs to be parallelizable  Speed up is bound, giving diminishing returns for more processors  Task dependent  Servers gain by maintaining multiple connections on multiple processors  Databases can be split into parallel tasks 53 Computer Architecture and Organization
  • 54. Sem. I, 2014/15 Amdahl’s Law Formula 54 Computer Architecture and Organization  Conclusions  f small, parallel processors has little effect  N ->∞, speedup bound by 1/(1 – f)  Diminishing returns for using more processors • For program running on single processor —Fraction f of code infinitely parallelizable with no scheduling overhead —Fraction (1-f) of code inherently serial —T is total execution time for program on single processor —N is number of processors that fully exploit parralle portions of code
  • 55. Sem. I, 2014/15 Amdahl’s Law Formula … 55 Computer Architecture and Organization
  • 56. UG Program School of Electrical and Computer Engineering End of Chapter 2 Computer Architecture and Organization (ECEG-3202 )