What Next in Computer Architecture: Problems
and Research Direction
Jaynarayan Tudu
Computer Science and Engineering
Indian Institue of Technology Tirupati, India
August 20, 2025
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
VLSI Past and Lesson Learnt
It all started with Moore’s Law
The corollary of Moore’s Law
Performance
Power, Energy and Temperature (the Dennard Scaling)
Reliability
Complexity of the Design
Evolution in computing
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
CMOS Transistor
Figure: nmos and pmos transistors 1
1
Taken from: CMOS VLSI Des, Weste and Hariss
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Transistor and IC History
Figure: The first transistor and IC built at AT&T Bell Lab 2
2
Taken from: CMOS VLSI Des, Weste and Hariss
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Transistor Scaling: Moore’s Law
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Manufacturing Cost: Moore’s Law
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Transistor count
Figure: Its is now in billion era
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Transistor Production
Figure: It is growing every year [IEEE Spectrum]
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Transistor Technology Trend
Figure: Scaling driven technology
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Trends in Functionality in Microprocessor
Figure: Microprocessor Functionality
The highest till date: IBM P6 at 5.0 GHz.
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Frequency [ITRS 2013]
Figure: the rate has slowed down
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Intel Microprocessor Trend: Clock frequency
TheRevolutionContinues
IntelcontinuestodeliveronthepromiseofMoore’s
Lawwiththeintroductionofpowerfulmulti-core
technologies,transformingthewaywelive,work,
andplayonceagain.
108KHz
2,300
Numberoftransistors
10µ
Manufacturingtechnology
10µ
Manufacturingtechnology
6µ
Manufacturingtechnology
3µ
Manufacturingtechnology
3µ
Manufacturingtechnology
1.5µ
Manufacturingtechnology
1.5µ
Manufacturingtechnology
1µ
Manufacturingtechnology
0.8µ
Manufacturingtechnology
0.6µ
Manufacturingtechnology
0.25µ
Manufacturingtechnology
0.18µ
Manufacturingtechnology
90nm
Manufacturingtechnology
65nm
Manufacturingtechnology
65nm
Manufacturingtechnology 90nm
Manufacturingtechnology
65nm
Manufacturingtechnology 45nm
ManufacturingTechnology
0.13µ
Manufacturingtechnology
0.18µ
Manufacturingtechnology
Intel®4004processor
Introduced1971
Initialclockspeed
ThegroundbreakingIntel®4004
processorwasintroducedwith
thesamecomputingpower
asENIAC.
500-800KHz
3,500
Numberoftransistors
Intel®8008processor
Introduced1972
Initialclockspeed
2MHz
4,500
Numberoftransistors
Intel®8080processor
Introduced1974
Initialclockspeed
TheIntel®Pentium®4processor
ushersintheadventofthe
nanotechnologyage.
5MHz
29,000
Numberoftransistors
Intel®8086processor
Introduced1978
Initialclockspeed
5MHz
29,000
Numberoftransistors
Intel®8088processor
Introduced1979
Initialclockspeed
6MHz
134,000
Numberoftransistors
Intel®286processor
Introduced1982
Initialclockspeed
16MHz
275,000
Numberoftransistors
Intel386™processor
Introduced1985
Initialclockspeed
25MHz
1,200,000
Numberoftransistors
Intel486™processor
Introduced1989
Initialclockspeed
66MHz
3,100,000
Numberoftransistors
Intel®Pentium®processor
Introduced1993
Initialclockspeed
200MHz
5,500,000
Numberoftransistors
Intel®Pentium®Proprocessor
Introduced1995
Initialclockspeed
300MHz
7,500,000
Numberoftransistors
Intel®Pentium®IIprocessor
Intel®PentiumIIXeon®processor
Introduced1997
Initialclockspeed
500MHz
9,500,000
Numberoftransistors
Intel®Pentium®IIIprocessor
Intel®Pentium®IIIXeon®processor
Introduced1999
Initialclockspeed
1.5GHz
42,000,000
Numberoftransistors
Intel®Pentium®4processor
Introduced2000
Intel®Xeon®processor
Introduced2001
Initialclockspeed
1.7GHz
55,000,000
Numberoftransistors
Intel®Pentium®Mprocessor
Introduced-2002
InitialClockSpeed
1GHz
220,000,000
Numberoftransistors
Intel®Itanium®2processor
Introduced2002
Initialclockspeed
3.2GHz
291,000,000
Numberoftransistors
Intel®Pentium®Dprocessor
Introduced2005
Initialclockspeed
2.93GHz
291,000,000
Numberoftransistors
Intel®Core™2Duoprocessor
Intel®Core™2Extremeprocessor
Dual-CoreIntel®Xeon®processor
Introduced2006
Initialclockspeed
2.66GHz
582,000,000
Numberoftransistors
Quad-CoreIntel®Xeon®processor
Quad-CoreIntel®Core™2Extremeprocessor
Introduced2006
Intel®Core™2Quadprocessors
Introduced2007
Initialclockspeed
1.66GHz
1,720,000,000
Numberoftransistors
Dual-CoreIntel®Itanium®2processor9000series
Introduced2006
Initialclockspeed
> 3GHz
820,000,000
Numberoftransistors
Quad-CoreIntel®Xeon®processor(Penryn)
Dual-CoreIntel®Xeon®processor(Penryn)
Quad-CoreIntel®Core™2Extremeprocessor(Penryn)
Introduced2007
Initialclockspeed
1975
TheAltair8800microcomputer,based
ontheIntel®8080microprocessor,was
thefirstsuccessfulhomeorpersonal
computer.
1972
TheIntel®4004processor,Intel’s
firstmicroprocessor,poweredthe
Busicomcalculatorandpavedthe
wayforthepersonalcomputer.
2003
Intel®Centrino®mobiletechnologybrought
highperformance,enhancedbatterylife,
andintegratedWLANcapabilitytothinner,
lighterPCs.
1994
Intelchipspoweredalmost75percentofall
desktopcomputers.
Moore’sLaw
In1965,Intelco-founderGordonMoorepredictedthat
thenumberoftransistorsonachipwoulddoubleabout
everytwoyears.Sincethen,Moore’sLawhasfueleda
technologyrevolutionasIntelhasexponentiallyincreased
thenumberoftransistorsintegratedintoitprocessors
forgreaterperformanceandenergyefficiency.
TheRevolutionBegins
Throughouthistory,newandimprovedtechnologies
havetransformedthehumanexperience.Inthe20th
century,thepaceofchangespedupradicallyaswe
enteredthecomputingage.Fornearly40years
Intelinnovationshavecontinuouslycreatednew
possibilitiesinthelivesofpeoplearoundtheworld. 1989
TheNationalAcademyofEngineering
namedthemicroprocessoroneoften
outstandingengineeringachievements
fortheadvancementofhumanwelfare.
2005
Dual-coretechnologywasintroduced.
1995
Releasedinthefallof1995,theIntel®Pentium®Pro
processorwasdesignedtofuel32-bitserverand
workstationapplications,enablingfastcomputer-aided
design,mechanicalengineeringandscientific
computation.
2001
TheItanium®processoristhefirst
inafamilyof64-bitproducts
fromIntelandisdesignedfor
high-end,enterprise-class
serversandworkstations.
2006
Intellaunchedfourprocessorsforservers
undertheXeon5300brand,andanother
processorundertheCore2Extreme
seriesforhighperformancecomputing.
These"quad-core"processorsshow
improvedperformanceoverotherswith
justoneortwoprocessingcores.
2007
Inthesecondhalfof2007,Intelbeganproduction
ofthenextgenerationIntel®Core™2andXeon
processorfamiliesbasedon45-nanometer(nm)
Hi-kmetalgatesilicontechnology.
1998
TheIntel®PentiumIIXeon
processorsfeaturetechnical
innovationsspecificallydesignedfor
workstationsandserversthatutilize
demandingbusinessapplications.
TheIntel®8008processorwas
twiceaspowerfulastheIntel®
4004processor.
TheIntel®8080processor
madevideogamesandhome
computerspossible.
TheIntel®8086processorwas
thefirst16bitprocessorand
deliveredabouttentimesthe
performanceofitspredecessors.
ApivotalsaletoIBM'snewpersonal
computerdivisionmadetheIntel®
8088processorthebrainsofIBM's
newhitproduct--theIBMPC.
TheIntel®286wasthefirst
Intelprocessorthatcould
runallthesoftwarewritten
foritspredecessor.
TheIntel386™processorcouldrunmultiple
softwareprogramsatonceandfeatured
275,000transistors—morethan100times
asmanyastheoriginalIntel®4004.
TheIntel486™introducedtheintegrated
floatingpointunit.Thisgenerationof
computersreallyalloweduserstogofrom
acommandlevelcomputerintopointand
clickcomputing.
TheIntel®Pentium®processor,executing
112millioncommandspersecond,allowed
computerstomoreeasilyincorporate"real
world"datasuchasspeech,sound,
handwritingandphotographicimages.
ThePentium®Proprocessordelivered
moreperformancethanprevious
generationprocessorsthroughan
innovationcalledDynamicExecution.
Thismadepossibletheadvanced3D
visualizationandinteractivecapabilities.
TheIntel®Pentium®IIprocessor’ssignificant
performanceimprovementoverprevious
Intel-Architectureprocessorswasbasedon
theseamlesscombinationoftheP6
microarchitectureandIntelMMXmedia
enhancementtechnology.
TheIntel®Pentium®IIIprocessorexecuted
InternetStreamingSIMDExtensions,
extendedtheconceptofprocessor
identificationandutilizedmultiple
low-powerstatestoconservepower
duringidletimes.
TheIntel®Pentium®Mprocessor,theIntel®
855chipsetfamily,andtheIntel®
PRO/Wireless2100networkconnectionare
thethreecomponentsofIntel®Centrino®
processortechnology.Intel®Centrino®
processortechnologywasdesigned
specificallyforportablecomputing.
TheIntel®Pentium®Dprocessorfeatures
thefirstdesktopduel-coredesignwithtwo
completeprocessorcores,thateachrunat
thesamespeed,inonephysicalpackage.
Intel®Core™2Duoprocessoroptimizes
mobilemicroarchitectureoftheIntel®
Pentium®Mprocessorandenhancedit
withmanymicroarchitectureinnovations.
Intel®Centrino®ProandIntel®vPro™
processortechnologyprovideexcellent
performancefromtheDual-CoreIntel®
Core™2Duoprocessor.
Dual-CoreIntel®Itanium®2processor9000series
outperformstheearlier,single-coreversionofthe
Itanium2processors.Withmorethan1.7billion
transistorsandwithtwoexecutioncores,these
processorsdoubletheperformanceofprevious
Itaniumprocessorswhilereducingaveragepower
consumption.
TheunprecedentedperformanceoftheIntel®
Core™2Quadprocessorismadepossiblebyeach
ofthefourcompleteexecutioncoresdelivering
thefullpowerofIntelCoremicroarchitecture.
TheQuad-CoreIntel®Xeon®processorprovides
50percentgreaterperformancethanindustry-
leadingDual-CoreIntel®Xeon®processorinthe
samepowerenvelope.Thequad-core-based
serversenablemoreapplicationstorunwitha
smallerfootprint.
Intel’snextgenerationIntel®Core™2processor
family,codenamed"Penryn",contains
industry-leadingmicroarchitecture
enhancements. Further,newSSE4instructions
forimprovedvideo,imaging,and3Dcontent
performanceandnewpowermanagement
featureswillextend“Penryn”processorfamily
leadershipinperformanceandenergyefficiency.
TheIntel®Itanium®2processoristhe
successorofthefirstItaniumprocessor.
ThearchitectureisbasedonExplicitlyParallel
InstructionComputing(EPIC).Itistheoretically
capableofperformingroughly8timesmore
workperclockcyclethanotherCISCandRISC
architectures.
1976
Anoperatorinanearlybunnysuitshowshowa4-inch
waferispreparedforapositiveacidspin.
1981
TheIntel®8088microprocessorwas
selectedtopowertheIBMPC.
1982
Within6yearsofitsrelease,anestimated
15million286-basedpersonalcomputers
wereinstalledaroundtheworld.
Note:Numberoftransistorsisanapproximatenumber.
45nm
Manufacturingtechnology
Figure: Clock frequency has been consistently in rise!
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Performance Growth: 1978 to 2018
Figure: Growth of performance over last 40 years (where is the Moore’s
law)
1978 to 1986: Doubling every 3.5 years
1988 to 2002: Doubling every 2 years
2002 to 2010: Doubling every 4 years
2010 to 2014: Doubling every 8 years
2014 to 2018: Doubling every 20 years
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Power Density Trend [ITRS 2015 — A Kahng, UCSD ]
Figure: Peak power density trend
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Power Consumption [Borkar et al]
Figure: Processor power consumption
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Heat Flux [IBM — ITRS 2015 meeting]
Figure: Heat flux: Watt/area with respect to devices - bipolar vs CMOS
There is a need of new device technology to tackle the heat flux!
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Exascale Computing: Trends in Top500.org
Power and Energy will pose a major challenge [Borkar 2011].
GWatt consumption is expected.
Figure: Performance developement graph of top500 supercomputer
And, it is projected to go beyond Exaflop! Do we need it?
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Power Trend in Top500
Figure: Nothing comes free... more performance consumes more power!
The result is recent one - 2024 November.
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Performance per Watt Trend [top500.org]
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Memory Wall: Hindrance in performance
Tjhe old story of memory wall on CPU and Memory!
Figure: Obsever the flat cpu performance post 2005....
The real issue is the gap!
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Memory Wall: Hindrance in performance
Here is a more recent result examining the memory wall problem
for CPU,GPU, and TPU!3
Figure: The gap is growing exponentially still..
3
Amir et al, AI and Memory Wall, arXiv:2403.14123v1 [cs.LG] 21 Mar 2024
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Emerging Memory Technology [ITRS 2015 meeting]
Solution: The idea is to bring in hierarchy system!
Figure: Focus is on non-volatile memory
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
On die Cache [Borkar et al]
Figure: It will increase to mitigate the memory wall problem
- The upcoming technology is system in package (SiP) moving
from the existing the SoC
- Addressing the problem of memory wall- memory bandwidth and
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Emerging Architecture
von Neumann
The traditional microprocessor
Non von Neumann
Cellular Automata
Co-located memory-logic [procesor-in-memory,
memory-in-logic, nonvolatile logic]
Reconfigurable computing
Cognitive computing [neuro morphic, machine learning]
Statistical and stochastic computing [statistical inference,
approx computing]
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Emerging Architecture: Data and learning
In Design or test production
GPU architecture (NVIDIA)
Tensor Processing Unit [Google ML processing for Search and
Language ]
Microsoft Azure with Stratix 10 (intel) FPGA
XPU by Xiling for Machine learning
IPU (Intelligent Proc Unit) by Graphcore
DPU (Data flow computnig) by Wavecore
DLU by Fujitsu
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Trends in IoT and AI
Figure: IoT in use, market capital indicates user trends and applicability
- Find out the trend for AI applications and computing!
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Summary and Action Items
Going beyond Moore’s Law
Sustainable computing - Power and energy remains a challenge
Increasing clock frequency in CPU a well GPU
Increasing number of Cores in CPU and GPU
Exploring accelerator level parallelism
NVIDIA:
What Is Energy-Efficient Computing? Sustainable computing is an
end-to-end approach that spans the entire lifecycle of technology,
designed to enhance energy efficiency and boost carbon mitigation.
From weather forecasting to chip manufacturing, accelerated
computing holds the potential to realize green computing
objectives, while saving costs.
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Summary and Action Items
Figure: World economic forum: Global impact of sustainable computing -
energy is a problem
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Reference and Reading Materials
J. L. Hennessy and D. A. Patterson, Computer Architectures: A
Quantitative Approach, Morgan Kaufmann Publishers, 5th
Edition.
J.P. Shen and M.H. Lipasti, Modern Processor Design, MC Graw
Hill, Crowfordsville, 2005
Current Literature (from ISCA, Micro, HPCA, ICCD, and IEEE
Trans. on Computers, IEEE Architecture Letters)
The other resources:
http://pages.cs.wisc.edu/ arch/www/
Prof. Onur Mutlu: https://users.ece.cmu.edu/ omutlu/
Almost all universities have very strong research group
SPEC Benchmark: https://www.spec.org/benchmarks.html
Simulators: simplescalar, sniper, gem5, gpgpu-sim, tejas
DBLP: https://dblp.uni-trier.de/
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Research in India
IISc Bangalore: Prof Matthew Jacob, Prof. R Govind Rajan,
Dr. Uday Bandagolu, Dr. Arka Basu, Prof. S K Nandi.
IIT Bombay: Prof. Virendra Singh, Prof. Sachin Patkar, and
Prof. Madhava Desai, Dr. Biswabandan Panda.
IIT Madras: Prof. V Kamakoti, Dr. Rupesh Nasre, Prof.
Madhu Mutyam
IIT Delhi: Prof. Pritiranjan Panda, Prof. Smruti Sarangi
IIT Kanpur: Prof. Mainak Chaoudhuri,
IIT/IIIT Hyderabad: Dr. Mittal, Dr. Lavanya
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D
Thank You
Jaynarayan Tudu
What Next in Computer Architecture: Problems and Research D

What next in Computer Architecture: Problems and research direction

  • 1.
    What Next inComputer Architecture: Problems and Research Direction Jaynarayan Tudu Computer Science and Engineering Indian Institue of Technology Tirupati, India August 20, 2025 Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 2.
    VLSI Past andLesson Learnt It all started with Moore’s Law The corollary of Moore’s Law Performance Power, Energy and Temperature (the Dennard Scaling) Reliability Complexity of the Design Evolution in computing Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 3.
    CMOS Transistor Figure: nmosand pmos transistors 1 1 Taken from: CMOS VLSI Des, Weste and Hariss Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 4.
    Transistor and ICHistory Figure: The first transistor and IC built at AT&T Bell Lab 2 2 Taken from: CMOS VLSI Des, Weste and Hariss Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 5.
    Transistor Scaling: Moore’sLaw Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 6.
    Manufacturing Cost: Moore’sLaw Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 7.
    Transistor count Figure: Itsis now in billion era Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 8.
    Transistor Production Figure: Itis growing every year [IEEE Spectrum] Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 9.
    Transistor Technology Trend Figure:Scaling driven technology Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 10.
    Trends in Functionalityin Microprocessor Figure: Microprocessor Functionality The highest till date: IBM P6 at 5.0 GHz. Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 11.
    Frequency [ITRS 2013] Figure:the rate has slowed down Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 12.
    Intel Microprocessor Trend:Clock frequency TheRevolutionContinues IntelcontinuestodeliveronthepromiseofMoore’s Lawwiththeintroductionofpowerfulmulti-core technologies,transformingthewaywelive,work, andplayonceagain. 108KHz 2,300 Numberoftransistors 10µ Manufacturingtechnology 10µ Manufacturingtechnology 6µ Manufacturingtechnology 3µ Manufacturingtechnology 3µ Manufacturingtechnology 1.5µ Manufacturingtechnology 1.5µ Manufacturingtechnology 1µ Manufacturingtechnology 0.8µ Manufacturingtechnology 0.6µ Manufacturingtechnology 0.25µ Manufacturingtechnology 0.18µ Manufacturingtechnology 90nm Manufacturingtechnology 65nm Manufacturingtechnology 65nm Manufacturingtechnology 90nm Manufacturingtechnology 65nm Manufacturingtechnology 45nm ManufacturingTechnology 0.13µ Manufacturingtechnology 0.18µ Manufacturingtechnology Intel®4004processor Introduced1971 Initialclockspeed ThegroundbreakingIntel®4004 processorwasintroducedwith thesamecomputingpower asENIAC. 500-800KHz 3,500 Numberoftransistors Intel®8008processor Introduced1972 Initialclockspeed 2MHz 4,500 Numberoftransistors Intel®8080processor Introduced1974 Initialclockspeed TheIntel®Pentium®4processor ushersintheadventofthe nanotechnologyage. 5MHz 29,000 Numberoftransistors Intel®8086processor Introduced1978 Initialclockspeed 5MHz 29,000 Numberoftransistors Intel®8088processor Introduced1979 Initialclockspeed 6MHz 134,000 Numberoftransistors Intel®286processor Introduced1982 Initialclockspeed 16MHz 275,000 Numberoftransistors Intel386™processor Introduced1985 Initialclockspeed 25MHz 1,200,000 Numberoftransistors Intel486™processor Introduced1989 Initialclockspeed 66MHz 3,100,000 Numberoftransistors Intel®Pentium®processor Introduced1993 Initialclockspeed 200MHz 5,500,000 Numberoftransistors Intel®Pentium®Proprocessor Introduced1995 Initialclockspeed 300MHz 7,500,000 Numberoftransistors Intel®Pentium®IIprocessor Intel®PentiumIIXeon®processor Introduced1997 Initialclockspeed 500MHz 9,500,000 Numberoftransistors Intel®Pentium®IIIprocessor Intel®Pentium®IIIXeon®processor Introduced1999 Initialclockspeed 1.5GHz 42,000,000 Numberoftransistors Intel®Pentium®4processor Introduced2000 Intel®Xeon®processor Introduced2001 Initialclockspeed 1.7GHz 55,000,000 Numberoftransistors Intel®Pentium®Mprocessor Introduced-2002 InitialClockSpeed 1GHz 220,000,000 Numberoftransistors Intel®Itanium®2processor Introduced2002 Initialclockspeed 3.2GHz 291,000,000 Numberoftransistors Intel®Pentium®Dprocessor Introduced2005 Initialclockspeed 2.93GHz 291,000,000 Numberoftransistors Intel®Core™2Duoprocessor Intel®Core™2Extremeprocessor Dual-CoreIntel®Xeon®processor Introduced2006 Initialclockspeed 2.66GHz 582,000,000 Numberoftransistors Quad-CoreIntel®Xeon®processor Quad-CoreIntel®Core™2Extremeprocessor Introduced2006 Intel®Core™2Quadprocessors Introduced2007 Initialclockspeed 1.66GHz 1,720,000,000 Numberoftransistors Dual-CoreIntel®Itanium®2processor9000series Introduced2006 Initialclockspeed > 3GHz 820,000,000 Numberoftransistors Quad-CoreIntel®Xeon®processor(Penryn) Dual-CoreIntel®Xeon®processor(Penryn) Quad-CoreIntel®Core™2Extremeprocessor(Penryn) Introduced2007 Initialclockspeed 1975 TheAltair8800microcomputer,based ontheIntel®8080microprocessor,was thefirstsuccessfulhomeorpersonal computer. 1972 TheIntel®4004processor,Intel’s firstmicroprocessor,poweredthe Busicomcalculatorandpavedthe wayforthepersonalcomputer. 2003 Intel®Centrino®mobiletechnologybrought highperformance,enhancedbatterylife, andintegratedWLANcapabilitytothinner, lighterPCs. 1994 Intelchipspoweredalmost75percentofall desktopcomputers. Moore’sLaw In1965,Intelco-founderGordonMoorepredictedthat thenumberoftransistorsonachipwoulddoubleabout everytwoyears.Sincethen,Moore’sLawhasfueleda technologyrevolutionasIntelhasexponentiallyincreased thenumberoftransistorsintegratedintoitprocessors forgreaterperformanceandenergyefficiency. TheRevolutionBegins Throughouthistory,newandimprovedtechnologies havetransformedthehumanexperience.Inthe20th century,thepaceofchangespedupradicallyaswe enteredthecomputingage.Fornearly40years Intelinnovationshavecontinuouslycreatednew possibilitiesinthelivesofpeoplearoundtheworld. 1989 TheNationalAcademyofEngineering namedthemicroprocessoroneoften outstandingengineeringachievements fortheadvancementofhumanwelfare. 2005 Dual-coretechnologywasintroduced. 1995 Releasedinthefallof1995,theIntel®Pentium®Pro processorwasdesignedtofuel32-bitserverand workstationapplications,enablingfastcomputer-aided design,mechanicalengineeringandscientific computation. 2001 TheItanium®processoristhefirst inafamilyof64-bitproducts fromIntelandisdesignedfor high-end,enterprise-class serversandworkstations. 2006 Intellaunchedfourprocessorsforservers undertheXeon5300brand,andanother processorundertheCore2Extreme seriesforhighperformancecomputing. These"quad-core"processorsshow improvedperformanceoverotherswith justoneortwoprocessingcores. 2007 Inthesecondhalfof2007,Intelbeganproduction ofthenextgenerationIntel®Core™2andXeon processorfamiliesbasedon45-nanometer(nm) Hi-kmetalgatesilicontechnology. 1998 TheIntel®PentiumIIXeon processorsfeaturetechnical innovationsspecificallydesignedfor workstationsandserversthatutilize demandingbusinessapplications. TheIntel®8008processorwas twiceaspowerfulastheIntel® 4004processor. TheIntel®8080processor madevideogamesandhome computerspossible. TheIntel®8086processorwas thefirst16bitprocessorand deliveredabouttentimesthe performanceofitspredecessors. ApivotalsaletoIBM'snewpersonal computerdivisionmadetheIntel® 8088processorthebrainsofIBM's newhitproduct--theIBMPC. TheIntel®286wasthefirst Intelprocessorthatcould runallthesoftwarewritten foritspredecessor. TheIntel386™processorcouldrunmultiple softwareprogramsatonceandfeatured 275,000transistors—morethan100times asmanyastheoriginalIntel®4004. TheIntel486™introducedtheintegrated floatingpointunit.Thisgenerationof computersreallyalloweduserstogofrom acommandlevelcomputerintopointand clickcomputing. TheIntel®Pentium®processor,executing 112millioncommandspersecond,allowed computerstomoreeasilyincorporate"real world"datasuchasspeech,sound, handwritingandphotographicimages. ThePentium®Proprocessordelivered moreperformancethanprevious generationprocessorsthroughan innovationcalledDynamicExecution. Thismadepossibletheadvanced3D visualizationandinteractivecapabilities. TheIntel®Pentium®IIprocessor’ssignificant performanceimprovementoverprevious Intel-Architectureprocessorswasbasedon theseamlesscombinationoftheP6 microarchitectureandIntelMMXmedia enhancementtechnology. TheIntel®Pentium®IIIprocessorexecuted InternetStreamingSIMDExtensions, extendedtheconceptofprocessor identificationandutilizedmultiple low-powerstatestoconservepower duringidletimes. TheIntel®Pentium®Mprocessor,theIntel® 855chipsetfamily,andtheIntel® PRO/Wireless2100networkconnectionare thethreecomponentsofIntel®Centrino® processortechnology.Intel®Centrino® processortechnologywasdesigned specificallyforportablecomputing. TheIntel®Pentium®Dprocessorfeatures thefirstdesktopduel-coredesignwithtwo completeprocessorcores,thateachrunat thesamespeed,inonephysicalpackage. Intel®Core™2Duoprocessoroptimizes mobilemicroarchitectureoftheIntel® Pentium®Mprocessorandenhancedit withmanymicroarchitectureinnovations. Intel®Centrino®ProandIntel®vPro™ processortechnologyprovideexcellent performancefromtheDual-CoreIntel® Core™2Duoprocessor. Dual-CoreIntel®Itanium®2processor9000series outperformstheearlier,single-coreversionofthe Itanium2processors.Withmorethan1.7billion transistorsandwithtwoexecutioncores,these processorsdoubletheperformanceofprevious Itaniumprocessorswhilereducingaveragepower consumption. TheunprecedentedperformanceoftheIntel® Core™2Quadprocessorismadepossiblebyeach ofthefourcompleteexecutioncoresdelivering thefullpowerofIntelCoremicroarchitecture. TheQuad-CoreIntel®Xeon®processorprovides 50percentgreaterperformancethanindustry- leadingDual-CoreIntel®Xeon®processorinthe samepowerenvelope.Thequad-core-based serversenablemoreapplicationstorunwitha smallerfootprint. Intel’snextgenerationIntel®Core™2processor family,codenamed"Penryn",contains industry-leadingmicroarchitecture enhancements. Further,newSSE4instructions forimprovedvideo,imaging,and3Dcontent performanceandnewpowermanagement featureswillextend“Penryn”processorfamily leadershipinperformanceandenergyefficiency. TheIntel®Itanium®2processoristhe successorofthefirstItaniumprocessor. ThearchitectureisbasedonExplicitlyParallel InstructionComputing(EPIC).Itistheoretically capableofperformingroughly8timesmore workperclockcyclethanotherCISCandRISC architectures. 1976 Anoperatorinanearlybunnysuitshowshowa4-inch waferispreparedforapositiveacidspin. 1981 TheIntel®8088microprocessorwas selectedtopowertheIBMPC. 1982 Within6yearsofitsrelease,anestimated 15million286-basedpersonalcomputers wereinstalledaroundtheworld. Note:Numberoftransistorsisanapproximatenumber. 45nm Manufacturingtechnology Figure: Clock frequency has been consistently in rise! Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 13.
    Performance Growth: 1978to 2018 Figure: Growth of performance over last 40 years (where is the Moore’s law) 1978 to 1986: Doubling every 3.5 years 1988 to 2002: Doubling every 2 years 2002 to 2010: Doubling every 4 years 2010 to 2014: Doubling every 8 years 2014 to 2018: Doubling every 20 years Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 14.
    Power Density Trend[ITRS 2015 — A Kahng, UCSD ] Figure: Peak power density trend Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 15.
    Power Consumption [Borkaret al] Figure: Processor power consumption Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 16.
    Heat Flux [IBM— ITRS 2015 meeting] Figure: Heat flux: Watt/area with respect to devices - bipolar vs CMOS There is a need of new device technology to tackle the heat flux! Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 17.
    Exascale Computing: Trendsin Top500.org Power and Energy will pose a major challenge [Borkar 2011]. GWatt consumption is expected. Figure: Performance developement graph of top500 supercomputer And, it is projected to go beyond Exaflop! Do we need it? Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 18.
    Power Trend inTop500 Figure: Nothing comes free... more performance consumes more power! The result is recent one - 2024 November. Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 19.
    Performance per WattTrend [top500.org] Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 20.
    Memory Wall: Hindrancein performance Tjhe old story of memory wall on CPU and Memory! Figure: Obsever the flat cpu performance post 2005.... The real issue is the gap! Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 21.
    Memory Wall: Hindrancein performance Here is a more recent result examining the memory wall problem for CPU,GPU, and TPU!3 Figure: The gap is growing exponentially still.. 3 Amir et al, AI and Memory Wall, arXiv:2403.14123v1 [cs.LG] 21 Mar 2024 Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 22.
    Emerging Memory Technology[ITRS 2015 meeting] Solution: The idea is to bring in hierarchy system! Figure: Focus is on non-volatile memory Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 23.
    On die Cache[Borkar et al] Figure: It will increase to mitigate the memory wall problem - The upcoming technology is system in package (SiP) moving from the existing the SoC - Addressing the problem of memory wall- memory bandwidth and Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 24.
    Emerging Architecture von Neumann Thetraditional microprocessor Non von Neumann Cellular Automata Co-located memory-logic [procesor-in-memory, memory-in-logic, nonvolatile logic] Reconfigurable computing Cognitive computing [neuro morphic, machine learning] Statistical and stochastic computing [statistical inference, approx computing] Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 25.
    Emerging Architecture: Dataand learning In Design or test production GPU architecture (NVIDIA) Tensor Processing Unit [Google ML processing for Search and Language ] Microsoft Azure with Stratix 10 (intel) FPGA XPU by Xiling for Machine learning IPU (Intelligent Proc Unit) by Graphcore DPU (Data flow computnig) by Wavecore DLU by Fujitsu Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 26.
    Trends in IoTand AI Figure: IoT in use, market capital indicates user trends and applicability - Find out the trend for AI applications and computing! Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 27.
    Summary and ActionItems Going beyond Moore’s Law Sustainable computing - Power and energy remains a challenge Increasing clock frequency in CPU a well GPU Increasing number of Cores in CPU and GPU Exploring accelerator level parallelism NVIDIA: What Is Energy-Efficient Computing? Sustainable computing is an end-to-end approach that spans the entire lifecycle of technology, designed to enhance energy efficiency and boost carbon mitigation. From weather forecasting to chip manufacturing, accelerated computing holds the potential to realize green computing objectives, while saving costs. Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 28.
    Summary and ActionItems Figure: World economic forum: Global impact of sustainable computing - energy is a problem Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 29.
    Reference and ReadingMaterials J. L. Hennessy and D. A. Patterson, Computer Architectures: A Quantitative Approach, Morgan Kaufmann Publishers, 5th Edition. J.P. Shen and M.H. Lipasti, Modern Processor Design, MC Graw Hill, Crowfordsville, 2005 Current Literature (from ISCA, Micro, HPCA, ICCD, and IEEE Trans. on Computers, IEEE Architecture Letters) The other resources: http://pages.cs.wisc.edu/ arch/www/ Prof. Onur Mutlu: https://users.ece.cmu.edu/ omutlu/ Almost all universities have very strong research group SPEC Benchmark: https://www.spec.org/benchmarks.html Simulators: simplescalar, sniper, gem5, gpgpu-sim, tejas DBLP: https://dblp.uni-trier.de/ Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 30.
    Research in India IIScBangalore: Prof Matthew Jacob, Prof. R Govind Rajan, Dr. Uday Bandagolu, Dr. Arka Basu, Prof. S K Nandi. IIT Bombay: Prof. Virendra Singh, Prof. Sachin Patkar, and Prof. Madhava Desai, Dr. Biswabandan Panda. IIT Madras: Prof. V Kamakoti, Dr. Rupesh Nasre, Prof. Madhu Mutyam IIT Delhi: Prof. Pritiranjan Panda, Prof. Smruti Sarangi IIT Kanpur: Prof. Mainak Chaoudhuri, IIT/IIIT Hyderabad: Dr. Mittal, Dr. Lavanya Jaynarayan Tudu What Next in Computer Architecture: Problems and Research D
  • 31.
    Thank You Jaynarayan Tudu WhatNext in Computer Architecture: Problems and Research D