SlideShare a Scribd company logo
1 of 30
Perry Lea
4.17.15
Architectural Failures
Hybrid & Heterogeneity
Typical SOC design
Emerging Trends
Heterogeneous Compute is the Future
What are the trends in industrial adoption of new computing elements?
Hewlett Packard
Micron
21 Years
IEEE Senior Member
ACM Senior Member
Embedded Computing
Distinguished TechnologistChief Architect
Imaging and Embedded Systems
ASIC and SOC Design
Software Engineer
Firmware Engineer
Advanced Memory Systems Architect
Patents
ARM
MIPS
BS: Computer Science
MS: Computer EngineeringPD: Electrical Engineering
Columbia University
Heterogeneous Computing
Hardware Architect
Publications Transmeta
Intel
HP Labs: The Machine
memristor
Talking about something as a failure will lead to
debate. I’m fine debating.
What I ask is we define failure as:
Why didn’t these architectures stand
alone?
Architecture, cost, market, timing?
•Founded by Danny Hillis’s PhD work in 1983, Cambridge MA..
•Array of bit-serial processors.
•Hypercube interconnect.
•Near memory (4K bit) microprocessors.
•Not commercially viable until they added an OTS FPU.
•Originally programmed in LISP and later C.
•Later CM5 moved to MIMD architecture and a fat tree network.
•Peak performance: 20 Gflops
•Max revenue: $65M
•Founded by Jeff Kalb (DEC), 1987 in Sunnyvale.
•Based on SIMD architecture
•1K to 16K processors on MP1
•Custom logic, fab’d by HP and TI.
•Required a front end VAX 11/780
•Originally programmed in Fortran and C on
front end and MPL to run on MasPar
•Marketed as “a general purpose computer system”
•Peak performance: 1.2 Gflops
•Max revenue: $20M
•Founded by Josh Fisher, 1984 in New Haven, CT.
•Based on VLIW architecture and trace scheduling
•Ran Unix
•7 - 32 bit parallel ops
•125 units sold
•Peak performance: 100 Mflops
•Max revenue: $15M
Company Primary
Architecture
Start Chapter 11
Filing Date
Where are they
now?
Thinking
Machine
Massively
Parallel bit-serial
1983 1994 SUN/Oracle
nCube Distribute
MIMD
Hypercube
1983 ~2005 IP used for video
on demand
Meiko Mesh of Inmos
Transputers
1985 2003 / 2009 Bought by
Quadrics -
defunct
Multiflow VLIW 1984 1990 HP  ST
Maspar Massive SIMD
Array
1987 1996 Remnants in
data mining
1. The 80’s were a good time to be a computer architect.
2. Novel architectures to solve all the world’s problems probably won’t last.
17Uniquedesigns&architectures.
5Uniquedesigns&architectures.
“Heterogeneous System Architecture is a type of
computer processor architecture that integrates central
processing units and graphics processors on the same
bus …”
While that is true, I contend that a true heterogeneous architecture blends
the right core/ISA/processor to the job.
A GPU/CPU blend only gets you so far..
Bus contention and
saturation
Inter-processor
communication
MPI may not exist.
Fixed function ownership
Deadlock propagation
Example of bus saturation in a
heterogeneous SOC
•Unpredictable code flow paths
•Legacy code base and application
•Naturally aligns to fine grain parallelism
•Cost is less of a factor
•Code has real-time requirements
•Dataflow is naturally streaming in application
•Design unproven, needs programmatic flexibility
•Cost somewhat a factor
•Code is embarrassingly parallel
•Code aligns will with SMT
•Cost not an object
•Code must perform small kernels of execution
in very low power
•Code and data have security implications
•Die area concerns
CPU
SMP Traditional
DSPVLIW
uController
GPU
•Code has real-time requirements
•Dataflow is naturally streaming in application
•Fixed IO dependencies
•Design can be hardened
•Ultimate performance
•NRE high, risk high
Fixed Function
SI
SOCs and ASSPs amortize
system cost (power, board
area, die) into a single die and
package
Push as much functionality
into a single package as
possible.
SOC data flow optimized for
particularly narrow use cases.
Heterogeneous is more than a
CPU + GPU
Typical SOCs currently blend together
many cores, much fixed function
silicon, many OS, and many code
bases.
VM’s are adopted to ease software
migration
2 Symmetric or asymmetric CPUs
running some RTOS
Vivante GC 2D/3D GPU running
multiple threads.
DSP running unique RTOS
Others…
Marvell Armada 1500
Courtesy Marvell
Actually 8 cores in SOC
Potentially 6 operating systems
NAND: RTOS
DSP: small DSP OS (STOS)
Secure Core: TEE
Front Panel: uKernel
ARM 1: Greenhills RTOS
ARM 2: Linux
TS Processor: ThreadX
Is it the 1980’s again?
What emerging architectures may
find their way into an SOC?
Collapsed Memory Stacks
Neuromorphic Computing
Managed Language Accelerators
Computing Memories
Computational RAM
The Machine
Combination of:
Collapsed memory stack,
Near memory SOC
compute (Moonshot-like)
Photonic interconnect.
Pervasive from IOT/embedded
markets to exascale.
New OS being crated (Linux++)
with university collaboration
Designed ground up to
address, performance, security,
and data locality
Claims:
Problem Size: 2^42 edges vs Blue
Gene Q@ 2^40 edges
Performance: 16 GTEPS vs Blue Gene
Q @ 15.3 GTEPS
Power: 400 kW vs Blue Gene Q at
7,900 kW
Cores/Racks: 122K Cores/20 Racks vs
Blue Gene Q @ 1.6M Cores/96 Racks
Utilization < 70%
The memristor is optional in this
architecture.
PCM may be a backup option.
2 types of computive memory with data stored
as resistance.
Memristor crossbar
Material Implication “IMP” architecture
p implies q…
If p then q
Adding memristor IMP components would
double the size of the die, but yield a 1000x
performance improvement.
Still need silicon gate to drive current.
Memristors don’t “drive” anything.
Still unproven to synthesize bulk memristors
with any form of yield.
Trust table of boolean logic built with memristor IMP
Courtesy Shahsavari 2010
Automata Processor
Micron fabricated DRAM
Non von Neumann dataflow
48K state transition elements
per chip
6.6T path decisions/s
4W max TPD
DDR3 RAM interface
State Machine Compiler and Unique Tools
Pattern matching on von Neumann CPU: O(n^2)
Pattern matching on Automata O(1)
Automata Processor to full ALU support
Berkeley iRAM
VIRAM
CRAM Experiments during IRAM era
Courtesy Elliot, 1999
Direct Bytecode Execution (DBX)
No need to JIT
Improves startup time.
Reduces code inflation with JIT process
(~8x)
ARM Jazelle
140 JAVA instructions are directly
executed
94 are emulated in short bursts of ARM
instructions
12K silicon gates
Dalvik VM put it out of business.
No JIT
Register based
Back in 2004 this data looked phenomenal
Build a non-Von Neumann
architecture based on “leaky
integrate and fire” CMOS topology.
Each core (4096) models 256
“neurons” with 256 “synapses” on a
5B transitor (28nm) die.
System clock is very slow.
Memory units tightly coupled to
“neurons”. IBM refers to this as the
TrueNorth architecture.
Goals:
10B cores, 100T connections
(synapses)
1KW (45pJ per compute)
2 liters of space
TrueNorth Architecture
Uses simple leaky integrate and fire CMOS.
1st parts fabricated at HRL Labs
Total current funding: $102M since 2009
Timeline
Phase 2 (2013): multi-core synaptic
processor based on TrueNorth. 1M
neurons (256 neurons per core – 400
0cores)
Phase 3 (??): fabricate 10M core.
Simulate mouse.
Phase 4 (2017): 100M core
TrueNorth Architecture
Uses simple leaky integrate and fire CMOS.
Technology Challenges Prediction
The Machine •Large OS hurdle.
•Persistent memory yields.
•Doubtful SOC benefits
•Select customer adoption.
•Low penetration in mobile
or embedded.
Computive Memory •Difficult to program •Uncertain application
•Crossbar may have genesis
in FPGA alternative
Computational RAM •Toolchain
•Automata programming
•Application limits
•Possible HPC and mobile
penetration as technology
matures.
Managed Language
Accelerators
•Modern JIT engines
changing rapidally.
•Low acceptance
Neuromorphic Engines •Huge die size and cost
•Programming difficulties
•Acceptance in research
and limited HPC
application.

More Related Content

What's hot

Evolution of Computing Microprocessors and SoCs
Evolution of Computing Microprocessors and SoCsEvolution of Computing Microprocessors and SoCs
Evolution of Computing Microprocessors and SoCsazmathmoosa
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computersSanjivani Sontakke
 
Cell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationCell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationSlide_N
 
Comparative Study of RISC AND CISC Architectures
Comparative Study of RISC AND CISC ArchitecturesComparative Study of RISC AND CISC Architectures
Comparative Study of RISC AND CISC ArchitecturesEditor IJCATR
 
Risc and cisc eugene clewlow
Risc and cisc   eugene clewlowRisc and cisc   eugene clewlow
Risc and cisc eugene clewlowChaudhary Manzoor
 
16bit RISC Processor
16bit RISC Processor16bit RISC Processor
16bit RISC ProcessorShashi Suman
 
AI is Impacting HPC Everywhere
AI is Impacting HPC EverywhereAI is Impacting HPC Everywhere
AI is Impacting HPC Everywhereinside-BigData.com
 
Necessity of 32-Bit Controllers
Necessity of 32-Bit ControllersNecessity of 32-Bit Controllers
Necessity of 32-Bit Controllersmohanav
 
RISC (reduced instruction set computer)
RISC (reduced instruction set computer)RISC (reduced instruction set computer)
RISC (reduced instruction set computer)LokmanArman
 
A tour of F9 microkernel and BitSec hypervisor
A tour of F9 microkernel and BitSec hypervisorA tour of F9 microkernel and BitSec hypervisor
A tour of F9 microkernel and BitSec hypervisorLouie Lu
 
Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Mauryasuraj98
 
Mp So C 18 Apr
Mp So C 18 AprMp So C 18 Apr
Mp So C 18 AprFNian
 
Intro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPCIntro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPCSlide_N
 
TotalView Debugger On Blue Gene
TotalView Debugger On Blue GeneTotalView Debugger On Blue Gene
TotalView Debugger On Blue GeneTotalviewtech
 
COMPLETE DETAIL ABOUT ARM PART1
COMPLETE DETAIL ABOUT ARM PART1COMPLETE DETAIL ABOUT ARM PART1
COMPLETE DETAIL ABOUT ARM PART1NOWAY
 

What's hot (20)

Evolution of Computing Microprocessors and SoCs
Evolution of Computing Microprocessors and SoCsEvolution of Computing Microprocessors and SoCs
Evolution of Computing Microprocessors and SoCs
 
Reduced instruction set computers
Reduced instruction set computersReduced instruction set computers
Reduced instruction set computers
 
Cell Technology for Graphics and Visualization
Cell Technology for Graphics and VisualizationCell Technology for Graphics and Visualization
Cell Technology for Graphics and Visualization
 
Comparative Study of RISC AND CISC Architectures
Comparative Study of RISC AND CISC ArchitecturesComparative Study of RISC AND CISC Architectures
Comparative Study of RISC AND CISC Architectures
 
Risc and cisc eugene clewlow
Risc and cisc   eugene clewlowRisc and cisc   eugene clewlow
Risc and cisc eugene clewlow
 
Risc processors
Risc processorsRisc processors
Risc processors
 
16bit RISC Processor
16bit RISC Processor16bit RISC Processor
16bit RISC Processor
 
AI is Impacting HPC Everywhere
AI is Impacting HPC EverywhereAI is Impacting HPC Everywhere
AI is Impacting HPC Everywhere
 
Necessity of 32-Bit Controllers
Necessity of 32-Bit ControllersNecessity of 32-Bit Controllers
Necessity of 32-Bit Controllers
 
RISC (reduced instruction set computer)
RISC (reduced instruction set computer)RISC (reduced instruction set computer)
RISC (reduced instruction set computer)
 
A tour of F9 microkernel and BitSec hypervisor
A tour of F9 microkernel and BitSec hypervisorA tour of F9 microkernel and BitSec hypervisor
A tour of F9 microkernel and BitSec hypervisor
 
Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Case study on Intel core i3 processor.
Case study on Intel core i3 processor.
 
Mp So C 18 Apr
Mp So C 18 AprMp So C 18 Apr
Mp So C 18 Apr
 
Blue Gene Active Storage
Blue Gene Active StorageBlue Gene Active Storage
Blue Gene Active Storage
 
Tibor
TiborTibor
Tibor
 
Intro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPCIntro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPC
 
TotalView Debugger On Blue Gene
TotalView Debugger On Blue GeneTotalView Debugger On Blue Gene
TotalView Debugger On Blue Gene
 
RISC AND CISC PROCESSOR
RISC AND CISC PROCESSORRISC AND CISC PROCESSOR
RISC AND CISC PROCESSOR
 
COMPLETE DETAIL ABOUT ARM PART1
COMPLETE DETAIL ABOUT ARM PART1COMPLETE DETAIL ABOUT ARM PART1
COMPLETE DETAIL ABOUT ARM PART1
 
RISC-V assembly
RISC-V assemblyRISC-V assembly
RISC-V assembly
 

Similar to Industrial trends in heterogeneous and esoteric compute

“A New Golden Age for Computer Architecture: Processor Innovation to Enable U...
“A New Golden Age for Computer Architecture: Processor Innovation to Enable U...“A New Golden Age for Computer Architecture: Processor Innovation to Enable U...
“A New Golden Age for Computer Architecture: Processor Innovation to Enable U...Edge AI and Vision Alliance
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyPerry Lea
 
Parallelism Processor Design
Parallelism Processor DesignParallelism Processor Design
Parallelism Processor DesignSri Prasanna
 
Connection Machine
Connection MachineConnection Machine
Connection Machinebutest
 
A New Golden Age for Computer Architecture
A New Golden Age for Computer ArchitectureA New Golden Age for Computer Architecture
A New Golden Age for Computer ArchitectureYanbin Kong
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3mustafa sarac
 
How to Select Hardware for Internet of Things Systems?
How to Select Hardware for Internet of Things Systems?How to Select Hardware for Internet of Things Systems?
How to Select Hardware for Internet of Things Systems?Hannes Tschofenig
 
Chip Design Trend & Fabrication Prospects In India
Chip  Design Trend & Fabrication Prospects In IndiaChip  Design Trend & Fabrication Prospects In India
Chip Design Trend & Fabrication Prospects In Indiabibhuti bikramaditya
 
Mobile Device-Architecture
Mobile Device-ArchitectureMobile Device-Architecture
Mobile Device-Architecturecyberns_
 
0.FPGA for dummies: Historical introduction
0.FPGA for dummies: Historical introduction0.FPGA for dummies: Historical introduction
0.FPGA for dummies: Historical introductionMaurizio Donna
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Eric Van Hensbergen
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)Heiko Joerg Schick
 
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processorHardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processorSlide_N
 
The Berkeley View on the Parallel Computing Landscape
The Berkeley View on the Parallel Computing LandscapeThe Berkeley View on the Parallel Computing Landscape
The Berkeley View on the Parallel Computing Landscapeugur candan
 

Similar to Industrial trends in heterogeneous and esoteric compute (20)

“A New Golden Age for Computer Architecture: Processor Innovation to Enable U...
“A New Golden Age for Computer Architecture: Processor Innovation to Enable U...“A New Golden Age for Computer Architecture: Processor Innovation to Enable U...
“A New Golden Age for Computer Architecture: Processor Innovation to Enable U...
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
 
Parallelism Processor Design
Parallelism Processor DesignParallelism Processor Design
Parallelism Processor Design
 
Connection Machine
Connection MachineConnection Machine
Connection Machine
 
A New Golden Age for Computer Architecture
A New Golden Age for Computer ArchitectureA New Golden Age for Computer Architecture
A New Golden Age for Computer Architecture
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
 
How to Select Hardware for Internet of Things Systems?
How to Select Hardware for Internet of Things Systems?How to Select Hardware for Internet of Things Systems?
How to Select Hardware for Internet of Things Systems?
 
Chip Design Trend & Fabrication Prospects In India
Chip  Design Trend & Fabrication Prospects In IndiaChip  Design Trend & Fabrication Prospects In India
Chip Design Trend & Fabrication Prospects In India
 
Mobile Device-Architecture
Mobile Device-ArchitectureMobile Device-Architecture
Mobile Device-Architecture
 
Par com
Par comPar com
Par com
 
Nehalem
NehalemNehalem
Nehalem
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
Webinaron muticoreprocessors
Webinaron muticoreprocessorsWebinaron muticoreprocessors
Webinaron muticoreprocessors
 
Computer Evolution
Computer EvolutionComputer Evolution
Computer Evolution
 
0.FPGA for dummies: Historical introduction
0.FPGA for dummies: Historical introduction0.FPGA for dummies: Historical introduction
0.FPGA for dummies: Historical introduction
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
 
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processorHardware and Software Architectures for the CELL BROADBAND ENGINE processor
Hardware and Software Architectures for the CELL BROADBAND ENGINE processor
 
The Berkeley View on the Parallel Computing Landscape
The Berkeley View on the Parallel Computing LandscapeThe Berkeley View on the Parallel Computing Landscape
The Berkeley View on the Parallel Computing Landscape
 

Recently uploaded

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 

Recently uploaded (20)

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 

Industrial trends in heterogeneous and esoteric compute

  • 2.
  • 3. Architectural Failures Hybrid & Heterogeneity Typical SOC design Emerging Trends Heterogeneous Compute is the Future What are the trends in industrial adoption of new computing elements?
  • 4. Hewlett Packard Micron 21 Years IEEE Senior Member ACM Senior Member Embedded Computing Distinguished TechnologistChief Architect Imaging and Embedded Systems ASIC and SOC Design Software Engineer Firmware Engineer Advanced Memory Systems Architect Patents ARM MIPS BS: Computer Science MS: Computer EngineeringPD: Electrical Engineering Columbia University Heterogeneous Computing Hardware Architect Publications Transmeta Intel HP Labs: The Machine memristor
  • 5.
  • 6.
  • 7. Talking about something as a failure will lead to debate. I’m fine debating. What I ask is we define failure as: Why didn’t these architectures stand alone? Architecture, cost, market, timing?
  • 8. •Founded by Danny Hillis’s PhD work in 1983, Cambridge MA.. •Array of bit-serial processors. •Hypercube interconnect. •Near memory (4K bit) microprocessors. •Not commercially viable until they added an OTS FPU. •Originally programmed in LISP and later C. •Later CM5 moved to MIMD architecture and a fat tree network. •Peak performance: 20 Gflops •Max revenue: $65M
  • 9. •Founded by Jeff Kalb (DEC), 1987 in Sunnyvale. •Based on SIMD architecture •1K to 16K processors on MP1 •Custom logic, fab’d by HP and TI. •Required a front end VAX 11/780 •Originally programmed in Fortran and C on front end and MPL to run on MasPar •Marketed as “a general purpose computer system” •Peak performance: 1.2 Gflops •Max revenue: $20M
  • 10. •Founded by Josh Fisher, 1984 in New Haven, CT. •Based on VLIW architecture and trace scheduling •Ran Unix •7 - 32 bit parallel ops •125 units sold •Peak performance: 100 Mflops •Max revenue: $15M
  • 11. Company Primary Architecture Start Chapter 11 Filing Date Where are they now? Thinking Machine Massively Parallel bit-serial 1983 1994 SUN/Oracle nCube Distribute MIMD Hypercube 1983 ~2005 IP used for video on demand Meiko Mesh of Inmos Transputers 1985 2003 / 2009 Bought by Quadrics - defunct Multiflow VLIW 1984 1990 HP  ST Maspar Massive SIMD Array 1987 1996 Remnants in data mining 1. The 80’s were a good time to be a computer architect. 2. Novel architectures to solve all the world’s problems probably won’t last.
  • 12.
  • 14. “Heterogeneous System Architecture is a type of computer processor architecture that integrates central processing units and graphics processors on the same bus …” While that is true, I contend that a true heterogeneous architecture blends the right core/ISA/processor to the job. A GPU/CPU blend only gets you so far..
  • 15. Bus contention and saturation Inter-processor communication MPI may not exist. Fixed function ownership Deadlock propagation Example of bus saturation in a heterogeneous SOC
  • 16. •Unpredictable code flow paths •Legacy code base and application •Naturally aligns to fine grain parallelism •Cost is less of a factor •Code has real-time requirements •Dataflow is naturally streaming in application •Design unproven, needs programmatic flexibility •Cost somewhat a factor •Code is embarrassingly parallel •Code aligns will with SMT •Cost not an object •Code must perform small kernels of execution in very low power •Code and data have security implications •Die area concerns CPU SMP Traditional DSPVLIW uController GPU •Code has real-time requirements •Dataflow is naturally streaming in application •Fixed IO dependencies •Design can be hardened •Ultimate performance •NRE high, risk high Fixed Function SI
  • 17. SOCs and ASSPs amortize system cost (power, board area, die) into a single die and package Push as much functionality into a single package as possible. SOC data flow optimized for particularly narrow use cases. Heterogeneous is more than a CPU + GPU
  • 18. Typical SOCs currently blend together many cores, much fixed function silicon, many OS, and many code bases. VM’s are adopted to ease software migration 2 Symmetric or asymmetric CPUs running some RTOS Vivante GC 2D/3D GPU running multiple threads. DSP running unique RTOS Others… Marvell Armada 1500 Courtesy Marvell Actually 8 cores in SOC Potentially 6 operating systems NAND: RTOS DSP: small DSP OS (STOS) Secure Core: TEE Front Panel: uKernel ARM 1: Greenhills RTOS ARM 2: Linux TS Processor: ThreadX
  • 19. Is it the 1980’s again? What emerging architectures may find their way into an SOC?
  • 20. Collapsed Memory Stacks Neuromorphic Computing Managed Language Accelerators Computing Memories Computational RAM
  • 21. The Machine Combination of: Collapsed memory stack, Near memory SOC compute (Moonshot-like) Photonic interconnect. Pervasive from IOT/embedded markets to exascale. New OS being crated (Linux++) with university collaboration Designed ground up to address, performance, security, and data locality
  • 22. Claims: Problem Size: 2^42 edges vs Blue Gene Q@ 2^40 edges Performance: 16 GTEPS vs Blue Gene Q @ 15.3 GTEPS Power: 400 kW vs Blue Gene Q at 7,900 kW Cores/Racks: 122K Cores/20 Racks vs Blue Gene Q @ 1.6M Cores/96 Racks Utilization < 70% The memristor is optional in this architecture. PCM may be a backup option.
  • 23. 2 types of computive memory with data stored as resistance. Memristor crossbar Material Implication “IMP” architecture p implies q… If p then q Adding memristor IMP components would double the size of the die, but yield a 1000x performance improvement. Still need silicon gate to drive current. Memristors don’t “drive” anything. Still unproven to synthesize bulk memristors with any form of yield. Trust table of boolean logic built with memristor IMP Courtesy Shahsavari 2010
  • 24. Automata Processor Micron fabricated DRAM Non von Neumann dataflow 48K state transition elements per chip 6.6T path decisions/s 4W max TPD DDR3 RAM interface State Machine Compiler and Unique Tools Pattern matching on von Neumann CPU: O(n^2) Pattern matching on Automata O(1)
  • 25.
  • 26. Automata Processor to full ALU support Berkeley iRAM VIRAM CRAM Experiments during IRAM era Courtesy Elliot, 1999
  • 27. Direct Bytecode Execution (DBX) No need to JIT Improves startup time. Reduces code inflation with JIT process (~8x) ARM Jazelle 140 JAVA instructions are directly executed 94 are emulated in short bursts of ARM instructions 12K silicon gates Dalvik VM put it out of business. No JIT Register based Back in 2004 this data looked phenomenal
  • 28. Build a non-Von Neumann architecture based on “leaky integrate and fire” CMOS topology. Each core (4096) models 256 “neurons” with 256 “synapses” on a 5B transitor (28nm) die. System clock is very slow. Memory units tightly coupled to “neurons”. IBM refers to this as the TrueNorth architecture. Goals: 10B cores, 100T connections (synapses) 1KW (45pJ per compute) 2 liters of space TrueNorth Architecture Uses simple leaky integrate and fire CMOS.
  • 29. 1st parts fabricated at HRL Labs Total current funding: $102M since 2009 Timeline Phase 2 (2013): multi-core synaptic processor based on TrueNorth. 1M neurons (256 neurons per core – 400 0cores) Phase 3 (??): fabricate 10M core. Simulate mouse. Phase 4 (2017): 100M core TrueNorth Architecture Uses simple leaky integrate and fire CMOS.
  • 30. Technology Challenges Prediction The Machine •Large OS hurdle. •Persistent memory yields. •Doubtful SOC benefits •Select customer adoption. •Low penetration in mobile or embedded. Computive Memory •Difficult to program •Uncertain application •Crossbar may have genesis in FPGA alternative Computational RAM •Toolchain •Automata programming •Application limits •Possible HPC and mobile penetration as technology matures. Managed Language Accelerators •Modern JIT engines changing rapidally. •Low acceptance Neuromorphic Engines •Huge die size and cost •Programming difficulties •Acceptance in research and limited HPC application.