SlideShare a Scribd company logo
A Breakthrough New CPU Architecture Revives IPC Scaling Mohammad Abdallah Founder, President and CTO 
Linley Processor Conference 
October 23, 2014
• 
Emerging from stealth mode 
• 
Developed new VISC™ Architecture 
• 
7 years, $125M R&D 
• 
~250 employees , 75+ patents filed 
Introducing Soft Machines™ 
©Copyright 2014, All Rights Reserved 2
The Death of CPU Scaling 
©Copyright 2014, All Rights Reserved 3 
“The failure of CPU scaling after 30 years of continual improvements may have slammed the door on the easiest and most common type of performance scaling…” 
The Death of CPU Scaling 
ExtremeTech (2012) 
2014 
Microprocessor Scaling Realities after 2004 
Transistor scaling continues 
Clock speed flat 
Power budget flat 
Perf/clock flat 
Source: “The Free Lunch is Over”, Herb Sutter
Industry Response: Multi-Core 
4 
Core1 
Core2 
Thread1 
Thread2 
Advantages: 
- 
Utilizes growing transistor budget 
- 
Performance scaling for parallel code 
- 
Improves throughput 
Challenges: 
- 
ST performance doesn’t scale 
- 
Threading/multicore coding complexity 
- 
Amdahl’s Law of diminishing returns 
- 
Dark silicon 
©Copyright 2014, All Rights Reserved
• 
Revive CPU performance scaling 
• 
Utilize Moore’s Law transistor scaling 
• 
Mitigate dark silicon 
• 
Liberate ISA dependency 
CPU Architecture Challenge 
©Copyright 2014, All Rights Reserved 5
VISC™ Architecture Wave 
6 
RISC (MIPS) 
CISC 
(IBM/Intel) 
VISC (Soft Machines) 
Software Scalability/Productivity 
Compilation 
Concurrency Extraction 
Assembly 
Device Physics Scalability 
Short Pipeline Code Memory size 
Deep OoO Pipeline 
Processor Speed 
Virtual Cores/Threads Processor Power 
Late 1980s – 2010s 
1970s – early 1980s 
2010s 
VISC Architecture scales on both physical and software productivity layers 
©Copyright 2014, All Rights Reserved
VISC™ Processor Block Diagram 
©Copyright 2014, All Rights Reserved 7 
L2$ & Memory 
Sequential Code 
SW Single Thread 
Core2 
Core1 
L1 D$ 
L1 D$ 
Core4 
Core3 
L1 D$ 
L1 D$ 
Virtual Cores 
Global Front End 
Virtual HW Threads 
(HW threadlets) 
Virtual 
Core1 
Virtual Core2 
Virtual 
Core3 
Virtual 
Core4
VISC™ CPU Usage Example 
©Copyright 2014, All Rights Reserved 8 
or 
• 
VISC dynamically allocates resources across virtual cores based on individual application needs 
• 
Performance/watt balanced for both single & multi-thread applications 
Heavy App 
Dual SW Threads 
Single SW Thread 
Heavy App 
Light App 
Virtual Cores Virtual HW Threads/Threadlets 
Core2 
Core1 
L1 D$ 
L1 D$ 
Virtual 
Core1 
Virtual 
Core2 
Virtual Cores 
Virtual HW Threads/Threadlets 
Core2 
Core1 
L1 D$ 
L1 D$ 
Virtual 
Core1
VISC™ Architecture Prototype Pipeline 
©Copyright 2014, All Rights Reserved 9 
Fetch 
Allocate/ Dispatch 
EXE 
Mem/long latency Execution 
RF read 
Virtual Thread Formation 
Pipeline of Virtual Threads Across the Virtual Cores 
L2$ & Memory 
SW Single Thread 
Global Front End 
Core2 
Core1 
L1 D$ 
L1 D$ 
Virtual 
Core1 
Virtual 
Core2 
Virtual Cores 
Virtual HW Threads 
(HW threadlets)
VISC™ Revives IPC Curve 
10 
ARM 
A15 
1C 
Intel 
Atom 
1C 
Soft Machines 
2VC 
Proto 
Apple 
A7 
1C 
ARM 
A57 
1C 
Intel 
Haswell 
1C 
Compiled Code 
32-bit 
32-bit 
32-bit 
32-bit 
32-bit 
64-bit 
Cache 
1M 
2M 
1M 
1M+4M 
2M 
2M 
Pipeline 
Moderate 
Moderate 
Shallow 
Moderate 
Moderate 
Deep 
IPC(SPEC 2006)* 
0.71 
0.69 
2.1 
1.0 
.87 
1.39 
* Company conducted benchmark tests and projections, using industry-standard Compiler GCC 4.6 or equivalent 
Mobile CPU designs are pursuing higher ARCH/μARCH complexity 
2006 
The Basic 
A8 
2-way 
2009 
The Simple 
A9 
2-way OoO 
2011 The Moderate A15 3-way 
2013 
The Big 
Apple A7 
6-way 
2014 
The Ultimate 
Haswell 
8-way 
©Copyright 2014, All Rights Reserved
• 
Extracting ILP has significant complexity 
• 
OoO complexity increases quadratically with machine width 
• 
VISC complexity increases linearly with number of virtual cores 
• 
VISC Performance/Watt utilizes linear scaling 
VISC™ Concurrency Extraction Linear vs. Quadratic Complexity 
©Copyright 2014, All Rights Reserved 11
System Energy Approach: DRVFS 
12 
Virtual Cores – DRVFS 
• 
DRVFS: linear increase in power 
• 
P No. of virtual core resources 
• 
Higher Perf/MHz enables DVFS scaling DOWN 
Physical Cores – DVFS 
• 
DVFS: quadratic increase in power 
• 
P V2 * F 
• 
Lower Perf/MHz requires DVFS scaling UP 
Use Case: Rush to low power mode (boosting performance or response time) 
Core1 
©Copyright 2014, All Rights Reserved
VISC™ Single Thread SPEC/Watt 
13 
Mobile 
Server 
Same performance in 1/4-1/3rd power or 1.7-2.2x perf at the same power* 
* Company conducted benchmark tests and projections for 28nm 
1C App CPU 
Single Thread Performance 
Power 
1.7x 
1/3 
1/4 
2.1x 
1.8x 
2.2x 
1VC (2C) 
1VC (4C) 
©Copyright 2014, All Rights Reserved
VISC™ Dual Thread SPEC/Watt 
14 
* Company conducted benchmark tests and projections for 28nm 
2C App CPU 
Mobile 
Server 
Power 
Dual Thread Performance 
1.4x 
1.5x 
1/2 
0.4x 
1.8x 
1.9x 
Same performance in 0.4 to 0.5x of power or 1.4 - 1.9x perf at the same power* 
2VC (2C) 
2VC (4C) 
©Copyright 2014, All Rights Reserved
VISC™ Technology Prototype 
15 
Working Silicon 
• 
VISC Processor Proof-of-Concept Prototype 
• 
IPC scalability 
• 
VISC architecture 
• 
Software efficiency 
• 
Full Platform 
• 
VISC Dual Virtual Core Processor 
• 
SoC with 3D, Video, DRAM controller, HD video…. 
• 
Full System functionality 
• 
Linux OS 
• 
UEFI BIOS 
• 
Benchmarks running on Linux 
• 
Android ICS booting 
©Copyright 2014, All Rights Reserved
16 
Silicon Results: Performance/MHz 
Dual Virtual Core/A15 IPC Ratio 
©Copyright 2014, All Rights Reserved
VISC™ Architecture 
17 
Virtual SW layer 
Guest Sequential Code 
OS & Hypervisor 
Single Thread 
Guest ISA 
Virtual ISA 
L2$ & Memory 
Core2 
Core1 
L1 D$ 
L1 D$ 
Core4 
Core3 
L1 D$ 
L1 D$ 
Virtual 
Core1 
Virtual Core2 
Virtual 
Core3 
Virtual 
Core4 
Virtual Cores 
Global Front End 
Virtual HW Threads/Threadlets ©Copyright 2014, All Rights Reserved
Converter 
VISC™ Run-time SW Architecture 
18 
Low level Virtual Machine 
High level Virtual Machine 
Guest Code (ARM,X86) 
Dynamic optimization 
VISC™ Processor 
Guest/VM to native mapping 
Native Code 
SMI API 
Hot Pass 
©Copyright 2014, All Rights Reserved
• 
Silicon proven VISC™ architecture delivers 3-4x IPC advantage on single and multi-threaded applications without software changes 
• 
Resulting in ~2-4x performance/watt advantage 
• 
VISC architecture is scalable from IoT to mobile to servers due to its modularity and symmetry 
• 
Number of virtual cores, virtual threads, and virtual instruction layer 
• 
VISC virtual instruction layer provides ISA agnostic and optimized run-time platform capabilities 
Summary 
©Copyright 2014, All Rights Reserved 19

More Related Content

Viewers also liked

Android OS Presentation
Android OS PresentationAndroid OS Presentation
Android OS Presentationhession25819
 
Model Template Presentation PowerPoint
Model Template Presentation PowerPointModel Template Presentation PowerPoint
Model Template Presentation PowerPointHoai Nam NGUYEN
 
My presentation on Android in my college
My presentation on Android in my collegeMy presentation on Android in my college
My presentation on Android in my collegeSneha Lata
 
Android seminar-presentation
Android seminar-presentationAndroid seminar-presentation
Android seminar-presentationconnectshilpa
 
Android Entwicklung
Android EntwicklungAndroid Entwicklung
Android Entwicklungfranky1888
 
Presentation on Android operating system
Presentation on Android operating systemPresentation on Android operating system
Presentation on Android operating systemSalma Begum
 

Viewers also liked (9)

Android OS Presentation
Android OS PresentationAndroid OS Presentation
Android OS Presentation
 
Model Template Presentation PowerPoint
Model Template Presentation PowerPointModel Template Presentation PowerPoint
Model Template Presentation PowerPoint
 
My presentation on Android in my college
My presentation on Android in my collegeMy presentation on Android in my college
My presentation on Android in my college
 
Android seminar ppt
Android seminar pptAndroid seminar ppt
Android seminar ppt
 
Android seminar-presentation
Android seminar-presentationAndroid seminar-presentation
Android seminar-presentation
 
Android ppt
Android pptAndroid ppt
Android ppt
 
Android Entwicklung
Android EntwicklungAndroid Entwicklung
Android Entwicklung
 
Presentation on Android operating system
Presentation on Android operating systemPresentation on Android operating system
Presentation on Android operating system
 
Android ppt
Android ppt Android ppt
Android ppt
 

More from rusnano

Презентация к открытой лекции Анатолия Чубайса в МИСиС
Презентация к открытой лекции Анатолия Чубайса в МИСиСПрезентация к открытой лекции Анатолия Чубайса в МИСиС
Презентация к открытой лекции Анатолия Чубайса в МИСиСrusnano
 
Опыт создания фондов в форме инвестиционного товарищества
Опыт создания фондов в форме инвестиционного товариществаОпыт создания фондов в форме инвестиционного товарищества
Опыт создания фондов в форме инвестиционного товариществаrusnano
 
Инвестиционное товарищество: юридические и налоговые аспекты
Инвестиционное товарищество: юридические и налоговые аспектыИнвестиционное товарищество: юридические и налоговые аспекты
Инвестиционное товарищество: юридические и налоговые аспектыrusnano
 
Государственная поддержка развития промышленности в Российской Федерации
Государственная поддержка развития промышленности в Российской ФедерацииГосударственная поддержка развития промышленности в Российской Федерации
Государственная поддержка развития промышленности в Российской Федерацииrusnano
 
Фонд Развития Промышленности: новые возможности финансирования проектов
Фонд Развития Промышленности: новые возможности финансирования проектовФонд Развития Промышленности: новые возможности финансирования проектов
Фонд Развития Промышленности: новые возможности финансирования проектовrusnano
 
О мерах по стимулированию закупок отечественной продукции
О мерах по стимулированию закупок отечественной продукцииО мерах по стимулированию закупок отечественной продукции
О мерах по стимулированию закупок отечественной продукцииrusnano
 
Импортозамещение сегодня – сценарии 2020
Импортозамещение сегодня – сценарии 2020Импортозамещение сегодня – сценарии 2020
Импортозамещение сегодня – сценарии 2020rusnano
 
Эффективные инструменты таможенно-тарифного регулирования
Эффективные инструменты таможенно-тарифного регулированияЭффективные инструменты таможенно-тарифного регулирования
Эффективные инструменты таможенно-тарифного регулированияrusnano
 

More from rusnano (8)

Презентация к открытой лекции Анатолия Чубайса в МИСиС
Презентация к открытой лекции Анатолия Чубайса в МИСиСПрезентация к открытой лекции Анатолия Чубайса в МИСиС
Презентация к открытой лекции Анатолия Чубайса в МИСиС
 
Опыт создания фондов в форме инвестиционного товарищества
Опыт создания фондов в форме инвестиционного товариществаОпыт создания фондов в форме инвестиционного товарищества
Опыт создания фондов в форме инвестиционного товарищества
 
Инвестиционное товарищество: юридические и налоговые аспекты
Инвестиционное товарищество: юридические и налоговые аспектыИнвестиционное товарищество: юридические и налоговые аспекты
Инвестиционное товарищество: юридические и налоговые аспекты
 
Государственная поддержка развития промышленности в Российской Федерации
Государственная поддержка развития промышленности в Российской ФедерацииГосударственная поддержка развития промышленности в Российской Федерации
Государственная поддержка развития промышленности в Российской Федерации
 
Фонд Развития Промышленности: новые возможности финансирования проектов
Фонд Развития Промышленности: новые возможности финансирования проектовФонд Развития Промышленности: новые возможности финансирования проектов
Фонд Развития Промышленности: новые возможности финансирования проектов
 
О мерах по стимулированию закупок отечественной продукции
О мерах по стимулированию закупок отечественной продукцииО мерах по стимулированию закупок отечественной продукции
О мерах по стимулированию закупок отечественной продукции
 
Импортозамещение сегодня – сценарии 2020
Импортозамещение сегодня – сценарии 2020Импортозамещение сегодня – сценарии 2020
Импортозамещение сегодня – сценарии 2020
 
Эффективные инструменты таможенно-тарифного регулирования
Эффективные инструменты таможенно-тарифного регулированияЭффективные инструменты таможенно-тарифного регулирования
Эффективные инструменты таможенно-тарифного регулирования
 

Recently uploaded

ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfKamal Acharya
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfPipe Restoration Solutions
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxwendy cai
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientistgettygaming1
 
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...Roi Lipman
 
grop material handling.pdf and resarch ethics tth
grop material handling.pdf and resarch ethics tthgrop material handling.pdf and resarch ethics tth
grop material handling.pdf and resarch ethics tthAmanyaSylus
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdfKamal Acharya
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
 
School management system project report.pdf
School management system project report.pdfSchool management system project report.pdf
School management system project report.pdfKamal Acharya
 
A case study of cinema management system project report..pdf
A case study of cinema management system project report..pdfA case study of cinema management system project report..pdf
A case study of cinema management system project report..pdfKamal Acharya
 
Paint shop management system project report.pdf
Paint shop management system project report.pdfPaint shop management system project report.pdf
Paint shop management system project report.pdfKamal Acharya
 
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...Amil baba
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringDr. Radhey Shyam
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopEmre Günaydın
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfAbrahamGadissa
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamDr. Radhey Shyam
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-IVigneshvaranMech
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
 

Recently uploaded (20)

ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
 
grop material handling.pdf and resarch ethics tth
grop material handling.pdf and resarch ethics tthgrop material handling.pdf and resarch ethics tth
grop material handling.pdf and resarch ethics tth
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
School management system project report.pdf
School management system project report.pdfSchool management system project report.pdf
School management system project report.pdf
 
A case study of cinema management system project report..pdf
A case study of cinema management system project report..pdfA case study of cinema management system project report..pdf
A case study of cinema management system project report..pdf
 
Paint shop management system project report.pdf
Paint shop management system project report.pdfPaint shop management system project report.pdf
Paint shop management system project report.pdf
 
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...
 
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and ClusteringKIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
KIT-601 Lecture Notes-UNIT-4.pdf Frequent Itemsets and Clustering
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 

A Breakthrough New CPU Architecture Revives IPC Scaling

  • 1. A Breakthrough New CPU Architecture Revives IPC Scaling Mohammad Abdallah Founder, President and CTO Linley Processor Conference October 23, 2014
  • 2. • Emerging from stealth mode • Developed new VISC™ Architecture • 7 years, $125M R&D • ~250 employees , 75+ patents filed Introducing Soft Machines™ ©Copyright 2014, All Rights Reserved 2
  • 3. The Death of CPU Scaling ©Copyright 2014, All Rights Reserved 3 “The failure of CPU scaling after 30 years of continual improvements may have slammed the door on the easiest and most common type of performance scaling…” The Death of CPU Scaling ExtremeTech (2012) 2014 Microprocessor Scaling Realities after 2004 Transistor scaling continues Clock speed flat Power budget flat Perf/clock flat Source: “The Free Lunch is Over”, Herb Sutter
  • 4. Industry Response: Multi-Core 4 Core1 Core2 Thread1 Thread2 Advantages: - Utilizes growing transistor budget - Performance scaling for parallel code - Improves throughput Challenges: - ST performance doesn’t scale - Threading/multicore coding complexity - Amdahl’s Law of diminishing returns - Dark silicon ©Copyright 2014, All Rights Reserved
  • 5. • Revive CPU performance scaling • Utilize Moore’s Law transistor scaling • Mitigate dark silicon • Liberate ISA dependency CPU Architecture Challenge ©Copyright 2014, All Rights Reserved 5
  • 6. VISC™ Architecture Wave 6 RISC (MIPS) CISC (IBM/Intel) VISC (Soft Machines) Software Scalability/Productivity Compilation Concurrency Extraction Assembly Device Physics Scalability Short Pipeline Code Memory size Deep OoO Pipeline Processor Speed Virtual Cores/Threads Processor Power Late 1980s – 2010s 1970s – early 1980s 2010s VISC Architecture scales on both physical and software productivity layers ©Copyright 2014, All Rights Reserved
  • 7. VISC™ Processor Block Diagram ©Copyright 2014, All Rights Reserved 7 L2$ & Memory Sequential Code SW Single Thread Core2 Core1 L1 D$ L1 D$ Core4 Core3 L1 D$ L1 D$ Virtual Cores Global Front End Virtual HW Threads (HW threadlets) Virtual Core1 Virtual Core2 Virtual Core3 Virtual Core4
  • 8. VISC™ CPU Usage Example ©Copyright 2014, All Rights Reserved 8 or • VISC dynamically allocates resources across virtual cores based on individual application needs • Performance/watt balanced for both single & multi-thread applications Heavy App Dual SW Threads Single SW Thread Heavy App Light App Virtual Cores Virtual HW Threads/Threadlets Core2 Core1 L1 D$ L1 D$ Virtual Core1 Virtual Core2 Virtual Cores Virtual HW Threads/Threadlets Core2 Core1 L1 D$ L1 D$ Virtual Core1
  • 9. VISC™ Architecture Prototype Pipeline ©Copyright 2014, All Rights Reserved 9 Fetch Allocate/ Dispatch EXE Mem/long latency Execution RF read Virtual Thread Formation Pipeline of Virtual Threads Across the Virtual Cores L2$ & Memory SW Single Thread Global Front End Core2 Core1 L1 D$ L1 D$ Virtual Core1 Virtual Core2 Virtual Cores Virtual HW Threads (HW threadlets)
  • 10. VISC™ Revives IPC Curve 10 ARM A15 1C Intel Atom 1C Soft Machines 2VC Proto Apple A7 1C ARM A57 1C Intel Haswell 1C Compiled Code 32-bit 32-bit 32-bit 32-bit 32-bit 64-bit Cache 1M 2M 1M 1M+4M 2M 2M Pipeline Moderate Moderate Shallow Moderate Moderate Deep IPC(SPEC 2006)* 0.71 0.69 2.1 1.0 .87 1.39 * Company conducted benchmark tests and projections, using industry-standard Compiler GCC 4.6 or equivalent Mobile CPU designs are pursuing higher ARCH/μARCH complexity 2006 The Basic A8 2-way 2009 The Simple A9 2-way OoO 2011 The Moderate A15 3-way 2013 The Big Apple A7 6-way 2014 The Ultimate Haswell 8-way ©Copyright 2014, All Rights Reserved
  • 11. • Extracting ILP has significant complexity • OoO complexity increases quadratically with machine width • VISC complexity increases linearly with number of virtual cores • VISC Performance/Watt utilizes linear scaling VISC™ Concurrency Extraction Linear vs. Quadratic Complexity ©Copyright 2014, All Rights Reserved 11
  • 12. System Energy Approach: DRVFS 12 Virtual Cores – DRVFS • DRVFS: linear increase in power • P No. of virtual core resources • Higher Perf/MHz enables DVFS scaling DOWN Physical Cores – DVFS • DVFS: quadratic increase in power • P V2 * F • Lower Perf/MHz requires DVFS scaling UP Use Case: Rush to low power mode (boosting performance or response time) Core1 ©Copyright 2014, All Rights Reserved
  • 13. VISC™ Single Thread SPEC/Watt 13 Mobile Server Same performance in 1/4-1/3rd power or 1.7-2.2x perf at the same power* * Company conducted benchmark tests and projections for 28nm 1C App CPU Single Thread Performance Power 1.7x 1/3 1/4 2.1x 1.8x 2.2x 1VC (2C) 1VC (4C) ©Copyright 2014, All Rights Reserved
  • 14. VISC™ Dual Thread SPEC/Watt 14 * Company conducted benchmark tests and projections for 28nm 2C App CPU Mobile Server Power Dual Thread Performance 1.4x 1.5x 1/2 0.4x 1.8x 1.9x Same performance in 0.4 to 0.5x of power or 1.4 - 1.9x perf at the same power* 2VC (2C) 2VC (4C) ©Copyright 2014, All Rights Reserved
  • 15. VISC™ Technology Prototype 15 Working Silicon • VISC Processor Proof-of-Concept Prototype • IPC scalability • VISC architecture • Software efficiency • Full Platform • VISC Dual Virtual Core Processor • SoC with 3D, Video, DRAM controller, HD video…. • Full System functionality • Linux OS • UEFI BIOS • Benchmarks running on Linux • Android ICS booting ©Copyright 2014, All Rights Reserved
  • 16. 16 Silicon Results: Performance/MHz Dual Virtual Core/A15 IPC Ratio ©Copyright 2014, All Rights Reserved
  • 17. VISC™ Architecture 17 Virtual SW layer Guest Sequential Code OS & Hypervisor Single Thread Guest ISA Virtual ISA L2$ & Memory Core2 Core1 L1 D$ L1 D$ Core4 Core3 L1 D$ L1 D$ Virtual Core1 Virtual Core2 Virtual Core3 Virtual Core4 Virtual Cores Global Front End Virtual HW Threads/Threadlets ©Copyright 2014, All Rights Reserved
  • 18. Converter VISC™ Run-time SW Architecture 18 Low level Virtual Machine High level Virtual Machine Guest Code (ARM,X86) Dynamic optimization VISC™ Processor Guest/VM to native mapping Native Code SMI API Hot Pass ©Copyright 2014, All Rights Reserved
  • 19. • Silicon proven VISC™ architecture delivers 3-4x IPC advantage on single and multi-threaded applications without software changes • Resulting in ~2-4x performance/watt advantage • VISC architecture is scalable from IoT to mobile to servers due to its modularity and symmetry • Number of virtual cores, virtual threads, and virtual instruction layer • VISC virtual instruction layer provides ISA agnostic and optimized run-time platform capabilities Summary ©Copyright 2014, All Rights Reserved 19