FalconeXt Semiconductor has developed a new parallel processing architecture called FALCON that can provide significant improvements in power reduction, performance increase, size reduction, and cost reduction compared to CPUs, GPUs, and other parallel processors. FALCON aims to enable low power, low cost IoT devices by optimizing silicon size and power consumption through its distributed processing approach and memory efficiency. FalconeXt believes FALCON can increase performance by up to 100x while reducing power by 50x and silicon size/cost also by 50x, addressing the needs of markets like smartphones, cloud computing, graphics/video, IoT, and medical imaging.
2. FalconeXt Semiconductor Confidential
FalconeXt Semiconductor, The Company
FalconeXt is a new IP startup that invented a New Revolutionary Parallel Processor architecture which enables up to 50Ximprovement over Multi Core CPU, DSP, or GPU in:
Power reduction
Performance increase
Size reduction
Cost reduction FALCon
Processing
Units Size
Power
Cost
Performance
Enables
2
3. FalconeXt Semiconductor Confidential
Falcon Semiconductor, The Market
Falcon markets:
Smart phones, Mobile devices
Decrease the size, reduce cost by 50%
Cloud (data center) computing
Increase efficiency by reducing size, power, and heat
Graphics & Video
Increase number of GPU by 3X, computing power
Internet of Things, wearable
Smaller, very low power devices
Medical Imaging
Faster analysis on the same silicon size
* Total Parallel Processing Market: $50B/Year
3
5. FalconeXt Semiconductor Confidential
Silicon Size Optimization
Processing Unit Silicon Size Reduction
Memory size per processing unit is reduced by at least 48x
Based on 3rdparty Falcon based design –Falcon unit silicon size is about 7x smaller than parallel processing competitors GPU
Processing Unit
Falcon
Processing Unit
Application Optimization
Reduction in number of
Processing Units required
to implement the application
Simple example for application optimization algorithm is symmetric FIR that gives 2x optimization factor
There are more than 10 application optimization algorithms built-in Falcon hardware
How it is done? -Size Reduction
1/8
Efficiency memory use
Algorithm Optimization
Distributed Processing
Instruction less Architecture
Function Level Processor
Algorithm Optimization
Algorithm Optimization
5
6. FalconeXt Semiconductor Confidential
Flexible Adaptive Low Cost & Power Consumption Parallel Processing Units
ARM/x86
System
(Serial)
Processor
Serdes
PCIe
FALC
Wireless, Video, Smartphone, … Device
I/O Switch Box
Parameter
Registers
6
Falcon Parallel Processor Architecture
Processing
Elements
Program
Memory
Data
Memory
Smart Routing
Parameter
Registers
I/O Switch Box
DDR
ADC/DAC
Color Coding
Size Optimized Falcon Programmable Accelerator
Think Outside the Core (Box)
FALCONPARALLELPROCESSORISIMPLEMENTINGTHEHEAVYPARALLELPROCESSINGTASKS/ALGORITHMS, WHILETHESYSTEMPROCESSORSOFTWARETAKESCAREOFTHECONTROLANDSERIALTASKS
THESYSTEMPROCESSORISCONTROLLINGTHEFALCONPARALLELPROCESSORBYFUNCTIONALLEVELINSTRUCTIONS
FASTAPPLICATIONSSWITCHTIME
6
8. FalconeXt Semiconductor Confidential
Parallel Processing Optimization Example
Apple A6 Multi-Processor Floor-plan
Most of the silicon area is consumed by:
ARM Cores
GPU
Accelerators
H.264, ECC, …
Memories
I/O cores
PCI Express, SRIO, …
Optimized
by Falcon
8
9. FalconeXt Semiconductor Confidential
Multi-Core Implementation Tradeoffs (A6 example)
Most of the silicon area is consumed by:
ARM Cores
GPU
Accelerators
H.264, ECC, …
Memories
I/O cores
PCI Express, SRIO, …
Optimized
by Falcon
Falcon Processing Units
Falcon
Example
Size optimized by Falcon is assumed to be 50% of the device
Conservative 4x Falcon size reduction assumption
Silicon area reduction of 37.5%
Power consumption reduction of 30%-50%
Device cost reduction of ~30%
Or
Performance increase of 4X
37%
Size
40%
Power
30%
Cost
Performance
400%
9
10. FalconeXt Semiconductor Confidential
0
5
10
15
20
TI C66x
Fermi
Tilera
Adapteva
Tensilica
Falcon
Memory
Proc. Unit
Size Reduction - Memory per Processing Unit
Company
ARM15
TI C66x
Nvidia(GPU)
Tilera
Adapteva
Tensilica
FalconeXt
Device Name
TI KeyStone
Fermi
Tile Gx-8072
Epiphany
IVP-EP
FALC-256
L1p
128 KB
32 KB
16 KB
32 KB
32 KB
128 KB
¼ KB
L1d
128 KB
32 KB
48 KB
32 KB
256 KB
L2
4096 KB
1024 KB
768 KB
256 KB
SIMD factor
1
32
32
16
1
32
1
Totalper processing unit
4352 KB
34 KB
26 KB
20 KB
16 KB
12 KB
¼KB
First Parallel Processor Revolution
SIMD + Application Optimization
Processor
Unit -Memory
Size Efficiency
Evolution
Memories
(Program, Data, Coefficients)
Register files
Processing Elements (Multiplier,
Accumulator, Pre-adder, …)
Second Parallel Processor Revolution
Function Level Processor+ Distributed Processing
10
11. FalconeXt Semiconductor Confidential
Method for low power
Falcon Reuse Silicon area >> low power, low cost
VT–low power optimization
Shut down of non-active silicon area
Technology Nodes
Falcon Technology low power optimization can be used in
parallel to other methods used by Internet of Things devices.
11
12. FalconeXt Semiconductor Confidential
Conclusion
•
FalconeXt Semiconductor has a unique and revolutionary parallel processor technologythat can:
•
Increase up to 100x the IC Performance
•
Reduce up to 50x the power consumption
•
Reduce up to 50x the silicon size/cost
•
Falcon technology enables customers to enjoy a combined advantage of ASIC/ASSP/FPGA/Processor with minimal overheads.
•
Falcone software design flow is simple and non-disruptive. It is based on existing Processor software development tools.
•
FalconeXt has estimated 5Y High ROI of at least 40X Internet of Things devices that will use Falcon Technology will benefit
from lower power consumption, smaller device size, and higher
performance
12