SlideShare a Scribd company logo
1 of 40
Download to read offline
www.flextiles.eu 
FlexTiles 
Runtime Mapping of Hardware Accelerators on the Embedded FPGA Layer 
FPL’14, FlexTiles Workshop September 1st 2014 
Olivier SENTIEYS★, Christophe HURIAUX, Antoine COURTAY  University of Rennes 1 
★ Inria
2 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
The Multicore Era is Hitting the Utilization Wall 
Multicore era is true since 2005-2008, but what’s next? 
Energy efficiency is not scaling along with integration capacity 
Transistor and power budgets no longer balanced 
Classical scaling 
Device count S2 
Device frequency S 
Device power (cap) 1/S 
Device power (Vdd) 1/S2 
Utilization 1 
Leakage limited scaling 
Device count S2 
Device frequency S 
Device power (cap) 1/S 
Device power (Vdd) ~1 
Utilization 1/S2 
Pi=ai fi Ci Vddi2 
Corei 
[Venkatesh et al., ASPLOS’10]
3 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, 
copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 32 
The Utilization Wall 
With each successive process generation, the 
percentage of a chip that can switch at full frequency 
drops exponentially due to power constraints 
8nm in 2018 
best-case average 
3.7x speedup 
14% per year 
(highly parallel codes 
and optimal per-benchmark) 
[Esmaeilzadeh et al., ISCA’11]
4 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
0 
5 
10 
15 
20 
45nm 
32nm 
22nm 
16nm 
11nm 
8nm 
Speedup 
Historical Scaling 
ITRS Scaling 
Realistic Scaling 
18x 
7.9x 
3.7x 
Multicore and Dark Silicon 
[Doug Burger, HiPEAC’13] 
Dark Silicon 
47% 
36% 
71% 
51% 
62% 
40% 
17% 
1% 
2014 
>2016 
>2018
5 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, 
copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 32 
The Efficiency of Specialization 
* Source: Ning Zhang and Bob Brodersen, ISSCC data 
100-1000X Gap in Efficiency … but Specialization 
comes with Penalties in Programmability 
ASICs 
FPGAs
6 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Heterogeneous Multicores 
Different cores on a single chip 
GPPs, HW accelerators, memory, network-on-chip 
Reconfigurable HW accelerators keep flexibility while increasing area and energy efficiency Self-adapting devices 
Dynamically adapt the hardware to the application and to changing environments 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Proc. 
Reconf. 
HW 
Mem. 
HW 
Acc.
7 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Can 3D Stacking Help? 
3D-Stacked Reconfigurable Accelerators 
Improved bandwidth/latency between cores and accelerators 
Improved resource usage 
Improved performance and energy efficiency 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
Core 
reconfigurable layer 
multicore layer
8 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Outline 
eFPGA Reconfigurable Fabric 
General architecture overview 
Expected features 
Task migration in FPGA vs. task migration in eFPGA Virtual Bit-Stream Coping with Heterogeneous Blocks Development Flow Achievements & Conclusion
9 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
FlexTiles Architecture Overview 
- 9 
3D interface to the NoC 
DSP blocks 
Memory blocks
10 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Expected Features of the Reconfigurable Layer 
Main expected features 
Low reconfiguration time (and power) overhead 
Double-context configuration memory 
Low complexity reconfiguration control 
Resource sharing/distribution easiness, simplified task migration 
No predefined configuration domains 
Bit-stream independent from task location 
Smaller bit-stream size in configuration memory  Virtual Bit-Stream (VBS)
11 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Task Allocation & Migration in an FPGA 
Predefined reconfigurable regions 
Bit-stream depends on task location 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
I/O 
HW Accelerator #1 
BS #1 
HW Accelerator #1 
BS #2
12 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Task Migration in eFPGA 
3D NI 
3D NI 
3D NI 
3D NI 
RAM 
RAM 
RAM 
RAM 
RAM 
RAM 
RAM 
RAM 
3D NI 
3D NI 
3D NI 
3D NI 
3D NI 
3D NI 
3D NI 
3D NI 
3D NI 
3D NI 
3D NI 
HW Accelerator #2 
BS #2 
HW Accelerator #1 
BS #1
13 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Outline 
eFPGA Reconfigurable Fabric Virtual Bit-Stream 
Concept 
Abstraction of routing details 
Results Coping with Heterogeneous Fabric Development Flow Achievements & Conclusion
14 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, 
copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 32 
Concept of Virtual Bit-Stream 
A task is synthesized and 
placed&routed into a Virtual 
Bit-Stream (VBS) 
 Hide some routing details which are 
architecture dependent 
 Remove details coming from task 
physical location in the fabric 
 No predefined configuration domains 
Final Bits-Stream is 
generated at run time 
 Resource sharing/distribution 
becomes easier, task migration is 
simplified 
Quartus II
15 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Interconnection Architecture 
Hiding routing details 
Full BS is 129 bits 
Could be reduced by giving less details 
CLBIN[1] 
CLBIN[2] 
CLBIN[3] 
CLBOUT 
CLBIN[0] 
4 5 6 7 
12 13 14 15 
0 1 2 3 
8 9 10 11 
16 
17 
18 
19 20
16 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Virtual Bit Stream 
Hiding routing details 
List of I/O and connections 
20  8 
1  9 
5  18 
4 5 6 7 
12 13 14 15 
0 1 2 3 
8 9 10 11 
16 
17 
18 
19 20
17 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, 
copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 32 
Results 
VBS is independent of task location with a 
smaller size than BS 
44.4%	 
49.2%	 
47.2%	 
55.2%	 
49.7%	 
29.5%	 
27.4%	 26.6%	 
0.0%	 
10.0%	 
20.0%	 
30.0%	 
40.0%	 
50.0%	 
60.0%	 
70.0%	 
80.0%	 
90.0%	 
100.0%	 
0	 
200	 
400	 
600	 
800	 
1000	 
1200	 
1400	 
1600	 
tseng	 tseng	 diffeq	 diffeq	 apex4	 des	 ex5p	 misex3	 
Kilo-bits	 
BS	size	 
VBS	size	 
Compression	ra o	 
3-4 time smaller for 
large bit-streams
18 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
eFPGA Architecture using VBS 
Reconfiguration controller 
Upon GPP requirements: can place, duplicate and migrate tasks 
Finalizes VBS 
Reconfiguration controller 
External memory 
VBS 1 
VBS 2 
VBS 3 
VBS N 
Buffer memory 
data 
control 
1 
2
19 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Outline 
eFPGA Reconfigurable Fabric Virtual Bit-Stream Coping with Heterogeneous Fabric 
Heterogeneous Blocks 
Task placement in a Homogeneous context 
Task placement in a Heterogeneous context Development Flow Achievements & Conclusion
20 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, 
copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 32 
Heterogeneous Blocks 
Logic Elements 
 Cluster of four 6-input LUTs 
 3309 mm2 
Arithmetic Elements 
 18x18 multiplier, 48-bit adder/subtractor 
 4351 mm2 
… 
… 
… … … 
CLBIN 
CLBOUT 
LUT 
LUT 
LUT 
LUT 
+ 
- 
A 
B 
18 
18 
36 
48
21 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, 
copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 32 
Heterogeneous Blocks 
Memories 
 1024 x 16-bit word SRAM 
 6570 mm2 
3D TSV and Accelerator Interface 
Reconfiguration 
Controller 
3D 
3D 3D 
3D 
3D 
3D 
3D 
3D 
3D 
Reconfiguration 
RAM 
3DNI 3DNI 3DNI 
3DNI 3DNI 
3DNI 3DNI 3DNI 
NoC Link (400 I/O) Pitch X Y size X size Y Area mm² 
40 20 20 800 800 0,64 
26.95mm² 
Work In Progress
22 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
eFPGA Floorplan (heterogeneous) 
Logic Block Arithmetic Accelerator Memories Accelerator Interface
23 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Task Placement & Migration 
Homogeneous case 
No constraint on task placement 
Regular routing architecture 
Easy! (thanks to the Virtual Bit-Stream) Cope with heterogeneity 
RAM, DSP, 3D I/Os 
Migration is limited 
vertically to the same column 
to the next column containing same complex blocks 
Task 
Configured LE 
Logic Element (LE)
24 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
eFPGA: Handling of Complex Blocks 
Heterogeneous blocks routing is abstracted from logic routing 
Long lines allow a trade-off between placement flexibility and routing complexity 
A two-level routing is performed at runtime: 
Logic routing (as in the homogeneous case) 
Heterogeneous block routing through long lines
25 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
eFPGA: Handling of Complex Blocks 
Delay depends on final placement 
Only worst-case delay can be estimated offline Flexibility is still limited in the vertical axis 
Multiple of block height Length of long lines and connections long-lines – routing-resources should be limited 
Area overhead, but slight delay penalty 
(see our paper at FPL’14 on Wednesday)
26 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Outline 
eFPGA Reconfigurable Fabric 
Virtual Bit-Stream 
Coping with Heterogeneous Fabric 
Development Flow 
Achievements & Conclusion
27 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, 
copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 32 
Development Flow 
Custom development flow from C to Virtual Bit-Stream 
High-level Synthesis 
High-level task 
description 
RTL task description 
HDL Synthesis 
HDL task description 
Flat logic netlist 
Technology mapping 
Mapped logic netlist 
Placer Router 
Placement 
data 
Routing 
data 
Arch. 
netlist 
Bitstream generation 
Virtual bit-stream 
Arch. 
description 
 Integrated within the 
FlexTiles 
development flow 
 Generates VBS from 
a C description or a 
HDL description
28 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Development Flow 
Custom development flow from C to Virtual Bit-Stream 
Relies on Catapult C from Calypto Design Systems 
High-level synthesis from C to VHDL
29 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Development Flow 
Custom development flow from C to Virtual Bit-Stream 
Use the Verilog To Routing (VTR) academic tool flow to generate netlist and routing data from Verilog 
RTL task description HDL Synthesis HDL task description Flat logic netlist Technology mapping Mapped logic netlist Placer Router Placement data Routing data Arch. netlist Arch. description
30 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Development Flow 
Custom development flow from C to Virtual Bit-Stream 
A custom back-end generate the VBS from the data generated by VTR 
The VBS can be loaded on the FlexTiles platform
31 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Conclusions 
Overall results and achievements 
3-D stacked embedded FPGA coupled to a processor layer 
Flexible resource allocation/sharing 
Seamless task migration 
Virtual Bit-Stream 
VBS also reduces bitstream size eFPGA Chip “Proof of Concept” 
65nm CMOS 
Homogenous Fabric of LBs 
I/O Ring (not 3D…) 
External Reconfiguration Controller
32 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Results 
Thank you for your attention
33 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
D-cache 6% 
Datapath 3% 
Energy Saved 91% 
D-cache 6% 
Datapath 38% 
Reg. File 14% 
Fetch/ Decode 19% 
I-cache 23% 
Where do the energy savings come from? 
MIPS baseline 91 pJ/instr. 
Specialized core 8 pJ/instr. 
[Goulding et al., Hot Chips’10]
34 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Energy per operation: 45nm CMOS, 40nm V6 FPGA 
HW operators (45nm) 
32-bit addition: 0.5pJ 
16-bit multiply: 2.2pJ 
64-bit FPU: 50pJ/op 40nm V6 FPGA 
16/32-bit multiply and add: 114pJ (DSP blocks), 170pJ (LUT) 
32-bit I/O access: 1.47nJ 
32-bit memory read: 660 pJ 
32-bit register R/W: 1.12 pJ Embedded RISC Processor (45nm) 
32-bit register R/W: 0.33pJ 
32-bit cache R/W: 3.5pJ 
add instruction⋆⋆: 5.32 pJ 
⋆⋆add instruction (best case) = fetch, decode, read 2 operands from RF, execute, write back (into local reg. first, then copy into RF) 
[Dally et al., Computer, 2010] 
[Bonamy et al., 2013]
35 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
The Energy Cost of Data Movement 
Fetching operands costs more than computing 
Energy cost of cache coherence is huge! 
28nm 
CMOS 
500 pJ 
Efficient 
off-chip link 
16 nJ 
DRAM 
Rd/Wr 
64-bit DP 
20pJ 
26 pJ 
256 pJ 
1 nJ 
256- bit 
buses 
50 pJ 
256-bit access 
8 kB SRAM 
[Dally, IPDPS’11]
36 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
Efficient Hardware Task Swapping 
Hiding reconfiguration time with computing 
Single-context memory 
Double-context memory 
eFPGA will use double-context memory 
Gain in dynamic reconfiguration efficiency 
At the cost of ~50% overhead 
Task 1 
Task 2 
time 
Cfg. 2 
Cfg. 1 
Task 1 
Task 2 
time 
Cfg. 2 
Cfg. 1 
CB 
FF 
ConfClk 
Latch 
ConfEn 
CB 
CB: one configuration bit
37 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
eFPGA(V1) Architecture 
Logic Block Switch Block 
LUT 
CLBIN 
ScanIn 
FF 
mux 
CB 
ScanOut 
CLBOUT 
clk,rstb 
CB 
CB 
CB 
CB 
NORTH(i) 
SOUTH(i) 
EAST(i) 
WEST(i) 
ScanIn 
ScanOut
38 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
eFPGA Architecture 
Interconnection Block 
CLBIN[1] 
CLBIN[2] 
CLBIN[3] 
CLBOUT 
CLBIN[0] 
NORTH 
0 1 2 3 
0 1 2 3 
SOUTH 
0 1 2 3 
WEST 
EAST 
0 1 2 3
39 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
eFPGA Architecture 
eFPGA macro 
CHANY 
(i,j+1) 
SB 
(i-1,j) 
CHANX 
(i+1,j) 
CLB 
(i+1,j) 
SB 
(i,j-1) 
SB(i,j) 
CLB 
(i,j+1) 
CLB 
(i,j) 
CLBIN[1] 
CLBIN[2] 
CLBIN[0] 
CLBIN[3] 
CLBOUT 
CHANX(i,j) 
CHANY(i,j) 
CLBIN[3] 
CLBOUT 
CLBIN[0]
40 / 
The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 
University of Rennes 1 – FPL’14 FlexTiles Workshop 
32 
eFPGA Floorplan 
eFPGA Floorplan

More Related Content

What's hot

Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsGanesan Narayanasamy
 
NNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for SupercomputingNNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for Supercomputinginside-BigData.com
 
RFGen News. Dara Hamlet (Gibbs)
RFGen News. Dara Hamlet (Gibbs)RFGen News. Dara Hamlet (Gibbs)
RFGen News. Dara Hamlet (Gibbs)Dara Gibbs
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLinside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
CINECA for HCP and e-infrastructures infrastructures
CINECA for HCP and e-infrastructures infrastructuresCINECA for HCP and e-infrastructures infrastructures
CINECA for HCP and e-infrastructures infrastructuresCineca
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performanceinside-BigData.com
 
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles Introductions
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles IntroductionsConference on Adaptive Hardware and Systems (AHS'14) - FlexTiles Introductions
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles IntroductionsFlexTiles Team
 
powerpoint
powerpointpowerpoint
powerpointVideoguy
 
Beginning of the end for big iron ATE?
Beginning of the end for big iron ATE?Beginning of the end for big iron ATE?
Beginning of the end for big iron ATE?Hank Lydick
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialGanesan Narayanasamy
 
BXI: Bull eXascale Interconnect
BXI: Bull eXascale InterconnectBXI: Bull eXascale Interconnect
BXI: Bull eXascale Interconnectinside-BigData.com
 

What's hot (20)

Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
 
NNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for SupercomputingNNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for Supercomputing
 
Session29 Arc
Session29 ArcSession29 Arc
Session29 Arc
 
RFGen News. Dara Hamlet (Gibbs)
RFGen News. Dara Hamlet (Gibbs)RFGen News. Dara Hamlet (Gibbs)
RFGen News. Dara Hamlet (Gibbs)
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
DOME 64-bit μDataCenter
DOME 64-bit μDataCenterDOME 64-bit μDataCenter
DOME 64-bit μDataCenter
 
CINECA for HCP and e-infrastructures infrastructures
CINECA for HCP and e-infrastructures infrastructuresCINECA for HCP and e-infrastructures infrastructures
CINECA for HCP and e-infrastructures infrastructures
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performance
 
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles Introductions
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles IntroductionsConference on Adaptive Hardware and Systems (AHS'14) - FlexTiles Introductions
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles Introductions
 
powerpoint
powerpointpowerpoint
powerpoint
 
Beginning of the end for big iron ATE?
Beginning of the end for big iron ATE?Beginning of the end for big iron ATE?
Beginning of the end for big iron ATE?
 
ARM HPC Ecosystem
ARM HPC EcosystemARM HPC Ecosystem
ARM HPC Ecosystem
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
 
BXI: Bull eXascale Interconnect
BXI: Bull eXascale InterconnectBXI: Bull eXascale Interconnect
BXI: Bull eXascale Interconnect
 
Overview and Status of HDF in NPOESS & NPP
Overview and Status of HDF in NPOESS & NPPOverview and Status of HDF in NPOESS & NPP
Overview and Status of HDF in NPOESS & NPP
 

Similar to FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA Accelerators

FPL'2014 - FlexTiles Workshop - 5 - FlexTiles Simulation Platform
FPL'2014 - FlexTiles Workshop - 5 - FlexTiles Simulation PlatformFPL'2014 - FlexTiles Workshop - 5 - FlexTiles Simulation Platform
FPL'2014 - FlexTiles Workshop - 5 - FlexTiles Simulation PlatformFlexTiles Team
 
FPL'2014 - FlexTiles Workshop - 4 - FlexTiles Virtual Platform
FPL'2014 - FlexTiles Workshop - 4 - FlexTiles Virtual PlatformFPL'2014 - FlexTiles Workshop - 4 - FlexTiles Virtual Platform
FPL'2014 - FlexTiles Workshop - 4 - FlexTiles Virtual PlatformFlexTiles Team
 
FPL'2014 - FlexTiles Workshop - 1 - FlexTiles Overview
FPL'2014 - FlexTiles Workshop - 1 - FlexTiles OverviewFPL'2014 - FlexTiles Workshop - 1 - FlexTiles Overview
FPL'2014 - FlexTiles Workshop - 1 - FlexTiles OverviewFlexTiles Team
 
Conference on Adaptive Hardware and Systems (AHS'14) - What is FlexTiles?
Conference on Adaptive Hardware and Systems (AHS'14) - What is FlexTiles?Conference on Adaptive Hardware and Systems (AHS'14) - What is FlexTiles?
Conference on Adaptive Hardware and Systems (AHS'14) - What is FlexTiles?FlexTiles Team
 
Conference on Adaptive Hardware and Systems (AHS'14) - The 3D FlexTiles Concept
Conference on Adaptive Hardware and Systems (AHS'14) - The 3D FlexTiles ConceptConference on Adaptive Hardware and Systems (AHS'14) - The 3D FlexTiles Concept
Conference on Adaptive Hardware and Systems (AHS'14) - The 3D FlexTiles ConceptFlexTiles Team
 
Conference on Adaptive Hardware and Systems (AHS'14) - The DSP for FlexTiles
Conference on Adaptive Hardware and Systems (AHS'14) - The DSP for FlexTilesConference on Adaptive Hardware and Systems (AHS'14) - The DSP for FlexTiles
Conference on Adaptive Hardware and Systems (AHS'14) - The DSP for FlexTilesFlexTiles Team
 
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) Architecture
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) ArchitectureNFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) Architecture
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) ArchitectureMichelle Holley
 
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles FPGA Emulation
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles FPGA EmulationConference on Adaptive Hardware and Systems (AHS'14) - FlexTiles FPGA Emulation
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles FPGA EmulationFlexTiles Team
 
978-1-4577-1343-912$26.00 ©2014 IEEE Reliability an.docx
978-1-4577-1343-912$26.00 ©2014 IEEE  Reliability an.docx978-1-4577-1343-912$26.00 ©2014 IEEE  Reliability an.docx
978-1-4577-1343-912$26.00 ©2014 IEEE Reliability an.docxevonnehoggarth79783
 
Industrial_Ethernet_Technologies_220529_031813 (1).pdf
Industrial_Ethernet_Technologies_220529_031813 (1).pdfIndustrial_Ethernet_Technologies_220529_031813 (1).pdf
Industrial_Ethernet_Technologies_220529_031813 (1).pdfTobey Houston
 
Ethercat.org industrial ethernet technologies
Ethercat.org industrial ethernet technologiesEthercat.org industrial ethernet technologies
Ethercat.org industrial ethernet technologiesKen Ott
 
A Collaborative Research Proposal To The NSF Research Accelerator For Multip...
A Collaborative Research Proposal To The NSF  Research Accelerator For Multip...A Collaborative Research Proposal To The NSF  Research Accelerator For Multip...
A Collaborative Research Proposal To The NSF Research Accelerator For Multip...Scott Donald
 
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureRed Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureIntel® Software
 
2018 Genivi Xen Overview Nov Update
2018 Genivi Xen Overview Nov Update2018 Genivi Xen Overview Nov Update
2018 Genivi Xen Overview Nov UpdateThe Linux Foundation
 
HPC Facility Designing for next generation HPC systems Ram Nagappan Intel Final
HPC Facility Designing for next generation HPC systems Ram Nagappan Intel FinalHPC Facility Designing for next generation HPC systems Ram Nagappan Intel Final
HPC Facility Designing for next generation HPC systems Ram Nagappan Intel FinalRamkumar Nagappan
 
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasIntroduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasBruno Teixeira
 
Conference on Adaptive Hardware and Systems (AHS'14) - Why FlexTiles uses OVP...
Conference on Adaptive Hardware and Systems (AHS'14) - Why FlexTiles uses OVP...Conference on Adaptive Hardware and Systems (AHS'14) - Why FlexTiles uses OVP...
Conference on Adaptive Hardware and Systems (AHS'14) - Why FlexTiles uses OVP...FlexTiles Team
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityAPNIC
 
3. EMC Storage for future Surveillance.pdf
3. EMC Storage for future Surveillance.pdf3. EMC Storage for future Surveillance.pdf
3. EMC Storage for future Surveillance.pdfPawachMetharattanara
 

Similar to FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA Accelerators (20)

FPL'2014 - FlexTiles Workshop - 5 - FlexTiles Simulation Platform
FPL'2014 - FlexTiles Workshop - 5 - FlexTiles Simulation PlatformFPL'2014 - FlexTiles Workshop - 5 - FlexTiles Simulation Platform
FPL'2014 - FlexTiles Workshop - 5 - FlexTiles Simulation Platform
 
FPL'2014 - FlexTiles Workshop - 4 - FlexTiles Virtual Platform
FPL'2014 - FlexTiles Workshop - 4 - FlexTiles Virtual PlatformFPL'2014 - FlexTiles Workshop - 4 - FlexTiles Virtual Platform
FPL'2014 - FlexTiles Workshop - 4 - FlexTiles Virtual Platform
 
FPL'2014 - FlexTiles Workshop - 1 - FlexTiles Overview
FPL'2014 - FlexTiles Workshop - 1 - FlexTiles OverviewFPL'2014 - FlexTiles Workshop - 1 - FlexTiles Overview
FPL'2014 - FlexTiles Workshop - 1 - FlexTiles Overview
 
Conference on Adaptive Hardware and Systems (AHS'14) - What is FlexTiles?
Conference on Adaptive Hardware and Systems (AHS'14) - What is FlexTiles?Conference on Adaptive Hardware and Systems (AHS'14) - What is FlexTiles?
Conference on Adaptive Hardware and Systems (AHS'14) - What is FlexTiles?
 
Conference on Adaptive Hardware and Systems (AHS'14) - The 3D FlexTiles Concept
Conference on Adaptive Hardware and Systems (AHS'14) - The 3D FlexTiles ConceptConference on Adaptive Hardware and Systems (AHS'14) - The 3D FlexTiles Concept
Conference on Adaptive Hardware and Systems (AHS'14) - The 3D FlexTiles Concept
 
Conference on Adaptive Hardware and Systems (AHS'14) - The DSP for FlexTiles
Conference on Adaptive Hardware and Systems (AHS'14) - The DSP for FlexTilesConference on Adaptive Hardware and Systems (AHS'14) - The DSP for FlexTiles
Conference on Adaptive Hardware and Systems (AHS'14) - The DSP for FlexTiles
 
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) Architecture
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) ArchitectureNFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) Architecture
NFV and SDN: 4G LTE and 5G Wireless Networks on Intel(r) Architecture
 
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles FPGA Emulation
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles FPGA EmulationConference on Adaptive Hardware and Systems (AHS'14) - FlexTiles FPGA Emulation
Conference on Adaptive Hardware and Systems (AHS'14) - FlexTiles FPGA Emulation
 
978-1-4577-1343-912$26.00 ©2014 IEEE Reliability an.docx
978-1-4577-1343-912$26.00 ©2014 IEEE  Reliability an.docx978-1-4577-1343-912$26.00 ©2014 IEEE  Reliability an.docx
978-1-4577-1343-912$26.00 ©2014 IEEE Reliability an.docx
 
Industrial_Ethernet_Technologies_220529_031813 (1).pdf
Industrial_Ethernet_Technologies_220529_031813 (1).pdfIndustrial_Ethernet_Technologies_220529_031813 (1).pdf
Industrial_Ethernet_Technologies_220529_031813 (1).pdf
 
Ethercat.org industrial ethernet technologies
Ethercat.org industrial ethernet technologiesEthercat.org industrial ethernet technologies
Ethercat.org industrial ethernet technologies
 
guna_2015.DOC
guna_2015.DOCguna_2015.DOC
guna_2015.DOC
 
A Collaborative Research Proposal To The NSF Research Accelerator For Multip...
A Collaborative Research Proposal To The NSF  Research Accelerator For Multip...A Collaborative Research Proposal To The NSF  Research Accelerator For Multip...
A Collaborative Research Proposal To The NSF Research Accelerator For Multip...
 
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureRed Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
 
2018 Genivi Xen Overview Nov Update
2018 Genivi Xen Overview Nov Update2018 Genivi Xen Overview Nov Update
2018 Genivi Xen Overview Nov Update
 
HPC Facility Designing for next generation HPC systems Ram Nagappan Intel Final
HPC Facility Designing for next generation HPC systems Ram Nagappan Intel FinalHPC Facility Designing for next generation HPC systems Ram Nagappan Intel Final
HPC Facility Designing for next generation HPC systems Ram Nagappan Intel Final
 
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasIntroduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
 
Conference on Adaptive Hardware and Systems (AHS'14) - Why FlexTiles uses OVP...
Conference on Adaptive Hardware and Systems (AHS'14) - Why FlexTiles uses OVP...Conference on Adaptive Hardware and Systems (AHS'14) - Why FlexTiles uses OVP...
Conference on Adaptive Hardware and Systems (AHS'14) - Why FlexTiles uses OVP...
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and Flexibility
 
3. EMC Storage for future Surveillance.pdf
3. EMC Storage for future Surveillance.pdf3. EMC Storage for future Surveillance.pdf
3. EMC Storage for future Surveillance.pdf
 

More from FlexTiles Team

Adaptive Hardware and Systems (AHS'14) - FlexTiles OVP Demo
Adaptive Hardware and Systems (AHS'14) - FlexTiles OVP DemoAdaptive Hardware and Systems (AHS'14) - FlexTiles OVP Demo
Adaptive Hardware and Systems (AHS'14) - FlexTiles OVP DemoFlexTiles Team
 
Reconfigurable 3D MultiCore Concept by Prof. Michael Hübner @ ARC 2013
Reconfigurable 3D MultiCore Concept by Prof. Michael Hübner @ ARC 2013Reconfigurable 3D MultiCore Concept by Prof. Michael Hübner @ ARC 2013
Reconfigurable 3D MultiCore Concept by Prof. Michael Hübner @ ARC 2013FlexTiles Team
 
The FlexTiles Development Platform offers Dual FPGA for 3D SoC Prototyping
The FlexTiles Development Platform offers Dual FPGA for 3D SoC PrototypingThe FlexTiles Development Platform offers Dual FPGA for 3D SoC Prototyping
The FlexTiles Development Platform offers Dual FPGA for 3D SoC PrototypingFlexTiles Team
 
FlexTiles Development Platform
FlexTiles Development Platform FlexTiles Development Platform
FlexTiles Development Platform FlexTiles Team
 
FlexTiles Platform - Xilinx Virtex-6 DUO
FlexTiles Platform - Xilinx Virtex-6 DUOFlexTiles Platform - Xilinx Virtex-6 DUO
FlexTiles Platform - Xilinx Virtex-6 DUOFlexTiles Team
 
Fall School on Programming Paradigms for Multi-core Embedded Systems 2012
Fall School on Programming Paradigms for Multi-core Embedded Systems 2012Fall School on Programming Paradigms for Multi-core Embedded Systems 2012
Fall School on Programming Paradigms for Multi-core Embedded Systems 2012FlexTiles Team
 

More from FlexTiles Team (14)

Adaptive Hardware and Systems (AHS'14) - FlexTiles OVP Demo
Adaptive Hardware and Systems (AHS'14) - FlexTiles OVP DemoAdaptive Hardware and Systems (AHS'14) - FlexTiles OVP Demo
Adaptive Hardware and Systems (AHS'14) - FlexTiles OVP Demo
 
Reconfigurable 3D MultiCore Concept by Prof. Michael Hübner @ ARC 2013
Reconfigurable 3D MultiCore Concept by Prof. Michael Hübner @ ARC 2013Reconfigurable 3D MultiCore Concept by Prof. Michael Hübner @ ARC 2013
Reconfigurable 3D MultiCore Concept by Prof. Michael Hübner @ ARC 2013
 
The FlexTiles Development Platform offers Dual FPGA for 3D SoC Prototyping
The FlexTiles Development Platform offers Dual FPGA for 3D SoC PrototypingThe FlexTiles Development Platform offers Dual FPGA for 3D SoC Prototyping
The FlexTiles Development Platform offers Dual FPGA for 3D SoC Prototyping
 
FlexTiles Platform
FlexTiles Platform FlexTiles Platform
FlexTiles Platform
 
FlexTiles Development Platform
FlexTiles Development Platform FlexTiles Development Platform
FlexTiles Development Platform
 
FlexTiles Platform - Xilinx Virtex-6 DUO
FlexTiles Platform - Xilinx Virtex-6 DUOFlexTiles Platform - Xilinx Virtex-6 DUO
FlexTiles Platform - Xilinx Virtex-6 DUO
 
INA OCMC 2012
INA OCMC 2012INA OCMC 2012
INA OCMC 2012
 
DATE 2012
DATE 2012DATE 2012
DATE 2012
 
DAC 2012
DAC 2012DAC 2012
DAC 2012
 
SAMOS 2012
SAMOS 2012SAMOS 2012
SAMOS 2012
 
RAW 2012
RAW 2012RAW 2012
RAW 2012
 
ISVLSI 2012
ISVLSI 2012ISVLSI 2012
ISVLSI 2012
 
Fall School on Programming Paradigms for Multi-core Embedded Systems 2012
Fall School on Programming Paradigms for Multi-core Embedded Systems 2012Fall School on Programming Paradigms for Multi-core Embedded Systems 2012
Fall School on Programming Paradigms for Multi-core Embedded Systems 2012
 
HiPEAC 2012
HiPEAC 2012HiPEAC 2012
HiPEAC 2012
 

Recently uploaded

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 

Recently uploaded (20)

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 

FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA Accelerators

  • 1. www.flextiles.eu FlexTiles Runtime Mapping of Hardware Accelerators on the Embedded FPGA Layer FPL’14, FlexTiles Workshop September 1st 2014 Olivier SENTIEYS★, Christophe HURIAUX, Antoine COURTAY  University of Rennes 1 ★ Inria
  • 2. 2 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 The Multicore Era is Hitting the Utilization Wall Multicore era is true since 2005-2008, but what’s next? Energy efficiency is not scaling along with integration capacity Transistor and power budgets no longer balanced Classical scaling Device count S2 Device frequency S Device power (cap) 1/S Device power (Vdd) 1/S2 Utilization 1 Leakage limited scaling Device count S2 Device frequency S Device power (cap) 1/S Device power (Vdd) ~1 Utilization 1/S2 Pi=ai fi Ci Vddi2 Corei [Venkatesh et al., ASPLOS’10]
  • 3. 3 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 The Utilization Wall With each successive process generation, the percentage of a chip that can switch at full frequency drops exponentially due to power constraints 8nm in 2018 best-case average 3.7x speedup 14% per year (highly parallel codes and optimal per-benchmark) [Esmaeilzadeh et al., ISCA’11]
  • 4. 4 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 0 5 10 15 20 45nm 32nm 22nm 16nm 11nm 8nm Speedup Historical Scaling ITRS Scaling Realistic Scaling 18x 7.9x 3.7x Multicore and Dark Silicon [Doug Burger, HiPEAC’13] Dark Silicon 47% 36% 71% 51% 62% 40% 17% 1% 2014 >2016 >2018
  • 5. 5 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 The Efficiency of Specialization * Source: Ning Zhang and Bob Brodersen, ISSCC data 100-1000X Gap in Efficiency … but Specialization comes with Penalties in Programmability ASICs FPGAs
  • 6. 6 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Heterogeneous Multicores Different cores on a single chip GPPs, HW accelerators, memory, network-on-chip Reconfigurable HW accelerators keep flexibility while increasing area and energy efficiency Self-adapting devices Dynamically adapt the hardware to the application and to changing environments Core Core Core Core Core Core Core Core Core Proc. Reconf. HW Mem. HW Acc.
  • 7. 7 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Can 3D Stacking Help? 3D-Stacked Reconfigurable Accelerators Improved bandwidth/latency between cores and accelerators Improved resource usage Improved performance and energy efficiency Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core reconfigurable layer multicore layer
  • 8. 8 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Outline eFPGA Reconfigurable Fabric General architecture overview Expected features Task migration in FPGA vs. task migration in eFPGA Virtual Bit-Stream Coping with Heterogeneous Blocks Development Flow Achievements & Conclusion
  • 9. 9 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 FlexTiles Architecture Overview - 9 3D interface to the NoC DSP blocks Memory blocks
  • 10. 10 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Expected Features of the Reconfigurable Layer Main expected features Low reconfiguration time (and power) overhead Double-context configuration memory Low complexity reconfiguration control Resource sharing/distribution easiness, simplified task migration No predefined configuration domains Bit-stream independent from task location Smaller bit-stream size in configuration memory  Virtual Bit-Stream (VBS)
  • 11. 11 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Task Allocation & Migration in an FPGA Predefined reconfigurable regions Bit-stream depends on task location I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O I/O HW Accelerator #1 BS #1 HW Accelerator #1 BS #2
  • 12. 12 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Task Migration in eFPGA 3D NI 3D NI 3D NI 3D NI RAM RAM RAM RAM RAM RAM RAM RAM 3D NI 3D NI 3D NI 3D NI 3D NI 3D NI 3D NI 3D NI 3D NI 3D NI 3D NI HW Accelerator #2 BS #2 HW Accelerator #1 BS #1
  • 13. 13 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Outline eFPGA Reconfigurable Fabric Virtual Bit-Stream Concept Abstraction of routing details Results Coping with Heterogeneous Fabric Development Flow Achievements & Conclusion
  • 14. 14 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Concept of Virtual Bit-Stream A task is synthesized and placed&routed into a Virtual Bit-Stream (VBS)  Hide some routing details which are architecture dependent  Remove details coming from task physical location in the fabric  No predefined configuration domains Final Bits-Stream is generated at run time  Resource sharing/distribution becomes easier, task migration is simplified Quartus II
  • 15. 15 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Interconnection Architecture Hiding routing details Full BS is 129 bits Could be reduced by giving less details CLBIN[1] CLBIN[2] CLBIN[3] CLBOUT CLBIN[0] 4 5 6 7 12 13 14 15 0 1 2 3 8 9 10 11 16 17 18 19 20
  • 16. 16 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Virtual Bit Stream Hiding routing details List of I/O and connections 20  8 1  9 5  18 4 5 6 7 12 13 14 15 0 1 2 3 8 9 10 11 16 17 18 19 20
  • 17. 17 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Results VBS is independent of task location with a smaller size than BS 44.4% 49.2% 47.2% 55.2% 49.7% 29.5% 27.4% 26.6% 0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 90.0% 100.0% 0 200 400 600 800 1000 1200 1400 1600 tseng tseng diffeq diffeq apex4 des ex5p misex3 Kilo-bits BS size VBS size Compression ra o 3-4 time smaller for large bit-streams
  • 18. 18 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 eFPGA Architecture using VBS Reconfiguration controller Upon GPP requirements: can place, duplicate and migrate tasks Finalizes VBS Reconfiguration controller External memory VBS 1 VBS 2 VBS 3 VBS N Buffer memory data control 1 2
  • 19. 19 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Outline eFPGA Reconfigurable Fabric Virtual Bit-Stream Coping with Heterogeneous Fabric Heterogeneous Blocks Task placement in a Homogeneous context Task placement in a Heterogeneous context Development Flow Achievements & Conclusion
  • 20. 20 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Heterogeneous Blocks Logic Elements  Cluster of four 6-input LUTs  3309 mm2 Arithmetic Elements  18x18 multiplier, 48-bit adder/subtractor  4351 mm2 … … … … … CLBIN CLBOUT LUT LUT LUT LUT + - A B 18 18 36 48
  • 21. 21 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Heterogeneous Blocks Memories  1024 x 16-bit word SRAM  6570 mm2 3D TSV and Accelerator Interface Reconfiguration Controller 3D 3D 3D 3D 3D 3D 3D 3D 3D Reconfiguration RAM 3DNI 3DNI 3DNI 3DNI 3DNI 3DNI 3DNI 3DNI NoC Link (400 I/O) Pitch X Y size X size Y Area mm² 40 20 20 800 800 0,64 26.95mm² Work In Progress
  • 22. 22 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 eFPGA Floorplan (heterogeneous) Logic Block Arithmetic Accelerator Memories Accelerator Interface
  • 23. 23 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Task Placement & Migration Homogeneous case No constraint on task placement Regular routing architecture Easy! (thanks to the Virtual Bit-Stream) Cope with heterogeneity RAM, DSP, 3D I/Os Migration is limited vertically to the same column to the next column containing same complex blocks Task Configured LE Logic Element (LE)
  • 24. 24 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 eFPGA: Handling of Complex Blocks Heterogeneous blocks routing is abstracted from logic routing Long lines allow a trade-off between placement flexibility and routing complexity A two-level routing is performed at runtime: Logic routing (as in the homogeneous case) Heterogeneous block routing through long lines
  • 25. 25 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 eFPGA: Handling of Complex Blocks Delay depends on final placement Only worst-case delay can be estimated offline Flexibility is still limited in the vertical axis Multiple of block height Length of long lines and connections long-lines – routing-resources should be limited Area overhead, but slight delay penalty (see our paper at FPL’14 on Wednesday)
  • 26. 26 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Outline eFPGA Reconfigurable Fabric Virtual Bit-Stream Coping with Heterogeneous Fabric Development Flow Achievements & Conclusion
  • 27. 27 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Development Flow Custom development flow from C to Virtual Bit-Stream High-level Synthesis High-level task description RTL task description HDL Synthesis HDL task description Flat logic netlist Technology mapping Mapped logic netlist Placer Router Placement data Routing data Arch. netlist Bitstream generation Virtual bit-stream Arch. description  Integrated within the FlexTiles development flow  Generates VBS from a C description or a HDL description
  • 28. 28 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Development Flow Custom development flow from C to Virtual Bit-Stream Relies on Catapult C from Calypto Design Systems High-level synthesis from C to VHDL
  • 29. 29 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Development Flow Custom development flow from C to Virtual Bit-Stream Use the Verilog To Routing (VTR) academic tool flow to generate netlist and routing data from Verilog RTL task description HDL Synthesis HDL task description Flat logic netlist Technology mapping Mapped logic netlist Placer Router Placement data Routing data Arch. netlist Arch. description
  • 30. 30 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Development Flow Custom development flow from C to Virtual Bit-Stream A custom back-end generate the VBS from the data generated by VTR The VBS can be loaded on the FlexTiles platform
  • 31. 31 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Conclusions Overall results and achievements 3-D stacked embedded FPGA coupled to a processor layer Flexible resource allocation/sharing Seamless task migration Virtual Bit-Stream VBS also reduces bitstream size eFPGA Chip “Proof of Concept” 65nm CMOS Homogenous Fabric of LBs I/O Ring (not 3D…) External Reconfiguration Controller
  • 32. 32 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Results Thank you for your attention
  • 33. 33 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 D-cache 6% Datapath 3% Energy Saved 91% D-cache 6% Datapath 38% Reg. File 14% Fetch/ Decode 19% I-cache 23% Where do the energy savings come from? MIPS baseline 91 pJ/instr. Specialized core 8 pJ/instr. [Goulding et al., Hot Chips’10]
  • 34. 34 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Energy per operation: 45nm CMOS, 40nm V6 FPGA HW operators (45nm) 32-bit addition: 0.5pJ 16-bit multiply: 2.2pJ 64-bit FPU: 50pJ/op 40nm V6 FPGA 16/32-bit multiply and add: 114pJ (DSP blocks), 170pJ (LUT) 32-bit I/O access: 1.47nJ 32-bit memory read: 660 pJ 32-bit register R/W: 1.12 pJ Embedded RISC Processor (45nm) 32-bit register R/W: 0.33pJ 32-bit cache R/W: 3.5pJ add instruction⋆⋆: 5.32 pJ ⋆⋆add instruction (best case) = fetch, decode, read 2 operands from RF, execute, write back (into local reg. first, then copy into RF) [Dally et al., Computer, 2010] [Bonamy et al., 2013]
  • 35. 35 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 The Energy Cost of Data Movement Fetching operands costs more than computing Energy cost of cache coherence is huge! 28nm CMOS 500 pJ Efficient off-chip link 16 nJ DRAM Rd/Wr 64-bit DP 20pJ 26 pJ 256 pJ 1 nJ 256- bit buses 50 pJ 256-bit access 8 kB SRAM [Dally, IPDPS’11]
  • 36. 36 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 Efficient Hardware Task Swapping Hiding reconfiguration time with computing Single-context memory Double-context memory eFPGA will use double-context memory Gain in dynamic reconfiguration efficiency At the cost of ~50% overhead Task 1 Task 2 time Cfg. 2 Cfg. 1 Task 1 Task 2 time Cfg. 2 Cfg. 1 CB FF ConfClk Latch ConfEn CB CB: one configuration bit
  • 37. 37 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 eFPGA(V1) Architecture Logic Block Switch Block LUT CLBIN ScanIn FF mux CB ScanOut CLBOUT clk,rstb CB CB CB CB NORTH(i) SOUTH(i) EAST(i) WEST(i) ScanIn ScanOut
  • 38. 38 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 eFPGA Architecture Interconnection Block CLBIN[1] CLBIN[2] CLBIN[3] CLBOUT CLBIN[0] NORTH 0 1 2 3 0 1 2 3 SOUTH 0 1 2 3 WEST EAST 0 1 2 3
  • 39. 39 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 eFPGA Architecture eFPGA macro CHANY (i,j+1) SB (i-1,j) CHANX (i+1,j) CLB (i+1,j) SB (i,j-1) SB(i,j) CLB (i,j+1) CLB (i,j) CLBIN[1] CLBIN[2] CLBIN[0] CLBIN[3] CLBOUT CHANX(i,j) CHANY(i,j) CLBIN[3] CLBOUT CLBIN[0]
  • 40. 40 / The information contained in this document and any attachments are the property of FlexTiles consortium. You are hereby notified that any review, dissemination, distribution, copying or otherwise use of this document must be done in accordance with the CA of the project (TRT/DJ/624412785.2011). Template version 1.0 University of Rennes 1 – FPL’14 FlexTiles Workshop 32 eFPGA Floorplan eFPGA Floorplan