XOHW: Rational behind FPGA

•Download as PPTX, PDF•

0 likes•638 views

NNEURaMA at NECST

In this video we are going to explain the reasons why we choose an FPGA for our project

Engineering

NNEURaMA
Marco Bacis - Serena Farina - Stefania Deligia

Our algorithm: Convolutional Neural Networks
Four Convolution:
D= 3x3 S=1
N= 8, 16, 32, 32
4 x 4x32 2 x 2 x32
9x 9x32
51 x 51 x3
49 x 49 x8
24 x 24 x8
22 x 22x16
11 x 11 x16
Three MaxPooling:
D= 3x3, 2x2, 3x3
S= 2
Fully Connected
Softmax:
p(MA) and p(!MA)

Why not CPU?
Convolution:
Multiply-Accumulate operations
+x
x
x
x
+
Sequential
execution
Parallel
execution
CPU
Custom
Hardware
Design

Why not ASIC?
FPGA
• Lower Cost
• Adaptable
• Medium Performance
ASIC
• High Performance
• Fixed Design
• High cost

Comparison
Low performances Not reconfigurable High power consumption
ASIC: hardwareCPU: software GPU: hardware

Thanks for your attention
Questions?
marco.bacis@mail.polimi.it
stefania.deligia@mail.polimi.it
serena.farina@mail.polimi.it
Please, follow us:
https://twitter.com/nneuramaatnecst
https://www.facebook.com/pg/NNEURaMAatNECST/
https://www.slideshare.net/NNEURaMAProject
XOHW17_NNEURaMA_public

Similar to XOHW: Rational behind FPGA

Scalability for All: Unreal Engine* 4 with Intel Intel® Software

Threading Successes 06 Allegorithmicguest40fc7cd

Making the most out of Heterogeneous Chips with CPU, GPU and FPGAFacultad de Informática UCM

The Next Generation of PhyreEngineSlide_N

Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Daosheng Mu

S12075-GPU-Accelerated-Video-Encoding.pptxgopikahari7

26_Fan.pdfRioCarthiis

XboxTomas Hernandez

Introduction to GPUs for Machine LearningSri Ambati

Gpu perf-presentationGiannisTsagatakis

GPGPU Computationjtsagata

APSys Presentation Final copy2Junli Gu

GPU Compute in Medical and Print ImagingAMD

Ne xtscale infographicMartin Nielsen

E3MV - Embedded Vision - SundanceSundance Multiprocessor Technology Ltd.

Machine learning at Scale with Apache SparkMartin Zapletal

Mauricio breteernitiz hpc-exascale-isctembreternitz

osdi20-slides_zhao.pptxCive1971

Sony Computer Entertainment Europe Research & Development DivisionSlide_N

“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...Edge AI and Vision Alliance

Similar to XOHW: Rational behind FPGA (20)

Scalability for All: Unreal Engine* 4 with Intel

Threading Successes 06 Allegorithmic

Making the most out of Heterogeneous Chips with CPU, GPU and FPGA

The Next Generation of PhyreEngine

Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...

S12075-GPU-Accelerated-Video-Encoding.pptx

26_Fan.pdf

Xbox

Introduction to GPUs for Machine Learning

Gpu perf-presentation

GPGPU Computation

APSys Presentation Final copy2

GPU Compute in Medical and Print Imaging

Ne xtscale infographic

E3MV - Embedded Vision - Sundance

Machine learning at Scale with Apache Spark

Mauricio breteernitiz hpc-exascale-iscte

osdi20-slides_zhao.pptx

Sony Computer Entertainment Europe Research & Development Division

“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...

Recently uploaded

Online banking management system project.pdfKamal Acharya

Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile

UNIT-II FMM-Flow Through Circular Conduitsrknatarajan

ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya

Java Programming :Event Handling(Types of Events)simmis5

Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis

Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani

MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N

UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan

Introduction to Multiple Access Protocol.pptxupamatechverse

UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat

Recently uploaded (20)

Online banking management system project.pdf

Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...

UNIT-II FMM-Flow Through Circular Conduits

ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf

Java Programming :Event Handling(Types of Events)

Glass Ceramics: Processing and Properties

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...

Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts

Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record

MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working

Introduction to Multiple Access Protocol.pptx

UNIT-III FMM. DIMENSIONAL ANALYSIS

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...

XOHW: Rational behind FPGA

1. NNEURaMA Marco Bacis - Serena Farina - Stefania Deligia

2. Our algorithm: Convolutional Neural Networks Four Convolution: D= 3x3 S=1 N= 8, 16, 32, 32 4 x 4x32 2 x 2 x32 9x 9x32 51 x 51 x3 49 x 49 x8 24 x 24 x8 22 x 22x16 11 x 11 x16 Three MaxPooling: D= 3x3, 2x2, 3x3 S= 2 Fully Connected Softmax: p(MA) and p(!MA)

3. Three Competitors ASIC FPGA CPU GPU

4. Why not CPU? Convolution: Multiply-Accumulate operations +x x x x + Sequential execution Parallel execution CPU Custom Hardware Design

5. Why not ASIC? FPGA • Lower Cost • Adaptable • Medium Performance ASIC • High Performance • Fixed Design • High cost

6. Why not GPU?

7. Comparison Low performances Not reconfigurable High power consumption ASIC: hardwareCPU: software GPU: hardware

8. The solution

9. Thanks for your attention Questions? marco.bacis@mail.polimi.it stefania.deligia@mail.polimi.it serena.farina@mail.polimi.it Please, follow us: https://twitter.com/nneuramaatnecst https://www.facebook.com/pg/NNEURaMAatNECST/ https://www.slideshare.net/NNEURaMAProject XOHW17_NNEURaMA_public

Editor's Notes

Hello from NNEURaMA project. In today’s video we are going to show the reasons that have brought us towards the use of an FPGA chip in our project.
First, let’s have a brief recap of our project : nneurama aims at recognizing the presence of microaneurysms in retinal images. In order to perform the classification, we have chosen to use Convolutional Neural Networks, as they are the best kind of algorithm suited for images recognition and classification. In particular, a CNN is composed of different layers which extract different features from a picture and combine them. All these operations are carried over thanks to a set of trained weights. The most compute intensive operations of CNNs are convolutions and pooling, so we have chosen our architecture based on their acceleration.
FPGAs are not the only type of architecture available to implement our application. Standars CPUs, GPUs and also Application specific integrated circuits are available too. We are now going to illustrate what are the characteristics that have brought us far from these competitors.
The main operation in a CNN is MAC (Multiply and Accumulate). Usually, a CPU performs this operations in a sequential manner, which leads to low performances in case of a high number of operations, such as in a CNN. With a custom hardware design a higher parallelization is possible, meaning that many operations can be performed in parallel, given their independence. This is the case of convolution, as the MAC operations on the same layer depend only on the previous layer, and not between themselves. In this way, a better use of the hardware is possible, with higher performances.
Given that a custom hardware design seems to be the best option to implement our algorithm, we are faced with the choice of implementing it on an fpga, or design a custom digital circuit (called ASIC: Application specific integrated circuit). ASICs allow to have the best performances available given a specific application to accelerate, but they present high costs and a fixed design. In the case of Neural networks (and CNNs) the parameters and topology of the network are improved iteratively, meaning that the design is continuosly updated. The network is changed even over a long period of time, because of new and better methods and techniques. This prevent us to choose a fixed circuit design for our application, as the performances don’t justify the high cost and low adaptivity. Meanwhile, FPGAs show a better tradeoff between costs, performances and flexibility, thanks to their reconfigurability.
Apart from video processing, GPUs can be used to accelerate algorithms by using a large set of parallel processors. Unfortunately, they require a high power to work, and even in large datacenters power consumption has become an issue, in addition to heat dissipation. On the other hand, FPGAs have a lower power consumption and can reach the same performances of GPUs with given conditions.
Based on the previous explanations we can draw our conclusions… CPUs don’t have the needed performance in term of throughput, ASICs are performant but not reconfigurable and GPUs are too power hungry…
This has lead us to choose FPGAs for the implementation of our CNN-based algorithm.
Thanks for watching! Like and share this video. You can find us also on facebook and twitter, If you have any question feel free to contact us!

XOHW: Rational behind FPGA

Recommended

Recommended

More Related Content

Similar to XOHW: Rational behind FPGA

Similar to XOHW: Rational behind FPGA (20)

Recently uploaded

Recently uploaded (20)

XOHW: Rational behind FPGA

Editor's Notes