SlideShare a Scribd company logo
VISpro V i s i o n P r o c e s s i n g L a b
Literature Summary
2
List
1. Implementation of a RISC-V Processor with Hardware
Accelerator
2. A Reconfigurable Convolutional Neural Network-
Accelerated Coprocessor Based on RISC-V Instruction Set
3. Open-Source RISC-V Processor IP Cores for FPGAs –
Overview and Evaluation
4. Extending a Soft-Core RISC-V Processor to Accelerate
CNN Inference
5. XpulpNN: Accelerating Quantized Neural Networks on
RISC-V Processors Through ISA Extensions
6. RISC-V Barrel Processor for Deep Neural Network
Acceleration
7. Wip: The RISC-V Instruction Set Architecture Optimization
and Fixed-point Math Library Co-design
(BE Thesis: Chalmers uni 2019)
(MDPI Electronics 2020)
(MECO 2019)
(MDPI Electronics 2020)
(DATE 2020)
(ISCAS 2021)
(CODES+ISSS 2021)
3
Implementation of a RISC-V Processor with Hardware Accelerator
■ Design HW accelerator for matrix operations and implement it together with a
RISC-V processor.
■ Support a set of operations.
■ Evaluated and compared against the RISC-V core doing in software.
■ Compared with other alternative ways of doing equivalent matrix calculations.
4
A Reconfigurable Convolutional Neural Network-Accelerated Coprocessor
Based on RISC-V Instruction Set
■ Proposed CNN accelerator is connected to the E203 core as a coprocessor,
and the corresponding custom coprocessor instructions are designed.
■ Implement the four basic algorithms of convolution, pooling, ReLU, and matrix
addition.
■ The compiler environment and library functions are established for the designed
instructions, the hardware and software design of the coprocessor is completed,
and the implementation process of common algorithms in the IoT system on the
coprocessor is described.
■ Finally, resource consumption evaluation and performance analysis of the
coprocessor are completed on a Xilinx FPGA.
■ Acceleration of convolution has reached 6.27 times that achieved by standard
instruction set.
5
Open-Source RISC-V Processor IP Cores for FPGAs – Overview and Evaluation
■
Comparison of open-source processor IP cores. Only open-source cores with free isa, 32-bit data width, available standard bus interface
synchronous design, tool- and technology independent code base, and existing compiler support were considered here. The first four cores are
listed for comparison only.
6
Extending a Soft-Core RISC-V Processor to Accelerate CNN Inference
■ Investigate the potential of extending RISC-V ISA for accelerating the inference
of a CNN using in-pipeline hardware blocks and custom instructions.
■ Preliminary designs have a small footprint and minimal impact on maximum
core frequency.
7
XpulpNN: Accelerating Quantized Neural Networks on RISC-V Processors
Through ISA Extensions
■ A set of extensions to the RISC-V ISA, aimed at boosting the energy efficiency
of low bit width QNNs on low-power microcontroller-class cores.
■ Integrate proposed extensions RTL description and GCC toolchain of an open-
source RISC-V processor targeting energy-efficiency.
■ Implement a full microcontroller system (based on open-source PULPissimo)
■ Extended processor features 11.1% area, 5.9% power overhead, and negligible
timing overhead compared to the original one;
8
RISC-V Barrel Processor for Deep Neural Network Acceleration
■ Designed a barrel RISC-V processor to control NN processing elements (PEs).
■ For lower Area, Processor is an implementation of RV32I + custom CSRs for
controlling the PEs.
■ To demonstrate the capabilities of design, we computed a GEMV operation with
an input matrix size of 8 by 128 and a weight matrix size of 128 by 128 with two-
bit precision in only 16 clock cycles
(ISCAS 2021)
CSRs Control and Status Registers PE performs arbitrary precision GEneral Matrix Vector (GEMV) operations.
9
Work-in-Progress: The RISC-V Instruction Set Architecture Optimization and
Fixed-point Math Library Co-design
■ Found: current RISC-V instruction set has room of improvement for lightweight
mathematical operations.
■ Introduce A set of custom RISC-V ISA designs for fixed-point math libraries
■ Perform performance evaluations to demonstrate the optimization results.
End

More Related Content

Similar to Literature Summary.pptx

The sunsparc architecture
The sunsparc architectureThe sunsparc architecture
The sunsparc architecture
Taha Malampatti
 
DESIGN AND IMPLEMENTATION OF 64-BIT ARITHMETIC LOGIC UNIT ON FPGA USING VHDL
DESIGN AND IMPLEMENTATION OF 64-BIT ARITHMETIC LOGIC UNIT ON FPGA USING VHDLDESIGN AND IMPLEMENTATION OF 64-BIT ARITHMETIC LOGIC UNIT ON FPGA USING VHDL
DESIGN AND IMPLEMENTATION OF 64-BIT ARITHMETIC LOGIC UNIT ON FPGA USING VHDL
sateeshkourav
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at Scale
James Saint-Rossy
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
Ganesan Narayanasamy
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
11 2014
11 201411 2014
11 2014
Nely Ciobanu
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
Dr. Swaminathan Kathirvel
 
IJCRT2006062.pdf
IJCRT2006062.pdfIJCRT2006062.pdf
IJCRT2006062.pdf
ssuser1e1bab
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Dr. Fabio Baruffa
 
Ef35745749
Ef35745749Ef35745749
Ef35745749
IJERA Editor
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit Supercomputer
VigneshwarRamaswamy
 
Streaming multiprocessors and HPC
Streaming multiprocessors and HPCStreaming multiprocessors and HPC
Streaming multiprocessors and HPC
OmkarKachare1
 
NSCC Training Introductory Class
NSCC Training Introductory Class NSCC Training Introductory Class
NSCC Training Introductory Class
National Supercomputing Centre Singapore
 
VSPERF BEnchmarking the Network Data Plane of NFV VDevices and VLinks
VSPERF BEnchmarking the Network Data Plane of NFV VDevices and VLinksVSPERF BEnchmarking the Network Data Plane of NFV VDevices and VLinks
VSPERF BEnchmarking the Network Data Plane of NFV VDevices and VLinks
OPNFV
 
Hg3612911294
Hg3612911294Hg3612911294
Hg3612911294
IJERA Editor
 
DESIGN AND IMPLEMENTATION OF I2C AND UART BLOCK IMPLEMENTATION FOR RISC-V SOC
DESIGN AND IMPLEMENTATION OF I2C AND UART BLOCK IMPLEMENTATION FOR RISC-V SOCDESIGN AND IMPLEMENTATION OF I2C AND UART BLOCK IMPLEMENTATION FOR RISC-V SOC
DESIGN AND IMPLEMENTATION OF I2C AND UART BLOCK IMPLEMENTATION FOR RISC-V SOC
IRJET Journal
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
PT Datacomm Diangraha
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
Marina Kolpakova
 
Implementation of Soft-core processor on FPGA (Final Presentation)
Implementation of Soft-core processor on FPGA (Final Presentation)Implementation of Soft-core processor on FPGA (Final Presentation)
Implementation of Soft-core processor on FPGA (Final Presentation)
Deepak Kumar
 

Similar to Literature Summary.pptx (20)

The sunsparc architecture
The sunsparc architectureThe sunsparc architecture
The sunsparc architecture
 
DESIGN AND IMPLEMENTATION OF 64-BIT ARITHMETIC LOGIC UNIT ON FPGA USING VHDL
DESIGN AND IMPLEMENTATION OF 64-BIT ARITHMETIC LOGIC UNIT ON FPGA USING VHDLDESIGN AND IMPLEMENTATION OF 64-BIT ARITHMETIC LOGIC UNIT ON FPGA USING VHDL
DESIGN AND IMPLEMENTATION OF 64-BIT ARITHMETIC LOGIC UNIT ON FPGA USING VHDL
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at Scale
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
11 2014
11 201411 2014
11 2014
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
 
IJCRT2006062.pdf
IJCRT2006062.pdfIJCRT2006062.pdf
IJCRT2006062.pdf
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
Ef35745749
Ef35745749Ef35745749
Ef35745749
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit Supercomputer
 
Streaming multiprocessors and HPC
Streaming multiprocessors and HPCStreaming multiprocessors and HPC
Streaming multiprocessors and HPC
 
NSCC Training Introductory Class
NSCC Training Introductory Class NSCC Training Introductory Class
NSCC Training Introductory Class
 
VSPERF BEnchmarking the Network Data Plane of NFV VDevices and VLinks
VSPERF BEnchmarking the Network Data Plane of NFV VDevices and VLinksVSPERF BEnchmarking the Network Data Plane of NFV VDevices and VLinks
VSPERF BEnchmarking the Network Data Plane of NFV VDevices and VLinks
 
Hg3612911294
Hg3612911294Hg3612911294
Hg3612911294
 
DESIGN AND IMPLEMENTATION OF I2C AND UART BLOCK IMPLEMENTATION FOR RISC-V SOC
DESIGN AND IMPLEMENTATION OF I2C AND UART BLOCK IMPLEMENTATION FOR RISC-V SOCDESIGN AND IMPLEMENTATION OF I2C AND UART BLOCK IMPLEMENTATION FOR RISC-V SOC
DESIGN AND IMPLEMENTATION OF I2C AND UART BLOCK IMPLEMENTATION FOR RISC-V SOC
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
 
Implementation of Soft-core processor on FPGA (Final Presentation)
Implementation of Soft-core processor on FPGA (Final Presentation)Implementation of Soft-core processor on FPGA (Final Presentation)
Implementation of Soft-core processor on FPGA (Final Presentation)
 

Recently uploaded

Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
gaafergoudaay7aga
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
VANDANAMOHANGOUDA
 

Recently uploaded (20)

Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
 

Literature Summary.pptx

  • 1. VISpro V i s i o n P r o c e s s i n g L a b Literature Summary
  • 2. 2 List 1. Implementation of a RISC-V Processor with Hardware Accelerator 2. A Reconfigurable Convolutional Neural Network- Accelerated Coprocessor Based on RISC-V Instruction Set 3. Open-Source RISC-V Processor IP Cores for FPGAs – Overview and Evaluation 4. Extending a Soft-Core RISC-V Processor to Accelerate CNN Inference 5. XpulpNN: Accelerating Quantized Neural Networks on RISC-V Processors Through ISA Extensions 6. RISC-V Barrel Processor for Deep Neural Network Acceleration 7. Wip: The RISC-V Instruction Set Architecture Optimization and Fixed-point Math Library Co-design (BE Thesis: Chalmers uni 2019) (MDPI Electronics 2020) (MECO 2019) (MDPI Electronics 2020) (DATE 2020) (ISCAS 2021) (CODES+ISSS 2021)
  • 3. 3 Implementation of a RISC-V Processor with Hardware Accelerator ■ Design HW accelerator for matrix operations and implement it together with a RISC-V processor. ■ Support a set of operations. ■ Evaluated and compared against the RISC-V core doing in software. ■ Compared with other alternative ways of doing equivalent matrix calculations.
  • 4. 4 A Reconfigurable Convolutional Neural Network-Accelerated Coprocessor Based on RISC-V Instruction Set ■ Proposed CNN accelerator is connected to the E203 core as a coprocessor, and the corresponding custom coprocessor instructions are designed. ■ Implement the four basic algorithms of convolution, pooling, ReLU, and matrix addition. ■ The compiler environment and library functions are established for the designed instructions, the hardware and software design of the coprocessor is completed, and the implementation process of common algorithms in the IoT system on the coprocessor is described. ■ Finally, resource consumption evaluation and performance analysis of the coprocessor are completed on a Xilinx FPGA. ■ Acceleration of convolution has reached 6.27 times that achieved by standard instruction set.
  • 5. 5 Open-Source RISC-V Processor IP Cores for FPGAs – Overview and Evaluation ■ Comparison of open-source processor IP cores. Only open-source cores with free isa, 32-bit data width, available standard bus interface synchronous design, tool- and technology independent code base, and existing compiler support were considered here. The first four cores are listed for comparison only.
  • 6. 6 Extending a Soft-Core RISC-V Processor to Accelerate CNN Inference ■ Investigate the potential of extending RISC-V ISA for accelerating the inference of a CNN using in-pipeline hardware blocks and custom instructions. ■ Preliminary designs have a small footprint and minimal impact on maximum core frequency.
  • 7. 7 XpulpNN: Accelerating Quantized Neural Networks on RISC-V Processors Through ISA Extensions ■ A set of extensions to the RISC-V ISA, aimed at boosting the energy efficiency of low bit width QNNs on low-power microcontroller-class cores. ■ Integrate proposed extensions RTL description and GCC toolchain of an open- source RISC-V processor targeting energy-efficiency. ■ Implement a full microcontroller system (based on open-source PULPissimo) ■ Extended processor features 11.1% area, 5.9% power overhead, and negligible timing overhead compared to the original one;
  • 8. 8 RISC-V Barrel Processor for Deep Neural Network Acceleration ■ Designed a barrel RISC-V processor to control NN processing elements (PEs). ■ For lower Area, Processor is an implementation of RV32I + custom CSRs for controlling the PEs. ■ To demonstrate the capabilities of design, we computed a GEMV operation with an input matrix size of 8 by 128 and a weight matrix size of 128 by 128 with two- bit precision in only 16 clock cycles (ISCAS 2021) CSRs Control and Status Registers PE performs arbitrary precision GEneral Matrix Vector (GEMV) operations.
  • 9. 9 Work-in-Progress: The RISC-V Instruction Set Architecture Optimization and Fixed-point Math Library Co-design ■ Found: current RISC-V instruction set has room of improvement for lightweight mathematical operations. ■ Introduce A set of custom RISC-V ISA designs for fixed-point math libraries ■ Perform performance evaluations to demonstrate the optimization results.
  • 10. End

Editor's Notes

  1. In this presentation we will discuss our understanding of MatMul Subsystem developed by UCI+Xcelerium Team,
  2. Bachelor's Thesis: Chalmers university, Sweden 2019 MDPI Electronics 9, 1005 (2020) MECO Mediterranean conference on embedded computing 2019 5 May 2020, MDPI, Electronics (Design, Automation And Test in Europe DATE 2020) 2021 IEEE International Symposium on Circuits and Systems (ISCAS). 2021 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
  3. Designs a reconfigurable CNN-accelerated coprocessor based on the RISC-V instruction set