SlideShare a Scribd company logo
1 of 10
Download to read offline
Francesco Sgherzi
Alberto Parravicini
Approximate
Personalized PageRank
on FPGA
Francesco Sgherzi
Alberto Parravicini
29 July 2020
Approximating Personalized
PageRank (PPR) on FPGA
PPR is a building block of recommender systems
in e-commerce and social networks
● For an input node, find the most similar nodes
● We need real-time results, with
high energy efficiency
2
Francesco Sgherzi
Alberto Parravicini
29 July 2020
Approximating Personalized
PageRank (PPR) on FPGA
PPR is a building block of recommender systems
in e-commerce and social networks
● For an input node, find the most similar nodes
● We need real-time results, with
high energy efficiency
No need for 100% accuracy
● The ranking is what matters!
● Use reduced-precision fixed-point arithmetic
for better performance at no accuracy cost
● FPGAs provides high-performance
and energy efficiency
3
Francesco Sgherzi
Alberto Parravicini
29 July 2020
Problem Definition
Graphs can be seen as sparse matrices
● E.g. COO format, list of edges/non-zero
matrix entries
4
Francesco Sgherzi
Alberto Parravicini
29 July 2020
Problem Definition
Graphs can be seen as sparse matrices
● E.g. COO format, list of edges/non-zero
matrix entries
● The PPR equation becomes a repeated
SpMV (Sparse matrix-vector multiplication)
● The SpMV is the biggest bottleneck
5
Francesco Sgherzi
Alberto Parravicini
29 July 2020
SpMV Architecture
The heart of our implementation is a
data-flow reduced-precision SpMV core
6
● Implemented on a
Xilinx Alveo U200
Accelerator Card
● Use UltraRAM for fast
random accesses
● Use DRAM for burst
sequential reads, at
peak bandwidth
Francesco Sgherzi
Alberto Parravicini
29 July 2020
Experimental Setup
We compare against a multi-threaded SoA CPU
and a floating-point FPGA implementations
Test with 8 different real and synthetic graphs
● Between 500K and 2M edges
● Size similar to real community and co-purchasing graphs
Different fixed-point sizes, from 20 to 26 bits
● Lower bit-width gives better clock speed and lower resource usage
CPU: 2x Intel Xeon E5-2680 v2, 384GB RAM
FPGA: Xilinx Alveo U200
7
Francesco Sgherzi
Alberto Parravicini
29 July 2020
Results - Speedup
Up to 6x speedup w.r.t. CPU and floating-point FPGA
● 42x higher energy efficiency w.r.t. CPU and up to 6x w.r.t. floating point
FPGA
8
Francesco Sgherzi
Alberto Parravicini
29 July 2020
Results - Ranking
Accuracy of results tested with many metrics
● Num. of errors, NDCG, Precision, Kendall’s 𝜏, ...
● 26 bits provide no accuracy loss
Fixed-point arithmetic provides 2x faster convergence
9
● High-Performance FPGA implementation of Approximate PPR
● Use fixed-point data-flow SpMV for maximum throughput
● Up to 6.8x faster than CPU and 42x higher energy efficiency
● No accuracy loss and 2x faster convergence
Thank you!
Approximate Personalized Pagerank on FPGA
Francesco Sgherzi - francesco1.sgherzi@mail.polimi.it
Alberto Parravicini - alberto.parravicini@polimi.it

More Related Content

Similar to Approximate Personalized PageRank on FPGA .

Firmware Develpment for hybrid (ARM and FPGA) processors
Firmware Develpment for hybrid (ARM and FPGA) processorsFirmware Develpment for hybrid (ARM and FPGA) processors
Firmware Develpment for hybrid (ARM and FPGA) processorsMirko Mariotti
 
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET Journal
 
FPGAs for Supercomputing: The Why and How
FPGAs for Supercomputing: The Why and HowFPGAs for Supercomputing: The Why and How
FPGAs for Supercomputing: The Why and HowDESMOND YUEN
 
Running BGP with Mikrotik
Running BGP with MikrotikRunning BGP with Mikrotik
Running BGP with MikrotikGLC Networks
 
Introduction to Programmable Networks by Clarence Anslem, Intel
Introduction to Programmable Networks by Clarence Anslem, IntelIntroduction to Programmable Networks by Clarence Anslem, Intel
Introduction to Programmable Networks by Clarence Anslem, IntelMyNOG
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...Equnix Business Solutions
 
B-DCGAN Slides for ICONIP2019
B-DCGAN Slides for ICONIP2019B-DCGAN Slides for ICONIP2019
B-DCGAN Slides for ICONIP2019Hideo Terada
 
Achieve High-Performance with Optimizing Device Specifications in FPGA Design
Achieve High-Performance with Optimizing Device Specifications in FPGA DesignAchieve High-Performance with Optimizing Device Specifications in FPGA Design
Achieve High-Performance with Optimizing Device Specifications in FPGA DesignLogic Fruit Technologies
 
Efficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresEfficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresNxfee Innovation
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Haidee McMahon
 
Raspberry pi robotics
Raspberry pi roboticsRaspberry pi robotics
Raspberry pi roboticsLloydMoore
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAlluxio, Inc.
 
Gospel - High-performance heterogeneous architectures for graph analytics
 Gospel - High-performance heterogeneous architectures for graph analytics Gospel - High-performance heterogeneous architectures for graph analytics
Gospel - High-performance heterogeneous architectures for graph analyticsNECST Lab @ Politecnico di Milano
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningSergey Karayev
 

Similar to Approximate Personalized PageRank on FPGA . (20)

Firmware Develpment for hybrid (ARM and FPGA) processors
Firmware Develpment for hybrid (ARM and FPGA) processorsFirmware Develpment for hybrid (ARM and FPGA) processors
Firmware Develpment for hybrid (ARM and FPGA) processors
 
DELD Unit V cpld_fpga
DELD Unit V cpld_fpgaDELD Unit V cpld_fpga
DELD Unit V cpld_fpga
 
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
 
FPGAs for Supercomputing: The Why and How
FPGAs for Supercomputing: The Why and HowFPGAs for Supercomputing: The Why and How
FPGAs for Supercomputing: The Why and How
 
Running BGP with Mikrotik
Running BGP with MikrotikRunning BGP with Mikrotik
Running BGP with Mikrotik
 
Introduction to Programmable Networks by Clarence Anslem, Intel
Introduction to Programmable Networks by Clarence Anslem, IntelIntroduction to Programmable Networks by Clarence Anslem, Intel
Introduction to Programmable Networks by Clarence Anslem, Intel
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
 
B-DCGAN Slides for ICONIP2019
B-DCGAN Slides for ICONIP2019B-DCGAN Slides for ICONIP2019
B-DCGAN Slides for ICONIP2019
 
Achieve High-Performance with Optimizing Device Specifications in FPGA Design
Achieve High-Performance with Optimizing Device Specifications in FPGA DesignAchieve High-Performance with Optimizing Device Specifications in FPGA Design
Achieve High-Performance with Optimizing Device Specifications in FPGA Design
 
INFN SOSC 2022 Talk
INFN SOSC 2022 TalkINFN SOSC 2022 Talk
INFN SOSC 2022 Talk
 
Efficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresEfficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft cores
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
 
Raspberry pi robotics
Raspberry pi roboticsRaspberry pi robotics
Raspberry pi robotics
 
OHM CAD SYSTEM Capabilities
OHM CAD SYSTEM CapabilitiesOHM CAD SYSTEM Capabilities
OHM CAD SYSTEM Capabilities
 
FPGA-enhanced Bioinformatics @ NECST
FPGA-enhanced Bioinformatics @ NECSTFPGA-enhanced Bioinformatics @ NECST
FPGA-enhanced Bioinformatics @ NECST
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
 
Gospel - High-performance heterogeneous architectures for graph analytics
 Gospel - High-performance heterogeneous architectures for graph analytics Gospel - High-performance heterogeneous architectures for graph analytics
Gospel - High-performance heterogeneous architectures for graph analytics
 
Vlsi Projects titles 2018 19
Vlsi Projects titles 2018 19Vlsi Projects titles 2018 19
Vlsi Projects titles 2018 19
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep Learning
 

More from NECST Lab @ Politecnico di Milano

Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingNECST Lab @ Politecnico di Milano
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...NECST Lab @ Politecnico di Milano
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification SystemNECST Lab @ Politecnico di Milano
 
Maeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingMaeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingNECST Lab @ Politecnico di Milano
 

More from NECST Lab @ Politecnico di Milano (20)

Mesticheria Team - WiiReflex
Mesticheria Team - WiiReflexMesticheria Team - WiiReflex
Mesticheria Team - WiiReflex
 
Punto e virgola Team - Stressometro
Punto e virgola Team - StressometroPunto e virgola Team - Stressometro
Punto e virgola Team - Stressometro
 
BitIt Team - Stay.straight
BitIt Team - Stay.straight BitIt Team - Stay.straight
BitIt Team - Stay.straight
 
BabYodini Team - Talking Gloves
BabYodini Team - Talking GlovesBabYodini Team - Talking Gloves
BabYodini Team - Talking Gloves
 
printf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTonprintf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTon
 
BlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking PlatformBlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking Platform
 
#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome
 
Flipflops Team - Wave U
Flipflops Team - Wave UFlipflops Team - Wave U
Flipflops Team - Wave U
 
Bug(atta) Team - Little Brother
Bug(atta) Team - Little BrotherBug(atta) Team - Little Brother
Bug(atta) Team - Little Brother
 
#NECSTCamp: come partecipare
#NECSTCamp: come partecipare#NECSTCamp: come partecipare
#NECSTCamp: come partecipare
 
NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1
 
NECSTLab101 2020.2021
NECSTLab101 2020.2021NECSTLab101 2020.2021
NECSTLab101 2020.2021
 
TreeHouse, nourish your community
TreeHouse, nourish your communityTreeHouse, nourish your community
TreeHouse, nourish your community
 
TiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architectureTiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architecture
 
Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposing
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification System
 
Luns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural networkLuns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural network
 
BlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAsBlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAs
 
Maeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingMaeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matching
 

Recently uploaded

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 

Recently uploaded (20)

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 

Approximate Personalized PageRank on FPGA .

  • 2. Francesco Sgherzi Alberto Parravicini 29 July 2020 Approximating Personalized PageRank (PPR) on FPGA PPR is a building block of recommender systems in e-commerce and social networks ● For an input node, find the most similar nodes ● We need real-time results, with high energy efficiency 2
  • 3. Francesco Sgherzi Alberto Parravicini 29 July 2020 Approximating Personalized PageRank (PPR) on FPGA PPR is a building block of recommender systems in e-commerce and social networks ● For an input node, find the most similar nodes ● We need real-time results, with high energy efficiency No need for 100% accuracy ● The ranking is what matters! ● Use reduced-precision fixed-point arithmetic for better performance at no accuracy cost ● FPGAs provides high-performance and energy efficiency 3
  • 4. Francesco Sgherzi Alberto Parravicini 29 July 2020 Problem Definition Graphs can be seen as sparse matrices ● E.g. COO format, list of edges/non-zero matrix entries 4
  • 5. Francesco Sgherzi Alberto Parravicini 29 July 2020 Problem Definition Graphs can be seen as sparse matrices ● E.g. COO format, list of edges/non-zero matrix entries ● The PPR equation becomes a repeated SpMV (Sparse matrix-vector multiplication) ● The SpMV is the biggest bottleneck 5
  • 6. Francesco Sgherzi Alberto Parravicini 29 July 2020 SpMV Architecture The heart of our implementation is a data-flow reduced-precision SpMV core 6 ● Implemented on a Xilinx Alveo U200 Accelerator Card ● Use UltraRAM for fast random accesses ● Use DRAM for burst sequential reads, at peak bandwidth
  • 7. Francesco Sgherzi Alberto Parravicini 29 July 2020 Experimental Setup We compare against a multi-threaded SoA CPU and a floating-point FPGA implementations Test with 8 different real and synthetic graphs ● Between 500K and 2M edges ● Size similar to real community and co-purchasing graphs Different fixed-point sizes, from 20 to 26 bits ● Lower bit-width gives better clock speed and lower resource usage CPU: 2x Intel Xeon E5-2680 v2, 384GB RAM FPGA: Xilinx Alveo U200 7
  • 8. Francesco Sgherzi Alberto Parravicini 29 July 2020 Results - Speedup Up to 6x speedup w.r.t. CPU and floating-point FPGA ● 42x higher energy efficiency w.r.t. CPU and up to 6x w.r.t. floating point FPGA 8
  • 9. Francesco Sgherzi Alberto Parravicini 29 July 2020 Results - Ranking Accuracy of results tested with many metrics ● Num. of errors, NDCG, Precision, Kendall’s 𝜏, ... ● 26 bits provide no accuracy loss Fixed-point arithmetic provides 2x faster convergence 9
  • 10. ● High-Performance FPGA implementation of Approximate PPR ● Use fixed-point data-flow SpMV for maximum throughput ● Up to 6.8x faster than CPU and 42x higher energy efficiency ● No accuracy loss and 2x faster convergence Thank you! Approximate Personalized Pagerank on FPGA Francesco Sgherzi - francesco1.sgherzi@mail.polimi.it Alberto Parravicini - alberto.parravicini@polimi.it