SlideShare a Scribd company logo
1 of 15
Download to read offline
1
Operating Systems for FPGAs
–
Why and How?
Dirk Koch and a great team,
University of Manchester, UK
dirk.koch@manchester.ac.uk
2
FOS – the FPGA Operating System
3
FOS – the FPGA Operating System
ZUCL: pre-built Shells
for Zynq Ultrascale+
with middleware for
running OpenCL
 Comes with compile
scripts (RTL/HLS 
relocatable bitstream
 UltraZed, Ultra96, ZCU 102
4
Resource Elastic FPGA Virtualization (for OpenCL)
 Using aggressive reconfiguration to keep utilization high
 Virtualization in the space-domain
(time-domain fallback, if needed)
FOS – the FPGA Operating System
5
How to deploy FPGAs in datacenters?
Several commercial settings:
6
How to deploy FPGAs in datacenters
 Different development tools, tool versions, progr. models
(HLS or RTL likely not portable)
 Different shell infrastructures
(even if cloud vendors use same
FPGA, memory, or board)
 Different HALs and middleware
 No real abstraction (shell updates  accelerator updates)
 No real cloud principles (elasticity? resource pooling?)
 Hardware security? (applies to single and multi-tenancy)
7
How NOT to deploy FPGAs in a datacenter!
Example AWS:
 Shell interface
 Logical interface is fixed
(AXI buses)
 Physical is not fixed
(exact layout of the shell
and routing
 Creates dependency
 Modules run only with a
specific shell
 Requires shell management
shell
memory
memorymemory
memory
die1
die2
die3
8
How to deploy FPGAs in datacenters!
Example PCIeHLS [FSP 2017]:
 Shell with shared memory
and all I/O interfaces
 Fixed internal AXI interfaces
(clock-domain crossing)
 Modules of different size
 Module bitstream relocation
 BitMan [DATE 2017]
 Module replication
 Independence of shell and
accelerator modules
 Can share memory, I/O
bandwidth, FPGA resources
PCIe
&
2x
DDR3
slot1
slot2
slot3
slot4
slotX
9
How to deploy FPGAs in datacenters!
shell
memorymemory
memory
memory
die1
die2
die3
PCIe
PCIe
&
2x
DDR3
slot1
slot2
slot3
slot4
slotX
Alveoshell
10
 Resource Elasticity
In a networked system
 Load balancing and
load distribution
 System keeps track
about input data and
committed results
 Implements a distributed
checkpointing scheme
(tailored to OpenCL)
resilience
 Also useful for
maintenance/updates
Live Migration
11
11
FPGAs for Datacenters (H2020 ECOSCALE)
ECOSCALE demonstrator fully-populated 1u blade with
32 x Zynq UltraScale+ (ZU9EG with 16GB/node or 512 GB/blade)
www.ecoscale.eu
12
12
FPGAs for Datacenters (H2020 ECOSCALE)
Partial reconfiguration allows moving
compute to data  huge energy savings
13
13
FPGAs for Datacenters (H2020 EuroEXA)
• 2-D integr. chips with custom ASIC (64b-ARM) and 2xVU9P FPGAs
• 300+ VU9Ps (will provide ~1B LUTs, 2M DSPs) www.euroexa.eu
14
FPGADefender Virus Scanning for FPGAs
 Detects probably any kind of self-oscillating circuit
 Scans bitstream encoding (short circuits), high fan-out nets,
wire tapping, module bounding boxes, glitch-amplification,
interfaces to the shell, … (all at bitstream (netlist) level)
15
People:
 Tuan Minh La {tuan.la@postgrad.manchester.ac.uk}
 Anuj Vaishnav {anuj.vaishnav@manchester.ac.uk}
 Khoa Dang Pham {khoa.pham@manchester.ac.uk}
 Kaspar Matas {kaspar.matas@manchester.ac.uk}
 Nikola Grunchevski {nikola.grunchevski@manchester.ac.uk}
 Dirk Koch {dirk.koch@manchester.ac.uk}
FPGADefenderFPGADefender
Secure and Virtualized FPGA Management for FPGAs

More Related Content

More from LEGATO project

HiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat dataHiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat data
LEGATO project
 
Moldable pipelines for CNNs on heterogeneous edge devices
Moldable pipelines for CNNs on heterogeneous edge devicesMoldable pipelines for CNNs on heterogeneous edge devices
Moldable pipelines for CNNs on heterogeneous edge devices
LEGATO project
 
Low Energy Task Scheduling based on Work Stealing
Low Energy Task Scheduling based on Work StealingLow Energy Task Scheduling based on Work Stealing
Low Energy Task Scheduling based on Work Stealing
LEGATO project
 

More from LEGATO project (20)

LEGaTO Integration
LEGaTO IntegrationLEGaTO Integration
LEGaTO Integration
 
LEGaTO: Use cases
LEGaTO: Use casesLEGaTO: Use cases
LEGaTO: Use cases
 
LEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming Models
 
LEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack Runtimes
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous Hardware
 
LEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing Workshop
 
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneTZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
 
Infection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingInfection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow Computing
 
Smart Home - AI at the edge
Smart Home - AI at the edgeSmart Home - AI at the edge
Smart Home - AI at the edge
 
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-ResiliencyFPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
 
RECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient ComputingRECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
 
Secure Task-Based Programming with OmpSs and SGX
Secure Task-Based Programming with OmpSs and SGXSecure Task-Based Programming with OmpSs and SGX
Secure Task-Based Programming with OmpSs and SGX
 
HiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat dataHiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat data
 
Moldable pipelines for CNNs on heterogeneous edge devices
Moldable pipelines for CNNs on heterogeneous edge devicesMoldable pipelines for CNNs on heterogeneous edge devices
Moldable pipelines for CNNs on heterogeneous edge devices
 
Low Energy Task Scheduling based on Work Stealing
Low Energy Task Scheduling based on Work StealingLow Energy Task Scheduling based on Work Stealing
Low Energy Task Scheduling based on Work Stealing
 
Privacy Preserving Cloud Storage - A Rollback Protection Service for Untruste...
Privacy Preserving Cloud Storage - A Rollback Protection Service for Untruste...Privacy Preserving Cloud Storage - A Rollback Protection Service for Untruste...
Privacy Preserving Cloud Storage - A Rollback Protection Service for Untruste...
 
Introducing Generalized Deduplication for Energy-efficient IoT Networks with ...
Introducing Generalized Deduplication for Energy-efficient IoT Networks with ...Introducing Generalized Deduplication for Energy-efficient IoT Networks with ...
Introducing Generalized Deduplication for Energy-efficient IoT Networks with ...
 
SpecFuzz: Bringing Spectre-type vulnerabilities to the surface
SpecFuzz: Bringing Spectre-type vulnerabilities to the surfaceSpecFuzz: Bringing Spectre-type vulnerabilities to the surface
SpecFuzz: Bringing Spectre-type vulnerabilities to the surface
 

Recently uploaded

biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 

Operating Systems for FPGAs - Why and How?

  • 1. 1 Operating Systems for FPGAs – Why and How? Dirk Koch and a great team, University of Manchester, UK dirk.koch@manchester.ac.uk
  • 2. 2 FOS – the FPGA Operating System
  • 3. 3 FOS – the FPGA Operating System ZUCL: pre-built Shells for Zynq Ultrascale+ with middleware for running OpenCL  Comes with compile scripts (RTL/HLS  relocatable bitstream  UltraZed, Ultra96, ZCU 102
  • 4. 4 Resource Elastic FPGA Virtualization (for OpenCL)  Using aggressive reconfiguration to keep utilization high  Virtualization in the space-domain (time-domain fallback, if needed) FOS – the FPGA Operating System
  • 5. 5 How to deploy FPGAs in datacenters? Several commercial settings:
  • 6. 6 How to deploy FPGAs in datacenters  Different development tools, tool versions, progr. models (HLS or RTL likely not portable)  Different shell infrastructures (even if cloud vendors use same FPGA, memory, or board)  Different HALs and middleware  No real abstraction (shell updates  accelerator updates)  No real cloud principles (elasticity? resource pooling?)  Hardware security? (applies to single and multi-tenancy)
  • 7. 7 How NOT to deploy FPGAs in a datacenter! Example AWS:  Shell interface  Logical interface is fixed (AXI buses)  Physical is not fixed (exact layout of the shell and routing  Creates dependency  Modules run only with a specific shell  Requires shell management shell memory memorymemory memory die1 die2 die3
  • 8. 8 How to deploy FPGAs in datacenters! Example PCIeHLS [FSP 2017]:  Shell with shared memory and all I/O interfaces  Fixed internal AXI interfaces (clock-domain crossing)  Modules of different size  Module bitstream relocation  BitMan [DATE 2017]  Module replication  Independence of shell and accelerator modules  Can share memory, I/O bandwidth, FPGA resources PCIe & 2x DDR3 slot1 slot2 slot3 slot4 slotX
  • 9. 9 How to deploy FPGAs in datacenters! shell memorymemory memory memory die1 die2 die3 PCIe PCIe & 2x DDR3 slot1 slot2 slot3 slot4 slotX Alveoshell
  • 10. 10  Resource Elasticity In a networked system  Load balancing and load distribution  System keeps track about input data and committed results  Implements a distributed checkpointing scheme (tailored to OpenCL) resilience  Also useful for maintenance/updates Live Migration
  • 11. 11 11 FPGAs for Datacenters (H2020 ECOSCALE) ECOSCALE demonstrator fully-populated 1u blade with 32 x Zynq UltraScale+ (ZU9EG with 16GB/node or 512 GB/blade) www.ecoscale.eu
  • 12. 12 12 FPGAs for Datacenters (H2020 ECOSCALE) Partial reconfiguration allows moving compute to data  huge energy savings
  • 13. 13 13 FPGAs for Datacenters (H2020 EuroEXA) • 2-D integr. chips with custom ASIC (64b-ARM) and 2xVU9P FPGAs • 300+ VU9Ps (will provide ~1B LUTs, 2M DSPs) www.euroexa.eu
  • 14. 14 FPGADefender Virus Scanning for FPGAs  Detects probably any kind of self-oscillating circuit  Scans bitstream encoding (short circuits), high fan-out nets, wire tapping, module bounding boxes, glitch-amplification, interfaces to the shell, … (all at bitstream (netlist) level)
  • 15. 15 People:  Tuan Minh La {tuan.la@postgrad.manchester.ac.uk}  Anuj Vaishnav {anuj.vaishnav@manchester.ac.uk}  Khoa Dang Pham {khoa.pham@manchester.ac.uk}  Kaspar Matas {kaspar.matas@manchester.ac.uk}  Nikola Grunchevski {nikola.grunchevski@manchester.ac.uk}  Dirk Koch {dirk.koch@manchester.ac.uk} FPGADefenderFPGADefender Secure and Virtualized FPGA Management for FPGAs