SlideShare a Scribd company logo

FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency

LEGATO project
LEGATO project
LEGATO projectLEGATO project

Tutorial by Behzad Salami, Osman Unsal and Leonardo Bautista at 30th International Conference on Field-Programmable Logic and Applications (FPL2020), 3 September 2020

FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency

1 of 51
Download to read offline
FPGA Undervolting for Energy-Efficiency
30th International Conference on Field-Programmable Logic and Applications (FPL).
3th September, 2020.
Behzad Salami
Barcelona Supercomputing Center (BSC)
2
Outline
• Motivation and Background
• Methodology and Results
- Undervolting FPGA On-Chip Memories
- Undervolting FPGA Internal Components
• More Information
3
Aggressive Undervolting
• Aggressive undervolting- Underscaling the supply voltage below the
nominal and safe level:
 Power/Energy Efficiency: Reduces dynamic and static power quadratically
and linearly, respectively.
 Reliability: Increases the circuit delay and in turn, causes timing faults.
• Dual/Multi-Vdd, DVS, and DVFS: Similar but different mechanisms to
aggressive undervolting:
 Similarity: Underscaling the supply voltage.
 Difference: Undervolting is until a certain safe level, usually constrained by
vendors.
Reliability
Power/Energy
Efficiency
4
State-of-the-art
• Aggressive undervolting has shown significant efficiency
to reduce the energy consumption.
 Devices:
 CPUs: Itanium II (ISCA2014), X86 (IOLTS2017), ARM
(HPCA2017)
 GPUs: NVidia (Micro2015)
 DRAMs: Multiple Brands (Sigmetrics2017)
 FPGA: This work
 Focus of the previous works:
 Voltage guardband
 Minimum safe voltage, i.e., Vmin prediction
 Fault characterization and mitigation
 Chip-to-chip, core-to-core, and workload-to-workload variation
 ….
• More straightforward and more parameters
but less precise
 ASIC DNN: Minerva (Micro2016), Thundervolt (DAC2018)
 CPU: Bravo (HPCA2017 )
 Network On-Chip (HPCA2014)
Real hardware:
Simulation-based studies:
5
Undervolting on FPGAs: Motivation
Contribution of FPGAs in large data centers is growing, expected
to be in 30% of datacenter servers by 2020 (Top500 news).
• In comparison to ASICs,
energy efficiency of FPGAs
is a serious concern, i.e.,
10X-100X less-efficient.
• Nominal voltage reduction
of FPGAs is naturally
applied for different
generations.
Undervolting
[Intel/Altera]
[Xilinx]
6
Outline
• Motivation and Background
• Methodology and Results
- Undervolting FPGA On-Chip Memories
- Undervolting FPGA Internal Components
• More Information

Recommended

Comprehensive evaluation of supply voltage underscaling in FPGA on chip memories
Comprehensive evaluation of supply voltage underscaling in FPGA on chip memoriesComprehensive evaluation of supply voltage underscaling in FPGA on chip memories
Comprehensive evaluation of supply voltage underscaling in FPGA on chip memoriesLEGATO project
 
Evaluating Built-In ECC of FPGA On-Chip Memories for the Mitigation of Underv...
Evaluating Built-In ECC of FPGA On-Chip Memories for the Mitigation of Underv...Evaluating Built-In ECC of FPGA On-Chip Memories for the Mitigation of Underv...
Evaluating Built-In ECC of FPGA On-Chip Memories for the Mitigation of Underv...bscdissemination
 
FPGAVolt: Low-power FPGA-based DNN Accelerator through Aggressive Undervolting
FPGAVolt: Low-power FPGA-based DNN Accelerator through Aggressive Undervolting  FPGAVolt: Low-power FPGA-based DNN Accelerator through Aggressive Undervolting
FPGAVolt: Low-power FPGA-based DNN Accelerator through Aggressive Undervolting LEGATO project
 
IOLTS 2019: Agressive Undervolting of FPGAs: Power and Reliability Trade-offs
IOLTS 2019: Agressive Undervolting of FPGAs: Power and Reliability Trade-offsIOLTS 2019: Agressive Undervolting of FPGAs: Power and Reliability Trade-offs
IOLTS 2019: Agressive Undervolting of FPGAs: Power and Reliability Trade-offsLEGATO project
 
INTELLIGENT POWER MODULE Model No : PEC16DSMO1
INTELLIGENT POWER MODULE Model No : PEC16DSMO1 INTELLIGENT POWER MODULE Model No : PEC16DSMO1
INTELLIGENT POWER MODULE Model No : PEC16DSMO1 MayankSunhare
 

More Related Content

What's hot

Conversor a d mcp3201
Conversor a d mcp3201Conversor a d mcp3201
Conversor a d mcp3201Samuel Borges
 
Linear Isolators with Analog Devices iCoupler Technology
Linear Isolators with Analog Devices iCoupler TechnologyLinear Isolators with Analog Devices iCoupler Technology
Linear Isolators with Analog Devices iCoupler TechnologyAnalog Devices, Inc.
 
Constant Current Switching Regulator for LEDs with ON/OFF Function: NCP3066
Constant Current Switching Regulator for LEDs with ON/OFF Function: NCP3066Constant Current Switching Regulator for LEDs with ON/OFF Function: NCP3066
Constant Current Switching Regulator for LEDs with ON/OFF Function: NCP3066Premier Farnell
 
TechShanghai2016 - Reliable automotive-grade Isolators for new energy vehicles
TechShanghai2016 - Reliable automotive-grade Isolators for new energy vehiclesTechShanghai2016 - Reliable automotive-grade Isolators for new energy vehicles
TechShanghai2016 - Reliable automotive-grade Isolators for new energy vehiclesHardway Hou
 
Drvg_HB_LED_HP Ind_Light Fix
Drvg_HB_LED_HP Ind_Light FixDrvg_HB_LED_HP Ind_Light Fix
Drvg_HB_LED_HP Ind_Light FixSteve Mappus
 
Datasheet sensor temperatura mcp9700
Datasheet sensor temperatura mcp9700Datasheet sensor temperatura mcp9700
Datasheet sensor temperatura mcp9700ADELIUS
 
Original Power Factor Correction IC UCC28061DR 28061 SOP-16 New Texas Instrum...
Original Power Factor Correction IC UCC28061DR 28061 SOP-16 New Texas Instrum...Original Power Factor Correction IC UCC28061DR 28061 SOP-16 New Texas Instrum...
Original Power Factor Correction IC UCC28061DR 28061 SOP-16 New Texas Instrum...AUTHELECTRONIC
 
EC/Bios Interaction Laptop Repair Course
EC/Bios Interaction Laptop Repair CourseEC/Bios Interaction Laptop Repair Course
EC/Bios Interaction Laptop Repair CourseVikas Deoarshi
 
High-integrated Green-mode PWM Controller SG6841
High-integrated Green-mode PWM Controller SG6841High-integrated Green-mode PWM Controller SG6841
High-integrated Green-mode PWM Controller SG6841Bruno Camargo
 
Digital Potentiometers Replace Mechanical Potentiometers
Digital Potentiometers Replace Mechanical PotentiometersDigital Potentiometers Replace Mechanical Potentiometers
Digital Potentiometers Replace Mechanical PotentiometersPremier Farnell
 
LED Streetlight APEC Demo Performance_SMappus 03062013 AC 12 Mar 2013
LED Streetlight APEC Demo Performance_SMappus 03062013 AC 12 Mar 2013LED Streetlight APEC Demo Performance_SMappus 03062013 AC 12 Mar 2013
LED Streetlight APEC Demo Performance_SMappus 03062013 AC 12 Mar 2013Steve Mappus
 
Digital potentiometer ds1804 010+
Digital potentiometer ds1804 010+Digital potentiometer ds1804 010+
Digital potentiometer ds1804 010+robertoestrella
 
Ds012846
Ds012846Ds012846
Ds012846____
 
Non-Dimmable Lower Power LED Solutions
Non-Dimmable Lower Power LED SolutionsNon-Dimmable Lower Power LED Solutions
Non-Dimmable Lower Power LED SolutionsON Semiconductor
 
FEBFAN7688_I00250A
FEBFAN7688_I00250AFEBFAN7688_I00250A
FEBFAN7688_I00250ASteve Mappus
 

What's hot (20)

Conversor a d mcp3201
Conversor a d mcp3201Conversor a d mcp3201
Conversor a d mcp3201
 
Linear Isolators with Analog Devices iCoupler Technology
Linear Isolators with Analog Devices iCoupler TechnologyLinear Isolators with Analog Devices iCoupler Technology
Linear Isolators with Analog Devices iCoupler Technology
 
Constant Current Switching Regulator for LEDs with ON/OFF Function: NCP3066
Constant Current Switching Regulator for LEDs with ON/OFF Function: NCP3066Constant Current Switching Regulator for LEDs with ON/OFF Function: NCP3066
Constant Current Switching Regulator for LEDs with ON/OFF Function: NCP3066
 
TechShanghai2016 - Reliable automotive-grade Isolators for new energy vehicles
TechShanghai2016 - Reliable automotive-grade Isolators for new energy vehiclesTechShanghai2016 - Reliable automotive-grade Isolators for new energy vehicles
TechShanghai2016 - Reliable automotive-grade Isolators for new energy vehicles
 
Datasheet
DatasheetDatasheet
Datasheet
 
Drvg_HB_LED_HP Ind_Light Fix
Drvg_HB_LED_HP Ind_Light FixDrvg_HB_LED_HP Ind_Light Fix
Drvg_HB_LED_HP Ind_Light Fix
 
Datasheet sensor temperatura mcp9700
Datasheet sensor temperatura mcp9700Datasheet sensor temperatura mcp9700
Datasheet sensor temperatura mcp9700
 
Pdiusbd11
Pdiusbd11Pdiusbd11
Pdiusbd11
 
Original Power Factor Correction IC UCC28061DR 28061 SOP-16 New Texas Instrum...
Original Power Factor Correction IC UCC28061DR 28061 SOP-16 New Texas Instrum...Original Power Factor Correction IC UCC28061DR 28061 SOP-16 New Texas Instrum...
Original Power Factor Correction IC UCC28061DR 28061 SOP-16 New Texas Instrum...
 
EC/Bios Interaction Laptop Repair Course
EC/Bios Interaction Laptop Repair CourseEC/Bios Interaction Laptop Repair Course
EC/Bios Interaction Laptop Repair Course
 
Uno 2.0-2.5
Uno 2.0-2.5Uno 2.0-2.5
Uno 2.0-2.5
 
High-integrated Green-mode PWM Controller SG6841
High-integrated Green-mode PWM Controller SG6841High-integrated Green-mode PWM Controller SG6841
High-integrated Green-mode PWM Controller SG6841
 
Digital Potentiometers Replace Mechanical Potentiometers
Digital Potentiometers Replace Mechanical PotentiometersDigital Potentiometers Replace Mechanical Potentiometers
Digital Potentiometers Replace Mechanical Potentiometers
 
Mp8126 r1.03 1384507
Mp8126 r1.03 1384507Mp8126 r1.03 1384507
Mp8126 r1.03 1384507
 
LED Streetlight APEC Demo Performance_SMappus 03062013 AC 12 Mar 2013
LED Streetlight APEC Demo Performance_SMappus 03062013 AC 12 Mar 2013LED Streetlight APEC Demo Performance_SMappus 03062013 AC 12 Mar 2013
LED Streetlight APEC Demo Performance_SMappus 03062013 AC 12 Mar 2013
 
Digital potentiometer ds1804 010+
Digital potentiometer ds1804 010+Digital potentiometer ds1804 010+
Digital potentiometer ds1804 010+
 
Ds012846
Ds012846Ds012846
Ds012846
 
Non-Dimmable Lower Power LED Solutions
Non-Dimmable Lower Power LED SolutionsNon-Dimmable Lower Power LED Solutions
Non-Dimmable Lower Power LED Solutions
 
Low Power VLSI Design
Low Power VLSI DesignLow Power VLSI Design
Low Power VLSI Design
 
FEBFAN7688_I00250A
FEBFAN7688_I00250AFEBFAN7688_I00250A
FEBFAN7688_I00250A
 

Similar to FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency

Understanding the Reliability and Power-Efficiency Trade-offs of Modern FPGAs...
Understanding the Reliability and Power-Efficiency Trade-offs of Modern FPGAs...Understanding the Reliability and Power-Efficiency Trade-offs of Modern FPGAs...
Understanding the Reliability and Power-Efficiency Trade-offs of Modern FPGAs...Behzad Salami
 
ISCA2021 Tutorial-Methods for Characterization and Analysis of Voltage Margin...
ISCA2021 Tutorial-Methods for Characterization and Analysis of Voltage Margin...ISCA2021 Tutorial-Methods for Characterization and Analysis of Voltage Margin...
ISCA2021 Tutorial-Methods for Characterization and Analysis of Voltage Margin...Behzad Salami
 
HC24.29.625-IA-23-Wide-Ruhl-Intel_2012_NTV_iA
HC24.29.625-IA-23-Wide-Ruhl-Intel_2012_NTV_iAHC24.29.625-IA-23-Wide-Ruhl-Intel_2012_NTV_iA
HC24.29.625-IA-23-Wide-Ruhl-Intel_2012_NTV_iASaurabh Dighe
 
ACS37800-Datasheet.pdf
ACS37800-Datasheet.pdfACS37800-Datasheet.pdf
ACS37800-Datasheet.pdfSvenSong
 
How to protect electronic systems against esd
How to protect electronic systems against esdHow to protect electronic systems against esd
How to protect electronic systems against esdMohamed Saadna
 
FPL 2018: Fault Characterization Through FPGAs Undervolting
FPL 2018: Fault Characterization Through FPGAs UndervoltingFPL 2018: Fault Characterization Through FPGAs Undervolting
FPL 2018: Fault Characterization Through FPGAs UndervoltingLEGATO project
 
Automatic power factor controller by microcontroller
Automatic power factor controller by microcontrollerAutomatic power factor controller by microcontroller
Automatic power factor controller by microcontrollerSanket Shitole
 
Power system-protection-presentation-dated-03-10-2013-integrated-protection-c...
Power system-protection-presentation-dated-03-10-2013-integrated-protection-c...Power system-protection-presentation-dated-03-10-2013-integrated-protection-c...
Power system-protection-presentation-dated-03-10-2013-integrated-protection-c...jbpatel7290
 
under grund fault ppt (1).pptx
under grund fault ppt (1).pptxunder grund fault ppt (1).pptx
under grund fault ppt (1).pptxPoojaG86
 
Vlsi Design of Low Transition Low Power Test Pattern Generator Using Fault Co...
Vlsi Design of Low Transition Low Power Test Pattern Generator Using Fault Co...Vlsi Design of Low Transition Low Power Test Pattern Generator Using Fault Co...
Vlsi Design of Low Transition Low Power Test Pattern Generator Using Fault Co...iosrjce
 
QuickSilver Controls QCI-DS032 QCI-N2-MX
QuickSilver Controls QCI-DS032 QCI-N2-MXQuickSilver Controls QCI-DS032 QCI-N2-MX
QuickSilver Controls QCI-DS032 QCI-N2-MXElectromate
 
Mini Power Station Product Brochure - Salevo Pty Ltd
Mini Power Station Product Brochure - Salevo Pty LtdMini Power Station Product Brochure - Salevo Pty Ltd
Mini Power Station Product Brochure - Salevo Pty LtdChristopher Panopoulos
 
Low Power Design Techniques for ASIC / SOC Design
Low Power Design Techniques for ASIC / SOC DesignLow Power Design Techniques for ASIC / SOC Design
Low Power Design Techniques for ASIC / SOC DesignRajesh_navandar
 
W1 Lab Introduction to Process Control LabInstrumentation Mea.docx
W1 Lab Introduction to Process Control LabInstrumentation Mea.docxW1 Lab Introduction to Process Control LabInstrumentation Mea.docx
W1 Lab Introduction to Process Control LabInstrumentation Mea.docxrociobradford
 
Steper Motor Control Through Wireless
Steper Motor Control Through WirelessSteper Motor Control Through Wireless
Steper Motor Control Through WirelessPawan Bahuguna
 
three phase fault analysis with auto reset for temporary fault and trip for p...
three phase fault analysis with auto reset for temporary fault and trip for p...three phase fault analysis with auto reset for temporary fault and trip for p...
three phase fault analysis with auto reset for temporary fault and trip for p...Vikram Rawani
 

Similar to FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency (20)

Understanding the Reliability and Power-Efficiency Trade-offs of Modern FPGAs...
Understanding the Reliability and Power-Efficiency Trade-offs of Modern FPGAs...Understanding the Reliability and Power-Efficiency Trade-offs of Modern FPGAs...
Understanding the Reliability and Power-Efficiency Trade-offs of Modern FPGAs...
 
ISCA2021 Tutorial-Methods for Characterization and Analysis of Voltage Margin...
ISCA2021 Tutorial-Methods for Characterization and Analysis of Voltage Margin...ISCA2021 Tutorial-Methods for Characterization and Analysis of Voltage Margin...
ISCA2021 Tutorial-Methods for Characterization and Analysis of Voltage Margin...
 
HC24.29.625-IA-23-Wide-Ruhl-Intel_2012_NTV_iA
HC24.29.625-IA-23-Wide-Ruhl-Intel_2012_NTV_iAHC24.29.625-IA-23-Wide-Ruhl-Intel_2012_NTV_iA
HC24.29.625-IA-23-Wide-Ruhl-Intel_2012_NTV_iA
 
ACS37800-Datasheet.pdf
ACS37800-Datasheet.pdfACS37800-Datasheet.pdf
ACS37800-Datasheet.pdf
 
How to protect electronic systems against esd
How to protect electronic systems against esdHow to protect electronic systems against esd
How to protect electronic systems against esd
 
FPL 2018: Fault Characterization Through FPGAs Undervolting
FPL 2018: Fault Characterization Through FPGAs UndervoltingFPL 2018: Fault Characterization Through FPGAs Undervolting
FPL 2018: Fault Characterization Through FPGAs Undervolting
 
About Sine Pulse Width Modulation
About Sine Pulse Width Modulation About Sine Pulse Width Modulation
About Sine Pulse Width Modulation
 
Automatic power factor controller by microcontroller
Automatic power factor controller by microcontrollerAutomatic power factor controller by microcontroller
Automatic power factor controller by microcontroller
 
Power system-protection-presentation-dated-03-10-2013-integrated-protection-c...
Power system-protection-presentation-dated-03-10-2013-integrated-protection-c...Power system-protection-presentation-dated-03-10-2013-integrated-protection-c...
Power system-protection-presentation-dated-03-10-2013-integrated-protection-c...
 
under grund fault ppt (1).pptx
under grund fault ppt (1).pptxunder grund fault ppt (1).pptx
under grund fault ppt (1).pptx
 
Abb uno-6
Abb uno-6Abb uno-6
Abb uno-6
 
H010613642
H010613642H010613642
H010613642
 
Vlsi Design of Low Transition Low Power Test Pattern Generator Using Fault Co...
Vlsi Design of Low Transition Low Power Test Pattern Generator Using Fault Co...Vlsi Design of Low Transition Low Power Test Pattern Generator Using Fault Co...
Vlsi Design of Low Transition Low Power Test Pattern Generator Using Fault Co...
 
QuickSilver Controls QCI-DS032 QCI-N2-MX
QuickSilver Controls QCI-DS032 QCI-N2-MXQuickSilver Controls QCI-DS032 QCI-N2-MX
QuickSilver Controls QCI-DS032 QCI-N2-MX
 
Mini Power Station Product Brochure - Salevo Pty Ltd
Mini Power Station Product Brochure - Salevo Pty LtdMini Power Station Product Brochure - Salevo Pty Ltd
Mini Power Station Product Brochure - Salevo Pty Ltd
 
5 FINAL PROJECT REPORT
5 FINAL PROJECT REPORT5 FINAL PROJECT REPORT
5 FINAL PROJECT REPORT
 
Low Power Design Techniques for ASIC / SOC Design
Low Power Design Techniques for ASIC / SOC DesignLow Power Design Techniques for ASIC / SOC Design
Low Power Design Techniques for ASIC / SOC Design
 
W1 Lab Introduction to Process Control LabInstrumentation Mea.docx
W1 Lab Introduction to Process Control LabInstrumentation Mea.docxW1 Lab Introduction to Process Control LabInstrumentation Mea.docx
W1 Lab Introduction to Process Control LabInstrumentation Mea.docx
 
Steper Motor Control Through Wireless
Steper Motor Control Through WirelessSteper Motor Control Through Wireless
Steper Motor Control Through Wireless
 
three phase fault analysis with auto reset for temporary fault and trip for p...
three phase fault analysis with auto reset for temporary fault and trip for p...three phase fault analysis with auto reset for temporary fault and trip for p...
three phase fault analysis with auto reset for temporary fault and trip for p...
 

More from LEGATO project

Scrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitScrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitLEGATO project
 
A practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemA practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemLEGATO project
 
TEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsTEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsLEGATO project
 
secureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworksecureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworkLEGATO project
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...LEGATO project
 
LEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGATO project
 
Smart Home AI at the edge
Smart Home AI at the edgeSmart Home AI at the edge
Smart Home AI at the edgeLEGATO project
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGATO project
 
LEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGATO project
 
LEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGATO project
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGATO project
 
LEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGATO project
 
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneTZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneLEGATO project
 
Infection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingInfection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingLEGATO project
 
Smart Home - AI at the edge
Smart Home - AI at the edgeSmart Home - AI at the edge
Smart Home - AI at the edgeLEGATO project
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...LEGATO project
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsLEGATO project
 
RECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient ComputingRECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient ComputingLEGATO project
 

More from LEGATO project (20)

Scrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitScrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for Profit
 
A practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemA practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating system
 
TEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsTEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEs
 
secureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworksecureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow Framework
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
 
LEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use Case
 
Smart Home AI at the edge
Smart Home AI at the edgeSmart Home AI at the edge
Smart Home AI at the edge
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
 
LEGaTO Integration
LEGaTO IntegrationLEGaTO Integration
LEGaTO Integration
 
LEGaTO: Use cases
LEGaTO: Use casesLEGaTO: Use cases
LEGaTO: Use cases
 
LEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming Models
 
LEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack Runtimes
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous Hardware
 
LEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing Workshop
 
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneTZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
 
Infection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingInfection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow Computing
 
Smart Home - AI at the edge
Smart Home - AI at the edgeSmart Home - AI at the edge
Smart Home - AI at the edge
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
 
RECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient ComputingRECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
RECS – Cloud to Edge Microserver Platform for Energy-Efficient Computing
 

Recently uploaded

An Introduction to Quantum Programming Languages
An Introduction to Quantum Programming LanguagesAn Introduction to Quantum Programming Languages
An Introduction to Quantum Programming LanguagesDavid Yonge-Mallo
 
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Nan Yang Academy of Sciences
 
Agroecology as an approach to design sustainable Food Systems
Agroecology as an approach to design sustainable Food SystemsAgroecology as an approach to design sustainable Food Systems
Agroecology as an approach to design sustainable Food SystemsSIANI
 
Construction of Magic Squares by Swapping Rows and Columns.pdf
Construction of Magic Squares by Swapping Rows and Columns.pdfConstruction of Magic Squares by Swapping Rows and Columns.pdf
Construction of Magic Squares by Swapping Rows and Columns.pdfLossian Barbosa Bacelar Miranda
 
Analytical Coursework - Molly Winterbottom.pdf
Analytical Coursework - Molly Winterbottom.pdfAnalytical Coursework - Molly Winterbottom.pdf
Analytical Coursework - Molly Winterbottom.pdfMollyWinterbottom
 
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdfCW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdfMollyWinterbottom
 
IM-1 Press Kit - Kit de Imprensa do Lançmento da Missão IM-1
IM-1 Press Kit - Kit de Imprensa do Lançmento da Missão IM-1IM-1 Press Kit - Kit de Imprensa do Lançmento da Missão IM-1
IM-1 Press Kit - Kit de Imprensa do Lançmento da Missão IM-1Sérgio Sacani
 
Carpal tunnel Syndrom Wesam Aljabali -1.pdf
Carpal tunnel Syndrom Wesam Aljabali -1.pdfCarpal tunnel Syndrom Wesam Aljabali -1.pdf
Carpal tunnel Syndrom Wesam Aljabali -1.pdfMsm_mo
 
ELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeGaia Science Pte Ltd
 
Introduction to the research of stem cells
Introduction to the research of stem cellsIntroduction to the research of stem cells
Introduction to the research of stem cellsAlaaOraby6
 
Grade 8, Quarter 3.pdf lesson plan third
Grade 8, Quarter 3.pdf lesson plan thirdGrade 8, Quarter 3.pdf lesson plan third
Grade 8, Quarter 3.pdf lesson plan thirdgmail227828
 
A review of volcanic electrification of the atmosphere and volcanic lightning
A review of volcanic electrification of the atmosphere and volcanic lightningA review of volcanic electrification of the atmosphere and volcanic lightning
A review of volcanic electrification of the atmosphere and volcanic lightningSérgio Sacani
 
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptxExploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptxSamrat Tayade
 
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisFrom Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisMarkus Roggen
 
Fair and just food systems enabling local midstream businesses? What does it ...
Fair and just food systems enabling local midstream businesses? What does it ...Fair and just food systems enabling local midstream businesses? What does it ...
Fair and just food systems enabling local midstream businesses? What does it ...SIANI
 
Kavita Punekar: Illuminating Minds and Igniting Passion in Science Education
Kavita Punekar: Illuminating Minds and Igniting Passion in Science EducationKavita Punekar: Illuminating Minds and Igniting Passion in Science Education
Kavita Punekar: Illuminating Minds and Igniting Passion in Science Educationdsnow9802
 
Duchenne Muscular Dystrophy or DMD .pptx
Duchenne Muscular Dystrophy or DMD .pptxDuchenne Muscular Dystrophy or DMD .pptx
Duchenne Muscular Dystrophy or DMD .pptxNavanidhan.M
 
transgenics_17b.pptx
transgenics_17b.pptxtransgenics_17b.pptx
transgenics_17b.pptxridhi124788
 

Recently uploaded (20)

An Introduction to Quantum Programming Languages
An Introduction to Quantum Programming LanguagesAn Introduction to Quantum Programming Languages
An Introduction to Quantum Programming Languages
 
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
 
Agroecology as an approach to design sustainable Food Systems
Agroecology as an approach to design sustainable Food SystemsAgroecology as an approach to design sustainable Food Systems
Agroecology as an approach to design sustainable Food Systems
 
Construction of Magic Squares by Swapping Rows and Columns.pdf
Construction of Magic Squares by Swapping Rows and Columns.pdfConstruction of Magic Squares by Swapping Rows and Columns.pdf
Construction of Magic Squares by Swapping Rows and Columns.pdf
 
Analytical Coursework - Molly Winterbottom.pdf
Analytical Coursework - Molly Winterbottom.pdfAnalytical Coursework - Molly Winterbottom.pdf
Analytical Coursework - Molly Winterbottom.pdf
 
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdfCW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
CW 2 - Frustrated Lewis Pair - Molly winterbottom.pdf
 
IM-1 Press Kit - Kit de Imprensa do Lançmento da Missão IM-1
IM-1 Press Kit - Kit de Imprensa do Lançmento da Missão IM-1IM-1 Press Kit - Kit de Imprensa do Lançmento da Missão IM-1
IM-1 Press Kit - Kit de Imprensa do Lançmento da Missão IM-1
 
Carpal tunnel Syndrom Wesam Aljabali -1.pdf
Carpal tunnel Syndrom Wesam Aljabali -1.pdfCarpal tunnel Syndrom Wesam Aljabali -1.pdf
Carpal tunnel Syndrom Wesam Aljabali -1.pdf
 
ELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in SingaporeELK ELISA Kits Manufacturer in Singapore
ELK ELISA Kits Manufacturer in Singapore
 
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHESINTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
INTRODUCTION TO PLANT TAXONOMY WITH DIVERSE TAXONOMIC APPROACHES
 
Introduction to the research of stem cells
Introduction to the research of stem cellsIntroduction to the research of stem cells
Introduction to the research of stem cells
 
Grade 8, Quarter 3.pdf lesson plan third
Grade 8, Quarter 3.pdf lesson plan thirdGrade 8, Quarter 3.pdf lesson plan third
Grade 8, Quarter 3.pdf lesson plan third
 
Research methods in ethnobotany- Exploring Traditional Wisdom
Research methods in ethnobotany- Exploring Traditional WisdomResearch methods in ethnobotany- Exploring Traditional Wisdom
Research methods in ethnobotany- Exploring Traditional Wisdom
 
A review of volcanic electrification of the atmosphere and volcanic lightning
A review of volcanic electrification of the atmosphere and volcanic lightningA review of volcanic electrification of the atmosphere and volcanic lightning
A review of volcanic electrification of the atmosphere and volcanic lightning
 
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptxExploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
Exploring Artificial Intelligence_ Revolutionizing Tomorrow's World.pptx
 
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of CannabisFrom Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
From Leaf to Lab: Uncovering the Molecular Mysteries of Cannabis
 
Fair and just food systems enabling local midstream businesses? What does it ...
Fair and just food systems enabling local midstream businesses? What does it ...Fair and just food systems enabling local midstream businesses? What does it ...
Fair and just food systems enabling local midstream businesses? What does it ...
 
Kavita Punekar: Illuminating Minds and Igniting Passion in Science Education
Kavita Punekar: Illuminating Minds and Igniting Passion in Science EducationKavita Punekar: Illuminating Minds and Igniting Passion in Science Education
Kavita Punekar: Illuminating Minds and Igniting Passion in Science Education
 
Duchenne Muscular Dystrophy or DMD .pptx
Duchenne Muscular Dystrophy or DMD .pptxDuchenne Muscular Dystrophy or DMD .pptx
Duchenne Muscular Dystrophy or DMD .pptx
 
transgenics_17b.pptx
transgenics_17b.pptxtransgenics_17b.pptx
transgenics_17b.pptx
 

FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency

  • 1. FPGA Undervolting for Energy-Efficiency 30th International Conference on Field-Programmable Logic and Applications (FPL). 3th September, 2020. Behzad Salami Barcelona Supercomputing Center (BSC)
  • 2. 2 Outline • Motivation and Background • Methodology and Results - Undervolting FPGA On-Chip Memories - Undervolting FPGA Internal Components • More Information
  • 3. 3 Aggressive Undervolting • Aggressive undervolting- Underscaling the supply voltage below the nominal and safe level:  Power/Energy Efficiency: Reduces dynamic and static power quadratically and linearly, respectively.  Reliability: Increases the circuit delay and in turn, causes timing faults. • Dual/Multi-Vdd, DVS, and DVFS: Similar but different mechanisms to aggressive undervolting:  Similarity: Underscaling the supply voltage.  Difference: Undervolting is until a certain safe level, usually constrained by vendors. Reliability Power/Energy Efficiency
  • 4. 4 State-of-the-art • Aggressive undervolting has shown significant efficiency to reduce the energy consumption.  Devices:  CPUs: Itanium II (ISCA2014), X86 (IOLTS2017), ARM (HPCA2017)  GPUs: NVidia (Micro2015)  DRAMs: Multiple Brands (Sigmetrics2017)  FPGA: This work  Focus of the previous works:  Voltage guardband  Minimum safe voltage, i.e., Vmin prediction  Fault characterization and mitigation  Chip-to-chip, core-to-core, and workload-to-workload variation  …. • More straightforward and more parameters but less precise  ASIC DNN: Minerva (Micro2016), Thundervolt (DAC2018)  CPU: Bravo (HPCA2017 )  Network On-Chip (HPCA2014) Real hardware: Simulation-based studies:
  • 5. 5 Undervolting on FPGAs: Motivation Contribution of FPGAs in large data centers is growing, expected to be in 30% of datacenter servers by 2020 (Top500 news). • In comparison to ASICs, energy efficiency of FPGAs is a serious concern, i.e., 10X-100X less-efficient. • Nominal voltage reduction of FPGAs is naturally applied for different generations. Undervolting [Intel/Altera] [Xilinx]
  • 6. 6 Outline • Motivation and Background • Methodology and Results - Undervolting FPGA On-Chip Memories - Undervolting FPGA Internal Components • More Information
  • 7. 7 Undervolting FPGA On-Chip Memories 1. Undervolting FPGAs  Voltage guardband  Overall power and reliability trade-off 2. Fault characterization in FPGA on-chip memories  Fault type, location, and rate  Temperature, Chip 3. Low-voltage FPGA-based Neural Network (NN)  Power consumption and NN accuracy characterization  Fault mitigation techniques  Application-aware technique  Built-in ECC
  • 8. 8 Voltage Scaling Capability in Xilinx VC707: performance-efficient design KC705: power-efficient design (A & B) Evaluated Xilinx platforms VC707 Voltage distribution on Xilinx platforms Voltage regulator  Power Management Bus (PMBus).  Hardwired to the host. ZC702: ARM integrated with FPGA VCCINT VCCBRAM
  • 9. 9 Overall Voltage Behavior • FPGA stops operating below Vcrash, min operating voltageCRASH • No observable fault • Voltage Guardband below Vnom SAFE • Faults manifest • Below Vmin, min safe voltage CRITICAL • Voltage guardband: to ensure the worst-case environmental and process technologies. • Experimental conditions: At ambient temperature and maximum operating frequency. Vnom Vmin Vcrash 0 0.2 0.4 0.6 0.8 1 VC707 ZC702 KC705-A KC705-B VCCBRAM(V) Platform GUARDBAND CRITICAL CRASH
  • 10. 10 Floorplan of VC707 Experimental Methodology • FPGA BRAMs:  Hierarchy of set of bit-cells  distributed over the chip.  Size of each BRAM: 16-kbits • Experimental Methodology:  HW: Transfer content of BRAMs to the host.  SW: Analyze data, and adjust voltage of BRAMs. (2060 BRAMs)
  • 11. 11 0 200 400 600 800 0 1 2 3 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 FaultRate(per1Mbit) BRAMPower(Watts) VCCBRAM (V) BRAM Power Fault Rate Vmin=0.61V Vcrash=0.54V 0 400 800 0 0.1 0.2 0.3 0.4 0.61 0.6 0.59 0.58 0.57 0.56 0.55 0.54 per1Mbit Watts Vnom=1V Overall Trade-offs on BRAMs- Power & Reliability VC707
  • 12. 12 0 150 300 0 0.05 0.1 0.15 0.61 0.6 0.59 0.58 0.57 0.56 0.55 0.54 per1Mbit Watts Overall Trade-offs on BRAMs- Multiple Platforms 0 200 400 600 800 0 1 2 3 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 FaultRate (per1Mbit) BRAMPower (Watts) VCCBRAM (V) Vnom=1V Vmin=0.61V 0 400 800 0 0.2 0.4 0.61 0.6 0.59 0.58 0.57 0.56 0.55 0.54 per1Mbit Watts VC707 0 50 100 150 200 0 10 20 30 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 FaultRate (per1Mbit) BRAMPower (mWatts) VCCBRAM (V) Vnom=1V Vcrash=0.53V 0 100 200 0 2 4 0.59 0.58 0.57 0.56 0.55 0.54 0.53 per1Mbit mWatts ZC702 0 100 200 300 0 1 2 3 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 FaultRate (per1Mbit) BRAMPower (Watts) VCCBRAM (V) Vnom=1V Vcrash=0.54V Vmin=0.61V KC705-A 0 20 40 60 80 0 1 2 3 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 FaultRate (per1Mbit) BRAMPower (Watts) VCCBRAM (V) Vnom=1V Vmin=0.57V Vcrash=0.54V 0 40 80 0 0.05 0.1 0.15 0.57 0.56 0.55 0.54 per1Mbit Watts KC705-B Vmin=0.59V Vcrash=0.54V
  • 13. 13 Contributions 1. Undervolting FPGAs  Voltage guardband  Overall power and reliability trade-off 2. Fault characterization in FPGA on-chip memories  Fault type, location, and rate  Temperature, Chip 3. Low-voltage FPGA-based Neural Network (NN)  Power consumption and NN accuracy characterization  Fault mitigation techniques  Application-aware technique  Built-in ECC
  • 14. 14 Fault Characterization at CRITICAL Region • Fully non-uniform fault distribution. • Majority of BRAMs do not experience many faults. Fault variability among FPGA BRAMs: Fully non-uniform fault distribution VC707 (2060 BRAMs) VCCBRAM@ Vcrash= 0.54V Temperature@ Ambient 0.0% 0.3% 0.6% 0.9% 1.2% 1.5% BRAMFaultRate(%) %BRAMs Average Fault Rate (%) 1.8% 0.86% High-vulnerable 9.4% 0.24% Mid-vulnerable 52.3% 0.03% Low-vulnerable 36.3% 0.0% Zero-vulnerable K-means clustering
  • 15. 15 Fault Characterization at CRITICAL Region Type of undervolting faults: Permanent faults at specific voltage • There is no considerable change on the rate and location of faults over time. • Validated by repeating experiments for 100 times. • The physical location of BRAMs is extracted using Vivado. • Fault Variation Map (FVM): Fault rate mapped to the physical location of BRAMs. FVM can be potentially used in fault mitigation techniques! FPGA x-axis FPGAy-axis BRAMFaultRate(%) FVM @ (VCCBRAM @Vcrash, T= ambient, chip= VC707) 1 10 20 30 40 50 60 70 80 90 100 0 200 400 600 800 1000 1 11 21 31 41 51 61 71 81 91 FaultRate(per1Mbit) #Run Individual Run Cumulative Median Three parameters orthogonally have significant impact on the rate and location of faults: 1. Voltage 2. Temperature 3. Chip
  • 16. 16 Fault Characterization (Voltage Impacts) Location of undervolting faults: Fault Inclusion Property (FIP) • FIP: A corrupted bit at a specific voltage stays faulty in lower voltages as well. • FIP can be used in mitigation techniques. 0.1 1 10 100 1000 10000 0.61 0.6 0.59 0.58 0.57 0.56 0.55 0.54 FaultRate(per1Mbit) logscale VCCBRAM (V) Illustration of FIP FIP shown as fault rate for VC707
  • 17. 17 Fault Characterization (Temperature Impacts) • Methodology: Adjusting environmental temperature, monitoring on- board temperature via PMBus. • Experimental Observation:  At higher temperatures, fault rate is significantly reduced. • Inverse Temperature Dependency (𝑰𝑻𝑫) 𝟏:  For nano-scale technology nodes, under ultra low-voltage operations, the circuit delay reduces at higher temperatures since supply voltage approaches the threshold voltage. * x-axis: VCCBRAM (V). * y-axis: fault rate (per 1Mbit). 𝑇 = 50 0 𝐶 𝑇 = 60 0 𝐶 𝑇 = 70 0 𝐶 𝑇 = 80 0 𝐶 Practical confirmation of Inverse Temperature Dependency (ITD) (1) Neshatpour, K., Burleson, W., Khajeh, A., & Homayoun, H. (2018). Enhancing Power, Performance, and Energy Efficiency in Chip Multiprocessors Exploiting Inverse Thermal Dependence. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, (4), 778-791.
  • 18. 18 Fault Characterization (Chip Impacts) KC705-BKC705-A • Methodology: Repeating experiments on identical samples of KC705 (A&B). • • Observations:  Fault rates significantly vary, more than 4X.  Fault Variation Maps (FVMs) are entirely different. Fault location Fault location @VCCBRAM= Vcrash @VCCBRAM= Vcrash Even identical samples of same chips have totally different reliability behavior, due to the process variation/aging effects. Fault rate 0 100 200 300 0.57 0.56 0.55 0.54 Per1Mbit VCCBRAM (V) Fault rate 0 100 200 300 0.61 0.6 0.59 0.58 0.57 0.56 0.55 0.54 Per1Mbit VCCBRAM (V)
  • 19. 19 Contributions 1. Undervolting FPGAs  Voltage guardband  Overall power and reliability trade-off 2. Fault characterization in FPGA on-chip memories  Fault type, location, and rate  Temperature, Chip 3. Low-voltage FPGA-based Neural Network (NN)  Power consumption and NN accuracy characterization  Fault mitigation techniques  Application-aware technique  Built-in ECC
  • 20. 20 Experimental Methodology Neural Network (NN) Type Fully-connected classifier Total number of weights ~1.5 millions Activation function Logsig (logarithmic sigmoid) Major benchmark Name-type MNIST- handwritten digit images Number of images Training: 60000, Classification: 10000 Number of pixels per image 28*28=256 Number of output classes 10 Additional benchmarks Names Forest and Reuters Data representation model Type 16-bits fixed-point Precision Minimum sign and digit per layer An example implementation on VC707 Frequency 100 Mhz BRAM usage (total: 2060) 70.8%
  • 21. 21 NN Implementation on FPGA • Input data: off-chip DDR memory. • Weights: on-chip FPGA BRAM. • Computation: Streaming data onto DSPs and LUTs. • We undervolt VCCBRAM:  Weights of the NN are potentially affected. FPGA Implementation
  • 22. 22 Low-Voltage FPGA-based NN • Significant power reduction until the minimum safe voltage, i.e., Vmin (By eliminating the voltage guardband). • Additional 40% power reduction below the voltage guardband. • The NN classification error exponentially increases from 2.56% (inherent classification error) to 6.74% through undervolting BRAMs beyond Vmin. • Fault mitigation techniques to prevent the accuracy loss:  Application-aware mechanism  Built-in ECC Power saving NN accuracy loss 2.39 0.25 0.15 6.47 6.47 6.47 0 2 4 6 8 10 Vnom= 1 V Vmin= 0.61V Vcrash= 0.54V On-chipPower(Watts) BRAM Rest
  • 23. 23 Intelligently-Constrained BRAM placement (ICBP) • Below voltage guardband level at CRITICAL voltage region, we present ICBP to prevent NN classification error rate loss. • Core Idea: Map most-sensitive weights to faults into robust BRAMs.  Q: Which are the most-sensitive NN weights? A: Deeper Layers. ICBP-Additional ConstraintsintheFPGA placementstage 1 1.4 2.1 3 5.7 LAYER0 LAYER1 LAYER2 LAYER3 LAYER4 Normalized Vulnerability NN Layers
  • 24. 24 ICBP Evaluation • Pros:  Significant accuracy loss prevention.  No power and performance overhead. • Cons:  Needs the FVM as a pre-process step  Built-in ECC is evaluated without having this cost. 0 0.1 0.2 0.3 0.4 0% 2% 4% 6% 8% 10% 0.61 0.6 0.59 0.58 0.57 0.56 0.55 0.54 BRAMsPower(Watt) NNClassificationError(%) VCCBRAM (V) NN Error by Default Placement NN Error by ICBP BRAM Power Inherent NN Error: 2.56%
  • 25. 25 Built-in ECC • Built-in ECC of FPGA BRAMs:  Hamming-code.  Two (2) additional bits per row are reserved as parities.  SECDED (Single-Error Correction and Double-Error Detection). • Experimental Methodology:  Activate built-in ECC under low-voltage read operations. • Experimental Observations:  >90% fault correction  >7% fault detection (not correction) 0 200 400 600 800 0.61 0.6 0.59 0.58 0.57 0.56 0.55 0.54 Faultrate(per1Mbit) VCCBRAM (V) Without ECC With ECC Parity Bits single-bit double-bit multiple-bit
  • 26. 26 ECC for NN Accelerator 0% 2% 4% ClassificationError(%) VCCBRAM (V) Without ECC With ECC Inherent NN Error: 2.56% Area Utilization (%) BRAM LUT FF Without ECC 96% 3% 0.25% With ECC 100% 12% 0.25% BRAM Power (W) Vnom= 1V Vmin= 0.61V Vcrash= 0.54V Without ECC 2.4 0.31 0.198 With ECC ---- ---- 0.211ECC efficiency to prevent NN accuracy loss ECC area and power costs • Pros:  Significant accuracy loss prevention.  Negligible power and performance overhead. • Cons:  Requires larger data rows/lines.  Not all FPGAs are equipped with this technique.
  • 27. 27 Outline • Motivation and Background • Methodology and Results - Undervolting FPGA On-Chip Memories - Undervolting FPGA Internal Components • More Information
  • 28. 28 Executive Summary • Motivation: Power consumption of neural networks is a main concern  Hardware acceleration: GPUs, FPGAs, and ASICs • Problem: FPGAs are at least 10X less power-efficient than equivalent ASICs • Goal: Bridge the power-efficiency gap between ASIC- and FPGA-based neural networks by Undervolting below nominal level • Evaluation Setup  5 Image classification workloads  3 Xilinx UltraScale+ ZCU102 platforms  On-chip voltage rail for internal FPGA components • Main Results  Large voltage guardband (i.e., 33%)  >3X power-efficiency gain
  • 29. 29 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 30. 30 Motivation and Background • Motivation  Power consumption of neural networks is a main concern  Hardware acceleration: GPUs, FPGAs, and ASICs  FPGAs: Getting popular but less power-efficient than equivalent ASICs  Large voltage guardbands (12-35%) for CPUs, GPUs, DRAMs  Any potential of “Undervolting FPGAs” for power-efficiency of neural networks? • Background  Neural Networks: Widely deployed with an inherent resilience to errors  FPGAs: Higher throughput than GPUs and better flexibility than ASICs  Undervolting: Reduces power cons., may incur reliability or performance issues
  • 31. 31 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 32. 32 Our Goal • Primary Goal  Bridge the power-efficiency gap between ASIC- and FPGA-based neural networks by:  Undervolting (i.e., underscaling voltage below nominal level) • Secondary Goals  Study the voltage behavior of real FPGAs (e.g., guardband)  Study the power-efficiency gain of undervolting for neural networks  Study the reliability overhead  Study the frequency underscaling to prevent the accuracy loss  Study the effect of environmental temperature
  • 33. 33 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 34. 34 Overall Methodology • 5 CNN image classification workloads, i.e., VGGNet, GoogleNet, AlexNet, ResNet50, Inception. • Xilinx DNNDK to map CNN into FPGA  By default optimized for INT8 • 3 identical samples of Xilinx ZCU102  ZYNQ Ultrscale+ architecture  Hard-core ARM for data orchestration  FPGA for CNN acceleration • 1 on-chip voltage rails, via PMBus  𝑉𝐶𝐶𝐼𝑁𝑇: DSPs, LUTs, buffers, …  𝑉𝑛𝑜𝑚= 850mV (set by manufacturer) Vast majority (>99.9%) of the power is dissipated on 𝑉𝐶𝐶𝐼𝑁𝑇
  • 35. 35 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 36. 36 Overall Voltage Behavior Slight variation of voltage behavior across platforms and benchmarks  FPGA stops operatingCrash • Guardband: Large region below nominal level (𝑽 𝒏𝒐𝒎 = 𝟖𝟓𝟎𝒎𝑽) • Critical: Narrower region below guardband (𝑽 𝒎𝒊𝒏 = 𝟓𝟕𝟎𝒎𝑽) • Crash: FPGA crashes below critical region (𝑽 𝒄𝒓𝒂𝒔𝒉 = 𝟓𝟒𝟎𝒎𝑽)  No performance or reliability loss  Added by the vendor to ensure the worst-case conditions  Large guardband, average of 33% Guard band  A narrow voltage region  Neural network accuracy collapse Critical
  • 37. 37 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 38. 38 Power-Reliability Trade-off Power-efficiency (GOPs/W) gain • >3X power saving (2.6X by eliminating guardband and further 43% in critical region) Reliability overhead (i.e., CNN accuracy loss) VGGNet GoogleNet AlexNet ResNet Inception • Slight variation across 3 platforms and 5 workloads • No accuracy loss in the guardband, accuracy collapse in the critical region • Slight variation across 3 platforms and 5 workloads
  • 39. 39 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 40. 40 VCCINT (mV) Fmax (Mhz) GOPs (Norm) Power (W) Norm) GOPs/W (Norm) GOPs/J (Norm) 570 333 1 1 1 1 565 300 0.94 0.97 0.97 0.87 560 250 0.83 0.84 0.99 0.75 555 250 0.83 0.78 1.06 0.8 550 250 0.83 0.75 1.1 0.83 545 250 0.83 0.74 1.12 0.84 540 200 0.7 0.56 1.25 0.75 Frequency Underscaling • Simultaneous frequency underscaling to prevent CNN accuracy collapse in the critical voltage region • For each voltage level below 𝑽 𝒎𝒊𝒏, we found the 𝑭 𝒎𝒂𝒙, the maximum operating frequency at which there is no accuracy loss • Leads to performance and energy-efficiency loss Best setting for High-performance and Energy-efficiency Best setting for Power-efficiency (Voltage steps= 5mV, Frequency steps= 50Mhz)- shown for GoogleNet
  • 41. 41 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 42. 42 Environmental Temperature • Effects of environmental temperature on power-reliability  Use fan speed to test temperature in [34 ℃, 50 ℃]  On-board temperature monitored by PMBus • Temperature effects on power consumption  ↓ 𝑇𝑒𝑚𝑝 → ↓ 𝑃𝑜𝑤𝑒𝑟 (direct relation of power and temp)  By undervolting, the impact of temperature on power consumption reduces. • Temperature effects on reliability  ↓ 𝑇𝑒𝑚𝑝 → ↑ 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 𝑙𝑜𝑠𝑠 (indirect relation of reliability and temp)  In our temperature range, 𝑉 𝑚𝑖𝑛 and 𝑉𝑐𝑟𝑎𝑠ℎdo not change significantly. GoogleNet
  • 43. 43 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 44. 44 Prior Works • Undervolting  Studies for off-the-shelf real CPUs, GPUs, ASICs, DRAMs  Large voltage guardband (from 12% to 35%) for many devices  This work extends such studies for off-the-shelf FPGAs especially for neural network acceleration and confirms large guardbands (i.e., 33%) • Power-Efficient Neural Networks  Studies on architectural-, hardware-, and software-level techniques  Undervolting in neural network ASIC accelerator (e.g., GreenTPU-DAC’19)  This work proposes a hardware-level undervolting for further power-saving (>3X) in FPGAs. • Reliability in Neural Networks  Analytical and simulation-based studies (e.g., Thundervolt-DAC’18)  Some studies on real hardware (e.g., EDEN-MICRO’19)  This work studies the reliability of neural networks on real FPGAs when operating at reduced voltage levels.
  • 45. 45 Outline • Motivation and Background • Our Goal • Methodology • Results - Overall Voltage Behavior - Power-Reliability Trade-off - Frequency Underscaling - Environmental Temperature • Prior Works • Summary, Conclusion, and Future Works
  • 46. 46 Summary, Conclusion, and Future Works • Summary  We improve the power-efficiency (>3X) of off-the-shelf FPGAs via undervolting for neural network accelerators:  2.6X by eliminating the guardband (i.e., 33%) without any cost  43% by further undervolting below the guardband with the cost of  either accuracy loss, when the frequency is not underscaled  or performance loss, when the frequency is underscaled • Conclusion  Undervolting is an effective way to achieve significant power-saving for FPGA-based neural network accelerators • Future Works  HW & SW extension of our undervolting for FPGA clusters and other neural network models and tools
  • 47. 47 Outline • Motivation and Background • Methodology and Results - Undervolting FPGA On-Chip Memories - Undervolting FPGA Internal Components • More Information
  • 48. 48 References • B. Salami, et al., "An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration," in 50th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2020. • B. Salami, et al., "Comprehensive Evaluation of Supply Voltage Underscaling in FPGA on- chip Memories.", in 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO ), 2018. • B. Salami, et al., “Evaluating Built-in ECC of FPGA on-chip Memories for the Mitigation of Undervolting Faults," in 27th Euromicro International Conference on Parallel, Distributed, and Network-based Processing (PDP), 2019. • B. Salami, et al., "Fault Characterization Through FPGAs Undervolting.", in 28th International Conference on Field Programmable Logic & Applications (FPL), 2018. • B. Salami, et al., “On the Resilience of RTL NN Accelerators: Fault Characterization and Mitigation.", in 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2018.
  • 49. 49 Ongoing and Future Extensions • Circuit-level simulation for validating the results • Expansion for more number of FPGAs (cluster), more workloads (DNN and non-DNN) • Heterogeneous systems including hw-sw systems, more voltage rails • Design voltage-optimized FPGA components • Integration with error handling systems like check- pointing
  • 50. 50 Acknowledgment • Adrian Cristal • Osman Unsal • Fahrettin Koc • Baturay Onural • Ismail Emir Yuksel
  • 51. FPGA Undervolting for Energy-Efficiency 30th International Conference on Field-Programmable Logic and Applications (FPL). 3th September, 2020. Behzad Salami Barcelona Supercomputing Center (BSC) behzad.salami@bsc.es