SlideShare a Scribd company logo
1 of 9
IMPROVING GPU FREQUENCY SCALING FOR GPU WORKLOADS
TYPICAL DVFS BASED GPU BOOST MECHANISM
• GPU frequency boosting wired through the devfreq governor
• Monitors GPU busyness and tries to keep current load under given
target load by adjusting gpu frequency with tunables like settling
time, bias, damp and rampdown_delay
• Basically boost_freq = bias * freq * (load - target)/target
• Ideal for sustained loads and burstiness within high load window
• Too aggressive tunings lead to higher reactiveness
• However also leads to constant gpu overpowering
• For e.g. too low target_load or high rampdown_delay
PROBLEM
• Low latency VR use cases typically present repetitive & bursty GPU
workloads
• Need is guaranteed GPU horsepower exactly when workload
gets scheduled
• Load quickly gets degenerated (but high chance of repeating) -
so frequency needs to quickly fall down (and ramp up back)
• Typical use cases exhibiting this kind of burstiness are camera post
processing, edge detection, atw...
• Slower response time associated with current governor in ramping
up frequency clearly shows up with overall low perf/watt
JUST IN (SUBMIT) TIME FREQ SCALING
• Density of work submission (unit time) forms basis of GPU load
• Delay (order of ms) in submit to governor’s load visibility
• Translates to latency in effective gpu frequency boost
• Short boost pulse in submit code path takes care of ramp up latency
• Inherently makes frequency follow workload
• Increased chances of governor now seeing lower load and pulling
frequency down
• Effective gpu freq comes down to fmax@vmin for profiled use cases
(presenting better perf/watt)
PERF/POWER DATA ACROSS USE CASES
GPU intensive
section (ms)
Avg GPU
Busyness
Avg GPU
Frequency
(Mhz)
Avg GPU
Power
(mW)
Avg
(VDD_IN)
Total Power
(mW)
%
Perf/Watt
Increase
Pupil Detection (with
JIT scaling)
Edge
Detection
11.004 34 497 471 5488
99.623182
Pupil Detection (with
default scaling)
21.158 182 293 421 5286
Passthrough camera
(with JIT scaling)
Camera to
Display
(e2e)
40.599 219 596 856 7591
4.5763017
Passthrough camera
(with default scaling)
45.466 590 283 837 8129
Passthrough camera
(with max gpu)
40.025 153 1331 1377 8677
PUPIL DETECTION WITH CURRENT FREQ SCALING
Avg Max Min
GPU
intensive
code
latency ( in
ms)
21.158 843.41 7.289
GPU
Busyness
182 401 57
GPU
frequency
(in Mhz)
293 595 109
GPU Power
(in mW)
421 534 152
PUPIL DETECTION WITH JIT FREQ SCALING
Avg Max Min
GPU
intensive
code
latency (in
ms)
11.004 957.52 5890
GPU
Busyness
34 504 10
GPU
frequency
(in Mhz)
497 790 109
GPU Power
(in mW)
471 610 152
PASSTHROUGH WITH DEFAULT FREQ SCALING
Avg Max Min
GPU
intensive
code
latency (in
ms)
45.466 82.425 33.461
GPU
Busyness
590 946 173
GPU
frequency
(in Mhz)
283 693 109
GPU Power
(in mW)
837 838 761
PASSTHROUGH WITH JIT FREQ SCALING
Avg Max Min
GPU
intensive
code
latency (in
ms)
40.599 63.626 31.690
GPU
Busyness
219 390 67
GPU
frequency
(in Mhz)
596 790 303
GPU Power
(in mW)
856 914 762

More Related Content

Similar to Gpu submit time frequency boosting

KVM Tuning @ eBay
KVM Tuning @ eBayKVM Tuning @ eBay
KVM Tuning @ eBayXu Jiang
 
WALT vs PELT : Redux - SFO17-307
WALT vs PELT : Redux  - SFO17-307WALT vs PELT : Redux  - SFO17-307
WALT vs PELT : Redux - SFO17-307Linaro
 
Dynamic Resolution and Interlaced Rendering
Dynamic Resolution and Interlaced RenderingDynamic Resolution and Interlaced Rendering
Dynamic Resolution and Interlaced RenderingMartinMueller34
 
Service Assurance for Virtual Network Functions in Cloud-Native Environments
Service Assurance for Virtual Network Functions in Cloud-Native EnvironmentsService Assurance for Virtual Network Functions in Cloud-Native Environments
Service Assurance for Virtual Network Functions in Cloud-Native EnvironmentsNikos Anastopoulos
 
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...VMworld
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAShien-Chun Luo
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performances.rohit
 
Project ACRN CPU sharing BVT scheduler in ACRN hypervisor
Project ACRN CPU sharing BVT scheduler in ACRN hypervisorProject ACRN CPU sharing BVT scheduler in ACRN hypervisor
Project ACRN CPU sharing BVT scheduler in ACRN hypervisorProject ACRN
 
Nick Fisk - low latency Ceph
Nick Fisk - low latency CephNick Fisk - low latency Ceph
Nick Fisk - low latency CephShapeBlue
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
Symposium on HPC Applications – IIT Kanpur
Symposium on HPC Applications – IIT KanpurSymposium on HPC Applications – IIT Kanpur
Symposium on HPC Applications – IIT KanpurRishi Pathak
 
HiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use CasesHiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use CasesTulipp. Eu
 
On the Capability and Achievable Performance of FPGAs for HPC Applications
On the Capability and Achievable Performance of FPGAs for HPC ApplicationsOn the Capability and Achievable Performance of FPGAs for HPC Applications
On the Capability and Achievable Performance of FPGAs for HPC ApplicationsWim Vanderbauwhede
 
Inside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable CloudInside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable Cloudinside-BigData.com
 
AMD PowerTune & ZeroCore Power Technologies
AMD PowerTune & ZeroCore Power TechnologiesAMD PowerTune & ZeroCore Power Technologies
AMD PowerTune & ZeroCore Power TechnologiesAMD
 
Ovs perf
Ovs perfOvs perf
Ovs perfMadhu c
 
Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...Matthias Trapp
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceScyllaDB
 

Similar to Gpu submit time frequency boosting (20)

KVM Tuning @ eBay
KVM Tuning @ eBayKVM Tuning @ eBay
KVM Tuning @ eBay
 
WALT vs PELT : Redux - SFO17-307
WALT vs PELT : Redux  - SFO17-307WALT vs PELT : Redux  - SFO17-307
WALT vs PELT : Redux - SFO17-307
 
Dasia 2022
Dasia 2022Dasia 2022
Dasia 2022
 
Dynamic Resolution and Interlaced Rendering
Dynamic Resolution and Interlaced RenderingDynamic Resolution and Interlaced Rendering
Dynamic Resolution and Interlaced Rendering
 
Service Assurance for Virtual Network Functions in Cloud-Native Environments
Service Assurance for Virtual Network Functions in Cloud-Native EnvironmentsService Assurance for Virtual Network Functions in Cloud-Native Environments
Service Assurance for Virtual Network Functions in Cloud-Native Environments
 
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLA
 
improve deep learning training and inference performance
improve deep learning training and inference performanceimprove deep learning training and inference performance
improve deep learning training and inference performance
 
Project ACRN CPU sharing BVT scheduler in ACRN hypervisor
Project ACRN CPU sharing BVT scheduler in ACRN hypervisorProject ACRN CPU sharing BVT scheduler in ACRN hypervisor
Project ACRN CPU sharing BVT scheduler in ACRN hypervisor
 
Nick Fisk - low latency Ceph
Nick Fisk - low latency CephNick Fisk - low latency Ceph
Nick Fisk - low latency Ceph
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
Symposium on HPC Applications – IIT Kanpur
Symposium on HPC Applications – IIT KanpurSymposium on HPC Applications – IIT Kanpur
Symposium on HPC Applications – IIT Kanpur
 
HiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use CasesHiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use Cases
 
On the Capability and Achievable Performance of FPGAs for HPC Applications
On the Capability and Achievable Performance of FPGAs for HPC ApplicationsOn the Capability and Achievable Performance of FPGAs for HPC Applications
On the Capability and Achievable Performance of FPGAs for HPC Applications
 
Inside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable CloudInside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable Cloud
 
45 KVA Ground Power Unit for Raphael .pptx
45 KVA Ground Power Unit for Raphael .pptx45 KVA Ground Power Unit for Raphael .pptx
45 KVA Ground Power Unit for Raphael .pptx
 
AMD PowerTune & ZeroCore Power Technologies
AMD PowerTune & ZeroCore Power TechnologiesAMD PowerTune & ZeroCore Power Technologies
AMD PowerTune & ZeroCore Power Technologies
 
Ovs perf
Ovs perfOvs perf
Ovs perf
 
Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 

Recently uploaded

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 

Recently uploaded (20)

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 

Gpu submit time frequency boosting

  • 1. IMPROVING GPU FREQUENCY SCALING FOR GPU WORKLOADS
  • 2. TYPICAL DVFS BASED GPU BOOST MECHANISM • GPU frequency boosting wired through the devfreq governor • Monitors GPU busyness and tries to keep current load under given target load by adjusting gpu frequency with tunables like settling time, bias, damp and rampdown_delay • Basically boost_freq = bias * freq * (load - target)/target • Ideal for sustained loads and burstiness within high load window • Too aggressive tunings lead to higher reactiveness • However also leads to constant gpu overpowering • For e.g. too low target_load or high rampdown_delay
  • 3. PROBLEM • Low latency VR use cases typically present repetitive & bursty GPU workloads • Need is guaranteed GPU horsepower exactly when workload gets scheduled • Load quickly gets degenerated (but high chance of repeating) - so frequency needs to quickly fall down (and ramp up back) • Typical use cases exhibiting this kind of burstiness are camera post processing, edge detection, atw... • Slower response time associated with current governor in ramping up frequency clearly shows up with overall low perf/watt
  • 4. JUST IN (SUBMIT) TIME FREQ SCALING • Density of work submission (unit time) forms basis of GPU load • Delay (order of ms) in submit to governor’s load visibility • Translates to latency in effective gpu frequency boost • Short boost pulse in submit code path takes care of ramp up latency • Inherently makes frequency follow workload • Increased chances of governor now seeing lower load and pulling frequency down • Effective gpu freq comes down to fmax@vmin for profiled use cases (presenting better perf/watt)
  • 5. PERF/POWER DATA ACROSS USE CASES GPU intensive section (ms) Avg GPU Busyness Avg GPU Frequency (Mhz) Avg GPU Power (mW) Avg (VDD_IN) Total Power (mW) % Perf/Watt Increase Pupil Detection (with JIT scaling) Edge Detection 11.004 34 497 471 5488 99.623182 Pupil Detection (with default scaling) 21.158 182 293 421 5286 Passthrough camera (with JIT scaling) Camera to Display (e2e) 40.599 219 596 856 7591 4.5763017 Passthrough camera (with default scaling) 45.466 590 283 837 8129 Passthrough camera (with max gpu) 40.025 153 1331 1377 8677
  • 6. PUPIL DETECTION WITH CURRENT FREQ SCALING Avg Max Min GPU intensive code latency ( in ms) 21.158 843.41 7.289 GPU Busyness 182 401 57 GPU frequency (in Mhz) 293 595 109 GPU Power (in mW) 421 534 152
  • 7. PUPIL DETECTION WITH JIT FREQ SCALING Avg Max Min GPU intensive code latency (in ms) 11.004 957.52 5890 GPU Busyness 34 504 10 GPU frequency (in Mhz) 497 790 109 GPU Power (in mW) 471 610 152
  • 8. PASSTHROUGH WITH DEFAULT FREQ SCALING Avg Max Min GPU intensive code latency (in ms) 45.466 82.425 33.461 GPU Busyness 590 946 173 GPU frequency (in Mhz) 283 693 109 GPU Power (in mW) 837 838 761
  • 9. PASSTHROUGH WITH JIT FREQ SCALING Avg Max Min GPU intensive code latency (in ms) 40.599 63.626 31.690 GPU Busyness 219 390 67 GPU frequency (in Mhz) 596 790 303 GPU Power (in mW) 856 914 762