SlideShare a Scribd company logo
1 of 24
Download to read offline
Post-compiler Software Optimization for Reducing Energy
Eric Schulte, Jonathan Dorn, et all
Presented By: Abhishek Abhyankar
MS Computer Science Virginia Tech
08-May-15 Computer Architecture CS 5504 Spring 2015 1
Traditional Way of doing things
• Make a case for reduction in energy consumption.
• Traditionally Energy optimization handled in Hardware.
• Voltage Scaling , Heterogeneous Cores, Specialized Cores and, many others.
• On Software side, its mainly concerned about increasing speed and
reducing size of the compiled code.
• Extracting Instruction, Thread, and Data level parallelism.
08-May-15 Computer Architecture CS 5504 Spring 2015 2
Post Compile Software Optimization
• Handle the Optimizations on Software level.
• Take the compiled code output from standard compiler.
• How can this be achieved ? One of the approach is :
“Genetic Optimization algorithm which uses concepts from Evolutionary
computation which stochastically mutilates the software for optimum
implementation, all this while preserving strict functional semantics.”
08-May-15 Computer Architecture CS 5504 Spring 2015 3
Background Concepts
• Functional Vs Non Functional Requirements.
• On going debate between
• Functional Requirements: Adhering to Specifications, Correctness of the code.
• Non Functional Requirements: Memory Utilization, Energy Consumption.
• Stochastic Methods
• Used heavily in Evolutionary computation.
• Randomly trying out different combinations.
08-May-15 Computer Architecture CS 5504 Spring 2015 4
Background Concepts .. continued
• Profile Guided Optimizations.
• Program is profiled by running it and gathering run time data.
• Call graph generation.
• Enforcing “nearest is the best” policy.
• Software robustness even after mutilation.
• Random mutilations of the software preserve the semantic meaning.
• Many implementation possible which lead to same semantic goal.
08-May-15 Computer Architecture CS 5504 Spring 2015 5
Background Concepts .. continued
• Evolutionary Computation.
• Darwinian principles.
• Generally applied in black box approach.
• Steady State Algorithms.
• After each iterations candidates are simply inserted back in populous.
• Best among them is selected or rather worse is deleted.
08-May-15 Computer Architecture CS 5504 Spring 2015 6
Genetic Optimization Algorithm(GOA)
• “Genetic Optimization algorithm which uses concepts from
Evolutionary computation which stochastically mutilates the software
for optimum implementation, all this while preserving strict
functional semantics.”
• Takes in three inputs to start.
• Benchmark Applications or Kernels.
• Test Suites which validate the mutation.
• Fitness Function
08-May-15 Computer Architecture CS 5504 Spring 2015 7
High-level working of GOA
08-May-15 Computer Architecture CS 5504 Spring 2015 8
GOA Working .. continued
• Take the program
• create many random variants of the program by changing the order of
the instructions , deleting and editing some
• Test the new variant with the test suites which are submitted
• If they pass then check for improvement in the non functional
requirements function
• If yes spit out the assembly code as an optimized code after applying
Minimization technique.
08-May-15 Computer Architecture CS 5504 Spring 2015 9
Representation of Assembly code
• Very simple strategy adopted to represent the assembly code.
• Each line will have a cell in an array.
• One line can be broken down and also have multiple cells too.
• The Augmented instructions are avoided.
• Limits the search space.
08-May-15 Computer Architecture CS 5504 Spring 2015 10
Experimental Setup and Benchmark Kernels:
• Intel machine used as an example of Desktop computer.
• I7 , 4 Cores, 8 GB Ram
• AMD machine used as an example for Server Scale machine.
• 48 Cores, 128 GB Ram
• 8 Kernels from PARSEC benchmark suite used.
• Blackscholes, bodytrack, ferret, fluidanimate, freqmine, swaptions, vips, and
x264
• They should at-least keep the underlying Architecture running for 1
sec and produce output.
08-May-15 Computer Architecture CS 5504 Spring 2015 11
Input Test Suites
• Comprehensive test suites for each kernel.
• Smallest input size of the test suite is considered.
• Just for validating requirements specification, stress or border testing
not needed that this point.
08-May-15 Computer Architecture CS 5504 Spring 2015 12
Fitness Function
• GOA proposes a linear scalar energy model
• Hardware counters are captured using the “perf” utility in Linux.
• Tightly coupled with the underlying Architecture and Fine grained.
• Heavily dependent on time factor.
08-May-15 Computer Architecture CS 5504 Spring 2015 13
power = Cconst + Cins + Cfpos + Ctca + Cmem
energy = seconds power
ins fpos tca mem
cycle cycle cycle cycle
Constants Derived from Empirical Study
08-May-15 Computer Architecture CS 5504 Spring 2015 14
Minimization Technique
• Iteration tend to create redundant patterns of code.
• The goal is to get the best energy efficiency with least amount of
changes.
• Delta Debugging is used to compare and remove redundant , non
influential changes.
08-May-15 Computer Architecture CS 5504 Spring 2015 15
Code Example of GOA
08-May-15 Computer Architecture CS 5504 Spring 2015 16
Post processing the optimized code
• Execute the original code with Held-out test suite.
• Obtain Wall-Socket real measurements.
• Execute the optimized code with Held-out test suite.
• Obtain Wall-Socket real measurements.
• Compare the two results and find out patterns which saw
improvements and percentage improvement in Energy consumption.
08-May-15 Computer Architecture CS 5504 Spring 2015 17
Results
• In blackscholes kernel GOA caught the induced repeatition loop and
found a way around it.
• In swaptions kernel GOA gave a 42% energy savings.
• Have to take it with a pinch of salt though.
• In vips kernel , the cache misses actually increased instructions lines
decreased and hence 20% improvement was observed.
08-May-15 Computer Architecture CS 5504 Spring 2015 18
Interesting Observations
• 7% average error found in most prediction models and so as in GOA.
• But still works fine with it.
• Empirical studies show that GOA might be better suited to finding
efficient sequence of assembly instructions but not efficient memory
access patterns.
• Energy reduction percentage is consistently more on AMD machines.
• But mainly due more opportunities due to bigger machine.
08-May-15 Computer Architecture CS 5504 Spring 2015 19
QoS dependent Optimization
• “Relaxed” preservation of semantics and more emphasis on QoS.
• The plug and play testing suite policy gives the developer option of
making GOA strict or loose on semantics.
• Relaxed functional requirements provide much more energy
efficiency but risk is taken by the developer to see the program
semantic does not break.
08-May-15 Computer Architecture CS 5504 Spring 2015 20
Key contributions
• Genetic Optimization Algorithm (GOA) combines insights from profile-
guided optimization, superoptimization, evolutionary computation
and mutational robustness.
• This technique gave 20% average energy savings across all
benchmarks.
• Very simple and mostly leverages from already available techniques.
08-May-15 Computer Architecture CS 5504 Spring 2015 21
Drawbacks of GOA
• Energy constant are taken empirically over repeated run on specific
hardware.
• Introducing GOA on new architecture will take considerable amount of work.
• Non deterministic approach makes it almost impossible to restore to
earlier code path after the software is changed even slightly.
• Must provide indexing of the code paths and remember them.
• Very High quality test suites “required”
• Failure to provide them might result in over optimized false working code.
08-May-15 Computer Architecture CS 5504 Spring 2015 22
Proposed Future Work
• Currently only applied to x86.
• A matrix implementation proposed as a solution to this problem.
• Indirect selection can optimize one parameter at the cost of
worsening other.
• Should be generalized to Java Byte code and ARM.
• Instead of Compiler which takes a predefined “agreed” path, a code
should be compiled with multiple compiler using multiple paths and
then best should be selected.
08-May-15 Computer Architecture CS 5504 Spring 2015 23
Questions / Discussion
08-May-15 Computer Architecture CS 5504 Spring 2015 24

More Related Content

What's hot

Jeda Hls Hlv Success Story V4
Jeda Hls Hlv Success Story V4Jeda Hls Hlv Success Story V4
Jeda Hls Hlv Success Story V4
Chun Xia
 
Verification for system companies (LI) - value proposition
Verification for system companies (LI) - value propositionVerification for system companies (LI) - value proposition
Verification for system companies (LI) - value proposition
Hagai Arbel
 

What's hot (20)

Srivalli Aparna - The Blueprints to Success
Srivalli Aparna - The Blueprints to SuccessSrivalli Aparna - The Blueprints to Success
Srivalli Aparna - The Blueprints to Success
 
Jeda Hls Hlv Success Story V4
Jeda Hls Hlv Success Story V4Jeda Hls Hlv Success Story V4
Jeda Hls Hlv Success Story V4
 
Neotys PAC 2018 - Gayatree Nalwadad
Neotys PAC 2018 - Gayatree NalwadadNeotys PAC 2018 - Gayatree Nalwadad
Neotys PAC 2018 - Gayatree Nalwadad
 
What does it take to be a performance tester?
What does it take to be a performance tester?What does it take to be a performance tester?
What does it take to be a performance tester?
 
Bdd test automation analysis
Bdd test automation analysisBdd test automation analysis
Bdd test automation analysis
 
Test automation project estimation calculator
Test automation project estimation calculatorTest automation project estimation calculator
Test automation project estimation calculator
 
QTest
QTest QTest
QTest
 
Neotys PAC 2018 - Bruno Da Silva
Neotys PAC 2018 - Bruno Da SilvaNeotys PAC 2018 - Bruno Da Silva
Neotys PAC 2018 - Bruno Da Silva
 
Automation test
Automation testAutomation test
Automation test
 
Neotys PAC 2018 - Helen Bally
Neotys PAC 2018 - Helen BallyNeotys PAC 2018 - Helen Bally
Neotys PAC 2018 - Helen Bally
 
How to create SystemVerilog verification environment?
How to create SystemVerilog verification environment?How to create SystemVerilog verification environment?
How to create SystemVerilog verification environment?
 
Automatic performance-diagnosis-and-tuning-in-oracle
Automatic performance-diagnosis-and-tuning-in-oracleAutomatic performance-diagnosis-and-tuning-in-oracle
Automatic performance-diagnosis-and-tuning-in-oracle
 
Predictive Analytics based Regression Test Optimization
Predictive Analytics based Regression Test OptimizationPredictive Analytics based Regression Test Optimization
Predictive Analytics based Regression Test Optimization
 
Qtp - Introduction values
Qtp - Introduction valuesQtp - Introduction values
Qtp - Introduction values
 
Continuous performance: Load testing for developers with gatling @ JavaOne 2016
Continuous performance: Load testing for developers with gatling @ JavaOne 2016Continuous performance: Load testing for developers with gatling @ JavaOne 2016
Continuous performance: Load testing for developers with gatling @ JavaOne 2016
 
Automation in the world of project
Automation  in the world of projectAutomation  in the world of project
Automation in the world of project
 
Beginners overview of automated testing with Rspec
Beginners overview of automated testing with RspecBeginners overview of automated testing with Rspec
Beginners overview of automated testing with Rspec
 
The future of Analogue Test - NMI DFT event
The future of Analogue Test - NMI DFT eventThe future of Analogue Test - NMI DFT event
The future of Analogue Test - NMI DFT event
 
Reduce Test Automation Execution Time by 80%
Reduce Test Automation Execution Time by 80%Reduce Test Automation Execution Time by 80%
Reduce Test Automation Execution Time by 80%
 
Verification for system companies (LI) - value proposition
Verification for system companies (LI) - value propositionVerification for system companies (LI) - value proposition
Verification for system companies (LI) - value proposition
 

Viewers also liked (10)

E-Doaa Gomaa CV
E-Doaa Gomaa CVE-Doaa Gomaa CV
E-Doaa Gomaa CV
 
La planificacio
La planificacioLa planificacio
La planificacio
 
Finnish Sauna
Finnish SaunaFinnish Sauna
Finnish Sauna
 
Mira Mesa – Neighborhood Report (05.15)
Mira Mesa – Neighborhood Report (05.15)Mira Mesa – Neighborhood Report (05.15)
Mira Mesa – Neighborhood Report (05.15)
 
Iraq war
Iraq warIraq war
Iraq war
 
Curso Fundamento de Magento - módulo 2
Curso Fundamento de Magento - módulo 2Curso Fundamento de Magento - módulo 2
Curso Fundamento de Magento - módulo 2
 
Ensayo final de tics
Ensayo final de ticsEnsayo final de tics
Ensayo final de tics
 
Escuela superior de agricultura del valle del fuerte
Escuela superior de agricultura del valle del fuerteEscuela superior de agricultura del valle del fuerte
Escuela superior de agricultura del valle del fuerte
 
Rencana buah
Rencana buahRencana buah
Rencana buah
 
Untitled Presentation
Untitled PresentationUntitled Presentation
Untitled Presentation
 

Similar to Post compiler software optimization for reducing energy

Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solution
Shobha Kumar
 

Similar to Post compiler software optimization for reducing energy (20)

Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
 
Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014
 
Small is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignSmall is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case Design
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Callgraph analysis
Callgraph analysisCallgraph analysis
Callgraph analysis
 
Performance tuning Grails applications
Performance tuning Grails applicationsPerformance tuning Grails applications
Performance tuning Grails applications
 
Compute Cloud Performance Showdown: 18 Months Later (OCI, AWS, IBM Cloud, GCP...
Compute Cloud Performance Showdown: 18 Months Later (OCI, AWS, IBM Cloud, GCP...Compute Cloud Performance Showdown: 18 Months Later (OCI, AWS, IBM Cloud, GCP...
Compute Cloud Performance Showdown: 18 Months Later (OCI, AWS, IBM Cloud, GCP...
 
Functional verification techniques EW16 session
Functional verification techniques  EW16 sessionFunctional verification techniques  EW16 session
Functional verification techniques EW16 session
 
FPGA-enhanced Bioinformatics @ NECST
FPGA-enhanced Bioinformatics @ NECSTFPGA-enhanced Bioinformatics @ NECST
FPGA-enhanced Bioinformatics @ NECST
 
Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solution
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applications
 
Addressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization ProblemsAddressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization Problems
 
Migrating Mission-Critical Workloads to Intel Architecture
Migrating Mission-Critical Workloads to Intel ArchitectureMigrating Mission-Critical Workloads to Intel Architecture
Migrating Mission-Critical Workloads to Intel Architecture
 
IBM POWER8 as an HPC platform
IBM POWER8 as an HPC platformIBM POWER8 as an HPC platform
IBM POWER8 as an HPC platform
 
ASIC design verification
ASIC design verificationASIC design verification
ASIC design verification
 
Past Experiences and Future Challenges using Automatic Performance Modelling ...
Past Experiences and Future Challenges using Automatic Performance Modelling ...Past Experiences and Future Challenges using Automatic Performance Modelling ...
Past Experiences and Future Challenges using Automatic Performance Modelling ...
 
QuantumChemistry500
QuantumChemistry500QuantumChemistry500
QuantumChemistry500
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
 
2019 2 testing and verification of vlsi design_verification
2019 2 testing and verification of vlsi design_verification2019 2 testing and verification of vlsi design_verification
2019 2 testing and verification of vlsi design_verification
 
Common SQL Server Mistakes and How to Avoid Them with Tim Radney
Common SQL Server Mistakes and How to Avoid Them with Tim RadneyCommon SQL Server Mistakes and How to Avoid Them with Tim Radney
Common SQL Server Mistakes and How to Avoid Them with Tim Radney
 

Recently uploaded

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 

Recently uploaded (20)

S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 

Post compiler software optimization for reducing energy

  • 1. Post-compiler Software Optimization for Reducing Energy Eric Schulte, Jonathan Dorn, et all Presented By: Abhishek Abhyankar MS Computer Science Virginia Tech 08-May-15 Computer Architecture CS 5504 Spring 2015 1
  • 2. Traditional Way of doing things • Make a case for reduction in energy consumption. • Traditionally Energy optimization handled in Hardware. • Voltage Scaling , Heterogeneous Cores, Specialized Cores and, many others. • On Software side, its mainly concerned about increasing speed and reducing size of the compiled code. • Extracting Instruction, Thread, and Data level parallelism. 08-May-15 Computer Architecture CS 5504 Spring 2015 2
  • 3. Post Compile Software Optimization • Handle the Optimizations on Software level. • Take the compiled code output from standard compiler. • How can this be achieved ? One of the approach is : “Genetic Optimization algorithm which uses concepts from Evolutionary computation which stochastically mutilates the software for optimum implementation, all this while preserving strict functional semantics.” 08-May-15 Computer Architecture CS 5504 Spring 2015 3
  • 4. Background Concepts • Functional Vs Non Functional Requirements. • On going debate between • Functional Requirements: Adhering to Specifications, Correctness of the code. • Non Functional Requirements: Memory Utilization, Energy Consumption. • Stochastic Methods • Used heavily in Evolutionary computation. • Randomly trying out different combinations. 08-May-15 Computer Architecture CS 5504 Spring 2015 4
  • 5. Background Concepts .. continued • Profile Guided Optimizations. • Program is profiled by running it and gathering run time data. • Call graph generation. • Enforcing “nearest is the best” policy. • Software robustness even after mutilation. • Random mutilations of the software preserve the semantic meaning. • Many implementation possible which lead to same semantic goal. 08-May-15 Computer Architecture CS 5504 Spring 2015 5
  • 6. Background Concepts .. continued • Evolutionary Computation. • Darwinian principles. • Generally applied in black box approach. • Steady State Algorithms. • After each iterations candidates are simply inserted back in populous. • Best among them is selected or rather worse is deleted. 08-May-15 Computer Architecture CS 5504 Spring 2015 6
  • 7. Genetic Optimization Algorithm(GOA) • “Genetic Optimization algorithm which uses concepts from Evolutionary computation which stochastically mutilates the software for optimum implementation, all this while preserving strict functional semantics.” • Takes in three inputs to start. • Benchmark Applications or Kernels. • Test Suites which validate the mutation. • Fitness Function 08-May-15 Computer Architecture CS 5504 Spring 2015 7
  • 8. High-level working of GOA 08-May-15 Computer Architecture CS 5504 Spring 2015 8
  • 9. GOA Working .. continued • Take the program • create many random variants of the program by changing the order of the instructions , deleting and editing some • Test the new variant with the test suites which are submitted • If they pass then check for improvement in the non functional requirements function • If yes spit out the assembly code as an optimized code after applying Minimization technique. 08-May-15 Computer Architecture CS 5504 Spring 2015 9
  • 10. Representation of Assembly code • Very simple strategy adopted to represent the assembly code. • Each line will have a cell in an array. • One line can be broken down and also have multiple cells too. • The Augmented instructions are avoided. • Limits the search space. 08-May-15 Computer Architecture CS 5504 Spring 2015 10
  • 11. Experimental Setup and Benchmark Kernels: • Intel machine used as an example of Desktop computer. • I7 , 4 Cores, 8 GB Ram • AMD machine used as an example for Server Scale machine. • 48 Cores, 128 GB Ram • 8 Kernels from PARSEC benchmark suite used. • Blackscholes, bodytrack, ferret, fluidanimate, freqmine, swaptions, vips, and x264 • They should at-least keep the underlying Architecture running for 1 sec and produce output. 08-May-15 Computer Architecture CS 5504 Spring 2015 11
  • 12. Input Test Suites • Comprehensive test suites for each kernel. • Smallest input size of the test suite is considered. • Just for validating requirements specification, stress or border testing not needed that this point. 08-May-15 Computer Architecture CS 5504 Spring 2015 12
  • 13. Fitness Function • GOA proposes a linear scalar energy model • Hardware counters are captured using the “perf” utility in Linux. • Tightly coupled with the underlying Architecture and Fine grained. • Heavily dependent on time factor. 08-May-15 Computer Architecture CS 5504 Spring 2015 13 power = Cconst + Cins + Cfpos + Ctca + Cmem energy = seconds power ins fpos tca mem cycle cycle cycle cycle
  • 14. Constants Derived from Empirical Study 08-May-15 Computer Architecture CS 5504 Spring 2015 14
  • 15. Minimization Technique • Iteration tend to create redundant patterns of code. • The goal is to get the best energy efficiency with least amount of changes. • Delta Debugging is used to compare and remove redundant , non influential changes. 08-May-15 Computer Architecture CS 5504 Spring 2015 15
  • 16. Code Example of GOA 08-May-15 Computer Architecture CS 5504 Spring 2015 16
  • 17. Post processing the optimized code • Execute the original code with Held-out test suite. • Obtain Wall-Socket real measurements. • Execute the optimized code with Held-out test suite. • Obtain Wall-Socket real measurements. • Compare the two results and find out patterns which saw improvements and percentage improvement in Energy consumption. 08-May-15 Computer Architecture CS 5504 Spring 2015 17
  • 18. Results • In blackscholes kernel GOA caught the induced repeatition loop and found a way around it. • In swaptions kernel GOA gave a 42% energy savings. • Have to take it with a pinch of salt though. • In vips kernel , the cache misses actually increased instructions lines decreased and hence 20% improvement was observed. 08-May-15 Computer Architecture CS 5504 Spring 2015 18
  • 19. Interesting Observations • 7% average error found in most prediction models and so as in GOA. • But still works fine with it. • Empirical studies show that GOA might be better suited to finding efficient sequence of assembly instructions but not efficient memory access patterns. • Energy reduction percentage is consistently more on AMD machines. • But mainly due more opportunities due to bigger machine. 08-May-15 Computer Architecture CS 5504 Spring 2015 19
  • 20. QoS dependent Optimization • “Relaxed” preservation of semantics and more emphasis on QoS. • The plug and play testing suite policy gives the developer option of making GOA strict or loose on semantics. • Relaxed functional requirements provide much more energy efficiency but risk is taken by the developer to see the program semantic does not break. 08-May-15 Computer Architecture CS 5504 Spring 2015 20
  • 21. Key contributions • Genetic Optimization Algorithm (GOA) combines insights from profile- guided optimization, superoptimization, evolutionary computation and mutational robustness. • This technique gave 20% average energy savings across all benchmarks. • Very simple and mostly leverages from already available techniques. 08-May-15 Computer Architecture CS 5504 Spring 2015 21
  • 22. Drawbacks of GOA • Energy constant are taken empirically over repeated run on specific hardware. • Introducing GOA on new architecture will take considerable amount of work. • Non deterministic approach makes it almost impossible to restore to earlier code path after the software is changed even slightly. • Must provide indexing of the code paths and remember them. • Very High quality test suites “required” • Failure to provide them might result in over optimized false working code. 08-May-15 Computer Architecture CS 5504 Spring 2015 22
  • 23. Proposed Future Work • Currently only applied to x86. • A matrix implementation proposed as a solution to this problem. • Indirect selection can optimize one parameter at the cost of worsening other. • Should be generalized to Java Byte code and ARM. • Instead of Compiler which takes a predefined “agreed” path, a code should be compiled with multiple compiler using multiple paths and then best should be selected. 08-May-15 Computer Architecture CS 5504 Spring 2015 23
  • 24. Questions / Discussion 08-May-15 Computer Architecture CS 5504 Spring 2015 24