3. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Compiler optimization
โข Compiler optimization is the technique of minimizing or maximizing some
features of an executable code by tuning the output of a compiler.
โข Modern compilers support many different optimization phases and these phases
should analyze the code and should produce semantically equivalent
performance enhanced code.
โข The three vital parameters defining enhancement of the performance are:
Executiontime
Sizeofcode
Introduction
4/22
4. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
โข The compiler optimization phase ordering not only possesses challenges to
compiler developer but also for multithreaded programmer to enhance the
performance of Multicore systems.
โข Many compilers have numerous optimization techniques which are applied in
predetermined ordering.
โข These ordering of optimization techniques may not always give an optimal code.
CODE
Search Space
Introduction
The phase ordering
4/22
5. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Optimization flags
โข The variation in optimization phase ordering depends on the application that is
compiled, the architecture of the machine on which it runs and the compiler
implementation.
โข Many compilers allow optimization flags to be set by the users.
โข Turning on optimization flags makes the compiler attempt to improve the
performance and code size at the expense of compilation time.
Introduction
5/22
6. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
GNU compiler collection
โข The GNU Compiler Collection (GCC) includes front ends for C, C++, Objective C,
Fortran, Java, Ada, and Go, as well as libraries for these languages
โข In order to control compilation-time and compiler memory usage, and
the trade-offs between speed and space for the resulting executable, GCC
provides a range of general optimization levels, numbered from 0โ3, as
well as individual options for specific types of optimization.
O3
O2
O1
Background
6/22
7. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Optimization levels
โข The impact of the different optimization levels on the input code is as described
below:
-O0 or no-O
(default)
Optimization
Easy bug
elimination
Background
7/22
8. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Optimization levels
1. Less compile time.
2. smaller and faster
executable code.
A lot of simple
optimizations
eliminates
redundancy
-O1 or -O
Background
8/22
9. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Optimization levels
โข Only optimizations that do not require any speed-space tradeoffs
are used, so the executable should not increase in size.
-O2
1. maximum
optimization without
increasing the
executable size
O1+ additional
optimizations
instruction
scheduling
1. More compile time.
2. More memory usage.the best choice
for deployment of a program
Background
8/22
10. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Optimization levels
-O3
1. faster executable
code
2. Maximum Loop
optimization
O1+ O2+
more expensive
optimizations
function
inlining
1. Bulky code
Background
8/22
12. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
The challenge
sequential quick sort Parallel quick sort
Which optimization level??
overhead of inter-process communication
Background
10/22
13. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Genetic algorithm
Initial population
Selection
Crossover & mutation
Intermediate population
(mating pool)
Replacement
Next population
Background
11/22
14. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
PGA for Compiler Optimization
โข The work in this research uses GCC 4.8 compiler on Ubuntu 12.04 with OpenMP
3.0 library.
Methodology
12/22
15. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
The master-slave model
โข In the master-slave model the
master runs the evolutionary
algorithm, controls the slaves
and distributes the work.
โข The Slaves take batches of
individuals from the master
and evaluate them. Finally
send the calculated fitness
value back to master.
Methodology
13/22
17. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Fitness function
โข In the proposed system the PGA works with a population of six Chromosomes on
eight core machine and fitness function is computed at the Master core.
Fitness=|(exe_with_flagi-exe_without_flagi)|
iโ {1, 2, โฆ . , 12}
Master Node
Generate
random
population
evaluates all
individuals
Slave nodes
algorithm
After 200 generations
Methodology
15/22
18. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Algorithm for Slave Nodes
Receives all the chromosomes from the master node with the fitness values.
The slave cores apply the roulette wheel, Stochastic Universal Sampling and Elitism
methods respectively for selection process in parallel
Create next generation applying two point crossover.
Applies mutations using method, two position interchange and produce two new
offspring/chromosomes.
Sends both the chromosomes back to the master-node. (The master collects chromosomes
from all slaves.)
Step 1
Step 2
Step 3
Step 4
Step 5
Methodology
16/22
20. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Crossover and mutation
Two point
crossover
Swap mutation
Methodology
18/22
21. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Benchmarks
โข All the Bench mark programs are parallelized using OpenMP library to reap the
benefits of PGA.
Experimental results
19/22
22. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Performance analysis
โข As one sees the Figures, the results after applying PGA (WGAO) presents a major
improvement with respect to the random optimization (WRO) and compiling
code without applying optimization (WOO).
Experimental results
20/22
27. Introduction Background Methodology Experimental results
Optimizing Code using Parallel Genetic Algorithm93/3/5
Conclusion
โข In Compiler Optimization research, the phase ordering is an important
performance enhancement problem.
โข This study indicates that by increasing the number of cores the performance of
the benchmark program increases along with the usage of PGA.
โข The major concern in the experiment is the master core waiting time collect
values from slaves which is primarily due to the usage of Synchronized
communication between the Master-Slave cores in the system.
โข Further it may be explicitly noted that apart from PRIMS algorithm for core-8
system all other Bench marks exhibit better average performance.
22/22