EvoSoft 2016 @ GECCO 2016
Denver, Colorado, USA
July 20-24, 2016
elephant56
Design and Implementation of a
Parallel Genetic Algorithms Framework
on Hadoop MapReduce
PASQUALE SALZA1, FILOMENA FERRUCCI1, FEDERICA SARRO2
1University of Salerno, 2University College London
https://github.com/pasqualesalza/elephant56
• An open source framework based on Hadoop
MapReduce
• Allows developers to focus on Genetic
Algorithms definition only
• Manages the interaction with Hadoop
MapReduce
56elephant
https://github.com/pasqualesalza/elephant56
elephant
the number of
chromosomes of the
elephant is 56
the mascot of Hadoop
is an elephant
56
Why parallelize?
• Genetic Algorithms are intensive time and
resource consuming
• Luckily they are naturally parallelizable!
Hadoop
• Created by Doug Cutting in 2006 when he was
in Yahoo!
• Nowadays, it is part of YARN (Yet Another
Resource Negotiator)
Why Hadoop?
• Large clusters of machines
• Scalability and reliability of computation
processes and storage
• Wide diffusion in industries (BigData)
HDFS
• Hadoop Distributed File System
• Multiple data nodes
• Files are split up in blocks, spread across data
nodes
• Blocks replicated multiple times for reliability
and performance
MapReduce
• A programming paradigm derived from
functional programming
• Based on two main functions:
• map: handles the parallelization
• reduce: collects and merge the results
A Typical Day of a
MapReduce Job
• Implements word count, that counts word
occurrences in a text
Shuffling Reducing OutputMappingSplittingInput
Dear Bear River
Car Car River
Deer Car Bear
Deer Bear River
Car Car River
Deer Car Bear
Deer, 1
Bear, 1
River, 1
Car, 1
Car, 1
River, 1
Deer, 1
Car, 1
Bear, 1
Bear, (1, 1)
Car, (1, 1, 1)
Deer, (1, 1)
River, (1, 1)
Bear, 2
Car, 3
Deer, 2
River, 2
Bear, 2
Car, 3
Deer, 2
River, 2
How to parallelize
Genetic Algorithms?
Master/slave Model
Master/slave Model
• The master node manages the population of individuals and assigns
them to slave nodes
• The slaves evaluate only the fitness function
• The master applies other genetic operators
• Is similar to a sequential Genetic Algorithm
slave slave slave slave
master
Master/slave MapReduce
1. The Driver (master node) initializes the initial population
2. During each generation, it spreads the individuals across slave nodes
3. The slaves perform the fitness evaluation
4. The Driver applies the other genetic operators
...
FITNESS EVALUATION
Generation 1
...
Generation 2
. . .
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION
Mapper 1
Mapper n
Driver Driver
Grid Model
Grid Model
• Applies genetic operators independently for each
neighborhood
• Improves the diversification of individuals during evolution
• Reduces the probability of local optimum convergence
Grid MapReduce
1. The neighborhoods are chosen randomly at each generation
2. The Driver splits the population in equal parts
3. The partitioner sends individuals to their assigned neighborhood
4. The reducers execute other genetic operators
...
...
FITNESS EVALUATION SHUFFLE
Generation 1
...
Generation 2
...
. . .
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION
Mapper 1
Mapper n
Reducer 1
Reducer n
Driver
Island Model
Island Model
• The initial population is split in several groups (islands)
• The Genetic Algorithm is executed “independently” on each of them
• Periodically, the individuals are exchanged between islands through the
migration
• Explores different directions in the search space and the migration enhances the
diversity
Island MapReduce
1. The driver spreads individuals across islands
2. Consecutive generations are executed on islands
3. The migration occurs in periods with reducers
...
...
FITNESS EVALUATION
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION MIGRATION
Generations Period 1
...
Generations Period 2
...
. . .
Mapper 1
Mapper n Reducer n
Driver
Reducer 1
Example of Use
Install elephant56
<repositories>
<repository>
<id>jcenter</id>
<name>bintray</name>
<url>http://jcenter.bintray.com</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>it.unisa</groupId>
<artifactId>elephant56</artifactId>
<version>1.0.0</version>
</dependency>
</dependencies>
You can install elephant56 by simply adding it as
a dependency to your maven (or gradle) project!
https://github.com/pasqualesalza/elephant56
The OneMax Problem
Individual
public class BitStringIndividual extends Individual {
private List<Boolean> bits;
public BitStringIndividual(int size) {
bits = new ArrayList<>(size);
}
public void set(int index, boolean value) {
bits.set(index, value);
}
public boolean get(int index) {
return bits.get(index);
}
public int size() {
return bits.size();
}
}
Fitness Value
public class IntegerFitnessValue extends FitnessValue {
private int number;
public IntegerFitnessValue(int value) {
number = value;
}
public int get() {
return number;
}
@Override
public int compareTo(FitnessValue other) {
if (other == null)
return 1;
Integer otherInteger = ((IntegerFitnessValue) other).get();
Integer.compare(number, otherInteger);
}
}
Fitness Evaluation
public class OneMaxFitnessEvaluation
extends FitnessEvaluation<BitStringIndividual, IntegerFitnessValue> {
@Override
public IntegerFitnessValue evaluate(
IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) {
BitStringIndividual individual = wrapper.getIndividual();
int count = 0;
for (int i = 0; i < individual.size(); i++)
if (individual.get(i))
count++;
return new IntegerFitnessValue(count);
}
}
Initialization
public class RandomBitStringInitialization
extends Initialization<BitStringIndividual, IntegerFitnessValue> {
public RandomBitStringInitilization(..., Properties userProperties, ...) {
individualSize = userProperties.getInt(INDIVIDUAL_SIZE_PROPERTY);
random = new Random();
}
@Override
public IndividualWrapper<BitStringIndividual, IntegerFitnessValue>
generateNextIndividual(int id) {
BitStringIndividual individual = new BitStringIndividual(individualSize);
for (int i = 0; i < individualSize; i++)
individual.set(i, random.nextInt(2) == 1);
return new IndividualWrapper(individual);
}
}
Crossover
public class BitStringSinglePointCrossover
extends Crossover<BitStringIndividual, IntegerFitnessValue> {
public BitStringSinglePointCrossover(...) {
random = new Random();
}
@Override
public List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>>
cross(IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper1, wrapper2, ...) {
BitStringIndividual parent1 = wrapper1.getIndividual();
BitStringIndividual parent2 = wrapper2.getIndividual();
cutPoint = random.nextInt(parent1.size());
BitStringIndividual child1 = new BitStringIndividual(parent1.size());
BitStringIndividual child2 = new BitStringIndividual(parent1.size());
for (int i = 0; i < cutPoint; i++) {
child1.set(i, parent1.get(i));
child2.set(i, parent2.get(i));
}
for (int i = cutPoint; i < parent1.size(); i++) {
child1.set(i, parent2.get(i));
child2.set(i, parent1.get(i));
}
List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>> children = new ArrayList<>(2);
children.add(new IndividualWrapper<>(child1));
children.add(new IndividualWrapper<>(child2));
return children;
}
}
Mutation
public class BitStringMutation
extends Mutation<BitStringIndividual, IntegerFitnessValue> {
public BitStringMutation(...) {
mutationProbability =
userProperties.getDouble(MUTATION_PROBABILITY_PROPERTY);
random = new Random();
}
@Override
public IndividualWrapper<BitStringIndividual, IntegerFitnessValue> mutate(
IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) {
BitStringIndividual individual = wrapper.getIndividual();
for (int i = 0; i < individual.size(); i++)
if (random.nextDouble() <= mutationProbability)
individual.set(i, !individual.get(i));
return wrapper;
}
}
App
public class App {
public static void main(String[] args) {
driver = new GlobalDriver();
driver.setIndividualClass(BitStringIndividual.class);
driver.setFitnessValueClass(IntegerFitnessValue.class);
driver.setInitializationClass(RandomBitStringInitialization.class);
driver.setInitializationPopulationSize(POPULATION_SIZE);
userProperties.setInt(INDIVIDUAL_SIZE_PROPERTY, INDIVIDUAL_SIZE);
driver.setElitismClass(BestIndividualsElitism.class);
driver.activateElitism(true);
userProperties.setInt(NUMBER_OF_ELITISTS_PROPERTY, NUMBER_OF_ELITISTS);
driver.setParentsSelectionClass(RouletteWheelParentsSelection.class);
driver.setCrossoverClass(BitStringSinglePointCrossover.class);
driver.setSurvivalSelectionClass(RouletteWheelSurvivalSelection.class);
driver.activateSurvivalSelection(true);
driver.setUserProperties(userProperties);
driver.run();
}
}
driver.run()
It works!
Thank you!
PLEASE, COLLABORATE!
https://github.com/pasqualesalza/elephant56
pasquale.salza@gmail.com

elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce

  • 1.
    EvoSoft 2016 @ GECCO2016 Denver, Colorado, USA July 20-24, 2016 elephant56 Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce PASQUALE SALZA1, FILOMENA FERRUCCI1, FEDERICA SARRO2 1University of Salerno, 2University College London https://github.com/pasqualesalza/elephant56
  • 2.
    • An opensource framework based on Hadoop MapReduce • Allows developers to focus on Genetic Algorithms definition only • Manages the interaction with Hadoop MapReduce 56elephant https://github.com/pasqualesalza/elephant56
  • 3.
    elephant the number of chromosomesof the elephant is 56 the mascot of Hadoop is an elephant 56
  • 4.
    Why parallelize? • GeneticAlgorithms are intensive time and resource consuming • Luckily they are naturally parallelizable!
  • 5.
    Hadoop • Created byDoug Cutting in 2006 when he was in Yahoo! • Nowadays, it is part of YARN (Yet Another Resource Negotiator)
  • 6.
    Why Hadoop? • Largeclusters of machines • Scalability and reliability of computation processes and storage • Wide diffusion in industries (BigData)
  • 7.
    HDFS • Hadoop DistributedFile System • Multiple data nodes • Files are split up in blocks, spread across data nodes • Blocks replicated multiple times for reliability and performance
  • 8.
    MapReduce • A programmingparadigm derived from functional programming • Based on two main functions: • map: handles the parallelization • reduce: collects and merge the results
  • 9.
    A Typical Dayof a MapReduce Job • Implements word count, that counts word occurrences in a text Shuffling Reducing OutputMappingSplittingInput Dear Bear River Car Car River Deer Car Bear Deer Bear River Car Car River Deer Car Bear Deer, 1 Bear, 1 River, 1 Car, 1 Car, 1 River, 1 Deer, 1 Car, 1 Bear, 1 Bear, (1, 1) Car, (1, 1, 1) Deer, (1, 1) River, (1, 1) Bear, 2 Car, 3 Deer, 2 River, 2 Bear, 2 Car, 3 Deer, 2 River, 2
  • 10.
  • 11.
  • 12.
    Master/slave Model • Themaster node manages the population of individuals and assigns them to slave nodes • The slaves evaluate only the fitness function • The master applies other genetic operators • Is similar to a sequential Genetic Algorithm slave slave slave slave master
  • 13.
    Master/slave MapReduce 1. TheDriver (master node) initializes the initial population 2. During each generation, it spreads the individuals across slave nodes 3. The slaves perform the fitness evaluation 4. The Driver applies the other genetic operators ... FITNESS EVALUATION Generation 1 ... Generation 2 . . . ELITISM PARENTS SELECTION CROSSOVER MUTATION SURVIVAL SELECTION Mapper 1 Mapper n Driver Driver
  • 14.
  • 15.
    Grid Model • Appliesgenetic operators independently for each neighborhood • Improves the diversification of individuals during evolution • Reduces the probability of local optimum convergence
  • 16.
    Grid MapReduce 1. Theneighborhoods are chosen randomly at each generation 2. The Driver splits the population in equal parts 3. The partitioner sends individuals to their assigned neighborhood 4. The reducers execute other genetic operators ... ... FITNESS EVALUATION SHUFFLE Generation 1 ... Generation 2 ... . . . ELITISM PARENTS SELECTION CROSSOVER MUTATION SURVIVAL SELECTION Mapper 1 Mapper n Reducer 1 Reducer n Driver
  • 17.
  • 18.
    Island Model • Theinitial population is split in several groups (islands) • The Genetic Algorithm is executed “independently” on each of them • Periodically, the individuals are exchanged between islands through the migration • Explores different directions in the search space and the migration enhances the diversity
  • 19.
    Island MapReduce 1. Thedriver spreads individuals across islands 2. Consecutive generations are executed on islands 3. The migration occurs in periods with reducers ... ... FITNESS EVALUATION ELITISM PARENTS SELECTION CROSSOVER MUTATION SURVIVAL SELECTION MIGRATION Generations Period 1 ... Generations Period 2 ... . . . Mapper 1 Mapper n Reducer n Driver Reducer 1
  • 20.
  • 21.
  • 22.
  • 23.
    Individual public class BitStringIndividualextends Individual { private List<Boolean> bits; public BitStringIndividual(int size) { bits = new ArrayList<>(size); } public void set(int index, boolean value) { bits.set(index, value); } public boolean get(int index) { return bits.get(index); } public int size() { return bits.size(); } }
  • 24.
    Fitness Value public classIntegerFitnessValue extends FitnessValue { private int number; public IntegerFitnessValue(int value) { number = value; } public int get() { return number; } @Override public int compareTo(FitnessValue other) { if (other == null) return 1; Integer otherInteger = ((IntegerFitnessValue) other).get(); Integer.compare(number, otherInteger); } }
  • 25.
    Fitness Evaluation public classOneMaxFitnessEvaluation extends FitnessEvaluation<BitStringIndividual, IntegerFitnessValue> { @Override public IntegerFitnessValue evaluate( IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) { BitStringIndividual individual = wrapper.getIndividual(); int count = 0; for (int i = 0; i < individual.size(); i++) if (individual.get(i)) count++; return new IntegerFitnessValue(count); } }
  • 26.
    Initialization public class RandomBitStringInitialization extendsInitialization<BitStringIndividual, IntegerFitnessValue> { public RandomBitStringInitilization(..., Properties userProperties, ...) { individualSize = userProperties.getInt(INDIVIDUAL_SIZE_PROPERTY); random = new Random(); } @Override public IndividualWrapper<BitStringIndividual, IntegerFitnessValue> generateNextIndividual(int id) { BitStringIndividual individual = new BitStringIndividual(individualSize); for (int i = 0; i < individualSize; i++) individual.set(i, random.nextInt(2) == 1); return new IndividualWrapper(individual); } }
  • 27.
    Crossover public class BitStringSinglePointCrossover extendsCrossover<BitStringIndividual, IntegerFitnessValue> { public BitStringSinglePointCrossover(...) { random = new Random(); } @Override public List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>> cross(IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper1, wrapper2, ...) { BitStringIndividual parent1 = wrapper1.getIndividual(); BitStringIndividual parent2 = wrapper2.getIndividual(); cutPoint = random.nextInt(parent1.size()); BitStringIndividual child1 = new BitStringIndividual(parent1.size()); BitStringIndividual child2 = new BitStringIndividual(parent1.size()); for (int i = 0; i < cutPoint; i++) { child1.set(i, parent1.get(i)); child2.set(i, parent2.get(i)); } for (int i = cutPoint; i < parent1.size(); i++) { child1.set(i, parent2.get(i)); child2.set(i, parent1.get(i)); } List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>> children = new ArrayList<>(2); children.add(new IndividualWrapper<>(child1)); children.add(new IndividualWrapper<>(child2)); return children; } }
  • 28.
    Mutation public class BitStringMutation extendsMutation<BitStringIndividual, IntegerFitnessValue> { public BitStringMutation(...) { mutationProbability = userProperties.getDouble(MUTATION_PROBABILITY_PROPERTY); random = new Random(); } @Override public IndividualWrapper<BitStringIndividual, IntegerFitnessValue> mutate( IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) { BitStringIndividual individual = wrapper.getIndividual(); for (int i = 0; i < individual.size(); i++) if (random.nextDouble() <= mutationProbability) individual.set(i, !individual.get(i)); return wrapper; } }
  • 29.
    App public class App{ public static void main(String[] args) { driver = new GlobalDriver(); driver.setIndividualClass(BitStringIndividual.class); driver.setFitnessValueClass(IntegerFitnessValue.class); driver.setInitializationClass(RandomBitStringInitialization.class); driver.setInitializationPopulationSize(POPULATION_SIZE); userProperties.setInt(INDIVIDUAL_SIZE_PROPERTY, INDIVIDUAL_SIZE); driver.setElitismClass(BestIndividualsElitism.class); driver.activateElitism(true); userProperties.setInt(NUMBER_OF_ELITISTS_PROPERTY, NUMBER_OF_ELITISTS); driver.setParentsSelectionClass(RouletteWheelParentsSelection.class); driver.setCrossoverClass(BitStringSinglePointCrossover.class); driver.setSurvivalSelectionClass(RouletteWheelSurvivalSelection.class); driver.activateSurvivalSelection(true); driver.setUserProperties(userProperties); driver.run(); } }
  • 30.
  • 31.
  • 32.