elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce

EvoSoft 2016 @ GECCO 2016
Denver, Colorado, USA
July 20-24, 2016
elephant56
Design and Implementation of a
Parallel Genetic Algorithms Framework
on Hadoop MapReduce
PASQUALE SALZA1, FILOMENA FERRUCCI1, FEDERICA SARRO2
1University of Salerno, 2University College London
https://github.com/pasqualesalza/elephant56

• An open source framework based on Hadoop
MapReduce
• Allows developers to focus on Genetic
Algorithms deﬁnition only
• Manages the interaction with Hadoop
MapReduce
56elephant

elephant
the number of
chromosomes of the
elephant is 56
the mascot of Hadoop
is an elephant
56

Why parallelize?
• Genetic Algorithms are intensive time and
resource consuming
• Luckily they are naturally parallelizable!

Hadoop
• Created by Doug Cutting in 2006 when he was
in Yahoo!
• Nowadays, it is part of YARN (Yet Another
Resource Negotiator)

Why Hadoop?
• Large clusters of machines
• Scalability and reliability of computation
processes and storage
• Wide diﬀusion in industries (BigData)

HDFS
• Hadoop Distributed File System
• Multiple data nodes
• Files are split up in blocks, spread across data
nodes
• Blocks replicated multiple times for reliability
and performance

MapReduce
• A programming paradigm derived from
functional programming
• Based on two main functions:
• map: handles the parallelization
• reduce: collects and merge the results

A Typical Day of a
MapReduce Job
• Implements word count, that counts word
occurrences in a text
Shufﬂing Reducing OutputMappingSplittingInput
Dear Bear River
Car Car River
Deer Car Bear
Deer Bear River
Car Car River
Deer Car Bear
Deer, 1
Bear, 1
River, 1
Car, 1
Car, 1
River, 1
Deer, 1
Car, 1
Bear, 1
Bear, (1, 1)
Car, (1, 1, 1)
Deer, (1, 1)
River, (1, 1)
Bear, 2
Car, 3
Deer, 2
River, 2
Bear, 2
Car, 3
Deer, 2
River, 2

How to parallelize
Genetic Algorithms?

Master/slave Model
• The master node manages the population of individuals and assigns
them to slave nodes
• The slaves evaluate only the ﬁtness function
• The master applies other genetic operators
• Is similar to a sequential Genetic Algorithm
slave slave slave slave
master

Master/slave MapReduce
1. The Driver (master node) initializes the initial population
2. During each generation, it spreads the individuals across slave nodes
3. The slaves perform the ﬁtness evaluation
4. The Driver applies the other genetic operators
...
FITNESS EVALUATION
Generation 1
...
Generation 2
. . .
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION
Mapper 1
Mapper n
Driver Driver

Grid Model
• Applies genetic operators independently for each
neighborhood
• Improves the diversiﬁcation of individuals during evolution
• Reduces the probability of local optimum convergence

Grid MapReduce
1. The neighborhoods are chosen randomly at each generation
2. The Driver splits the population in equal parts
3. The partitioner sends individuals to their assigned neighborhood
4. The reducers execute other genetic operators
...
...
FITNESS EVALUATION SHUFFLE
Generation 1
...
Generation 2
...
. . .
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION
Mapper 1
Mapper n
Reducer 1
Reducer n
Driver

Island Model
• The initial population is split in several groups (islands)
• The Genetic Algorithm is executed “independently” on each of them
• Periodically, the individuals are exchanged between islands through the
migration
• Explores diﬀerent directions in the search space and the migration enhances the
diversity

Island MapReduce
1. The driver spreads individuals across islands
2. Consecutive generations are executed on islands
3. The migration occurs in periods with reducers
...
...
FITNESS EVALUATION
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION MIGRATION
Generations Period 1
...
Generations Period 2
...
. . .
Mapper 1
Mapper n Reducer n
Driver
Reducer 1

Install elephant56
<repositories>
<repository>
<id>jcenter</id>
<name>bintray</name>
<url>http://jcenter.bintray.com</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>it.unisa</groupId>
<artifactId>elephant56</artifactId>
<version>1.0.0</version>
</dependency>
</dependencies>
You can install elephant56 by simply adding it as
a dependency to your maven (or gradle) project!

Individual
public class BitStringIndividual extends Individual {
private List<Boolean> bits;
public BitStringIndividual(int size) {
bits = new ArrayList<>(size);
}
public void set(int index, boolean value) {
bits.set(index, value);
}
public boolean get(int index) {
return bits.get(index);
}
public int size() {
return bits.size();
}
}

Fitness Value
public class IntegerFitnessValue extends FitnessValue {
private int number;
public IntegerFitnessValue(int value) {
number = value;
}
public int get() {
return number;
}
@Override
public int compareTo(FitnessValue other) {
if (other == null)
return 1;
Integer otherInteger = ((IntegerFitnessValue) other).get();
Integer.compare(number, otherInteger);
}
}

Fitness Evaluation
public class OneMaxFitnessEvaluation
extends FitnessEvaluation<BitStringIndividual, IntegerFitnessValue> {
@Override
public IntegerFitnessValue evaluate(
IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) {
BitStringIndividual individual = wrapper.getIndividual();
int count = 0;
for (int i = 0; i < individual.size(); i++)
if (individual.get(i))
count++;
return new IntegerFitnessValue(count);
}
}

Initialization
public class RandomBitStringInitialization
extends Initialization<BitStringIndividual, IntegerFitnessValue> {
public RandomBitStringInitilization(..., Properties userProperties, ...) {
individualSize = userProperties.getInt(INDIVIDUAL_SIZE_PROPERTY);
random = new Random();
}
@Override
public IndividualWrapper<BitStringIndividual, IntegerFitnessValue>
generateNextIndividual(int id) {
BitStringIndividual individual = new BitStringIndividual(individualSize);
for (int i = 0; i < individualSize; i++)
individual.set(i, random.nextInt(2) == 1);
return new IndividualWrapper(individual);
}
}

Crossover
public class BitStringSinglePointCrossover
extends Crossover<BitStringIndividual, IntegerFitnessValue> {
public BitStringSinglePointCrossover(...) {
}
@Override
public List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>>
cross(IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper1, wrapper2, ...) {
BitStringIndividual parent1 = wrapper1.getIndividual();
BitStringIndividual parent2 = wrapper2.getIndividual();
cutPoint = random.nextInt(parent1.size());
BitStringIndividual child1 = new BitStringIndividual(parent1.size());
BitStringIndividual child2 = new BitStringIndividual(parent1.size());
for (int i = 0; i < cutPoint; i++) {
child1.set(i, parent1.get(i));
}
for (int i = cutPoint; i < parent1.size(); i++) {
}
List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>> children = new ArrayList<>(2);
children.add(new IndividualWrapper<>(child1));
children.add(new IndividualWrapper<>(child2));
return children;
}
}

Mutation
public class BitStringMutation
extends Mutation<BitStringIndividual, IntegerFitnessValue> {
public BitStringMutation(...) {
mutationProbability =
userProperties.getDouble(MUTATION_PROBABILITY_PROPERTY);
}
@Override
public IndividualWrapper<BitStringIndividual, IntegerFitnessValue> mutate(
IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) {
BitStringIndividual individual = wrapper.getIndividual();
for (int i = 0; i < individual.size(); i++)
if (random.nextDouble() <= mutationProbability)
individual.set(i, !individual.get(i));
return wrapper;
}
}

App
public class App {
public static void main(String[] args) {
driver = new GlobalDriver();
driver.setIndividualClass(BitStringIndividual.class);
driver.setFitnessValueClass(IntegerFitnessValue.class);
driver.setInitializationClass(RandomBitStringInitialization.class);
driver.setInitializationPopulationSize(POPULATION_SIZE);
userProperties.setInt(INDIVIDUAL_SIZE_PROPERTY, INDIVIDUAL_SIZE);
driver.setElitismClass(BestIndividualsElitism.class);
driver.activateElitism(true);
userProperties.setInt(NUMBER_OF_ELITISTS_PROPERTY, NUMBER_OF_ELITISTS);
driver.setParentsSelectionClass(RouletteWheelParentsSelection.class);
driver.setCrossoverClass(BitStringSinglePointCrossover.class);
driver.setSurvivalSelectionClass(RouletteWheelSurvivalSelection.class);
driver.activateSurvivalSelection(true);
driver.setUserProperties(userProperties);
driver.run();
}
}

Thank you!
PLEASE, COLLABORATE!
pasquale.salza@gmail.com

elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce

Similar to elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce (20)

Recently uploaded

Recently uploaded (13)

elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce