SlideShare a Scribd company logo
EvoSoft 2016 @ GECCO 2016
Denver, Colorado, USA
July 20-24, 2016
elephant56
Design and Implementation of a
Parallel Genetic Algorithms Framework
on Hadoop MapReduce
PASQUALE SALZA1, FILOMENA FERRUCCI1, FEDERICA SARRO2
1University of Salerno, 2University College London
https://github.com/pasqualesalza/elephant56
• An open source framework based on Hadoop
MapReduce
• Allows developers to focus on Genetic
Algorithms definition only
• Manages the interaction with Hadoop
MapReduce
56elephant
https://github.com/pasqualesalza/elephant56
elephant
the number of
chromosomes of the
elephant is 56
the mascot of Hadoop
is an elephant
56
Why parallelize?
• Genetic Algorithms are intensive time and
resource consuming
• Luckily they are naturally parallelizable!
Hadoop
• Created by Doug Cutting in 2006 when he was
in Yahoo!
• Nowadays, it is part of YARN (Yet Another
Resource Negotiator)
Why Hadoop?
• Large clusters of machines
• Scalability and reliability of computation
processes and storage
• Wide diffusion in industries (BigData)
HDFS
• Hadoop Distributed File System
• Multiple data nodes
• Files are split up in blocks, spread across data
nodes
• Blocks replicated multiple times for reliability
and performance
MapReduce
• A programming paradigm derived from
functional programming
• Based on two main functions:
• map: handles the parallelization
• reduce: collects and merge the results
A Typical Day of a
MapReduce Job
• Implements word count, that counts word
occurrences in a text
Shuffling Reducing OutputMappingSplittingInput
Dear Bear River
Car Car River
Deer Car Bear
Deer Bear River
Car Car River
Deer Car Bear
Deer, 1
Bear, 1
River, 1
Car, 1
Car, 1
River, 1
Deer, 1
Car, 1
Bear, 1
Bear, (1, 1)
Car, (1, 1, 1)
Deer, (1, 1)
River, (1, 1)
Bear, 2
Car, 3
Deer, 2
River, 2
Bear, 2
Car, 3
Deer, 2
River, 2
How to parallelize
Genetic Algorithms?
Master/slave Model
Master/slave Model
• The master node manages the population of individuals and assigns
them to slave nodes
• The slaves evaluate only the fitness function
• The master applies other genetic operators
• Is similar to a sequential Genetic Algorithm
slave slave slave slave
master
Master/slave MapReduce
1. The Driver (master node) initializes the initial population
2. During each generation, it spreads the individuals across slave nodes
3. The slaves perform the fitness evaluation
4. The Driver applies the other genetic operators
...
FITNESS EVALUATION
Generation 1
...
Generation 2
. . .
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION
Mapper 1
Mapper n
Driver Driver
Grid Model
Grid Model
• Applies genetic operators independently for each
neighborhood
• Improves the diversification of individuals during evolution
• Reduces the probability of local optimum convergence
Grid MapReduce
1. The neighborhoods are chosen randomly at each generation
2. The Driver splits the population in equal parts
3. The partitioner sends individuals to their assigned neighborhood
4. The reducers execute other genetic operators
...
...
FITNESS EVALUATION SHUFFLE
Generation 1
...
Generation 2
...
. . .
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION
Mapper 1
Mapper n
Reducer 1
Reducer n
Driver
Island Model
Island Model
• The initial population is split in several groups (islands)
• The Genetic Algorithm is executed “independently” on each of them
• Periodically, the individuals are exchanged between islands through the
migration
• Explores different directions in the search space and the migration enhances the
diversity
Island MapReduce
1. The driver spreads individuals across islands
2. Consecutive generations are executed on islands
3. The migration occurs in periods with reducers
...
...
FITNESS EVALUATION
ELITISM
PARENTS SELECTION
CROSSOVER
MUTATION
SURVIVAL SELECTION MIGRATION
Generations Period 1
...
Generations Period 2
...
. . .
Mapper 1
Mapper n Reducer n
Driver
Reducer 1
Example of Use
Install elephant56
<repositories>
<repository>
<id>jcenter</id>
<name>bintray</name>
<url>http://jcenter.bintray.com</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>it.unisa</groupId>
<artifactId>elephant56</artifactId>
<version>1.0.0</version>
</dependency>
</dependencies>
You can install elephant56 by simply adding it as
a dependency to your maven (or gradle) project!
https://github.com/pasqualesalza/elephant56
The OneMax Problem
Individual
public class BitStringIndividual extends Individual {
private List<Boolean> bits;
public BitStringIndividual(int size) {
bits = new ArrayList<>(size);
}
public void set(int index, boolean value) {
bits.set(index, value);
}
public boolean get(int index) {
return bits.get(index);
}
public int size() {
return bits.size();
}
}
Fitness Value
public class IntegerFitnessValue extends FitnessValue {
private int number;
public IntegerFitnessValue(int value) {
number = value;
}
public int get() {
return number;
}
@Override
public int compareTo(FitnessValue other) {
if (other == null)
return 1;
Integer otherInteger = ((IntegerFitnessValue) other).get();
Integer.compare(number, otherInteger);
}
}
Fitness Evaluation
public class OneMaxFitnessEvaluation
extends FitnessEvaluation<BitStringIndividual, IntegerFitnessValue> {
@Override
public IntegerFitnessValue evaluate(
IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) {
BitStringIndividual individual = wrapper.getIndividual();
int count = 0;
for (int i = 0; i < individual.size(); i++)
if (individual.get(i))
count++;
return new IntegerFitnessValue(count);
}
}
Initialization
public class RandomBitStringInitialization
extends Initialization<BitStringIndividual, IntegerFitnessValue> {
public RandomBitStringInitilization(..., Properties userProperties, ...) {
individualSize = userProperties.getInt(INDIVIDUAL_SIZE_PROPERTY);
random = new Random();
}
@Override
public IndividualWrapper<BitStringIndividual, IntegerFitnessValue>
generateNextIndividual(int id) {
BitStringIndividual individual = new BitStringIndividual(individualSize);
for (int i = 0; i < individualSize; i++)
individual.set(i, random.nextInt(2) == 1);
return new IndividualWrapper(individual);
}
}
Crossover
public class BitStringSinglePointCrossover
extends Crossover<BitStringIndividual, IntegerFitnessValue> {
public BitStringSinglePointCrossover(...) {
random = new Random();
}
@Override
public List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>>
cross(IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper1, wrapper2, ...) {
BitStringIndividual parent1 = wrapper1.getIndividual();
BitStringIndividual parent2 = wrapper2.getIndividual();
cutPoint = random.nextInt(parent1.size());
BitStringIndividual child1 = new BitStringIndividual(parent1.size());
BitStringIndividual child2 = new BitStringIndividual(parent1.size());
for (int i = 0; i < cutPoint; i++) {
child1.set(i, parent1.get(i));
child2.set(i, parent2.get(i));
}
for (int i = cutPoint; i < parent1.size(); i++) {
child1.set(i, parent2.get(i));
child2.set(i, parent1.get(i));
}
List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>> children = new ArrayList<>(2);
children.add(new IndividualWrapper<>(child1));
children.add(new IndividualWrapper<>(child2));
return children;
}
}
Mutation
public class BitStringMutation
extends Mutation<BitStringIndividual, IntegerFitnessValue> {
public BitStringMutation(...) {
mutationProbability =
userProperties.getDouble(MUTATION_PROBABILITY_PROPERTY);
random = new Random();
}
@Override
public IndividualWrapper<BitStringIndividual, IntegerFitnessValue> mutate(
IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) {
BitStringIndividual individual = wrapper.getIndividual();
for (int i = 0; i < individual.size(); i++)
if (random.nextDouble() <= mutationProbability)
individual.set(i, !individual.get(i));
return wrapper;
}
}
App
public class App {
public static void main(String[] args) {
driver = new GlobalDriver();
driver.setIndividualClass(BitStringIndividual.class);
driver.setFitnessValueClass(IntegerFitnessValue.class);
driver.setInitializationClass(RandomBitStringInitialization.class);
driver.setInitializationPopulationSize(POPULATION_SIZE);
userProperties.setInt(INDIVIDUAL_SIZE_PROPERTY, INDIVIDUAL_SIZE);
driver.setElitismClass(BestIndividualsElitism.class);
driver.activateElitism(true);
userProperties.setInt(NUMBER_OF_ELITISTS_PROPERTY, NUMBER_OF_ELITISTS);
driver.setParentsSelectionClass(RouletteWheelParentsSelection.class);
driver.setCrossoverClass(BitStringSinglePointCrossover.class);
driver.setSurvivalSelectionClass(RouletteWheelSurvivalSelection.class);
driver.activateSurvivalSelection(true);
driver.setUserProperties(userProperties);
driver.run();
}
}
driver.run()
It works!
Thank you!
PLEASE, COLLABORATE!
https://github.com/pasqualesalza/elephant56
pasquale.salza@gmail.com

More Related Content

What's hot

Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
MLconf
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
MLconf
 
Pytorch meetup
Pytorch meetupPytorch meetup
Pytorch meetup
Dmitri Azarnyh
 
Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O
Sri Ambati
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
Vijay Srinivas Agneeswaran, Ph.D
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Databricks
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011
Milind Bhandarkar
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUs
Sri Ambati
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
Albert Bifet
 
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
Sri Ambati
 
Smart Scalable Feature Reduction with Random Forests with Erik Erlandson
Smart Scalable Feature Reduction with Random Forests with Erik ErlandsonSmart Scalable Feature Reduction with Random Forests with Erik Erlandson
Smart Scalable Feature Reduction with Random Forests with Erik Erlandson
Databricks
 
Shark SQL and Rich Analytics at Scale
Shark SQL and Rich Analytics at ScaleShark SQL and Rich Analytics at Scale
Shark SQL and Rich Analytics at ScaleDataWorks Summit
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
Databricks
 
Sparkling Random Ferns by P Dendek and M Fedoryszak
Sparkling Random Ferns by  P Dendek and M FedoryszakSparkling Random Ferns by  P Dendek and M Fedoryszak
Sparkling Random Ferns by P Dendek and M Fedoryszak
Spark Summit
 
BDAS Shark study report 03 v1.1
BDAS Shark study report  03 v1.1BDAS Shark study report  03 v1.1
BDAS Shark study report 03 v1.1
Stefanie Zhao
 
Real-Time Hardware Simulation with Portable Hardware-in-the-Loop (PHIL-Rebooted)
Real-Time Hardware Simulation with Portable Hardware-in-the-Loop (PHIL-Rebooted)Real-Time Hardware Simulation with Portable Hardware-in-the-Loop (PHIL-Rebooted)
Real-Time Hardware Simulation with Portable Hardware-in-the-Loop (PHIL-Rebooted)
Riley Waite
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with Spark
Barak Gitsis
 
Knitting boar - Toronto and Boston HUGs - Nov 2012
Knitting boar - Toronto and Boston HUGs - Nov 2012Knitting boar - Toronto and Boston HUGs - Nov 2012
Knitting boar - Toronto and Boston HUGs - Nov 2012
Josh Patterson
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
Impetus Technologies
 

What's hot (19)

Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
 
Pytorch meetup
Pytorch meetupPytorch meetup
Pytorch meetup
 
Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011Modeling with Hadoop kdd2011
Modeling with Hadoop kdd2011
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUs
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
 
Smart Scalable Feature Reduction with Random Forests with Erik Erlandson
Smart Scalable Feature Reduction with Random Forests with Erik ErlandsonSmart Scalable Feature Reduction with Random Forests with Erik Erlandson
Smart Scalable Feature Reduction with Random Forests with Erik Erlandson
 
Shark SQL and Rich Analytics at Scale
Shark SQL and Rich Analytics at ScaleShark SQL and Rich Analytics at Scale
Shark SQL and Rich Analytics at Scale
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
Sparkling Random Ferns by P Dendek and M Fedoryszak
Sparkling Random Ferns by  P Dendek and M FedoryszakSparkling Random Ferns by  P Dendek and M Fedoryszak
Sparkling Random Ferns by P Dendek and M Fedoryszak
 
BDAS Shark study report 03 v1.1
BDAS Shark study report  03 v1.1BDAS Shark study report  03 v1.1
BDAS Shark study report 03 v1.1
 
Real-Time Hardware Simulation with Portable Hardware-in-the-Loop (PHIL-Rebooted)
Real-Time Hardware Simulation with Portable Hardware-in-the-Loop (PHIL-Rebooted)Real-Time Hardware Simulation with Portable Hardware-in-the-Loop (PHIL-Rebooted)
Real-Time Hardware Simulation with Portable Hardware-in-the-Loop (PHIL-Rebooted)
 
Scaling out logistic regression with Spark
Scaling out logistic regression with SparkScaling out logistic regression with Spark
Scaling out logistic regression with Spark
 
Knitting boar - Toronto and Boston HUGs - Nov 2012
Knitting boar - Toronto and Boston HUGs - Nov 2012Knitting boar - Toronto and Boston HUGs - Nov 2012
Knitting boar - Toronto and Boston HUGs - Nov 2012
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
 

Similar to elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce

Code transformation With Spoon
Code transformation With SpoonCode transformation With Spoon
Code transformation With Spoon
Gérard Paligot
 
Evolutionary Optimization Algorithms & Large-Scale Machine Learning
Evolutionary Optimization Algorithms & Large-Scale Machine LearningEvolutionary Optimization Algorithms & Large-Scale Machine Learning
Evolutionary Optimization Algorithms & Large-Scale Machine Learning
University of Maribor
 
L04 Software Design 2
L04 Software Design 2L04 Software Design 2
L04 Software Design 2
Ólafur Andri Ragnarsson
 
seminar HBMO
seminar HBMOseminar HBMO
seminar HBMOavaninith
 
2011.10.14 Apache Giraph - Hortonworks
2011.10.14 Apache Giraph - Hortonworks2011.10.14 Apache Giraph - Hortonworks
2011.10.14 Apache Giraph - Hortonworks
Avery Ching
 
ga-2.ppt
ga-2.pptga-2.ppt
ga-2.ppt
sayedmha
 
Artificial Inteligence for Games an Overview SBGAMES 2012
Artificial Inteligence for Games an Overview SBGAMES 2012Artificial Inteligence for Games an Overview SBGAMES 2012
Artificial Inteligence for Games an Overview SBGAMES 2012
Bruno Duarte Corrêa
 
Knowledge of Javascript
Knowledge of JavascriptKnowledge of Javascript
Knowledge of Javascript
Samuel Abraham
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 
Object Oriented Programming
Object Oriented ProgrammingObject Oriented Programming
Object Oriented Programming
Manish Pandit
 
Azure Digital Twins
Azure Digital TwinsAzure Digital Twins
Azure Digital Twins
Marco Parenzan
 
Behavioral Design Patterns
Behavioral Design PatternsBehavioral Design Patterns
Behavioral Design Patterns
Lidan Hifi
 
Sustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshopSustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshop
Yannick Wurm
 
Introduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-Seq
Enis Afgan
 
Write code that writes code!
Write code that writes code!Write code that writes code!
Write code that writes code!
Jason Feinstein
 
Write code that writes code! A beginner's guide to Annotation Processing - Ja...
Write code that writes code! A beginner's guide to Annotation Processing - Ja...Write code that writes code! A beginner's guide to Annotation Processing - Ja...
Write code that writes code! A beginner's guide to Annotation Processing - Ja...
DroidConTLV
 
Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014
Aleksander Stensby
 
Clonal Selection Algorithm Parallelization with MPJExpress
Clonal Selection Algorithm Parallelization with MPJExpressClonal Selection Algorithm Parallelization with MPJExpress
Clonal Selection Algorithm Parallelization with MPJExpress
Ayi Purbasari
 
Ihdels presentation
Ihdels presentationIhdels presentation
Ihdels presentation
Daniel Molina Cabrera
 

Similar to elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce (20)

Code transformation With Spoon
Code transformation With SpoonCode transformation With Spoon
Code transformation With Spoon
 
Java
JavaJava
Java
 
Evolutionary Optimization Algorithms & Large-Scale Machine Learning
Evolutionary Optimization Algorithms & Large-Scale Machine LearningEvolutionary Optimization Algorithms & Large-Scale Machine Learning
Evolutionary Optimization Algorithms & Large-Scale Machine Learning
 
L04 Software Design 2
L04 Software Design 2L04 Software Design 2
L04 Software Design 2
 
seminar HBMO
seminar HBMOseminar HBMO
seminar HBMO
 
2011.10.14 Apache Giraph - Hortonworks
2011.10.14 Apache Giraph - Hortonworks2011.10.14 Apache Giraph - Hortonworks
2011.10.14 Apache Giraph - Hortonworks
 
ga-2.ppt
ga-2.pptga-2.ppt
ga-2.ppt
 
Artificial Inteligence for Games an Overview SBGAMES 2012
Artificial Inteligence for Games an Overview SBGAMES 2012Artificial Inteligence for Games an Overview SBGAMES 2012
Artificial Inteligence for Games an Overview SBGAMES 2012
 
Knowledge of Javascript
Knowledge of JavascriptKnowledge of Javascript
Knowledge of Javascript
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
Object Oriented Programming
Object Oriented ProgrammingObject Oriented Programming
Object Oriented Programming
 
Azure Digital Twins
Azure Digital TwinsAzure Digital Twins
Azure Digital Twins
 
Behavioral Design Patterns
Behavioral Design PatternsBehavioral Design Patterns
Behavioral Design Patterns
 
Sustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshopSustainable software institute Collaboration workshop
Sustainable software institute Collaboration workshop
 
Introduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-Seq
 
Write code that writes code!
Write code that writes code!Write code that writes code!
Write code that writes code!
 
Write code that writes code! A beginner's guide to Annotation Processing - Ja...
Write code that writes code! A beginner's guide to Annotation Processing - Ja...Write code that writes code! A beginner's guide to Annotation Processing - Ja...
Write code that writes code! A beginner's guide to Annotation Processing - Ja...
 
Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014
 
Clonal Selection Algorithm Parallelization with MPJExpress
Clonal Selection Algorithm Parallelization with MPJExpressClonal Selection Algorithm Parallelization with MPJExpress
Clonal Selection Algorithm Parallelization with MPJExpress
 
Ihdels presentation
Ihdels presentationIhdels presentation
Ihdels presentation
 

Recently uploaded

Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Orkestra
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
Access Innovations, Inc.
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
Faculty of Medicine And Health Sciences
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
OWASP Beja
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
Vladimir Samoylov
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
OECD Directorate for Financial and Enterprise Affairs
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Matjaž Lipuš
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
Howard Spence
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
IP ServerOne
 

Recently uploaded (13)

Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
 

elephant56: Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce

  • 1. EvoSoft 2016 @ GECCO 2016 Denver, Colorado, USA July 20-24, 2016 elephant56 Design and Implementation of a Parallel Genetic Algorithms Framework on Hadoop MapReduce PASQUALE SALZA1, FILOMENA FERRUCCI1, FEDERICA SARRO2 1University of Salerno, 2University College London https://github.com/pasqualesalza/elephant56
  • 2. • An open source framework based on Hadoop MapReduce • Allows developers to focus on Genetic Algorithms definition only • Manages the interaction with Hadoop MapReduce 56elephant https://github.com/pasqualesalza/elephant56
  • 3. elephant the number of chromosomes of the elephant is 56 the mascot of Hadoop is an elephant 56
  • 4. Why parallelize? • Genetic Algorithms are intensive time and resource consuming • Luckily they are naturally parallelizable!
  • 5. Hadoop • Created by Doug Cutting in 2006 when he was in Yahoo! • Nowadays, it is part of YARN (Yet Another Resource Negotiator)
  • 6. Why Hadoop? • Large clusters of machines • Scalability and reliability of computation processes and storage • Wide diffusion in industries (BigData)
  • 7. HDFS • Hadoop Distributed File System • Multiple data nodes • Files are split up in blocks, spread across data nodes • Blocks replicated multiple times for reliability and performance
  • 8. MapReduce • A programming paradigm derived from functional programming • Based on two main functions: • map: handles the parallelization • reduce: collects and merge the results
  • 9. A Typical Day of a MapReduce Job • Implements word count, that counts word occurrences in a text Shuffling Reducing OutputMappingSplittingInput Dear Bear River Car Car River Deer Car Bear Deer Bear River Car Car River Deer Car Bear Deer, 1 Bear, 1 River, 1 Car, 1 Car, 1 River, 1 Deer, 1 Car, 1 Bear, 1 Bear, (1, 1) Car, (1, 1, 1) Deer, (1, 1) River, (1, 1) Bear, 2 Car, 3 Deer, 2 River, 2 Bear, 2 Car, 3 Deer, 2 River, 2
  • 12. Master/slave Model • The master node manages the population of individuals and assigns them to slave nodes • The slaves evaluate only the fitness function • The master applies other genetic operators • Is similar to a sequential Genetic Algorithm slave slave slave slave master
  • 13. Master/slave MapReduce 1. The Driver (master node) initializes the initial population 2. During each generation, it spreads the individuals across slave nodes 3. The slaves perform the fitness evaluation 4. The Driver applies the other genetic operators ... FITNESS EVALUATION Generation 1 ... Generation 2 . . . ELITISM PARENTS SELECTION CROSSOVER MUTATION SURVIVAL SELECTION Mapper 1 Mapper n Driver Driver
  • 15. Grid Model • Applies genetic operators independently for each neighborhood • Improves the diversification of individuals during evolution • Reduces the probability of local optimum convergence
  • 16. Grid MapReduce 1. The neighborhoods are chosen randomly at each generation 2. The Driver splits the population in equal parts 3. The partitioner sends individuals to their assigned neighborhood 4. The reducers execute other genetic operators ... ... FITNESS EVALUATION SHUFFLE Generation 1 ... Generation 2 ... . . . ELITISM PARENTS SELECTION CROSSOVER MUTATION SURVIVAL SELECTION Mapper 1 Mapper n Reducer 1 Reducer n Driver
  • 18. Island Model • The initial population is split in several groups (islands) • The Genetic Algorithm is executed “independently” on each of them • Periodically, the individuals are exchanged between islands through the migration • Explores different directions in the search space and the migration enhances the diversity
  • 19. Island MapReduce 1. The driver spreads individuals across islands 2. Consecutive generations are executed on islands 3. The migration occurs in periods with reducers ... ... FITNESS EVALUATION ELITISM PARENTS SELECTION CROSSOVER MUTATION SURVIVAL SELECTION MIGRATION Generations Period 1 ... Generations Period 2 ... . . . Mapper 1 Mapper n Reducer n Driver Reducer 1
  • 23. Individual public class BitStringIndividual extends Individual { private List<Boolean> bits; public BitStringIndividual(int size) { bits = new ArrayList<>(size); } public void set(int index, boolean value) { bits.set(index, value); } public boolean get(int index) { return bits.get(index); } public int size() { return bits.size(); } }
  • 24. Fitness Value public class IntegerFitnessValue extends FitnessValue { private int number; public IntegerFitnessValue(int value) { number = value; } public int get() { return number; } @Override public int compareTo(FitnessValue other) { if (other == null) return 1; Integer otherInteger = ((IntegerFitnessValue) other).get(); Integer.compare(number, otherInteger); } }
  • 25. Fitness Evaluation public class OneMaxFitnessEvaluation extends FitnessEvaluation<BitStringIndividual, IntegerFitnessValue> { @Override public IntegerFitnessValue evaluate( IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) { BitStringIndividual individual = wrapper.getIndividual(); int count = 0; for (int i = 0; i < individual.size(); i++) if (individual.get(i)) count++; return new IntegerFitnessValue(count); } }
  • 26. Initialization public class RandomBitStringInitialization extends Initialization<BitStringIndividual, IntegerFitnessValue> { public RandomBitStringInitilization(..., Properties userProperties, ...) { individualSize = userProperties.getInt(INDIVIDUAL_SIZE_PROPERTY); random = new Random(); } @Override public IndividualWrapper<BitStringIndividual, IntegerFitnessValue> generateNextIndividual(int id) { BitStringIndividual individual = new BitStringIndividual(individualSize); for (int i = 0; i < individualSize; i++) individual.set(i, random.nextInt(2) == 1); return new IndividualWrapper(individual); } }
  • 27. Crossover public class BitStringSinglePointCrossover extends Crossover<BitStringIndividual, IntegerFitnessValue> { public BitStringSinglePointCrossover(...) { random = new Random(); } @Override public List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>> cross(IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper1, wrapper2, ...) { BitStringIndividual parent1 = wrapper1.getIndividual(); BitStringIndividual parent2 = wrapper2.getIndividual(); cutPoint = random.nextInt(parent1.size()); BitStringIndividual child1 = new BitStringIndividual(parent1.size()); BitStringIndividual child2 = new BitStringIndividual(parent1.size()); for (int i = 0; i < cutPoint; i++) { child1.set(i, parent1.get(i)); child2.set(i, parent2.get(i)); } for (int i = cutPoint; i < parent1.size(); i++) { child1.set(i, parent2.get(i)); child2.set(i, parent1.get(i)); } List<IndividualWrapper<BitStringIndividual, IntegerFitnessValue>> children = new ArrayList<>(2); children.add(new IndividualWrapper<>(child1)); children.add(new IndividualWrapper<>(child2)); return children; } }
  • 28. Mutation public class BitStringMutation extends Mutation<BitStringIndividual, IntegerFitnessValue> { public BitStringMutation(...) { mutationProbability = userProperties.getDouble(MUTATION_PROBABILITY_PROPERTY); random = new Random(); } @Override public IndividualWrapper<BitStringIndividual, IntegerFitnessValue> mutate( IndividualWrapper<BitStringIndividual, IntegerFitnessValue> wrapper) { BitStringIndividual individual = wrapper.getIndividual(); for (int i = 0; i < individual.size(); i++) if (random.nextDouble() <= mutationProbability) individual.set(i, !individual.get(i)); return wrapper; } }
  • 29. App public class App { public static void main(String[] args) { driver = new GlobalDriver(); driver.setIndividualClass(BitStringIndividual.class); driver.setFitnessValueClass(IntegerFitnessValue.class); driver.setInitializationClass(RandomBitStringInitialization.class); driver.setInitializationPopulationSize(POPULATION_SIZE); userProperties.setInt(INDIVIDUAL_SIZE_PROPERTY, INDIVIDUAL_SIZE); driver.setElitismClass(BestIndividualsElitism.class); driver.activateElitism(true); userProperties.setInt(NUMBER_OF_ELITISTS_PROPERTY, NUMBER_OF_ELITISTS); driver.setParentsSelectionClass(RouletteWheelParentsSelection.class); driver.setCrossoverClass(BitStringSinglePointCrossover.class); driver.setSurvivalSelectionClass(RouletteWheelSurvivalSelection.class); driver.activateSurvivalSelection(true); driver.setUserProperties(userProperties); driver.run(); } }