Apply the Genetic Algorithm for optimization on a dataset obtained from UCI ML repository.
For Example: IRIS Dataset
Genetic Algorithm Optimization, Iris Dataset, Machine Learning, Python.
Introduction to IEEE STANDARDS and its different types.pptx
Genetic Algorithm for optimization on IRIS Dataset presentation ppt
1. Department of Computer Engineering
Sandip Foundation's
Sandip Institute of Technology and Research Centre, Nashik
Savitribai Phule Pune University
LP-III MINI PROJECT
Year 2019 – 2020
Under the Guidance
Prof.
Mangesh Ghonge
3. OPTIMIZATION
It’s a procedure to make a system or
design as effective, especially involving the
mathematical techniques.
To minimize the cost of production or to
maximize the efficiency of production.
4. GENETIC ALGORITHM
A genetic algorithm (or short GA) isa
search technique used in computing to
find true or approximate solutions to
optimization and search problems.
Genetic algorithms are categorized as
global search heuristics.
Genetic algorithms are a particular class
of evolutionary algorithms.
5. G A PROCEDURE
A typical genetic algorithm requires two
things to be defined:
a genetic representation of the solution
domain.
a fitness function to evaluate the solution
domain.
6. What Do We Mean By Genetic
Algorithm?
It is started with a set of randomly
generated solutions and recombine pairs
of them at random to produce offspring.
Only the best offspring and parents are
kept to produce the next generation.
7. PROBLEM DOMAINS
Problems which appear to be particularly
appropriate for solution by genetic
algorithms include timetabling and
scheduling problems,
Genetic algorithms are often applied as an
approach to solve global optimization
problems.
As a general rule of thumb genetic
algorithms might be useful in problem
domains that have a complex fitness
landscape as recombination is designed to
move the population away from local optima
that a traditional hill climbing algorithm might
get stuck in.
8. Best known database to be found in the
pattern recognition literature.
Data set- Iris flower data set(Donated date -
1988-07-01), also known as Fisher's Iris data
set and Anderson's Iris data set b/c Edgar
Anderson collected the data.
It is multivariate(more than 2 dependent
variable) data set Study of three related Iris
flowers species. Data set contain 50 sample
of each species(Iris-Setosa, Iris-Virginica, Iris-
Versicolor)
9.
10. Sepal length in cm
Sepal width in cm
Petal length in cm
Petal width in cm
Min Max Mean SD Class Correlation
sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194
petal length:
1.0 6.9
3.76 1.76 0.9490 (high!)
Petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)
11. Classify a new flower as belonging to
one of the 3 classes given the 4 features
12. # Box and whisker plots(Give idea
about distribution of input
attributes)
13.
14.
15.
16.
17. 1.Using Petal_Lenght & Petal_Width features,
we can distinguish Setosa, Versicolor &
Virginica fairly
2.There are slightly overlap of Versicolor &
Virginica.
3.Graph shows that Petal (Length and Width)
features are best contributor for Iris Species
as compare to Sepal (Length and Width)
18.
19. 4 Evaluate by using 6different
Algorithms(Cross Validation)
Here,
1. Logistic Regression (LR)
2. Linear Discriminant Analysis(LDA)
3. K-Nearest Neighbour(KNN)
4. Classification and Regression Tree(CART)
5. Gaussion Naive Bayes(NB)
6. Support Vector Machine(SVM)
20. Case Features used Best
Model
Train
Accuracy
Test
Accuracy
Missclassified
1 All features in SVM .9899 .9555 2 classes
2 Sepal only SVM .8472 .7111 12
3 Petal only SVM .9899 .9333 3
4 PetalWidth,Sepal
(Len,Wid)
SVM/LDA .9809 .9111 4
5 PetalLen,Sepal
(Len,Wid)
SVM .9700 .9111 4
21. Application :
Software engineering.
Traveling Salesman Problem.
Mobile communications infrastructure
optimization.
Electronic circuit design, known as
Evolvable hardware.
22. Advantages :
A GA has a number of advantages.
It can quickly scan a vast solution set.
Bad proposals do not effect the end
solution negatively as they are simply
discarded.
The inductive nature of the GA means that it
doesn't have to know any rules of the
problem - it works by its own internal rules.
This is very useful for complex or loosely
defined problems.
23. Disadvantages :
A practical disadvantage of the genetic
algorithm involves longer running times
on the computer. Fortunately, this
disadvantage continues to be minimized
by the ever-increasing processing speeds
of today's computers.
24. Conclusion:
Evolutionary algorithms have been around
since the early sixties. They apply the rules
of nature: evolution through selection of
the fittest individuals, the individuals
representing solutions to a mathematical
problem.Genetic algorithms are so far
generally the best and most robust kind of
evolutionary algorithms.
25. REFERENCES
1. Akbari Z. (2010). "A multilevel evolutionary algorithm for optimizing numerical
functions" IJIEC 2 (2011): 419–430
2. Ananya (2017), What is Diabetes, retrieved online from https://www.news-
medical.net/health/What- is-Diabetes.aspx
3. Coffin, D.; S., Robert E. (2008). "Linkage Learning in Estimation of Distribution
Algorithms". Linkage in Evolutionary Computation. Springer Berlin Heidelberg: 141–
156. doi:10.1007/978-3-540- 85068-7_7.
4. Eiben, A. E. et al (1994). Genetic algorithms with multi-parent recombination, PPSN III:
Proceedings of the International Conference on Evolutionary Computation. The Third
Conference on Parallel Problem Solving from Nature: 78–87. ISBN 3-540-58484-6.
5. Clustering - K-means demo’, K-means-Ineractive demo, Available at:
http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html.
Consulted 22 AUG 2013
6. Bache, K.& Lichman, M. 2013. UCI Machine Learning Repository
[http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information
and Computer Science.
7. Bishop, C. 2006. Pattern Recognition and Machine Learning. New York: Springer, pp.424-
428.
8. Fisher, R.A. 1936. UCI Machine Learning Repository: Iris Data Set. Available at:
http://archive.ics.uci.edu/ml/datasets/Iris. Consulted 10 AUG 2013
9. Mitchell, T. 1997. Machine learning. McGraw Hill.