SlideShare a Scribd company logo
1 of 30
Download to read offline
Bioinformatics Literature Review
A Review of Genetic Algorithms
Lit Review Talk
by
Kato Mivule
COSC891 – Bioinformatics, Spring 2014
Bowie State University
Bowie State University Department of Computer Science
Outline
• Introduction
• Biological Background
• Genetic Algorithm
• Genetics Algorithm Paper discussion
• Conclusion
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Sources
Information presented in these slides is adapted from the following sources:
1. Michael Skinner, Genetic Algorithms Overview, http://geneticalgorithms.ai-depot.com/Tutorial/Overview.html ,
accessed online, March 2nd 2014.
2. Genetic Algorithms, Lecture Notes UC Davis Computer Science Dept,
http://www.cs.ucdavis.edu/~vemuri/classes/ecs271/Genetic%20Algorithms%20Short%20Tutorial.htm
3. Wikipedia, Genetic algorithm, http://en.wikipedia.org/wiki/Genetic_algorithm
4. Nobal Niraula, Genetic Algorithms by Example
http://www.slideshare.net/kancho/genetic-algorithm-by-example
5. BBC Genetics:
http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_spec
ies/revision/6/
6. Deoxyribonucleic Acid (DNA), https://www.genome.gov/25520880#al-3
7. MATLAB, How the Genetic Algorithm Works, http://www.mathworks.com/help/gads/how-the-genetic-algorithm-
works.html
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Genetic Algorithms (GA) - Introduction
• Genetic Algorithms (GA) were first developed by John Holland (1975).
• GA is a search heuristic that mimics the process of natural evolution.
• GA uses Darwin's concepts of “Natural Selection” and “Genetic Inheritance”.
• GA are used to solve problems with little information about those problems.
• GA are Generalized to work in any search space.
• GA use selection and evolution to generate numerous solutions to a problem.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Genetic Algorithms (GA) – Introduction
• GA works well with a very large set of candidate solutions.
• GA are outperformed by more situation specific algorithms in the simpler
search spaces.
• GA are not always the best choice, their time run is long.
• GA are good at creating high quality solutions to a problem.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Genetic Algorithms (GA) – Introduction
• GA use the process of natural selection and evolution.
• “…Some birds developed large, strong beaks suited to cracking nuts, others long,
narrow beaks more suitable for digging bugs out of wood. The birds that had these
characteristics when blown to the island survived longer than other birds. This
allowed them to reproduce more and therefore have more offspring that also had this
unique characteristic. Those without the characteristic gradually died out from
starvation. Eventually all of the birds had a type of beak that helped it survive on its
island. The individuals themselves do not change, but those that survive better, or
have a higher fitness, will survive longer and produce more offspring. This continues
to happen, with the individuals becoming more suited to their environment every
generation. It was this continuous improvement that inspired computer scientists, one
of the most prominent being John Holland, to create genetic algorithms…” Genetic
Algorithms Overview, Michael Skinner
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Biology background
• The Body is made up of cells. The cell has a center called a nucleus. The
nucleus contains the chromosomes. The chromosome is composed of firmly
coiled strings of deoxyribonucleic acid (DNA).
• Genes are sections of DNA that determine particular traits, like eye and skin
color. You have more than 20,000 genes. A gene mutation is an modification
in DNA. Some changes in your genes result in genetic disorders.
Source: http://www.riversideonline.com/health_reference/Tools/DS00549.cfm
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Biology background
• The Body is made up of cells. The cell has a center called a nucleus. The
nucleus contains the chromosomes. The chromosomes contain the DNA
strand.
Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Biology background
• The chromosome is composed of firmly coiled strings of deoxyribonucleic acid
(DNA). Genes are sections of DNA that determine particular traits, like eye and
skin color.
Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Biology background
• DNA: A molecule of DNA is made up of two strands called the double helix.
The DNA Strand contains four types of molecules, Adenine (A), Thymine
(T), Guanine (G) and Cytosine (C). The molecules are held together by weak
hydrogen bonds. Adenine pairs with Thymine. Guanine pairs with Cytosine.
• A section of this DNA is called a gene. It is normally hundreds or thousands
of DNA bases long.
Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Biology background
• Genes and Proteins: The genetic information coded into DNA in the genes
gives the cells instructions to make many specific protein molecules
• Proteins are built using amino acid molecules. The order of the DNA bases is
code for the order of amino acids in the protein
Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Biology Background
• Random assortment of chromosomes: The partition of the members of a pair
of chromosomes is completely at random with many possible combinations.
Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Biology Background
Natural Selection Process
Source: BBC Biology Genetics: http://www.bbc.co.uk/bitesize/higher/biology/genetics_adaptation/natural_selection/revision/2/
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Biology Background
Natural Selection Process
Source: Wikipedia, Evolution: http://en.wikipedia.org/wiki/Evolution
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Genetic Algorithm Pseudo-code
Generate an initial population of individuals
Evaluate the fitness of all individuals
while termination condition not met do
Select fitter individuals for reproduction
Recombine between individuals
Mutate individuals
Evaluate the fitness of the modified individuals
Generate a new population
End while
Source: Nobal Niraula, Genetic Algorithms by Example http://www.slideshare.net/kancho/genetic-algorithm-by-example
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Genetic Algorithm
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Nobal Niraula, Genetic Algorithms by Example http://www.slideshare.net/kancho/genetic-algorithm-by-example
Genetic algorithm process
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Phases in the Genetic algorithm process.
Source: http://www.cs.ucdavis.edu/~vemuri
Genetic Algorithm (GA)
•Initial Population: GA starts by generating a random initial population
•Creating the Next Generation: children are created from the current initial population
•GA generates three types of children for the next generation:
•Elite children: individuals with the best fitness values who survive.
•Crossover children: combining the vectors of a pair of parents.
•Mutation children: introducing random changes to a single parent.
•Stopping Conditions for the Algorithm
•The algorithm stops when the value of the fitness criteria is met.
Source: MATLAB How the Genetic Algorithm Works, http://www.mathworks.com/help/gads/how-the-genetic-algorithm-works.html
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction
for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
The Problem:
•The expression dataset being analyzed involves multiple classes.
•The efficient selection of good predictive gene groups from datasets that are
inherently ‘noisy’.
•The development of new methodologies that can enhance the successful
classification of these complex datasets.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction
for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
Methods:
• GA is applied to the problem of multi-class prediction.
•A GA-based gene selection scheme is employed to automatically
•Determine the members of a predictive gene group
•Determine the optimal group size
•Determine the classification success using a maximum likelihood (MLHD)
classification method.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction
for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
Results:
•The Authors state that GA/MLHD-based approach achieves higher
classification accuracies than other published predictive methods on the same
multi-class test dataset.
•The Authors claim that GA/MLHD also permits substantial feature reduction in
classifier gene sets without compromising predictive accuracy.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction
for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
Dataset and Data Preprocessing
•Authors used the NCI60 gene expression dataset contains the gene expression
profiles of 64 cancer cell lines as measured by cDNA microarrays containing
9703 spotted cDNA sequences.
•Authors downloaded data from http://genome-
www.stanford.edu/sutech/download/nci60/dross arrays nci60.tgz.
•Authors during data preprocessing, excluded spots with missing data, control,
and empty leaving 6167 genes.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction
for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
Overall Methodology
The GA/MLHD classification strategy consists of two main components:
(1) a GA-based gene selector
(2) a maximum likelihood (MLHD) classifier.
•The actual classification process is performed using the maximum likelihood
(MLHD) classifier.
•Each individual in the population thus represents a specific gene predictor
subset
•A fitness function is used to determine the classification accuracy of a predictor
set.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction
for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
System and Methods
•Initialization and Evaluation: An initial population is formed by creating N
random strings, where the population size N is pre-specified
•Selection, Crossover and Mutation: Two selection methods were used to
select the strings for the mating pool: (i) stochastic universal sampling (SUS) and
(ii) roulette wheel selection (RWS).
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction
for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
System and Methods
•Crossovers: performed by randomly choosing a pair of strings from the
mating pool and then applying a crossover operation on the selected string
pair.
•Uniform mutation: operations applied at probability p(m) on each of the
offspring strings produced from crossover.
•Termination :evaluation, selection, crossover and mating are repeated for
G generations until the string with the best fitness of all generations is
outputted as the solution.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the
analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
A maximum likelihood (MLHD) classifier
•To build an MLHD classifier (James, 1985), a total of M(t) tumor samples are
used as training samples. The remaining M(θ) tumor samples are used as test
samples.
•For the NCI60 dataset, the ratio between M(t) and M(θ) is 2:1.
•Discriminant Function: The basis of the discriminant function is Bayes’ rule of
maximum likelihood: Assign the sample to the class with the highest conditional
probability.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the
analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
“…Comparing GA-based Predictor Sets to Predictor Sets Obtained from Other Methodologies The best
predictor set obtained using the GA-based selection scheme exhibited a cross validation error rate of
14.63% and an independent test error rate of 5% (Table 1, row 1, and see Supplementary Information for
specific misclassifications). This is an improvement in accuracy as compared to other methodologies
assessed by Dudoit et al. (2000), where the lowest independent test error rate was reported as 19%...” Ooi
and Tan (2003)
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the
analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
“…Comparison of expression profiles of predictor sets obtained through different methodologies.
Columns represent different class distinctions, and only training set samples are depicted. (a) Expression
profile of genes selected through the GA/MLHD method (only genes for the best predictor set are shown).
(b) Expression profile of 20 genes selected through the BSS/WSS ratio ranking method. (c) Expression
profile of 18 genes selected through the OVA/S2N ratio ranking method. Arrows depict genes which have
highly correlated expression patterns across the sample classes. Classes are labeled as follows: BR
(breast), CN (central nervous system), CL (colon), LE (leukemia), ME (melanoma), NS (non-small-cell
lung carcinoma), OV (ovarian), RE (renal) and PR (reproductive system)…” Ooi and Tan (2003)
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Paper: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the
analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003.
Conclusion
•The authors state that their report shows that highly accurate classification
results can be obtained using a combination of GA-based gene selection and
discriminant-based classification methods.
•The authors note that accuracy achieved (95% for NCI60) is better than other
published methods employing the same dataset.
•The authors note that other advantages of the GA-based approach are that it
automatically determines the optimal predictor set size and the delivery of
predictive accuracies that are comparable to other methods.
Bowie State University Department of Computer Science
Bioinformatics Literature Review
Conclusion
• Genetic algorithms tend to get outdone by more situation specific algorithms
in the simpler search spaces.
• Genetic algorithms are not always the best choice, their time run is long.
• Genetic algorithms are good at creating high quality solutions to a problem.
Bowie State University Department of Computer Science
Bioinformatics Literature Review

More Related Content

What's hot

Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program officePublicLeaker
 
Internet and Bioinformatics for Biologists
Internet and Bioinformatics for BiologistsInternet and Bioinformatics for Biologists
Internet and Bioinformatics for BiologistsDr Mehul Dave
 
Next-Gen Taxonomic Descriptions for Microbial Eukaryotes
Next-Gen Taxonomic Descriptions for Microbial EukaryotesNext-Gen Taxonomic Descriptions for Microbial Eukaryotes
Next-Gen Taxonomic Descriptions for Microbial EukaryotesAnne Thessen
 
Bioinformatics Meets Information Retrieval: State of the Art and a Case Study
Bioinformatics Meets Information Retrieval: State of the Art and a Case StudyBioinformatics Meets Information Retrieval: State of the Art and a Case Study
Bioinformatics Meets Information Retrieval: State of the Art and a Case StudyEloisa Vargiu
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksAlexander Pico
 
Systems biology: Large-scale biomedical data mining
Systems biology: Large-scale biomedical data miningSystems biology: Large-scale biomedical data mining
Systems biology: Large-scale biomedical data miningLars Juhl Jensen
 
Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...
Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...
Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...sanaullah noonari
 
Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research Akash Arora
 
20170128_Resume_Engineering
20170128_Resume_Engineering20170128_Resume_Engineering
20170128_Resume_EngineeringAaron Tan
 
Greene Bosc2008
Greene Bosc2008Greene Bosc2008
Greene Bosc2008bosc_2008
 
Ontology-based services for querying and mining plant genomic and phenomic data
Ontology-based services for querying and mining plant genomic and phenomic dataOntology-based services for querying and mining plant genomic and phenomic data
Ontology-based services for querying and mining plant genomic and phenomic dataNathan Dunn
 

What's hot (14)

Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program office
 
Internet and Bioinformatics for Biologists
Internet and Bioinformatics for BiologistsInternet and Bioinformatics for Biologists
Internet and Bioinformatics for Biologists
 
resume
resumeresume
resume
 
Next-Gen Taxonomic Descriptions for Microbial Eukaryotes
Next-Gen Taxonomic Descriptions for Microbial EukaryotesNext-Gen Taxonomic Descriptions for Microbial Eukaryotes
Next-Gen Taxonomic Descriptions for Microbial Eukaryotes
 
Bioinformatics Meets Information Retrieval: State of the Art and a Case Study
Bioinformatics Meets Information Retrieval: State of the Art and a Case StudyBioinformatics Meets Information Retrieval: State of the Art and a Case Study
Bioinformatics Meets Information Retrieval: State of the Art and a Case Study
 
BTIS
BTISBTIS
BTIS
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive Networks
 
Systems biology: Large-scale biomedical data mining
Systems biology: Large-scale biomedical data miningSystems biology: Large-scale biomedical data mining
Systems biology: Large-scale biomedical data mining
 
DanVanattaCV
DanVanattaCVDanVanattaCV
DanVanattaCV
 
Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...
Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...
Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...
 
Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research Role of Bioinformatics in Cancer Research
Role of Bioinformatics in Cancer Research
 
20170128_Resume_Engineering
20170128_Resume_Engineering20170128_Resume_Engineering
20170128_Resume_Engineering
 
Greene Bosc2008
Greene Bosc2008Greene Bosc2008
Greene Bosc2008
 
Ontology-based services for querying and mining plant genomic and phenomic data
Ontology-based services for querying and mining plant genomic and phenomic dataOntology-based services for querying and mining plant genomic and phenomic data
Ontology-based services for querying and mining plant genomic and phenomic data
 

Similar to Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms

Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
 
21 lecture genome_and_evolution
21 lecture genome_and_evolution21 lecture genome_and_evolution
21 lecture genome_and_evolutionveneethmathew
 
Annotated biology technology resources
Annotated biology technology resourcesAnnotated biology technology resources
Annotated biology technology resourcesMarilyn Brouette
 
Intro bioinformatics
Intro bioinformaticsIntro bioinformatics
Intro bioinformaticsChris Dwan
 
Genome data management
Genome data managementGenome data management
Genome data managementShareb Ismaeel
 
01. Introduction to Bioinformatics.pptx
01. Introduction to Bioinformatics.pptx01. Introduction to Bioinformatics.pptx
01. Introduction to Bioinformatics.pptxHussainTaqi1
 
Bioinformatics Introduction
Bioinformatics IntroductionBioinformatics Introduction
Bioinformatics IntroductionDavid Montaner
 
Current Trends & Developments of Bioinformatics
Current Trends & Developments of BioinformaticsCurrent Trends & Developments of Bioinformatics
Current Trends & Developments of BioinformaticsYousif A. Algabri
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformaticsbiinoida
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!adcobb
 
NIH Data Science Special Interest Group
NIH Data Science Special Interest GroupNIH Data Science Special Interest Group
NIH Data Science Special Interest GroupYaffa Rubinstien
 
Biobanking a user’s perspective: Dr. Jonathan Pevsner
Biobanking a user’s perspective: Dr. Jonathan PevsnerBiobanking a user’s perspective: Dr. Jonathan Pevsner
Biobanking a user’s perspective: Dr. Jonathan PevsnerData Science NIH
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KBenjamin Good
 
Genomics and Bioinformatics
Genomics and BioinformaticsGenomics and Bioinformatics
Genomics and BioinformaticsAmit Garg
 
Genetic engineering
Genetic engineeringGenetic engineering
Genetic engineeringArceism
 

Similar to Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms (20)

Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.
 
Iatul june1 2009
Iatul june1 2009Iatul june1 2009
Iatul june1 2009
 
21 lecture genome_and_evolution
21 lecture genome_and_evolution21 lecture genome_and_evolution
21 lecture genome_and_evolution
 
Annotated biology technology resources
Annotated biology technology resourcesAnnotated biology technology resources
Annotated biology technology resources
 
Intro bioinformatics
Intro bioinformaticsIntro bioinformatics
Intro bioinformatics
 
Genome data management
Genome data managementGenome data management
Genome data management
 
01. Introduction to Bioinformatics.pptx
01. Introduction to Bioinformatics.pptx01. Introduction to Bioinformatics.pptx
01. Introduction to Bioinformatics.pptx
 
Protocols for genomics and proteomics
Protocols for genomics and proteomics Protocols for genomics and proteomics
Protocols for genomics and proteomics
 
Genome.ppt
Genome.pptGenome.ppt
Genome.ppt
 
Bioinformatics Introduction
Bioinformatics IntroductionBioinformatics Introduction
Bioinformatics Introduction
 
Current Trends & Developments of Bioinformatics
Current Trends & Developments of BioinformaticsCurrent Trends & Developments of Bioinformatics
Current Trends & Developments of Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
 
2015 mcgill-talk
2015 mcgill-talk2015 mcgill-talk
2015 mcgill-talk
 
NIH Data Science Special Interest Group
NIH Data Science Special Interest GroupNIH Data Science Special Interest Group
NIH Data Science Special Interest Group
 
Biobanking a user’s perspective: Dr. Jonathan Pevsner
Biobanking a user’s perspective: Dr. Jonathan PevsnerBiobanking a user’s perspective: Dr. Jonathan Pevsner
Biobanking a user’s perspective: Dr. Jonathan Pevsner
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
Genomics and Bioinformatics
Genomics and BioinformaticsGenomics and Bioinformatics
Genomics and Bioinformatics
 
Genetic engineering
Genetic engineeringGenetic engineering
Genetic engineering
 

More from Kato Mivule

A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization Kato Mivule
 
Cancer Diagnostic Prediction with Amazon ML – A Tutorial
Cancer Diagnostic Prediction with Amazon ML – A TutorialCancer Diagnostic Prediction with Amazon ML – A Tutorial
Cancer Diagnostic Prediction with Amazon ML – A TutorialKato Mivule
 
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...Kato Mivule
 
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Kato Mivule
 
An Investigation of Data Privacy and Utility Preservation Using KNN Classific...
An Investigation of Data Privacy and Utility Preservation Using KNN Classific...An Investigation of Data Privacy and Utility Preservation Using KNN Classific...
An Investigation of Data Privacy and Utility Preservation Using KNN Classific...Kato Mivule
 
Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Kato Mivule
 
Applying Data Privacy Techniques on Published Data in Uganda
 Applying Data Privacy Techniques on Published Data in Uganda Applying Data Privacy Techniques on Published Data in Uganda
Applying Data Privacy Techniques on Published Data in UgandaKato Mivule
 
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule
 
Kato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy EngineeringKato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy EngineeringKato Mivule
 
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data PrivacyA Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data PrivacyKato Mivule
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeKato Mivule
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeKato Mivule
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeKato Mivule
 
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...Kato Mivule
 
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...Kato Mivule
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule
 
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Kato Mivule
 
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
Kato Mivule: An Overview of  Adaptive Boosting – AdaBoostKato Mivule: An Overview of  Adaptive Boosting – AdaBoost
Kato Mivule: An Overview of Adaptive Boosting – AdaBoostKato Mivule
 
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...Kato Mivule
 
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...Kato Mivule
 

More from Kato Mivule (20)

A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization
 
Cancer Diagnostic Prediction with Amazon ML – A Tutorial
Cancer Diagnostic Prediction with Amazon ML – A TutorialCancer Diagnostic Prediction with Amazon ML – A Tutorial
Cancer Diagnostic Prediction with Amazon ML – A Tutorial
 
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
 
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
 
An Investigation of Data Privacy and Utility Preservation Using KNN Classific...
An Investigation of Data Privacy and Utility Preservation Using KNN Classific...An Investigation of Data Privacy and Utility Preservation Using KNN Classific...
An Investigation of Data Privacy and Utility Preservation Using KNN Classific...
 
Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...
 
Applying Data Privacy Techniques on Published Data in Uganda
 Applying Data Privacy Techniques on Published Data in Uganda Applying Data Privacy Techniques on Published Data in Uganda
Applying Data Privacy Techniques on Published Data in Uganda
 
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an OverviewKato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
 
Kato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy EngineeringKato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy Engineering
 
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data PrivacyA Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
 
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
 
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance Computing
 
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
 
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
Kato Mivule: An Overview of  Adaptive Boosting – AdaBoostKato Mivule: An Overview of  Adaptive Boosting – AdaBoost
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
 
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
 
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
 

Recently uploaded

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 

Recently uploaded (20)

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 

Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms

  • 1. Bioinformatics Literature Review A Review of Genetic Algorithms Lit Review Talk by Kato Mivule COSC891 – Bioinformatics, Spring 2014 Bowie State University Bowie State University Department of Computer Science
  • 2. Outline • Introduction • Biological Background • Genetic Algorithm • Genetics Algorithm Paper discussion • Conclusion Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 3. Sources Information presented in these slides is adapted from the following sources: 1. Michael Skinner, Genetic Algorithms Overview, http://geneticalgorithms.ai-depot.com/Tutorial/Overview.html , accessed online, March 2nd 2014. 2. Genetic Algorithms, Lecture Notes UC Davis Computer Science Dept, http://www.cs.ucdavis.edu/~vemuri/classes/ecs271/Genetic%20Algorithms%20Short%20Tutorial.htm 3. Wikipedia, Genetic algorithm, http://en.wikipedia.org/wiki/Genetic_algorithm 4. Nobal Niraula, Genetic Algorithms by Example http://www.slideshare.net/kancho/genetic-algorithm-by-example 5. BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_spec ies/revision/6/ 6. Deoxyribonucleic Acid (DNA), https://www.genome.gov/25520880#al-3 7. MATLAB, How the Genetic Algorithm Works, http://www.mathworks.com/help/gads/how-the-genetic-algorithm- works.html Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 4. Genetic Algorithms (GA) - Introduction • Genetic Algorithms (GA) were first developed by John Holland (1975). • GA is a search heuristic that mimics the process of natural evolution. • GA uses Darwin's concepts of “Natural Selection” and “Genetic Inheritance”. • GA are used to solve problems with little information about those problems. • GA are Generalized to work in any search space. • GA use selection and evolution to generate numerous solutions to a problem. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 5. Genetic Algorithms (GA) – Introduction • GA works well with a very large set of candidate solutions. • GA are outperformed by more situation specific algorithms in the simpler search spaces. • GA are not always the best choice, their time run is long. • GA are good at creating high quality solutions to a problem. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 6. Genetic Algorithms (GA) – Introduction • GA use the process of natural selection and evolution. • “…Some birds developed large, strong beaks suited to cracking nuts, others long, narrow beaks more suitable for digging bugs out of wood. The birds that had these characteristics when blown to the island survived longer than other birds. This allowed them to reproduce more and therefore have more offspring that also had this unique characteristic. Those without the characteristic gradually died out from starvation. Eventually all of the birds had a type of beak that helped it survive on its island. The individuals themselves do not change, but those that survive better, or have a higher fitness, will survive longer and produce more offspring. This continues to happen, with the individuals becoming more suited to their environment every generation. It was this continuous improvement that inspired computer scientists, one of the most prominent being John Holland, to create genetic algorithms…” Genetic Algorithms Overview, Michael Skinner Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 7. Biology background • The Body is made up of cells. The cell has a center called a nucleus. The nucleus contains the chromosomes. The chromosome is composed of firmly coiled strings of deoxyribonucleic acid (DNA). • Genes are sections of DNA that determine particular traits, like eye and skin color. You have more than 20,000 genes. A gene mutation is an modification in DNA. Some changes in your genes result in genetic disorders. Source: http://www.riversideonline.com/health_reference/Tools/DS00549.cfm Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 8. Biology background • The Body is made up of cells. The cell has a center called a nucleus. The nucleus contains the chromosomes. The chromosomes contain the DNA strand. Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/ Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 9. Biology background • The chromosome is composed of firmly coiled strings of deoxyribonucleic acid (DNA). Genes are sections of DNA that determine particular traits, like eye and skin color. Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/ Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 10. Biology background • DNA: A molecule of DNA is made up of two strands called the double helix. The DNA Strand contains four types of molecules, Adenine (A), Thymine (T), Guanine (G) and Cytosine (C). The molecules are held together by weak hydrogen bonds. Adenine pairs with Thymine. Guanine pairs with Cytosine. • A section of this DNA is called a gene. It is normally hundreds or thousands of DNA bases long. Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/ Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 11. Biology background • Genes and Proteins: The genetic information coded into DNA in the genes gives the cells instructions to make many specific protein molecules • Proteins are built using amino acid molecules. The order of the DNA bases is code for the order of amino acids in the protein Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/ Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 12. Biology Background • Random assortment of chromosomes: The partition of the members of a pair of chromosomes is completely at random with many possible combinations. Source: BBC Genetics: http://www.bbc.co.uk/bitesize/intermediate2/biology/environmental_and_genetics/factors_affecting_variation_species/revision/6/ Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 13. Biology Background Natural Selection Process Source: BBC Biology Genetics: http://www.bbc.co.uk/bitesize/higher/biology/genetics_adaptation/natural_selection/revision/2/ Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 14. Biology Background Natural Selection Process Source: Wikipedia, Evolution: http://en.wikipedia.org/wiki/Evolution Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 15. Genetic Algorithm Pseudo-code Generate an initial population of individuals Evaluate the fitness of all individuals while termination condition not met do Select fitter individuals for reproduction Recombine between individuals Mutate individuals Evaluate the fitness of the modified individuals Generate a new population End while Source: Nobal Niraula, Genetic Algorithms by Example http://www.slideshare.net/kancho/genetic-algorithm-by-example Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 16. Genetic Algorithm Bowie State University Department of Computer Science Bioinformatics Literature Review Nobal Niraula, Genetic Algorithms by Example http://www.slideshare.net/kancho/genetic-algorithm-by-example
  • 17. Genetic algorithm process Bowie State University Department of Computer Science Bioinformatics Literature Review Phases in the Genetic algorithm process. Source: http://www.cs.ucdavis.edu/~vemuri
  • 18. Genetic Algorithm (GA) •Initial Population: GA starts by generating a random initial population •Creating the Next Generation: children are created from the current initial population •GA generates three types of children for the next generation: •Elite children: individuals with the best fitness values who survive. •Crossover children: combining the vectors of a pair of parents. •Mutation children: introducing random changes to a single parent. •Stopping Conditions for the Algorithm •The algorithm stops when the value of the fitness criteria is met. Source: MATLAB How the Genetic Algorithm Works, http://www.mathworks.com/help/gads/how-the-genetic-algorithm-works.html Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 19. Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. The Problem: •The expression dataset being analyzed involves multiple classes. •The efficient selection of good predictive gene groups from datasets that are inherently ‘noisy’. •The development of new methodologies that can enhance the successful classification of these complex datasets. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 20. Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. Methods: • GA is applied to the problem of multi-class prediction. •A GA-based gene selection scheme is employed to automatically •Determine the members of a predictive gene group •Determine the optimal group size •Determine the classification success using a maximum likelihood (MLHD) classification method. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 21. Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. Results: •The Authors state that GA/MLHD-based approach achieves higher classification accuracies than other published predictive methods on the same multi-class test dataset. •The Authors claim that GA/MLHD also permits substantial feature reduction in classifier gene sets without compromising predictive accuracy. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 22. Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. Dataset and Data Preprocessing •Authors used the NCI60 gene expression dataset contains the gene expression profiles of 64 cancer cell lines as measured by cDNA microarrays containing 9703 spotted cDNA sequences. •Authors downloaded data from http://genome- www.stanford.edu/sutech/download/nci60/dross arrays nci60.tgz. •Authors during data preprocessing, excluded spots with missing data, control, and empty leaving 6167 genes. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 23. Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. Overall Methodology The GA/MLHD classification strategy consists of two main components: (1) a GA-based gene selector (2) a maximum likelihood (MLHD) classifier. •The actual classification process is performed using the maximum likelihood (MLHD) classifier. •Each individual in the population thus represents a specific gene predictor subset •A fitness function is used to determine the classification accuracy of a predictor set. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 24. Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. System and Methods •Initialization and Evaluation: An initial population is formed by creating N random strings, where the population size N is pre-specified •Selection, Crossover and Mutation: Two selection methods were used to select the strings for the mating pool: (i) stochastic universal sampling (SUS) and (ii) roulette wheel selection (RWS). Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 25. Paper Review: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. System and Methods •Crossovers: performed by randomly choosing a pair of strings from the mating pool and then applying a crossover operation on the selected string pair. •Uniform mutation: operations applied at probability p(m) on each of the offspring strings produced from crossover. •Termination :evaluation, selection, crossover and mating are repeated for G generations until the string with the best fitness of all generations is outputted as the solution. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 26. Paper: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. A maximum likelihood (MLHD) classifier •To build an MLHD classifier (James, 1985), a total of M(t) tumor samples are used as training samples. The remaining M(θ) tumor samples are used as test samples. •For the NCI60 dataset, the ratio between M(t) and M(θ) is 2:1. •Discriminant Function: The basis of the discriminant function is Bayes’ rule of maximum likelihood: Assign the sample to the class with the highest conditional probability. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 27. Paper: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. “…Comparing GA-based Predictor Sets to Predictor Sets Obtained from Other Methodologies The best predictor set obtained using the GA-based selection scheme exhibited a cross validation error rate of 14.63% and an independent test error rate of 5% (Table 1, row 1, and see Supplementary Information for specific misclassifications). This is an improvement in accuracy as compared to other methodologies assessed by Dudoit et al. (2000), where the lowest independent test error rate was reported as 19%...” Ooi and Tan (2003) Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 28. Paper: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. “…Comparison of expression profiles of predictor sets obtained through different methodologies. Columns represent different class distinctions, and only training set samples are depicted. (a) Expression profile of genes selected through the GA/MLHD method (only genes for the best predictor set are shown). (b) Expression profile of 20 genes selected through the BSS/WSS ratio ranking method. (c) Expression profile of 18 genes selected through the OVA/S2N ratio ranking method. Arrows depict genes which have highly correlated expression patterns across the sample classes. Classes are labeled as follows: BR (breast), CN (central nervous system), CL (colon), LE (leukemia), ME (melanoma), NS (non-small-cell lung carcinoma), OV (ovarian), RE (renal) and PR (reproductive system)…” Ooi and Tan (2003) Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 29. Paper: C. H. Ooi and P. Tan, “Genetic algorithms applied to multi-class prediction for the analysis of gene expression data,” Bioinformatics, vol. 19, no. 1, pp. 37–44, 2003. Conclusion •The authors state that their report shows that highly accurate classification results can be obtained using a combination of GA-based gene selection and discriminant-based classification methods. •The authors note that accuracy achieved (95% for NCI60) is better than other published methods employing the same dataset. •The authors note that other advantages of the GA-based approach are that it automatically determines the optimal predictor set size and the delivery of predictive accuracies that are comparable to other methods. Bowie State University Department of Computer Science Bioinformatics Literature Review
  • 30. Conclusion • Genetic algorithms tend to get outdone by more situation specific algorithms in the simpler search spaces. • Genetic algorithms are not always the best choice, their time run is long. • Genetic algorithms are good at creating high quality solutions to a problem. Bowie State University Department of Computer Science Bioinformatics Literature Review