SlideShare a Scribd company logo
Amos Watentena
wateamos@gmail.com
 The term “Bioinformatics” was invented by Paulien
Hogeweg and Ben Hesper in 1970 as "the study of informatic
processes in biotic systems". Bioinformatics is the use of
information technology in biotechnology for the data storage,
data warehousing and analyzing the DNA sequences.
 In Bioinformatics knowledge of many branches are required
like biology, mathematics, computer science, laws of physics
& chemistry, and of course sound knowledge of information
technology to analyze biotech data.
 Given new DNA or protein sequence, biologist will want to
search databases of known sequences to look for anything
similar.
a. Sequence similarity can provide clues about function and
evolutionary relationships.
b. Databases such as GenBank are far too large to search
manually.
 But to search them efficiently, we need an algorithm because
we shouldn't expect exact matches (so it's not really like
Google):
i. Genomes aren't static: mutations, insertions, deletions.
ii. Human (and machine) error in reading sequencing gels.
 Bioinformatics intends to find new approaches to deal with the
volume and complexity of data.
 Providing researchers with better access to analysis and
computing tools in order to advance understanding of our
genetic legacy and its role in health and disease.
 There are many urgent open problems with lots of
opportunities to make fundamental contributions to the
society.
 Develop our creativity and problem-solving skills to the limit.
 Join a cross-disciplinary team and work with interesting
people.
 Participate in unlocking the mysteries of life itself so as to
make the world a better place to live.
 Bioinformatics is being used in following fields:
 Analysis and interpretation of various kinds of data including;
nucleotides and amino acid sequences, protein domains, and
protein structures.
 Development of new algorithms (mathematical formulas) and
statistics with which to assess relationships among members
of large data sets, such as methods to locate a gene within a
sequence, predict protein structure and/or function, and cluster
protein sequences into families of related sequences.
 Gene therapy; mutations are easily detected and quantified
through next-generation sequencing technology in a
heterogeneous sample thus a cost effective precision medicine,
“right drug at right dose to the right patient at the right time”
can be administered.
 Drug development; Bioinformatics explores the causes of
diseases at the molecular level, explains the phenomena of the
diseases on the gene/pathway level, makes use of computer
techniques such as data mining to analyze and interpret data
faster thus reducing the cost and time of drug discovery.
 Preventative medicine; gene identification by sequence
inspection and prediction of splice sites allows mutations to be
corrected easily. This is much used in the analysis of mutations
that cause cancer.
 Climate change Studies; Increasing levels of carbon dioxide
emission are thought to contribute to global climate change,
one way to decrease atmospheric carbon dioxide is to study
the genomes of microbes that use CO2 as their sole C source.
 Analysis of gene expression; DNA micro arrays allow
simultaneous measurement of the level of transcription for
every gene in a genome (gene expression) by:
 Tracking sample over a period of time to see gene expression
over time.
 Tracking two different samples under same conditions to see
difference in gene expressions.
 Computational evolutionary biology; Mouse provides insight
into human genetic disorder:
 Waardenburg’s syndrome is characterized by pigmentary
dysphasia.
 Disease gene linked to Chromosome 2, but not clear, where it
was located.
 “Splotch” mice:
 A breed of mice (with splotch gene) had similar symptoms.
 Scientists succeeded in identifying location of gene in mice.
 This gave clues as to where same gene is located in humans.
 Data driven biology;
 functional genomics
 comparative genomics
 systems biology
 Molecular medicine;
 identification of genetic components of various maladies
 diagnosis/prognosis from sequence/expression
 gene therapy
 Pharmacogenomics:
developing highly targeted drugs.
 Toxic genomics;
elucidating which genes are affected by various chemicals
 Waste cleanup
 Microbial genome applications
 Antibiotic resistance
 Alternative energy sources
 Crop improvement and development of resistant varieties
 Forensic analysis
 Bio-weapon creation
 Insect resistance
 Sequence analysis
 Literature analysis
Genome Where Year
H. Influenza TIGR 1995
E. Coli K -12 Wisconsin 1997
S. cerevisiae (yeast) International Collaboration 1997
C. elegans (worm) Washington U./Sanger 1998
Drosophila M. (fruit fly) multiple groups 2000
E. Coli 0157:H7 (pathogen) Wisconsin 2000
H. Sapiens (human) International Collaboration/ Celera 2001
Mus musculus (mouse) International Collaboration 2002
Rattus norvegicus (rat) International Collaboration 2004
 Over 300 publicly available databases pertaining to molecular
biology
 GenBank
• Has over 61 million sequence entries
• With over 65 billion bases
 UnitProtKB / Swis-Prot
• Has over 277 thousand protein sequence entries
• With over 100 million amino acids
 Protein Data Bank
• 45,632 protein (and related) structures
◦ Presence of repeats. Repeats are identical sequences that
occur in the genome in different locations and are often seen
in varying lengths and in the multiple copies. There are
several types of repeats: tandem repeats or interspersed
repeats. The read's originating from different copies of the
repeat appear identical to the assembler, causing errors in
the assembly.
◦ Contaminants in samples (e.g. from Bacteria or Human).
◦ PCR artifacts (e.g. Chimeras and Mutations)
◦ Sequencing errors, such as “Homopolymer” errors – when
e.g. 2+ run of same base.
◦ MID’s (multiplex indexes), primers/adapters still in the raw
reads.
◦ Polyploid genomes
 Biological redundancy and multiplicity
 Different sequences with similar structures
 Organism with similar genes
 Multiple functions of similar genes
 Grouping of genes in pathways
 Sequence redundancy in genomes
 Significance of relationships and similarities
 Signal vs. Noise
 Lack of data
 USA
NCBI (National Center for Biotechnology Information)
 Europe
EBI (European Bioinformatics Institute, Hinxton, UK)
 Japan
NIG (National Institute of Genetics)
Thank You

More Related Content

What's hot

databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformatics
nadeem akhter
 

What's hot (20)

Proteins databases
Proteins databasesProteins databases
Proteins databases
 
NCBI
NCBINCBI
NCBI
 
Biological database
Biological databaseBiological database
Biological database
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Genome Database Systems
Genome Database Systems Genome Database Systems
Genome Database Systems
 
Application of Bioinformatics in different fields of sciences
Application of Bioinformatics in different fields of sciencesApplication of Bioinformatics in different fields of sciences
Application of Bioinformatics in different fields of sciences
 
BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)
 
Fasta
FastaFasta
Fasta
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Structural databases
Structural databases Structural databases
Structural databases
 
databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformatics
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
TrEMBL
TrEMBLTrEMBL
TrEMBL
 
Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
BLAST
BLASTBLAST
BLAST
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 

Similar to BIOINFORMATICS Applications And Challenges

Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
Martín Arrieta
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple
nadeem akhter
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
Nikhil Aggarwal
 
Chapter 20 ppt
Chapter 20 pptChapter 20 ppt
Chapter 20 ppt
rehman2009
 

Similar to BIOINFORMATICS Applications And Challenges (20)

Bioinformatics .pptx
Bioinformatics .pptxBioinformatics .pptx
Bioinformatics .pptx
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
GENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICSGENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICS
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics issues and challanges presentation at s p college
Bioinformatics  issues and challanges  presentation at s p collegeBioinformatics  issues and challanges  presentation at s p college
Bioinformatics issues and challanges presentation at s p college
 
TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)
 
Gellibolian 2010 Audio Visual2
Gellibolian 2010 Audio Visual2Gellibolian 2010 Audio Visual2
Gellibolian 2010 Audio Visual2
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
 
Biotechnology
BiotechnologyBiotechnology
Biotechnology
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
An Introduction to Genomics
An Introduction to GenomicsAn Introduction to Genomics
An Introduction to Genomics
 
Bioinformatics in present and its future
Bioinformatics in present and its futureBioinformatics in present and its future
Bioinformatics in present and its future
 
Human genome project - Decoding the codes of life
Human genome project - Decoding the codes of lifeHuman genome project - Decoding the codes of life
Human genome project - Decoding the codes of life
 
Chapter 20 ppt
Chapter 20 pptChapter 20 ppt
Chapter 20 ppt
 
Una revisión de los conocimientos fundamentales de la biología de la célula. ...
Una revisión de los conocimientos fundamentales de la biología de la célula. ...Una revisión de los conocimientos fundamentales de la biología de la célula. ...
Una revisión de los conocimientos fundamentales de la biología de la célula. ...
 
Epigeneticsand methylation
Epigeneticsand methylationEpigeneticsand methylation
Epigeneticsand methylation
 

More from Amos Watentena

Insect collection, sorting, killing, labelling,preservation and cataloging in...
Insect collection, sorting, killing, labelling,preservation and cataloging in...Insect collection, sorting, killing, labelling,preservation and cataloging in...
Insect collection, sorting, killing, labelling,preservation and cataloging in...
Amos Watentena
 

More from Amos Watentena (17)

Insect collection, sorting, killing, labelling,preservation and cataloging in...
Insect collection, sorting, killing, labelling,preservation and cataloging in...Insect collection, sorting, killing, labelling,preservation and cataloging in...
Insect collection, sorting, killing, labelling,preservation and cataloging in...
 
Clinical Case 06-2019 by Slidesgo.pptx
Clinical Case 06-2019 by Slidesgo.pptxClinical Case 06-2019 by Slidesgo.pptx
Clinical Case 06-2019 by Slidesgo.pptx
 
Medical Thesis by Slidesgo.pptx
Medical Thesis by Slidesgo.pptxMedical Thesis by Slidesgo.pptx
Medical Thesis by Slidesgo.pptx
 
Detection Techniques of Insect Populations in Stored Grains
Detection Techniques of Insect Populations in Stored GrainsDetection Techniques of Insect Populations in Stored Grains
Detection Techniques of Insect Populations in Stored Grains
 
FIGHTING PESTS WITHOUT PESTICIDES
FIGHTING PESTS WITHOUT PESTICIDESFIGHTING PESTS WITHOUT PESTICIDES
FIGHTING PESTS WITHOUT PESTICIDES
 
Ecosystem and The Flow of Energy in an Ecosytem
Ecosystem and The Flow of Energy in an EcosytemEcosystem and The Flow of Energy in an Ecosytem
Ecosystem and The Flow of Energy in an Ecosytem
 
Biogeochemical Cycles and Human Activities
Biogeochemical Cycles and Human ActivitiesBiogeochemical Cycles and Human Activities
Biogeochemical Cycles and Human Activities
 
Beneficial Insects
Beneficial InsectsBeneficial Insects
Beneficial Insects
 
Insect Digestive System
Insect Digestive SystemInsect Digestive System
Insect Digestive System
 
Pests in Homes, Risks, Problems and Control
Pests in Homes, Risks, Problems and ControlPests in Homes, Risks, Problems and Control
Pests in Homes, Risks, Problems and Control
 
Control of 25 Household Pests (Pests of Medical Impotance)
Control of 25 Household Pests (Pests of Medical Impotance)Control of 25 Household Pests (Pests of Medical Impotance)
Control of 25 Household Pests (Pests of Medical Impotance)
 
Control of 25 Outdoor Pests (Pests of Agricultural Importance)
Control of 25 Outdoor Pests (Pests of Agricultural Importance)Control of 25 Outdoor Pests (Pests of Agricultural Importance)
Control of 25 Outdoor Pests (Pests of Agricultural Importance)
 
Biomes of an Ecosystem
Biomes of an EcosystemBiomes of an Ecosystem
Biomes of an Ecosystem
 
The world biomes
The world biomesThe world biomes
The world biomes
 
Ecological Succession
Ecological SuccessionEcological Succession
Ecological Succession
 
Ecosystem and Organism Interactions
Ecosystem and Organism InteractionsEcosystem and Organism Interactions
Ecosystem and Organism Interactions
 
Field Application of Insecticides
Field Application of InsecticidesField Application of Insecticides
Field Application of Insecticides
 

Recently uploaded

Seminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptxSeminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptx
RUDYLUMAPINET2
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
Sérgio Sacani
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
Sérgio Sacani
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 

Recently uploaded (20)

Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
GBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound MicroscopeGBSN - Microbiology (Lab 2) Compound Microscope
GBSN - Microbiology (Lab 2) Compound Microscope
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Microbial Type Culture Collection (MTCC)
Microbial Type Culture Collection (MTCC)Microbial Type Culture Collection (MTCC)
Microbial Type Culture Collection (MTCC)
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 
Seminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptxSeminar on Halal AGriculture and Fisheries.pptx
Seminar on Halal AGriculture and Fisheries.pptx
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
GBSN -  Microbiology (Lab  1) Microbiology Lab Safety ProceduresGBSN -  Microbiology (Lab  1) Microbiology Lab Safety Procedures
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSE
 
Shuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptxShuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptx
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
biotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptxbiotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptx
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of Bengal
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyan
 

BIOINFORMATICS Applications And Challenges

  • 2.  The term “Bioinformatics” was invented by Paulien Hogeweg and Ben Hesper in 1970 as "the study of informatic processes in biotic systems". Bioinformatics is the use of information technology in biotechnology for the data storage, data warehousing and analyzing the DNA sequences.
  • 3.  In Bioinformatics knowledge of many branches are required like biology, mathematics, computer science, laws of physics & chemistry, and of course sound knowledge of information technology to analyze biotech data.
  • 4.  Given new DNA or protein sequence, biologist will want to search databases of known sequences to look for anything similar. a. Sequence similarity can provide clues about function and evolutionary relationships. b. Databases such as GenBank are far too large to search manually.  But to search them efficiently, we need an algorithm because we shouldn't expect exact matches (so it's not really like Google): i. Genomes aren't static: mutations, insertions, deletions. ii. Human (and machine) error in reading sequencing gels.
  • 5.  Bioinformatics intends to find new approaches to deal with the volume and complexity of data.  Providing researchers with better access to analysis and computing tools in order to advance understanding of our genetic legacy and its role in health and disease.
  • 6.  There are many urgent open problems with lots of opportunities to make fundamental contributions to the society.  Develop our creativity and problem-solving skills to the limit.  Join a cross-disciplinary team and work with interesting people.  Participate in unlocking the mysteries of life itself so as to make the world a better place to live.
  • 7.  Bioinformatics is being used in following fields:
  • 8.  Analysis and interpretation of various kinds of data including; nucleotides and amino acid sequences, protein domains, and protein structures.  Development of new algorithms (mathematical formulas) and statistics with which to assess relationships among members of large data sets, such as methods to locate a gene within a sequence, predict protein structure and/or function, and cluster protein sequences into families of related sequences.  Gene therapy; mutations are easily detected and quantified through next-generation sequencing technology in a heterogeneous sample thus a cost effective precision medicine, “right drug at right dose to the right patient at the right time” can be administered.
  • 9.  Drug development; Bioinformatics explores the causes of diseases at the molecular level, explains the phenomena of the diseases on the gene/pathway level, makes use of computer techniques such as data mining to analyze and interpret data faster thus reducing the cost and time of drug discovery.  Preventative medicine; gene identification by sequence inspection and prediction of splice sites allows mutations to be corrected easily. This is much used in the analysis of mutations that cause cancer.  Climate change Studies; Increasing levels of carbon dioxide emission are thought to contribute to global climate change, one way to decrease atmospheric carbon dioxide is to study the genomes of microbes that use CO2 as their sole C source.
  • 10.  Analysis of gene expression; DNA micro arrays allow simultaneous measurement of the level of transcription for every gene in a genome (gene expression) by:  Tracking sample over a period of time to see gene expression over time.  Tracking two different samples under same conditions to see difference in gene expressions.
  • 11.  Computational evolutionary biology; Mouse provides insight into human genetic disorder:  Waardenburg’s syndrome is characterized by pigmentary dysphasia.  Disease gene linked to Chromosome 2, but not clear, where it was located.  “Splotch” mice:  A breed of mice (with splotch gene) had similar symptoms.  Scientists succeeded in identifying location of gene in mice.  This gave clues as to where same gene is located in humans.
  • 12.  Data driven biology;  functional genomics  comparative genomics  systems biology  Molecular medicine;  identification of genetic components of various maladies  diagnosis/prognosis from sequence/expression  gene therapy
  • 13.  Pharmacogenomics: developing highly targeted drugs.  Toxic genomics; elucidating which genes are affected by various chemicals
  • 14.  Waste cleanup  Microbial genome applications  Antibiotic resistance  Alternative energy sources  Crop improvement and development of resistant varieties  Forensic analysis  Bio-weapon creation  Insect resistance  Sequence analysis  Literature analysis
  • 15. Genome Where Year H. Influenza TIGR 1995 E. Coli K -12 Wisconsin 1997 S. cerevisiae (yeast) International Collaboration 1997 C. elegans (worm) Washington U./Sanger 1998 Drosophila M. (fruit fly) multiple groups 2000 E. Coli 0157:H7 (pathogen) Wisconsin 2000 H. Sapiens (human) International Collaboration/ Celera 2001 Mus musculus (mouse) International Collaboration 2002 Rattus norvegicus (rat) International Collaboration 2004
  • 16.  Over 300 publicly available databases pertaining to molecular biology  GenBank • Has over 61 million sequence entries • With over 65 billion bases  UnitProtKB / Swis-Prot • Has over 277 thousand protein sequence entries • With over 100 million amino acids  Protein Data Bank • 45,632 protein (and related) structures
  • 17. ◦ Presence of repeats. Repeats are identical sequences that occur in the genome in different locations and are often seen in varying lengths and in the multiple copies. There are several types of repeats: tandem repeats or interspersed repeats. The read's originating from different copies of the repeat appear identical to the assembler, causing errors in the assembly. ◦ Contaminants in samples (e.g. from Bacteria or Human). ◦ PCR artifacts (e.g. Chimeras and Mutations) ◦ Sequencing errors, such as “Homopolymer” errors – when e.g. 2+ run of same base. ◦ MID’s (multiplex indexes), primers/adapters still in the raw reads. ◦ Polyploid genomes
  • 18.  Biological redundancy and multiplicity  Different sequences with similar structures  Organism with similar genes  Multiple functions of similar genes  Grouping of genes in pathways  Sequence redundancy in genomes  Significance of relationships and similarities  Signal vs. Noise  Lack of data
  • 19.  USA NCBI (National Center for Biotechnology Information)  Europe EBI (European Bioinformatics Institute, Hinxton, UK)  Japan NIG (National Institute of Genetics)