SlideShare a Scribd company logo
1 of 51
Download to read offline
COMPARITIVE
GENOMICS.
GENOME

.

• The whole genetic complement
of an organism
• The study of all the genes present
in an organism
WHAT IS THE GENOME.
• cells contain chromosomes.
• Chromosomes consist of million of
genes.
• All these genes are present in DNA.
• The four Nucleotides(A,G,C,T) arrange
in different combinations to form
different genes.
• DNA encodes for thousands of proteins.
• Hereditary material
DNA SEQUENCING
• DNA of any organism is sequenced in
parts.
• In DNA sequencing both Molecular
biology and Computational techniques
is used.
• Molecular biology techniques like
Recombinant DNA Technology, PCR
amplification, Libraries Constructions
etc are used.
• Ultimately DNA Libraries are
constructed.
LIBRARIES
• Genomic Libraries:
• libraries of whole genome
• cDNA Libraries:
• libraries of only coding
regions (genes encodes for
some proteins)
SEQUENCED GENOMES
• After the completion of Human Genome
Sequencing in 2001 , genomes of many other
organisms are sequenced
• Table below show few examples of organisms
whose genomes are sequenced
Organisms

Year

Scientist

mouse

2002

Waterston et al

rat

2005

Gibbs et al

fruit fly

2000

Adams et al

baker's yeast

1996

Goffeau et al.

chicken

2004

Blattner et al

cow

2009

Elsik et al.

monkey

2007

Gibbs et al
COMPARATIVE GENOMICS
•

comparative genomics is the field of biology in which
genomes of different organisms are compared by using
computational techniques

•

Sequence alignment is main principle on which field of
comparative genomics based

•
•

Different techniques of sequence alignment are in used
Dot Plot

•

Dynamic Programming
• Smith waterman algorithm
• Needleman Wunch Algorithm
Heuristics
• FASTA,
• BLAST etc
.

•

•
COMPARATIVE GENOMICS.
 By comparing genomes
organisms we can find:

of

different

 What is conserved between species

 What make the closely related species
different
 we can also study evolutionary changes,
gene function and inherited diseases
HOW TO COMPARE?
 Three possible ways to compare genomes

comparing of general features
Finer-Resolution comparison
Comparison of discrete segments
GENERAL FEATURES
• General features of genomes are comared to find
similarity and differences
•
•
•
•

Genome Length
Number of Exons
Number of genes
Chromosome Numbers
FINER RESOLUTION METHOD.
In Finer-Resolution Method genomes are compared by the direct
comparison of DNA sequences of different species for instance
Human chromosomes, with segments containing at least two
genes whose order is conserved in the mouse genome as colour
blocks. Each colour corresponds to a particular mouse
chromosome. Centromeres, subcentromeric heterochromatin of
chromosomes 1, 9 and 16, and the repetitive short arms of 13, 14,
15, 21 and 22 are in black.
Conserved segments in the human and mouse genome
COMPARING OF DISCRETE SEGMENTS
Comparison of Discrete segments can be obtain through the
comparison of homologous segments of sequences
instance

for

A human gene (pyruvate kinase: PKLR) and the corresponding PKLR homologs
from macaque, dog, mouse, chicken, and zebrafish are aligned. Macaque show
similarity in all regions like exon(blue),introns(red) and untranslated regions(light
blue).
Applications
 Comparative genomics are used in finding
Evolutionary relationship
Counterparts of genes
New model organisms
EVOLUTIONARY RELATIONSHIP
o With the passage of time some changes take
place in the hereditary materiel which sometime
brings useful changes and sometime cause
drastic effect
o To study these changes evolutionary trees are
generated on the bases of evolutionary distances
o Two types of trees are constructed
o Rooted trees
o Unrooted trees
ROOTED TREE
• Rooted Tree is that tree which show the common
ancestor of all the target organisms

EXAMPLE
• let suppose we say that Human, apes, gorillas are all
comes under the Mammals. So we can say that
mammals is the parent of all so it is a common ancestor
all
UNROOTED TREE
• Unrooted trees are those trees which show the relationship of target
organisms with each other but do not show the common ancestor of all

• EXAMPLE
• Let suppose we say that Eagle, Sparrow, Crow, Dove are all related to each
other some how
Eagle
Crow

Sparrow
Dove
TERMINOLOGIES.
• Some of the terminologies used is studying Phylogenatic trees
are

• Root:
• The point which represent the common ancestor of all target
organisms

• Nodes:
• Points from which leaves originates

• Leaves:
• The teminal nodes or the end children of the tree

• Clade:
• Subtree of a large tree
STRUCTURE OF TREE
SOFTWARES.
• The most commonly used software for
constructing the Phylogenatic tree is
ClustralW
• In clustralW insert all the sequences in it in
FASTA format
• Than submit it

• Select guide tree option to generate tree
ClustalW2
Retriev sequence from NCBI and insert it into
ClustalW2 and submit it
RESULTS.
COUNTER PARTS OF GENES
• Dramatic results have emerged from the
rapidly developing field of comparative
genomics
• Comparison of the fruit fly genome with the
human genome reveals that about sixty
percent of genes are conserved (Adams et al.
2000). That is, the two organisms appear to
share a core set of genes
• Researchers have also found that two-thirds
of human genes known to be involved in
cancer have counterparts in the fruit fly
CONT…

 Michigan Tech researchers Thomas Werner and Komal
Kumar Bollepogu Raja have traced these black spots to
three specific genes in the fruit fly genome. These
particular genes all have counterparts in human DNA,
and all three of these counterparts just so happens to
cause cancer.
 "We are looking here at proto-oncogenes, which are
cancer genes that cause disease when they are active in
an uncontrolled manner. Both humans and flies have
them, and in flies they learned to paint black spots on the
abdomen
MODEL ORGANISMS
 Comparative genomics is an exciting new field of biological
research in which the genome sequences of different
species - human, mouse and a wide variety of other
organisms from yeast to chimpanzees - are compared.
 By comparing the finished reference sequence of the
human genome with genomes of other organisms,
researchers can identify regions of similarity and
difference. This information can help scientists better
understand the structure and function of human genes and
thereby develop new strategies to combat human diseases
 when scientists inserted a human gene associated with
early-onset Parkinson's disease into fruit flies, they
displayed symptoms similar to those seen in humans with
the disorder, raising the possibility the tiny insects could
serve as a new model for testing therapies aimed at
Parkinson's.
INTRODUCTION TO GENOMES WITH
ENSEMBL
DATABASE OF COMPARATIVE
GENOMICS.
 Ensemble is a database in which genomes of different
vertebrates and eukaryotes are store
 By using Ensemble one can use BLAST and BLAT to
search the similar sequences in other species and
organisms
 User can download sequences in FASTA format and
can also view the karyotypes
 One can find Homologues, gene trees, and whole
genome alignments across multiple species.
 User can get information about DNA methylation,
transcription factor binding sites, histone
modifications, and regulatory features such as
enhancers and repressors, and microarray
annotations.
Ensembl is Used Worldwide
Top users: UK US Canada China
France Germany Italy Japan Spain
THE ENSEMBL GENOME BROWSER:
MAKING IT INTERESTING
 Splice variants, proteins, non-coding RNA

 Small and large scale sequence variation, phenotype
associations
 Whole genome alignments, protein trees

 Potential promoters and enhancers, DNA methylation
 User upload, custom data
Different
species
whose genomes are
present in Ensemble
Total almost 69
spices
ENSEMBL FEATURES
 The gene set.
 •Comparative analysis
 •Variation and regulation
 •BioMart (data export)
 •Display of external data
(DAS) •Programmatic
access via the Perl API.
 •Open Source
SEQUENCE DISPLAYS.

Transcript:cDNA
Gene: Sequence

Transcript: Exons
KARYOTYPE OF
HUMAN GENOME.
Chromosome Summary
Comparative Genomics
COMPARATIVE GENOMICS
•

In comparative genomics section it provide four
options
• Alignment image
• Alignment Text
• Region comparison
• Synteny
ALIGNMENT IMAGE.
• It show the image of alignment that at which
point which gene is similar to which
ALIGNMENT TEXT
• It give the whole sequence in FASTA format
in which Exons are highlighted
• It also provide an opportunity of alignment
• User can select the specie to which he/she
want to align the target sequence
• The results will show all possible hits of the
target sequence or region in that specie
• Chromosome number and position of the
similar region on it are provided for all hits
• One can check its detail by selecting it
Human sequence is aligned against Mus
musculus specie
REGION COMPARISON
• It will compare different regions like protein
coding region, pseudo gene, RNA gene,
processed transcript etc.
SYNTENY..
In classical genetics, synteny describes
the physical co-localization of genetic
loci on the same chromosome within an
individual or species.
It show the similar region in the form of
bands of different colors.
HUMAN VS MOUSE..
• Human
and
mouse
genome
show
approximately 85% identity
• Comparisons of mRNA sequences of 1196
orthologous human and mouse gene pairs were
recently reported (Makalowski et al. 1996),
showing that coding regions tend to show
approximately 85% identity at the nucleotide
and protein levels
• A total of 117 orthologous gene pairs were
identified and studied
Exon Identity..
• For the purpose of comparing the genomic structure of the
gene pairs, we used dynamic programming algorithms
(employing both nucleotide similarity and codon
similarity using the PAM20 matrix (Dayhoff et al. 1978))
to align the sequences. We carefully inspected the
alignments to ensure that they correctly aligned the exons.
• The number of exons was identical for 95% of the genes
studied. There were six instances in which the number of
exons differed.
CONT….
• In two cases, a single internal coding exon in
mouse is reported to correspond to two internal
coding exons in human. In the spermidine
synthase gene, mouse exon 5 corresponds to
human exons 5 and 6, with the total exonic
lengths agreeing perfectly
• In the lymphotoxin beta gene mouse exon 2
corresponds to human exons 2 and 3.
Interestingly, the mouse exon 2 is 316 bp while
the sum of the lengths of human exon 2, intron
2 and exon 3 is only 301 bp.
EXON LENGTH
 The length of corresponding exons was strongly conserved.
The lengths were identical in 73% of cases. Those differences
that did occur were quite small: the mean ratio of the larger to
smaller length was 1.05.
 Moreover, the differences were nearly always a multiple of
three. The length difference was a multiple of three for 95% of
all exons and 99% of all internal coding exons. This is readily
understood in terms of the effects of evolutionary selection
 Only three instances were found in which corresponding
internal exons had lengths differing by other than a multiple of
three.
CONT..
o In the skeletal muscle specific myogenic gene the
respective lengths of exons 2 and 3 are 81 bp and 123 bp
in the human and 82 and 122 in the mouse
o gene encoding the Flt3 ligand The respective lengths of
exons 2 and 3 are 111 bp and 54 bp in the human and
122 bp and 46 bp in the mouse, while the respective
lengths of exons 5 and 6 are 139 bp and 179 bp in the
human and 144 bp and 189 bp in the mouse.
Intron length
 Exon lengths tended to be well-preserved,
intron lengths varied considerably
 Human introns tended to be larger than mouse
introns (68% of cases), but this could represent
a selection bias reflecting the fact that the less
extensive sequencing of the mouse genome
may lead to an underrepresentation of instances
in which the mouse genomic locus is larger
SEQUENCE IDENTITY
SEQUENCE IDENTITY
Coding regions showed strong sequence similarity,
with
approximately 85%



Coding regions showed strong sequence similarity, with
In approximately 85%
contrast, introns showed only weak sequence

similarity
with approximately 35% sequence identity, which
is
 not much
In contrast, introns showed only weak sequence similarity
higher than the background rate of sequence
with approximately 35% sequence identity, which is not much
identity in gapped
alignments of the background rate of sequence identity in gapped
higher than random sequences.

alignments of random sequences.
SEQUENCE IDENTITY
• The degree of conservation varied considerably
among genes. For example, the gene encoding the
ribosomal protein S24 showed 88% identity at the
DNA level and 100% identity at the amino acid level
in coding exons, but only 27% identity at the DNA
level in introns.
• In the tumor necrosis factor-beta gene the first intron
has 75% nucleotide identity and nearly perfect
agreement in length (86 bp in human, 83 bp in
mouse). Interestingly, the flanking exons are less
well-conserved, showing only 70% nucleotide
identity and 60% amino acid identity

More Related Content

What's hot

Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequencesababibi
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencingShital Pal
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)IndrajaDoradla
 
Structural genomics
Structural genomicsStructural genomics
Structural genomicsAshfaq Ahmad
 
Functional proteomics, and tools
Functional proteomics, and toolsFunctional proteomics, and tools
Functional proteomics, and toolsKAUSHAL SAHU
 
Comparative genomics in eukaryotes, organelles
Comparative genomics in eukaryotes, organellesComparative genomics in eukaryotes, organelles
Comparative genomics in eukaryotes, organellesKAUSHAL SAHU
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionAashish Patel
 
encode project
encode project encode project
encode project Priti Pal
 
RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities Paolo Dametto
 
Metagenomics and it’s applications
Metagenomics and it’s applicationsMetagenomics and it’s applications
Metagenomics and it’s applicationsSham Sadiq
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicskiran singh
 
Comparative genomics 2
Comparative genomics 2Comparative genomics 2
Comparative genomics 2GCUF
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programsMugdhaSharma11
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expressionTapeshwar Yadav
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 

What's hot (20)

Types of genomics ppt
Types of genomics pptTypes of genomics ppt
Types of genomics ppt
 
Est database
Est databaseEst database
Est database
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequence
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Functional proteomics, and tools
Functional proteomics, and toolsFunctional proteomics, and tools
Functional proteomics, and tools
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
Comparative genomics in eukaryotes, organelles
Comparative genomics in eukaryotes, organellesComparative genomics in eukaryotes, organelles
Comparative genomics in eukaryotes, organelles
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
encode project
encode project encode project
encode project
 
RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities
 
Metagenomics and it’s applications
Metagenomics and it’s applicationsMetagenomics and it’s applications
Metagenomics and it’s applications
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Comparative genomics 2
Comparative genomics 2Comparative genomics 2
Comparative genomics 2
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expression
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 

Similar to Comparitive genomics

Comparative transcriptomics
Comparative transcriptomicsComparative transcriptomics
Comparative transcriptomicsSayak Ghosh
 
Comparative genomics.pdf
Comparative genomics.pdfComparative genomics.pdf
Comparative genomics.pdfshinycthomas
 
Genomics-Mapping and sequencing.pdf
Genomics-Mapping and sequencing.pdfGenomics-Mapping and sequencing.pdf
Genomics-Mapping and sequencing.pdfshinycthomas
 
DNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyDNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyBikash1489
 
Mapping the bacteriophage genome
Mapping the bacteriophage genomeMapping the bacteriophage genome
Mapping the bacteriophage genomevibhakhanna1
 
Forensic dna typing by John M Butler
Forensic dna typing by John M ButlerForensic dna typing by John M Butler
Forensic dna typing by John M ButlerMuhammad Ahmad
 
Basic genetics ,mutation and karyotyping
Basic genetics ,mutation and karyotypingBasic genetics ,mutation and karyotyping
Basic genetics ,mutation and karyotypingAamir Sharif
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics TechnologiesSean Davis
 
genomics and system biology
genomics and system biologygenomics and system biology
genomics and system biologyNawfal Aldujaily
 
Human Genome 2009
Human Genome 2009Human Genome 2009
Human Genome 2009lyonja
 
cytogenomics tools and techniques and chromosome sorting.pptx
cytogenomics tools and techniques and chromosome sorting.pptxcytogenomics tools and techniques and chromosome sorting.pptx
cytogenomics tools and techniques and chromosome sorting.pptxPABOLU TEJASREE
 
Human genome project (2) converted
Human genome project (2) convertedHuman genome project (2) converted
Human genome project (2) convertedGAnchal
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsgroovescience
 

Similar to Comparitive genomics (20)

Comparative transcriptomics
Comparative transcriptomicsComparative transcriptomics
Comparative transcriptomics
 
Comparative genomics.pdf
Comparative genomics.pdfComparative genomics.pdf
Comparative genomics.pdf
 
Genomics-Mapping and sequencing.pdf
Genomics-Mapping and sequencing.pdfGenomics-Mapping and sequencing.pdf
Genomics-Mapping and sequencing.pdf
 
THE human genome
THE human genomeTHE human genome
THE human genome
 
DNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyDNA Sequencing in Phylogeny
DNA Sequencing in Phylogeny
 
Genomics
GenomicsGenomics
Genomics
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Gene mapping
Gene mappingGene mapping
Gene mapping
 
Mapping the bacteriophage genome
Mapping the bacteriophage genomeMapping the bacteriophage genome
Mapping the bacteriophage genome
 
Forensic dna typing by John M Butler
Forensic dna typing by John M ButlerForensic dna typing by John M Butler
Forensic dna typing by John M Butler
 
HGP, the human genome project
HGP, the human genome projectHGP, the human genome project
HGP, the human genome project
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Basic genetics ,mutation and karyotyping
Basic genetics ,mutation and karyotypingBasic genetics ,mutation and karyotyping
Basic genetics ,mutation and karyotyping
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics Technologies
 
genomics and system biology
genomics and system biologygenomics and system biology
genomics and system biology
 
Human Genome 2009
Human Genome 2009Human Genome 2009
Human Genome 2009
 
cytogenomics tools and techniques and chromosome sorting.pptx
cytogenomics tools and techniques and chromosome sorting.pptxcytogenomics tools and techniques and chromosome sorting.pptx
cytogenomics tools and techniques and chromosome sorting.pptx
 
Human genome project (2) converted
Human genome project (2) convertedHuman genome project (2) converted
Human genome project (2) converted
 
07_Phylogeny_2022.pdf
07_Phylogeny_2022.pdf07_Phylogeny_2022.pdf
07_Phylogeny_2022.pdf
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traits
 

Comparitive genomics

  • 2. GENOME . • The whole genetic complement of an organism • The study of all the genes present in an organism
  • 3. WHAT IS THE GENOME. • cells contain chromosomes. • Chromosomes consist of million of genes. • All these genes are present in DNA. • The four Nucleotides(A,G,C,T) arrange in different combinations to form different genes. • DNA encodes for thousands of proteins. • Hereditary material
  • 4. DNA SEQUENCING • DNA of any organism is sequenced in parts. • In DNA sequencing both Molecular biology and Computational techniques is used. • Molecular biology techniques like Recombinant DNA Technology, PCR amplification, Libraries Constructions etc are used. • Ultimately DNA Libraries are constructed.
  • 5.
  • 6. LIBRARIES • Genomic Libraries: • libraries of whole genome • cDNA Libraries: • libraries of only coding regions (genes encodes for some proteins)
  • 7. SEQUENCED GENOMES • After the completion of Human Genome Sequencing in 2001 , genomes of many other organisms are sequenced • Table below show few examples of organisms whose genomes are sequenced Organisms Year Scientist mouse 2002 Waterston et al rat 2005 Gibbs et al fruit fly 2000 Adams et al baker's yeast 1996 Goffeau et al. chicken 2004 Blattner et al cow 2009 Elsik et al. monkey 2007 Gibbs et al
  • 8. COMPARATIVE GENOMICS • comparative genomics is the field of biology in which genomes of different organisms are compared by using computational techniques • Sequence alignment is main principle on which field of comparative genomics based • • Different techniques of sequence alignment are in used Dot Plot • Dynamic Programming • Smith waterman algorithm • Needleman Wunch Algorithm Heuristics • FASTA, • BLAST etc . • •
  • 9. COMPARATIVE GENOMICS.  By comparing genomes organisms we can find: of different  What is conserved between species  What make the closely related species different  we can also study evolutionary changes, gene function and inherited diseases
  • 10. HOW TO COMPARE?  Three possible ways to compare genomes comparing of general features Finer-Resolution comparison Comparison of discrete segments
  • 11. GENERAL FEATURES • General features of genomes are comared to find similarity and differences • • • • Genome Length Number of Exons Number of genes Chromosome Numbers
  • 12.
  • 13. FINER RESOLUTION METHOD. In Finer-Resolution Method genomes are compared by the direct comparison of DNA sequences of different species for instance Human chromosomes, with segments containing at least two genes whose order is conserved in the mouse genome as colour blocks. Each colour corresponds to a particular mouse chromosome. Centromeres, subcentromeric heterochromatin of chromosomes 1, 9 and 16, and the repetitive short arms of 13, 14, 15, 21 and 22 are in black. Conserved segments in the human and mouse genome
  • 14. COMPARING OF DISCRETE SEGMENTS Comparison of Discrete segments can be obtain through the comparison of homologous segments of sequences instance for A human gene (pyruvate kinase: PKLR) and the corresponding PKLR homologs from macaque, dog, mouse, chicken, and zebrafish are aligned. Macaque show similarity in all regions like exon(blue),introns(red) and untranslated regions(light blue).
  • 15. Applications  Comparative genomics are used in finding Evolutionary relationship Counterparts of genes New model organisms
  • 16. EVOLUTIONARY RELATIONSHIP o With the passage of time some changes take place in the hereditary materiel which sometime brings useful changes and sometime cause drastic effect o To study these changes evolutionary trees are generated on the bases of evolutionary distances o Two types of trees are constructed o Rooted trees o Unrooted trees
  • 17. ROOTED TREE • Rooted Tree is that tree which show the common ancestor of all the target organisms EXAMPLE • let suppose we say that Human, apes, gorillas are all comes under the Mammals. So we can say that mammals is the parent of all so it is a common ancestor all
  • 18. UNROOTED TREE • Unrooted trees are those trees which show the relationship of target organisms with each other but do not show the common ancestor of all • EXAMPLE • Let suppose we say that Eagle, Sparrow, Crow, Dove are all related to each other some how Eagle Crow Sparrow Dove
  • 19. TERMINOLOGIES. • Some of the terminologies used is studying Phylogenatic trees are • Root: • The point which represent the common ancestor of all target organisms • Nodes: • Points from which leaves originates • Leaves: • The teminal nodes or the end children of the tree • Clade: • Subtree of a large tree
  • 21. SOFTWARES. • The most commonly used software for constructing the Phylogenatic tree is ClustralW • In clustralW insert all the sequences in it in FASTA format • Than submit it • Select guide tree option to generate tree
  • 22. ClustalW2 Retriev sequence from NCBI and insert it into ClustalW2 and submit it
  • 24. COUNTER PARTS OF GENES • Dramatic results have emerged from the rapidly developing field of comparative genomics • Comparison of the fruit fly genome with the human genome reveals that about sixty percent of genes are conserved (Adams et al. 2000). That is, the two organisms appear to share a core set of genes • Researchers have also found that two-thirds of human genes known to be involved in cancer have counterparts in the fruit fly
  • 25. CONT…  Michigan Tech researchers Thomas Werner and Komal Kumar Bollepogu Raja have traced these black spots to three specific genes in the fruit fly genome. These particular genes all have counterparts in human DNA, and all three of these counterparts just so happens to cause cancer.  "We are looking here at proto-oncogenes, which are cancer genes that cause disease when they are active in an uncontrolled manner. Both humans and flies have them, and in flies they learned to paint black spots on the abdomen
  • 26. MODEL ORGANISMS  Comparative genomics is an exciting new field of biological research in which the genome sequences of different species - human, mouse and a wide variety of other organisms from yeast to chimpanzees - are compared.  By comparing the finished reference sequence of the human genome with genomes of other organisms, researchers can identify regions of similarity and difference. This information can help scientists better understand the structure and function of human genes and thereby develop new strategies to combat human diseases  when scientists inserted a human gene associated with early-onset Parkinson's disease into fruit flies, they displayed symptoms similar to those seen in humans with the disorder, raising the possibility the tiny insects could serve as a new model for testing therapies aimed at Parkinson's.
  • 27. INTRODUCTION TO GENOMES WITH ENSEMBL
  • 28. DATABASE OF COMPARATIVE GENOMICS.  Ensemble is a database in which genomes of different vertebrates and eukaryotes are store  By using Ensemble one can use BLAST and BLAT to search the similar sequences in other species and organisms  User can download sequences in FASTA format and can also view the karyotypes  One can find Homologues, gene trees, and whole genome alignments across multiple species.  User can get information about DNA methylation, transcription factor binding sites, histone modifications, and regulatory features such as enhancers and repressors, and microarray annotations.
  • 29.
  • 30. Ensembl is Used Worldwide Top users: UK US Canada China France Germany Italy Japan Spain
  • 31. THE ENSEMBL GENOME BROWSER: MAKING IT INTERESTING  Splice variants, proteins, non-coding RNA  Small and large scale sequence variation, phenotype associations  Whole genome alignments, protein trees  Potential promoters and enhancers, DNA methylation  User upload, custom data
  • 32. Different species whose genomes are present in Ensemble Total almost 69 spices
  • 33. ENSEMBL FEATURES  The gene set.  •Comparative analysis  •Variation and regulation  •BioMart (data export)  •Display of external data (DAS) •Programmatic access via the Perl API.  •Open Source
  • 38. COMPARATIVE GENOMICS • In comparative genomics section it provide four options • Alignment image • Alignment Text • Region comparison • Synteny
  • 39. ALIGNMENT IMAGE. • It show the image of alignment that at which point which gene is similar to which
  • 40. ALIGNMENT TEXT • It give the whole sequence in FASTA format in which Exons are highlighted • It also provide an opportunity of alignment • User can select the specie to which he/she want to align the target sequence • The results will show all possible hits of the target sequence or region in that specie • Chromosome number and position of the similar region on it are provided for all hits • One can check its detail by selecting it
  • 41. Human sequence is aligned against Mus musculus specie
  • 42. REGION COMPARISON • It will compare different regions like protein coding region, pseudo gene, RNA gene, processed transcript etc.
  • 43. SYNTENY.. In classical genetics, synteny describes the physical co-localization of genetic loci on the same chromosome within an individual or species. It show the similar region in the form of bands of different colors.
  • 44. HUMAN VS MOUSE.. • Human and mouse genome show approximately 85% identity • Comparisons of mRNA sequences of 1196 orthologous human and mouse gene pairs were recently reported (Makalowski et al. 1996), showing that coding regions tend to show approximately 85% identity at the nucleotide and protein levels • A total of 117 orthologous gene pairs were identified and studied
  • 45. Exon Identity.. • For the purpose of comparing the genomic structure of the gene pairs, we used dynamic programming algorithms (employing both nucleotide similarity and codon similarity using the PAM20 matrix (Dayhoff et al. 1978)) to align the sequences. We carefully inspected the alignments to ensure that they correctly aligned the exons. • The number of exons was identical for 95% of the genes studied. There were six instances in which the number of exons differed.
  • 46. CONT…. • In two cases, a single internal coding exon in mouse is reported to correspond to two internal coding exons in human. In the spermidine synthase gene, mouse exon 5 corresponds to human exons 5 and 6, with the total exonic lengths agreeing perfectly • In the lymphotoxin beta gene mouse exon 2 corresponds to human exons 2 and 3. Interestingly, the mouse exon 2 is 316 bp while the sum of the lengths of human exon 2, intron 2 and exon 3 is only 301 bp.
  • 47. EXON LENGTH  The length of corresponding exons was strongly conserved. The lengths were identical in 73% of cases. Those differences that did occur were quite small: the mean ratio of the larger to smaller length was 1.05.  Moreover, the differences were nearly always a multiple of three. The length difference was a multiple of three for 95% of all exons and 99% of all internal coding exons. This is readily understood in terms of the effects of evolutionary selection  Only three instances were found in which corresponding internal exons had lengths differing by other than a multiple of three.
  • 48. CONT.. o In the skeletal muscle specific myogenic gene the respective lengths of exons 2 and 3 are 81 bp and 123 bp in the human and 82 and 122 in the mouse o gene encoding the Flt3 ligand The respective lengths of exons 2 and 3 are 111 bp and 54 bp in the human and 122 bp and 46 bp in the mouse, while the respective lengths of exons 5 and 6 are 139 bp and 179 bp in the human and 144 bp and 189 bp in the mouse.
  • 49. Intron length  Exon lengths tended to be well-preserved, intron lengths varied considerably  Human introns tended to be larger than mouse introns (68% of cases), but this could represent a selection bias reflecting the fact that the less extensive sequencing of the mouse genome may lead to an underrepresentation of instances in which the mouse genomic locus is larger
  • 50. SEQUENCE IDENTITY SEQUENCE IDENTITY Coding regions showed strong sequence similarity, with approximately 85%  Coding regions showed strong sequence similarity, with In approximately 85% contrast, introns showed only weak sequence similarity with approximately 35% sequence identity, which is  not much In contrast, introns showed only weak sequence similarity higher than the background rate of sequence with approximately 35% sequence identity, which is not much identity in gapped alignments of the background rate of sequence identity in gapped higher than random sequences. alignments of random sequences.
  • 51. SEQUENCE IDENTITY • The degree of conservation varied considerably among genes. For example, the gene encoding the ribosomal protein S24 showed 88% identity at the DNA level and 100% identity at the amino acid level in coding exons, but only 27% identity at the DNA level in introns. • In the tumor necrosis factor-beta gene the first intron has 75% nucleotide identity and nearly perfect agreement in length (86 bp in human, 83 bp in mouse). Interestingly, the flanking exons are less well-conserved, showing only 70% nucleotide identity and 60% amino acid identity