Genomics is the study and application of genetic mapping, sequencing, and bioinformatics to analyze genomes. It includes structural genomics, which maps genomes, functional genomics which analyzes gene function, and comparative genomics which compares genomes across species. Comparative genomics enables insights from model organisms to be applied to other species through identifying commonalities and differences between genomes. High-throughput bioinformatics tools can be used to analyze bacterial and fungal genomes from public databases to identify potential drug targets.
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
COMPARATIVE GENOMICS.ppt
1. Genomics-what is it?
Development and application of genetic mapping,
sequencing, and computation (bioinformatics) to
analyze the genomes of organisms.
Sub-fields of genomics:
1. Structural genomics-genetic and physical
mapping of genomes.
2. Functional genomics-analysis of gene function
(and non-genes).
3. Comparative genomics-comparison of genomes
across species.
Includes structural and functional genomics.
Evolutionary genomics.
3. Definition
A comparison of gene numbers ,
gene locations & biological functions
of gene, in the genomes of different
organisms, one objective being to
identify groups of genes that play a
unique biological role in a particular
organism.
4. Few Terminologies
Homology :- Relationship of any two
characters ( such as two proteins that have
similar sequences ) that have descended,
usually through divergence, from a common
ancestral character.
Homologues are thus components or
characters (such as genes/proteins with
similar sequences) that can be attributed to a
common ancestor of the two organisms
during evolution.
5. Homologoues can be…
Orthologues are homologues that have evolved
from a common ancestral gene by speciation.
They usually have similar functions.
Paralogues are homologues that are related or
produced by duplication within a genome
followed by subsequent divergence. They often
have different functions.
Xenologues are homologous that are related by
an interspecies (horizontal transfer) of the
genetic material for one of the homologues. The
functions of the xenologues are quite often
similar.
7. Comparative Genomics
Two very large problems are immediately apparent in
undertaking the sequencing of entire genomes.
First, the vast numbers of species and the much larger size of
some genomes makes the entire sequencing of all genomes a
non-optimal approach for understanding genome structure.
Second, within a given species most individuals are genetically
distinct in a number of ways. What does it actually mean, for
example, to "sequence a human genome"? The genomes of two
individuals who are genetically distinct differ with respect to DNA
sequence by definition.
These two problems, and the potential for other novel
applications, have given rise to new approaches which, taken
together, constitute the field of comparative genomics.
8. All modern genomes have arisen from common ancestral
genomes, the relationships between genomes can be studied
with this fact in mind. This commonality means that information
gained in one organism can have application in other even
distantly related organisms.
Comparative genomics enables the application of information
gained from facile model systems to agricultural and medical
problems. The nature and significance of differences between
genomes also provides a powerful tool for determining the
relationship between genotype and phenotype through
comparative genomics and morphological and physiological
studies.
9. The Role of Bioinformatics in Identification of Drug Targets from
Bacterial and Fungal Genomes
Dr. Andrew E. DePristo, Director of Bioinformatics, Genome Therapeutics
Corporation
Bacterial genomes are appearing at an ever-increasing rate,
with a September 1999 listing by NCBI indicating 16
completed, 10 being annotated, and 55 being sequenced.
Fungal genomes and proteomes are less prevalent with one
complete, a few nearly complete, and large collections of
cDNA sequences available for about five organisms. This
presentation will discuss use of this bacterial and fungal
genomic diversity, along with high-throughput bioinformatics
tools, to attach confidence to certain functional predictions and
to allow identification and targeting of essential genes that are
unique to specific organisms.
10. Methods (WET)
Introduction
A DNA walk of a genome represents how the frequency of
each nucleotide of a pairing nucleotide couple changes
locally. This analysis implies measurement of the local
distribution of Gs in the content of GC and of Ts in the
content of TA. Lobry was the first to propose this analysis
(1996, 1999). Two complementary representations can be
derived from the DNA walk: the cumulative TA- and the
GC-skew analysis.
Aim: By reading these description of the algorithm, a
reader not trained in genomics is able to redraw our
graphs, using the basic genometric data file that is posted
on our web resource for each organism as a zip file (.zip).
11. 1) DNA walk
1.1) Drawing a DNA walk by reading a sequence file
nucleotide by nucleotide.
A simple algorithm is used to draw a DNA walk by simply
assigning a direction to each nucleotide. We propose the
following assignment, slightly different from Lobry's: to T, C,
A, and G correspond the E(ast), S(outh), W(est), and
N(orth) directions, respectively (Lobry, 1999). Reading the
nucleotide sequence nucleotide by nucleotide, and following
the rule, a path clearly emerges on the graph: Figure 1.
Figure 1: DNA walk of the sequence
GTCTGGTGTCTGGAGTTCCTGGGTCTTGAGACCACAGGACC
CACCAGGGACCCAGGACCC
Starting from the bottom left (bold blue line), the curve end at the bottom left (pink line)
12. 1.2) Drawing a DNA walk by slicing a sequence file nucleotide
into small windows
A simple way to draw quickly this kind of graph is suggested by
Lobry (1996) by cutting a genome into windows of equal length.
Figure 2: DNA walk of the same sequence as the one presented in Figure 1:
GTCTGGTGTCTGGAGTTCCT
GGGTCTTGAGACCACAGGA
CCCACCAGGGACCCAGGAC
CC
The sequence was sliced into 5-nucleotide windows. Only the fifth nucleotide per
window is plotted. We can also work with the mean values of the window…
Comment: this method is not as precise as the first one. We could use
it with a spreadsheet software without affecting the final resolution of
the curve at the genome level.
13. 2) The cumulative TA- and the GC-skew analyses.
2.1) Drawing a cumulative TA- or a GC-skew analysis by reading a
sequence file nucleotide by nucleotide.
Cumulative TA-skew analysis: Assign to each nucleotide
the following direction: to A, T, C, and G correspond the S, N,
nd (no direction), and nd directions, respectively. On the
graph, after the reading of one nucleotide, the pointer has to
go one step eastward. If a A, or T, is read, a further step is
added, southward, or northward, respectively.
14. Cumulative GC-skew analysis: Assign to each
nucleotide the following direction: to A, T, C, and G
correspond the nd, nd, S, and N directions, respectively.
On the graph, after reading one nucleotide, the pointer
has to move one step eastward. If a C, or G, is read, a
further step is added, southward, or northward,
respectively.
16. Computational analysis in drug
target discovery
Shannon entropy is a measure of variation
or change over a time series. Genes that
exhibit significant changes are regarded
as good target candidates.
Clustering is a method for grouping
patterns by similarities in their shapes.
17.
18.
19. Founded in 1982 as a service of the Department of
Genetics at the University of Wisconsin, GCG became a
private company in 1990 and was acquired by Oxford
Molecular Group in 1997. The company was one of the
pioneers of bioinformatics and its Wisconsin Package
sequence analysis tools are widely used and well
regarded throughout the pharmaceutical and
biotechnology industries and in academia. To support
enterprise bioinformatics efforts, GCG developed
SeqStore, its Oracle-based data management system.
Desktop solutions are delivered to bench scientists
through products such as MacVector and OMIGA
GCG
History
(tools)
20. GCG Wisconsin Package
Molecular biologists
worldwide use the GCG®
Wisconsin Package® as
their software of choice for
comprehensive sequence
analysis. The Wisconsin
Package meets research
needs across disciplines,
project teams, and labs to
provide an enterprise-wide
solution. Based on
published algorithms from
the fields of mathematical
and computational biology,
the Package includes
tools for:
Comparison
Database Searching and Retrieval
DNA/RNA Secondary Structure
Editing and Publication
Evolution
Fragment Assembly
Gene Finding and Pattern Recognition
Importing and Exporting
Mapping
Primer Selection
Protein Analysis
Translation
21. PAUP* version 4.0 is a major upgrade and new release of
the software package for inference of evolutionary trees, for
use in Macintosh, Windows, UNIX/VMS, or DOS-based
formats. The influence of high-speed computer analysis of
molecular, morphological and/or behavioral data to infer
phylogenetic relationships has expanded well beyond its
central role in evolutionary biology, now encompassing
applications in areas as diverse as conservation biology,
ecology, and forensic studies. The success of previous
versions of PAUP: Phylogenetic Analysis Using Parsimony
has made it the most widely used software package for the
inference of evolutionary trees
22. Outcomes/ Benefits
Provides “first pass” information on the function
of the putative protein based on the existence of
conserved protein sequence motifs.
Advancements in computer software
technologies (Bioinformatics) has made
comparative analysis of genomes an extremely
powerful approach for functional genomics too.
These studies can also reveal insights into the
recruitment of enzymes in a pathway
23. Outcomes/ Benefits
It will help us to understand the genetic basis of
diversity in organisms, both speciation &
variation, events that are important aspects of
evolutionary biology.
Comparative genomics provides a powerful way
in which to analyze sequence data.
Indeed, there is already a long list of 'model'
organisms, which allow comparative analyses in
a variety of ways.
24. Outcomes/ Benefits
The very small vertebrate genome of the
pufferfish provides a simple and economical way
of comparing sequence data from mammals and
fish, representing a large evolutionary
divergence and so permitting the identification of
essential elements that are still present in both
species.
These elements include genes and the
associated machinery that controls their
expression; elements that, in many cases, have
survived the test of time