Bioinformatics
And Functional
Genomics
Outline…
 Bioinformatics
 Functional Genomics
 Sequence-based tools
 Microarray-based tools
 Gene Ontology
 System Biology approach to
Bioinformatics and Functional
Genomics
What is Bioinformatics?
Bioinformatics
is conceptualizing biology in terms of
molecules (in the sense of physical-chemistry) and
then applying "informatics" techniques (derived
from disciplines such as applied
mathematics, CS, and statistics) to understand and
organize the information associated with these
molecules, on a large-scale.
 The development of new algorithms and statistics
with which to assess relationships among
members of large data sets.
 The analysis and interpretation of various types of
data including nucleotide and amino acid
sequences, protein domains and protein structures.
 The development and implementation of tools that
enable efficient access and management of
different types of information.
Bioinformatics
Functional Genomics
 A field of molecular biology that
attempts to make use of the vast
wealth of data produced by genomic
projects (such as genome sequencing
projects) to
describe gene (and protein) functions
and interactions
Bioinformatics and Functional
Genomics
 Focus on gene function
◦ At genome level, using
◦ High throughput methods
 Conducted using
◦ Sequence-based tools
◦ Microarray-based tools
Bioinformatics and Functional
Genomics
 Because of the large quantity of data
produced by these techniques and the
desire to find biologically meaningful
patterns, bioinformatics is crucial for
analysis of functional genomics data.
Sequence-based tools
• DAVID
-The Database
annotation, visualization and
integrated discovery.
• DG
-DigiNorthern
DigiNorthern
 DigiNorthern (DN) is a web-based tool
for virtually displaying expression
profiles of query genes based on EST
sequences.
 In addition, digital expression data is
available for each UniGene through a
pre-computed data set based on
SAGE and/or ESTs.
Microarray based tools
 Gene Set Enrichment Analysis
(GSEA)
 GSEA considers experiments with
genome wide expression profiles from
samples belonging to two
classes, labeled 1 or 2. Genes are
ranked based on the correlation
between their expression and the
class distinction.
 GEO Gene Expression Omnibus
Gene Ontology
 The Gene Ontology, or GO, is a
major bioinformatics initiative to unify the
representation of gene and gene
product attributes across
all species. More specifically, the project
aims to:
 Maintain and develop its controlled
vocabulary of gene and gene
product attributes;
 Annotate genes and gene products, and
assimilate and disseminate annotation
data;
Gene Ontology
 Gene Ontology based enrichment
analysis are provided by DAVID and
Gene Set Enrichment Analysis
(GSEA).
"A system biology" approach to bioinformatics
and functional genomics in complex human
diseases: arthritis.
 Human and other annotated genome
sequences have facilitated generation of
vast amounts of correlative data, from
human/animal genetics, normal and
disease-affected tissues from complex
diseases such as arthritis using
gene/protein chips and SNP analysis.
 These data sets include genes/proteins
whose functions are partially known at
the cellular level or may be completely
unknown (e.g. ESTs).

Bioinformatics and functional genomics

  • 1.
  • 2.
    Outline…  Bioinformatics  FunctionalGenomics  Sequence-based tools  Microarray-based tools  Gene Ontology  System Biology approach to Bioinformatics and Functional Genomics
  • 3.
    What is Bioinformatics? Bioinformatics isconceptualizing biology in terms of molecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied mathematics, CS, and statistics) to understand and organize the information associated with these molecules, on a large-scale.
  • 4.
     The developmentof new algorithms and statistics with which to assess relationships among members of large data sets.  The analysis and interpretation of various types of data including nucleotide and amino acid sequences, protein domains and protein structures.  The development and implementation of tools that enable efficient access and management of different types of information. Bioinformatics
  • 5.
    Functional Genomics  Afield of molecular biology that attempts to make use of the vast wealth of data produced by genomic projects (such as genome sequencing projects) to describe gene (and protein) functions and interactions
  • 6.
    Bioinformatics and Functional Genomics Focus on gene function ◦ At genome level, using ◦ High throughput methods  Conducted using ◦ Sequence-based tools ◦ Microarray-based tools
  • 7.
    Bioinformatics and Functional Genomics Because of the large quantity of data produced by these techniques and the desire to find biologically meaningful patterns, bioinformatics is crucial for analysis of functional genomics data.
  • 8.
    Sequence-based tools • DAVID -TheDatabase annotation, visualization and integrated discovery. • DG -DigiNorthern
  • 10.
    DigiNorthern  DigiNorthern (DN)is a web-based tool for virtually displaying expression profiles of query genes based on EST sequences.  In addition, digital expression data is available for each UniGene through a pre-computed data set based on SAGE and/or ESTs.
  • 12.
    Microarray based tools Gene Set Enrichment Analysis (GSEA)  GSEA considers experiments with genome wide expression profiles from samples belonging to two classes, labeled 1 or 2. Genes are ranked based on the correlation between their expression and the class distinction.  GEO Gene Expression Omnibus
  • 14.
    Gene Ontology  TheGene Ontology, or GO, is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to:  Maintain and develop its controlled vocabulary of gene and gene product attributes;  Annotate genes and gene products, and assimilate and disseminate annotation data;
  • 15.
    Gene Ontology  GeneOntology based enrichment analysis are provided by DAVID and Gene Set Enrichment Analysis (GSEA).
  • 17.
    "A system biology"approach to bioinformatics and functional genomics in complex human diseases: arthritis.  Human and other annotated genome sequences have facilitated generation of vast amounts of correlative data, from human/animal genetics, normal and disease-affected tissues from complex diseases such as arthritis using gene/protein chips and SNP analysis.  These data sets include genes/proteins whose functions are partially known at the cellular level or may be completely unknown (e.g. ESTs).

Editor's Notes

  • #9 An expressed sequence tag or EST is a short sub-sequence of a cDNA sequence.[1] They may be used to identify gene transcripts, and are instrumental in gene discovery and gene sequence determination