By:
Vaibhav Kumar Maurya
Genomics is the molecular
characterization of whole
genomes.
It characterizes the physical nature of
whole genome.
It includes the genetic mapping, physical
mapping and sequencing of whole
genomes.
It describes the 3D structure of every
protein encoded by a given genome.
 Structural genomics attempts to
determine the structure of every
protein encoded by the genome.
Traditional structure prediction
focuses on one particular protein.
 Economy of scale
 Scientific community gets immediate access to
new structure as well as to reagents such as
clones & proteins.
 Many of the structure of protein are of
unknown function & don’t have corresponding
publication.
 It requires new ways of communicating the str
information to the broader research
community.
 To identify novel protein folds:
Done by ab-initio modeling.
 Protein 3D structure determination:
-3D structure knowledge is important to
understand the function of protein.
-It is also used in drug discovery and
protein engineering.
full structural coverage of a simple model organism
 Structural genomics takes advantage of completed
genome sequence in order to determine protein
structure.
 The gene sequence of the target protein can be
compared to a known sequence & structure
information can be inferred from known protein
structure.
 Structure genomics can be used to predict novel
protein folds based on other structural data.
 It also uses modeling based approach that relies on
homology b/w the unknown protein & a solved
protein.
Completed genome sequence allows every ORF to be cloned
& expressed as protein.
The proteins are then purified and crystallised
Then subjected to str determination: X-ray crystallography
& NMR
The whole genome sequence allows for the design of every
primer required in order to amplify all the ORFs, clone into
bacteria and express them
This whole genome approach allows for structure
determination of every protein that is encoded by the
genome.
1. AB-INITIO METHOD:
This approach uses protein sequence data & the
chemical & physical interaction of the encoded
amino acid to predict 3D str of protein with no
homology to solve protein structure.
Rosetta program: highly successful method
It divides protein into short segment and arranges
short polypeptide chain into a low energy local
conformation.
Available for commercial use and for non commercial
use Robetta is used.
2. SEQUENCE BASED MODELING
 This method compares the gene sequence of an
unknown protein with sequence of protein
with known structure.
 Depending on degree of similarity b/w the
sequence the str of known protein can be used
as model for solving the str of unknown
protein.
3. THREADING
 It is based on fold similarities rather than
sequence identity.
 This method is used to identify distantly
related protein & can be used to infer
molecular functions.
1. Mycobacterium tuberculosis proteome
-The goal of the TB Structural Genomics
Consortium is to determine the structures of
potential drug targets in Mycobacterium
tuberculosis, the bacterium that causes
tuberculosis.
- The development of novel drug therapies
against tuberculosis are particularly important
given the growing problem of multi-drug-
resistant tuberculosis.
2. The Thermotogo maritima proteome
One current goal of the Joint Center for Structural
Genomics (JCSG), a part of the Protein
Structure Initiative (PSI) is to solve the
structures for all the proteins in Thermotogo
maritima, a thermophilic bacteria.
- T. maritima was selected as a structural genomics
target based on its relatively small genome
consisting of 1,877 genes and the hypothesis
that the proteins expressed by a thermophilic
bacterium would be easier to crystallize.
 Protein bank (PDB): repository for protein
sequence and structural information
 UniProt: provides sequence and functional
information
 Structural Classification of Proteins (SCOP
Classifications): hierarchical-based approach
 Class, Architecture, Topology and
Homologous superfamily (CATH):
hierarchical-based approach
The structural genomics experimental pipeline.
Structural genomics

Structural genomics

  • 1.
  • 2.
    Genomics is themolecular characterization of whole genomes.
  • 3.
    It characterizes thephysical nature of whole genome. It includes the genetic mapping, physical mapping and sequencing of whole genomes. It describes the 3D structure of every protein encoded by a given genome.
  • 4.
     Structural genomicsattempts to determine the structure of every protein encoded by the genome. Traditional structure prediction focuses on one particular protein.
  • 5.
     Economy ofscale  Scientific community gets immediate access to new structure as well as to reagents such as clones & proteins.
  • 6.
     Many ofthe structure of protein are of unknown function & don’t have corresponding publication.  It requires new ways of communicating the str information to the broader research community.
  • 7.
     To identifynovel protein folds: Done by ab-initio modeling.  Protein 3D structure determination: -3D structure knowledge is important to understand the function of protein. -It is also used in drug discovery and protein engineering.
  • 8.
    full structural coverageof a simple model organism
  • 9.
     Structural genomicstakes advantage of completed genome sequence in order to determine protein structure.  The gene sequence of the target protein can be compared to a known sequence & structure information can be inferred from known protein structure.  Structure genomics can be used to predict novel protein folds based on other structural data.  It also uses modeling based approach that relies on homology b/w the unknown protein & a solved protein.
  • 11.
    Completed genome sequenceallows every ORF to be cloned & expressed as protein. The proteins are then purified and crystallised Then subjected to str determination: X-ray crystallography & NMR The whole genome sequence allows for the design of every primer required in order to amplify all the ORFs, clone into bacteria and express them This whole genome approach allows for structure determination of every protein that is encoded by the genome.
  • 12.
    1. AB-INITIO METHOD: Thisapproach uses protein sequence data & the chemical & physical interaction of the encoded amino acid to predict 3D str of protein with no homology to solve protein structure. Rosetta program: highly successful method It divides protein into short segment and arranges short polypeptide chain into a low energy local conformation. Available for commercial use and for non commercial use Robetta is used.
  • 13.
    2. SEQUENCE BASEDMODELING  This method compares the gene sequence of an unknown protein with sequence of protein with known structure.  Depending on degree of similarity b/w the sequence the str of known protein can be used as model for solving the str of unknown protein.
  • 14.
    3. THREADING  Itis based on fold similarities rather than sequence identity.  This method is used to identify distantly related protein & can be used to infer molecular functions.
  • 16.
    1. Mycobacterium tuberculosisproteome -The goal of the TB Structural Genomics Consortium is to determine the structures of potential drug targets in Mycobacterium tuberculosis, the bacterium that causes tuberculosis. - The development of novel drug therapies against tuberculosis are particularly important given the growing problem of multi-drug- resistant tuberculosis.
  • 17.
    2. The Thermotogomaritima proteome One current goal of the Joint Center for Structural Genomics (JCSG), a part of the Protein Structure Initiative (PSI) is to solve the structures for all the proteins in Thermotogo maritima, a thermophilic bacteria. - T. maritima was selected as a structural genomics target based on its relatively small genome consisting of 1,877 genes and the hypothesis that the proteins expressed by a thermophilic bacterium would be easier to crystallize.
  • 18.
     Protein bank(PDB): repository for protein sequence and structural information  UniProt: provides sequence and functional information  Structural Classification of Proteins (SCOP Classifications): hierarchical-based approach  Class, Architecture, Topology and Homologous superfamily (CATH): hierarchical-based approach
  • 19.
    The structural genomicsexperimental pipeline.