Presentation on :CATH
Ayesha Javaid
PHD Biotechnology
Course: Bioinformatics
CATH database
o The CATH database is a free, publicly available online resource
that provides information on the evolutionary relationships of
protein domains.
o It was created in the mid-1990s by Professor Christine Orengo
and colleagues, and continues to be developed by the Orengo
group at University College London.
 Protein domains are the basic units of proteins that can
fold, function, and evolve independently
 Knowledge of protein domains is critical for protein
classification, understanding their biological functions,
annotating their evolutionary mechanisms and protein
design.
 Domains are obtained from protein structures deposited
in the Protein Data Bank.
 Both domain identification and subsequent
classification use manual as well as automated
procedures.
Protein Domains
 The data in CATH are obtained from PDB files
deposited in the Protein Data Bank.
 The structures can be determined only with a
resolution of 4Ǻor better are included.
 Further more CATH requires the domains with
minimum 40 residues of length with 70% or more
side chains.
 Submitted protein chains are chopped to obtain
the domains.
 Classification are assigned to the resulting
domains.
highest level-placed the selected protein
into 1 of 4 categories of secondary structure.
description of the cross
arrangement of secondary structure, independent
of topology.
indication of over all shape and
connectivity of protein’s secondary structures.
proteins of known
structure that are homologous (share a common
ancester) to a selected protein.
 308,999 structural protein domain entries
 53,479,436 non-structural protein domain entries
 2,737 homologous super family entries
 92,882 functional family entries
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx
cath-171102055313.pptx

cath-171102055313.pptx

  • 1.
    Presentation on :CATH AyeshaJavaid PHD Biotechnology Course: Bioinformatics
  • 3.
    CATH database o TheCATH database is a free, publicly available online resource that provides information on the evolutionary relationships of protein domains. o It was created in the mid-1990s by Professor Christine Orengo and colleagues, and continues to be developed by the Orengo group at University College London.
  • 4.
     Protein domainsare the basic units of proteins that can fold, function, and evolve independently  Knowledge of protein domains is critical for protein classification, understanding their biological functions, annotating their evolutionary mechanisms and protein design.  Domains are obtained from protein structures deposited in the Protein Data Bank.  Both domain identification and subsequent classification use manual as well as automated procedures. Protein Domains
  • 5.
     The datain CATH are obtained from PDB files deposited in the Protein Data Bank.  The structures can be determined only with a resolution of 4Ǻor better are included.  Further more CATH requires the domains with minimum 40 residues of length with 70% or more side chains.
  • 6.
     Submitted proteinchains are chopped to obtain the domains.  Classification are assigned to the resulting domains.
  • 9.
    highest level-placed theselected protein into 1 of 4 categories of secondary structure. description of the cross arrangement of secondary structure, independent of topology. indication of over all shape and connectivity of protein’s secondary structures. proteins of known structure that are homologous (share a common ancester) to a selected protein.
  • 11.
     308,999 structuralprotein domain entries  53,479,436 non-structural protein domain entries  2,737 homologous super family entries  92,882 functional family entries