SlideShare a Scribd company logo
1 of 15
Download to read offline
PROTEIN SEQUENCE
DATABASES
Hemant Santosh Bothe
School of Life Sciences, SRTMU Nanded
 With the availability of over 165
completed genome sequences from both
eukaryotic and prokaryotic organisms,
efforts are now being focused on the
identification and functional analysis of
the proteins encoded by these genomes.
 The large-scale analysis of these proteins
has started to generate huge amounts of
data due to the new information provided
by the genome projects and to a range of
new technologies in protein science.
INTRODUCTION
 For example, mass spectrometry approaches are
being used in protein identification and in
determining the nature of post-translational
modifications. These and other methods make it
possible to quickly identify large numbers of
proteins, to map their interactions, to
determine their location within the cell and to
analyze their biological activities.
 Protein sequence databases play a vital role as
a central resource for storing the data
generated by these and more conventional
efforts, and making them available to the
scientific community.
 Universal protein databases cover proteins
from all species.
 Whereas specialized data collections contain
information about a particular protein family
or group of proteins, or related to a specific
organism.
 Universal protein sequence databases can be
further subdivided into two categories:
sequence repositories (depositories), in
which data are stored with little or no
manual intervention in the creation of the
records.
 And expertly curated databases, in which the
original data are enhanced by the addition of
further information.
TYPES
 Several protein sequence databases act as
repositories of protein sequences. These
databases add little or no additional information
to the sequence records they contain.
 e.g. GenPept, NCBI’s Entrez Protein, e-
Reference Sequence
SEQUENCE REPOSITORIES
 Although repositories are an essential
means of providing the user with
sequences as quickly as possible, it is
clear that, when additional information is
added to a sequence, this greatly
increases the value of the resource for
users.
 The curated databases enrich the sequence
data by adding additional information,
which gets validated by expert biologists
before being added to the databases to
ensure that the data in these collections
can be considered to be highly reliable.
UNIVERSAL CURUTED DATABASES
 SWISS-PROT is a universal protein sequence
database established in 1986 and
maintained collaboratively, since 1987, by
the Department of Medical Biochemistry of
the University of Geneva and the EMBL
Data Library.
 The leading universal curated protein
sequence database is Swiss-Prot, which
contained 140 000 curated sequence
entries from over 8300 different species as
on November 2003.
SWISS - PROT
 The database is non-rebundant, which
means that all reports for a given protein
are merged into a single entry, and is highly
integrated with other databases .Each entry
in Swiss-Prot is thoroughly analyzed and
annotated by biologists to ensure that the
database is of a high quality.
 The SWISS-PROT database distinguishes
itself from other protein sequence
databases by three distinct criteria
i.e.High level of annotation, a minimal
level of redundancy and high level of
integration with other databases.
 Established in 1984 by the National
Biomedical Research Foundation (NBRF) as
a resource to assist in the identification
and understanding of protein sequence
information.
 The PIR database evolved from the original
NBRF Protein Sequence Database,
developed over a 20 year period by the
late Margaret O. Dayhoff and published as
the ‘Atlas of Protein Sequence and
Structure.
THE PROTEIN INFROMANTION
RESOURCE PIR
 The database is partitioned into four
sections; PIR1, PIR2, PIR3 and PIR4
 These differ in terms of quality of data.
Currently PIR1 and PIR2 account for ∼99% of all
entries. Entries in PIR1 are fully classified, fully
merged and extensively annotated.
THE PROTEIN INFROMANTION
RESOURCE PIR
 SCOP: A Structural Classification of
Proteins database.
 Class Architecture Topology
Homologous (CATH):-
PROTEIN STRUCTURE DATABASE
 This database provides a detailed and
comprehensive description of the structural
and evolutionary relationships of the
proteins of known structure.
 A fundamental unit of classification in scop
is the protein domain.The first release of
scop in 1995 comprised 3179 domains, 498
families, 366 super families and 279 folds.
SCOP: A STRUCTURAL
CLASSIFICATION OF PROTEINS
DATABASE
 The classification of the proteins is
on hierarchical levels:
 Family
 Super family
 Common fold
 Class
SCOP
 The CATH database is a classification of
protein domains based not only on
sequence information, but also on
structural and functional properties.
 The first CATH release from 1997
contained only 8,078 domains.
 In addition to the four main levels, CATH
comprises five more layers, called S, O, L,
I and D. The first four layers group
domains according to increasing
sequence overlap and similarity whereas
the D-level assigns a unique identifier to
every domain.
CATH
Protein Sequence Databases

More Related Content

What's hot (20)

PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
Scop database
Scop databaseScop database
Scop database
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
Swiss PROT
Swiss PROT Swiss PROT
Swiss PROT
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahu
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment   Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
TrEMBL
TrEMBLTrEMBL
TrEMBL
 
PROTEIN DATABASE
PROTEIN DATABASEPROTEIN DATABASE
PROTEIN DATABASE
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Msa
MsaMsa
Msa
 
Clustal
ClustalClustal
Clustal
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Gen bank
Gen bankGen bank
Gen bank
 

Similar to Protein Sequence Databases

Protein databases
Protein databasesProtein databases
Protein databasessarumalay
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...SBituila
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...BibiQuinah
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introductionDrGopaSarma
 
Bioinformatics, application by kk sahu sir
Bioinformatics, application by kk sahu sirBioinformatics, application by kk sahu sir
Bioinformatics, application by kk sahu sirKAUSHAL SAHU
 
PROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANKPROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANKMalvika Bansal
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databasesSangeeta Das
 
Features of biological databases
Features of biological databasesFeatures of biological databases
Features of biological databasesCharu Sharma
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein databasechinmayeec
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEPrashantSharma807
 
protein databases
 protein databases protein databases
protein databaseswasisyed
 

Similar to Protein Sequence Databases (20)

Protein databases
Protein databasesProtein databases
Protein databases
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Biological database
Biological databaseBiological database
Biological database
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Bioinformatics, application by kk sahu sir
Bioinformatics, application by kk sahu sirBioinformatics, application by kk sahu sir
Bioinformatics, application by kk sahu sir
 
Biological databases
Biological databasesBiological databases
Biological databases
 
PROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANKPROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANK
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Features of biological databases
Features of biological databasesFeatures of biological databases
Features of biological databases
 
Protein database
Protein  databaseProtein  database
Protein database
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein database
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Protein database
Protein databaseProtein database
Protein database
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Introduction to Biological databases
 
protein databases
 protein databases protein databases
protein databases
 

Recently uploaded

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 

Recently uploaded (20)

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 

Protein Sequence Databases

  • 1. PROTEIN SEQUENCE DATABASES Hemant Santosh Bothe School of Life Sciences, SRTMU Nanded
  • 2.  With the availability of over 165 completed genome sequences from both eukaryotic and prokaryotic organisms, efforts are now being focused on the identification and functional analysis of the proteins encoded by these genomes.  The large-scale analysis of these proteins has started to generate huge amounts of data due to the new information provided by the genome projects and to a range of new technologies in protein science. INTRODUCTION
  • 3.  For example, mass spectrometry approaches are being used in protein identification and in determining the nature of post-translational modifications. These and other methods make it possible to quickly identify large numbers of proteins, to map their interactions, to determine their location within the cell and to analyze their biological activities.  Protein sequence databases play a vital role as a central resource for storing the data generated by these and more conventional efforts, and making them available to the scientific community.
  • 4.  Universal protein databases cover proteins from all species.  Whereas specialized data collections contain information about a particular protein family or group of proteins, or related to a specific organism.  Universal protein sequence databases can be further subdivided into two categories: sequence repositories (depositories), in which data are stored with little or no manual intervention in the creation of the records.  And expertly curated databases, in which the original data are enhanced by the addition of further information. TYPES
  • 5.  Several protein sequence databases act as repositories of protein sequences. These databases add little or no additional information to the sequence records they contain.  e.g. GenPept, NCBI’s Entrez Protein, e- Reference Sequence SEQUENCE REPOSITORIES
  • 6.  Although repositories are an essential means of providing the user with sequences as quickly as possible, it is clear that, when additional information is added to a sequence, this greatly increases the value of the resource for users.  The curated databases enrich the sequence data by adding additional information, which gets validated by expert biologists before being added to the databases to ensure that the data in these collections can be considered to be highly reliable. UNIVERSAL CURUTED DATABASES
  • 7.  SWISS-PROT is a universal protein sequence database established in 1986 and maintained collaboratively, since 1987, by the Department of Medical Biochemistry of the University of Geneva and the EMBL Data Library.  The leading universal curated protein sequence database is Swiss-Prot, which contained 140 000 curated sequence entries from over 8300 different species as on November 2003. SWISS - PROT
  • 8.  The database is non-rebundant, which means that all reports for a given protein are merged into a single entry, and is highly integrated with other databases .Each entry in Swiss-Prot is thoroughly analyzed and annotated by biologists to ensure that the database is of a high quality.  The SWISS-PROT database distinguishes itself from other protein sequence databases by three distinct criteria i.e.High level of annotation, a minimal level of redundancy and high level of integration with other databases.
  • 9.  Established in 1984 by the National Biomedical Research Foundation (NBRF) as a resource to assist in the identification and understanding of protein sequence information.  The PIR database evolved from the original NBRF Protein Sequence Database, developed over a 20 year period by the late Margaret O. Dayhoff and published as the ‘Atlas of Protein Sequence and Structure. THE PROTEIN INFROMANTION RESOURCE PIR
  • 10.  The database is partitioned into four sections; PIR1, PIR2, PIR3 and PIR4  These differ in terms of quality of data. Currently PIR1 and PIR2 account for ∼99% of all entries. Entries in PIR1 are fully classified, fully merged and extensively annotated. THE PROTEIN INFROMANTION RESOURCE PIR
  • 11.  SCOP: A Structural Classification of Proteins database.  Class Architecture Topology Homologous (CATH):- PROTEIN STRUCTURE DATABASE
  • 12.  This database provides a detailed and comprehensive description of the structural and evolutionary relationships of the proteins of known structure.  A fundamental unit of classification in scop is the protein domain.The first release of scop in 1995 comprised 3179 domains, 498 families, 366 super families and 279 folds. SCOP: A STRUCTURAL CLASSIFICATION OF PROTEINS DATABASE
  • 13.  The classification of the proteins is on hierarchical levels:  Family  Super family  Common fold  Class SCOP
  • 14.  The CATH database is a classification of protein domains based not only on sequence information, but also on structural and functional properties.  The first CATH release from 1997 contained only 8,078 domains.  In addition to the four main levels, CATH comprises five more layers, called S, O, L, I and D. The first four layers group domains according to increasing sequence overlap and similarity whereas the D-level assigns a unique identifier to every domain. CATH