SlideShare a Scribd company logo
1 of 23
Download to read offline
An introduction to
Protein Families
and databases
Jamia Millia Islamia
Date 2
Pitching in
Protein Families and the need
for classification
Domains & Motifs with GPCRs
as example
Vrinda Sharma
Groundwork
Sequence Features
Protein Signatures
Patterns & Profiles
HMMs
Wanchha Maurya
Showstopper
DUFs- a story worth reciting
Databases of Protein Families
Demistifying the Hypotheticals
Rohit Satyam
Need for classification
Date 3
Proteins can be classified into
groups based on sequence or
structural similarity.
These groups often contain
well characterised proteins
whose function is known.
Thus, when a novel protein is
identified, its functional
properties can be proposed
based on the group to which
it is predicted to belong.
Source: EMBL-EBI Training Course:
https://www.ebi.ac.uk/training-
beta/online/courses/protein-classification-intro-ebi-
resources/protein-classification/what-are-protein-
families/
Protein Families in Brief
Date Your Footer Here
Group of Proteins which
• Shares a common evolutionary origin
• Performs related functions
• Similar in sequence or structure.
Superfamily
Family A
Subfamily
A1
Subfamily
A2
Family B Family C
Subfamily
C1
Subfamily
C2
Subfamily
C3
Domain and Motifs
aren’t synonyms
Date
Domains are distinct functional and/or structural
units in a protein.
They are responsible for a particular function or
interaction, contributing to the overall role of a
protein.
Motifs are secondary structure that are formed
due to interaction between alpha-helices and
beta-sheets.
Structure of the SH3 domain
Domain composition of Nck. Nck contains three
SH3 domains plus another domain known as SH2
G-Protein Coupled receptors
An example to understand Protein
Families
G-Protein Signaling
Date Your Footer Here
• Regulator of GPS domains are protein
structural units that activate GTPase.
• sequences belonging to RGS protein
family(multifunctional GTPase accelerating
protein).
• All RGS protein family member contains RGS
domain ,some (RGS1) consist little more than
domain .
• RGS3 and RGS6 contain additional domains for
other functions .
They have seven transmembrane
domains, and interact with
specialized proteins (called G
proteins) to influence intracellular
pathways after binding
extracellular signals
G-protein-coupled receptors
and cancer
Dorsam et al 2007
Date Your Footer Here 9
Level2
Level 1
Sub-family
Superfamily GPCRs
Rhodopsin
-like GPCRs
Opsins
Red-
sensitive
opsins
Green-
sensitive
opsins
Blue-
sensitive
opsins
APJ
receptors
Relaxin
Receptors
cAMP
Receptors
Secretin like-
GPCRs
Etc…
The GPCR superfamily hierarchy. Families and subfamilies to which the short-wave-sensitive opsin 1
protein belongs are highlighted in violet.
GPCRs
Regulates: Biological processes, including photoreception, regulation of the immune system, and nervous system
transmission.
Similarity
increases
Date 10
What Are Sequence Features?
1.Active Site
2.Binding Site
3. Post Translational Modifications (PTMs)
4. Repeats
Group of amino acid that confer certain characteristics upon a protein ,and maybe important for
overall function
Date 11
Protein Signatures
• To classify protein’s family and to
predict the domains or sequence
features we use computational tools
and that tools are the predictive
models known as protein signatures.
• Model refines distantly related
sequences in database are identified.
• Once the model is mature, signature
is ready for protein sequence
analysis.
The Purpose and the Process
Date 12
How do Protein Signature compare to other
ways of classifying proteins?
• Multiple sequence alignment gives
us information about classification
which we use to identify amino acid
residues that are conserved in
distantly proteins.
• Protein signature built from
multiple sequence alignment are
usually better at detecting
divergent homologues than
pairwise comparison method.
Identifying the conserved residues
Date 13
Signature types
Patterns
Profiles
Fingerprints
Hidden Markov Models (HMMs)
Approaches to generate signatures
Patterns & Profiles
Date 14
Signature Types
Patterns can recognize sequence
features such as binding sites or
active sites of enzymes consist of a
only few amino acids.
Ex: PROSITE database.
1 2
Profiles are built by converting
multiple sequence alignment into
position specific scoring system
(PMMs).
Ex: CDD, HAMAP, PROSITE and
PRODOM.
Fingerprints and HMMs
Date 15
Signature Types
3 4
Fingerprints are composed of multiple
short conserved motifs which are drawn
from sequence alignment. They can
distinguish individual subfamilies within
protein families.
Ex : PRINTS database.
Hidden Markov models (HMMs) are
used to convert multiple sequence
alignment into position specific
scoring system.
Ex: Pfam, SMART, TIGRFAM,
PANTHER, SFLD, Superfamily
and Gene 3D.
Date 16
Families in search of function
Domains of unknown function (DUFs)
Popovic et al., 2017.,Scientific reports,
The function of the Domain is yet to be discovered
The DUF naming scheme was introduced by Chris
Ponting through the addition of DUF1 and DUF2 to
the SMART database
Goodacre et al 2014.
Databases at Glance
Date 18
Databases of
Protein Families
5. PRINTS
Combine Multidomain/motif
information for family categorization.
MSA and Fuzzy Logic (Regex)
6. MobiDB
Homology, Predicted, Curated
Intrinsically Disordered regions
database
7. TIGRFAM
MSA, HMM mainly for prokaryotic
proteins
8. SUPERFAMILY2
Using HMM and protein Sequences
Domain organisation, sequence alignments
and protein sequence details can be
obtained for query sequence
4. PRIDE
Mass-Spec based identification
Provide PTM information and Literature
Evidences
3. Prosite
MSA of homologous Proteins;Based on
Prorules
2. PIRSF
MSA and Clustering with hight similarity
thresholds
1. Pfam
Protein Family, Domains, Motifs and Repeats
(Generated from MSA and HMMs)
1
3 5
7
2
4
8
6
Date 19
Interpro-A Protein Family Compendium
Date 20
GOFeat Tutorial
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Protein Under Investigation: LOC645967
Date 21
InterPro Tutorial
Protein Under Investigation: LOC645967
Date 22
References
• Dorsam, R.T. and Gutkind, J.S., 2007. G-protein-coupled receptors and cancer. Nature reviews
cancer, 7(2), pp.79-94.
• Bateman, Alex, Penny Coggill, and Robert D. Finn. "DUFs: families in search of function." Acta
Crystallographica Section F: Structural Biology and Crystallization Communications 66, no. 10
(2010): 1148-1152.
• Goodacre, Norman F., Dietlind L. Gerloff, and Peter Uetz. "Protein domains of unknown function are
essential in bacteria." MBio 5, no. 1 (2014).
• EMBL-EBI Training Course: https://www.ebi.ac.uk/training-beta/online/courses/protein-
classification-intro-ebi-resources/protein-classification/what-are-protein-families/
Date 23
Thanks
Drop in
@RohitSatyam1
+91 9870953351
Jamia Millia Islamia University

More Related Content

What's hot (20)

Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Formation and expression ofpseudogenes
Formation and expression ofpseudogenesFormation and expression ofpseudogenes
Formation and expression ofpseudogenes
 
UniProt
UniProtUniProt
UniProt
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Scop database
Scop databaseScop database
Scop database
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
Cath
CathCath
Cath
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
C value
C value C value
C value
 
BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
genomic comparison
genomic comparison genomic comparison
genomic comparison
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 

Similar to Introduction to Protein Families and Databases

Lecture__on__Proteomics_Introduction.ppt
Lecture__on__Proteomics_Introduction.pptLecture__on__Proteomics_Introduction.ppt
Lecture__on__Proteomics_Introduction.pptSachin Teotia
 
Protein Chemistry-Proteomics-Lec1_Intro.ppt
Protein Chemistry-Proteomics-Lec1_Intro.pptProtein Chemistry-Proteomics-Lec1_Intro.ppt
Protein Chemistry-Proteomics-Lec1_Intro.pptSachin Teotia
 
An Overview to Protein bioinformatics
An Overview to Protein bioinformaticsAn Overview to Protein bioinformatics
An Overview to Protein bioinformaticsJoel Ricci-López
 
Characterizing Protein Families of Unknown Function
Characterizing Protein Families of Unknown FunctionCharacterizing Protein Families of Unknown Function
Characterizing Protein Families of Unknown FunctionMorgan Langille
 
1Pfam.pptx
1Pfam.pptx1Pfam.pptx
1Pfam.pptxVetico
 
Bioinformatics, application by kk sahu sir
Bioinformatics, application by kk sahu sirBioinformatics, application by kk sahu sir
Bioinformatics, application by kk sahu sirKAUSHAL SAHU
 
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...Keiji Takamoto
 
Research presentation-wd
Research presentation-wdResearch presentation-wd
Research presentation-wdWagied Davids
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxChijiokeNsofor
 
Proteomics: lecture (1) introduction to proteomics
Proteomics: lecture (1) introduction to proteomicsProteomics: lecture (1) introduction to proteomics
Proteomics: lecture (1) introduction to proteomicsClaudine83
 
Presage database
Presage databasePresage database
Presage databaseAkshay More
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformaticsmaulikchaudhary8
 
6. protein secondry structure ppt
6. protein secondry structure ppt6. protein secondry structure ppt
6. protein secondry structure pptVinaKhan1
 
Genome and Proteome data integration in RDF
Genome and Proteome data integration in RDFGenome and Proteome data integration in RDF
Genome and Proteome data integration in RDFNadia Anwar
 

Similar to Introduction to Protein Families and Databases (20)

Protein database
Protein databaseProtein database
Protein database
 
Lecture__on__Proteomics_Introduction.ppt
Lecture__on__Proteomics_Introduction.pptLecture__on__Proteomics_Introduction.ppt
Lecture__on__Proteomics_Introduction.ppt
 
Protein Chemistry-Proteomics-Lec1_Intro.ppt
Protein Chemistry-Proteomics-Lec1_Intro.pptProtein Chemistry-Proteomics-Lec1_Intro.ppt
Protein Chemistry-Proteomics-Lec1_Intro.ppt
 
An Overview to Protein bioinformatics
An Overview to Protein bioinformaticsAn Overview to Protein bioinformatics
An Overview to Protein bioinformatics
 
Characterizing Protein Families of Unknown Function
Characterizing Protein Families of Unknown FunctionCharacterizing Protein Families of Unknown Function
Characterizing Protein Families of Unknown Function
 
1Pfam.pptx
1Pfam.pptx1Pfam.pptx
1Pfam.pptx
 
Bioinformatics, application by kk sahu sir
Bioinformatics, application by kk sahu sirBioinformatics, application by kk sahu sir
Bioinformatics, application by kk sahu sir
 
NIH-mar2604.rm.ppt
NIH-mar2604.rm.pptNIH-mar2604.rm.ppt
NIH-mar2604.rm.ppt
 
Data retrieval
Data retrievalData retrieval
Data retrieval
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
 
Research presentation-wd
Research presentation-wdResearch presentation-wd
Research presentation-wd
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
 
Proteomics: lecture (1) introduction to proteomics
Proteomics: lecture (1) introduction to proteomicsProteomics: lecture (1) introduction to proteomics
Proteomics: lecture (1) introduction to proteomics
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
 
Presage database
Presage databasePresage database
Presage database
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
6. protein secondry structure ppt
6. protein secondry structure ppt6. protein secondry structure ppt
6. protein secondry structure ppt
 
Genome and Proteome data integration in RDF
Genome and Proteome data integration in RDFGenome and Proteome data integration in RDF
Genome and Proteome data integration in RDF
 

More from Rohit Satyam

Best Practices in Structural Biology
Best Practices in Structural BiologyBest Practices in Structural Biology
Best Practices in Structural BiologyRohit Satyam
 
Tridax procumbens and its Antidiarrhoeal property
Tridax procumbens and its Antidiarrhoeal propertyTridax procumbens and its Antidiarrhoeal property
Tridax procumbens and its Antidiarrhoeal propertyRohit Satyam
 
Bermuda Triangle and Its associated Secrets
Bermuda Triangle and Its associated SecretsBermuda Triangle and Its associated Secrets
Bermuda Triangle and Its associated SecretsRohit Satyam
 
Job interviews and How to get through
Job interviews and How to get throughJob interviews and How to get through
Job interviews and How to get throughRohit Satyam
 
Immunisation against bacteria
Immunisation against bacteriaImmunisation against bacteria
Immunisation against bacteriaRohit Satyam
 
Induced Pluripotent Stem Cells, iPSCs
Induced Pluripotent Stem Cells, iPSCsInduced Pluripotent Stem Cells, iPSCs
Induced Pluripotent Stem Cells, iPSCsRohit Satyam
 

More from Rohit Satyam (9)

Best Practices in Structural Biology
Best Practices in Structural BiologyBest Practices in Structural Biology
Best Practices in Structural Biology
 
Tridax procumbens and its Antidiarrhoeal property
Tridax procumbens and its Antidiarrhoeal propertyTridax procumbens and its Antidiarrhoeal property
Tridax procumbens and its Antidiarrhoeal property
 
Bermuda Triangle and Its associated Secrets
Bermuda Triangle and Its associated SecretsBermuda Triangle and Its associated Secrets
Bermuda Triangle and Its associated Secrets
 
Job interviews and How to get through
Job interviews and How to get throughJob interviews and How to get through
Job interviews and How to get through
 
Immunisation against bacteria
Immunisation against bacteriaImmunisation against bacteria
Immunisation against bacteria
 
Golgi bodies
Golgi bodiesGolgi bodies
Golgi bodies
 
Cell division
Cell divisionCell division
Cell division
 
Renewa ble energy
Renewa ble energyRenewa ble energy
Renewa ble energy
 
Induced Pluripotent Stem Cells, iPSCs
Induced Pluripotent Stem Cells, iPSCsInduced Pluripotent Stem Cells, iPSCs
Induced Pluripotent Stem Cells, iPSCs
 

Recently uploaded

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfSanaAli374401
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Recently uploaded (20)

Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 

Introduction to Protein Families and Databases

  • 1. An introduction to Protein Families and databases Jamia Millia Islamia
  • 2. Date 2 Pitching in Protein Families and the need for classification Domains & Motifs with GPCRs as example Vrinda Sharma Groundwork Sequence Features Protein Signatures Patterns & Profiles HMMs Wanchha Maurya Showstopper DUFs- a story worth reciting Databases of Protein Families Demistifying the Hypotheticals Rohit Satyam
  • 3. Need for classification Date 3 Proteins can be classified into groups based on sequence or structural similarity. These groups often contain well characterised proteins whose function is known. Thus, when a novel protein is identified, its functional properties can be proposed based on the group to which it is predicted to belong. Source: EMBL-EBI Training Course: https://www.ebi.ac.uk/training- beta/online/courses/protein-classification-intro-ebi- resources/protein-classification/what-are-protein- families/
  • 4. Protein Families in Brief Date Your Footer Here Group of Proteins which • Shares a common evolutionary origin • Performs related functions • Similar in sequence or structure. Superfamily Family A Subfamily A1 Subfamily A2 Family B Family C Subfamily C1 Subfamily C2 Subfamily C3
  • 5. Domain and Motifs aren’t synonyms Date Domains are distinct functional and/or structural units in a protein. They are responsible for a particular function or interaction, contributing to the overall role of a protein. Motifs are secondary structure that are formed due to interaction between alpha-helices and beta-sheets. Structure of the SH3 domain Domain composition of Nck. Nck contains three SH3 domains plus another domain known as SH2
  • 6. G-Protein Coupled receptors An example to understand Protein Families
  • 7. G-Protein Signaling Date Your Footer Here • Regulator of GPS domains are protein structural units that activate GTPase. • sequences belonging to RGS protein family(multifunctional GTPase accelerating protein). • All RGS protein family member contains RGS domain ,some (RGS1) consist little more than domain . • RGS3 and RGS6 contain additional domains for other functions .
  • 8. They have seven transmembrane domains, and interact with specialized proteins (called G proteins) to influence intracellular pathways after binding extracellular signals G-protein-coupled receptors and cancer Dorsam et al 2007
  • 9. Date Your Footer Here 9 Level2 Level 1 Sub-family Superfamily GPCRs Rhodopsin -like GPCRs Opsins Red- sensitive opsins Green- sensitive opsins Blue- sensitive opsins APJ receptors Relaxin Receptors cAMP Receptors Secretin like- GPCRs Etc… The GPCR superfamily hierarchy. Families and subfamilies to which the short-wave-sensitive opsin 1 protein belongs are highlighted in violet. GPCRs Regulates: Biological processes, including photoreception, regulation of the immune system, and nervous system transmission. Similarity increases
  • 10. Date 10 What Are Sequence Features? 1.Active Site 2.Binding Site 3. Post Translational Modifications (PTMs) 4. Repeats Group of amino acid that confer certain characteristics upon a protein ,and maybe important for overall function
  • 11. Date 11 Protein Signatures • To classify protein’s family and to predict the domains or sequence features we use computational tools and that tools are the predictive models known as protein signatures. • Model refines distantly related sequences in database are identified. • Once the model is mature, signature is ready for protein sequence analysis. The Purpose and the Process
  • 12. Date 12 How do Protein Signature compare to other ways of classifying proteins? • Multiple sequence alignment gives us information about classification which we use to identify amino acid residues that are conserved in distantly proteins. • Protein signature built from multiple sequence alignment are usually better at detecting divergent homologues than pairwise comparison method. Identifying the conserved residues
  • 13. Date 13 Signature types Patterns Profiles Fingerprints Hidden Markov Models (HMMs) Approaches to generate signatures
  • 14. Patterns & Profiles Date 14 Signature Types Patterns can recognize sequence features such as binding sites or active sites of enzymes consist of a only few amino acids. Ex: PROSITE database. 1 2 Profiles are built by converting multiple sequence alignment into position specific scoring system (PMMs). Ex: CDD, HAMAP, PROSITE and PRODOM.
  • 15. Fingerprints and HMMs Date 15 Signature Types 3 4 Fingerprints are composed of multiple short conserved motifs which are drawn from sequence alignment. They can distinguish individual subfamilies within protein families. Ex : PRINTS database. Hidden Markov models (HMMs) are used to convert multiple sequence alignment into position specific scoring system. Ex: Pfam, SMART, TIGRFAM, PANTHER, SFLD, Superfamily and Gene 3D.
  • 16. Date 16 Families in search of function Domains of unknown function (DUFs) Popovic et al., 2017.,Scientific reports, The function of the Domain is yet to be discovered The DUF naming scheme was introduced by Chris Ponting through the addition of DUF1 and DUF2 to the SMART database Goodacre et al 2014.
  • 17.
  • 18. Databases at Glance Date 18 Databases of Protein Families 5. PRINTS Combine Multidomain/motif information for family categorization. MSA and Fuzzy Logic (Regex) 6. MobiDB Homology, Predicted, Curated Intrinsically Disordered regions database 7. TIGRFAM MSA, HMM mainly for prokaryotic proteins 8. SUPERFAMILY2 Using HMM and protein Sequences Domain organisation, sequence alignments and protein sequence details can be obtained for query sequence 4. PRIDE Mass-Spec based identification Provide PTM information and Literature Evidences 3. Prosite MSA of homologous Proteins;Based on Prorules 2. PIRSF MSA and Clustering with hight similarity thresholds 1. Pfam Protein Family, Domains, Motifs and Repeats (Generated from MSA and HMMs) 1 3 5 7 2 4 8 6
  • 19. Date 19 Interpro-A Protein Family Compendium
  • 20. Date 20 GOFeat Tutorial Lorem ipsum dolor sit amet, consectetur adipiscing elit. Protein Under Investigation: LOC645967
  • 21. Date 21 InterPro Tutorial Protein Under Investigation: LOC645967
  • 22. Date 22 References • Dorsam, R.T. and Gutkind, J.S., 2007. G-protein-coupled receptors and cancer. Nature reviews cancer, 7(2), pp.79-94. • Bateman, Alex, Penny Coggill, and Robert D. Finn. "DUFs: families in search of function." Acta Crystallographica Section F: Structural Biology and Crystallization Communications 66, no. 10 (2010): 1148-1152. • Goodacre, Norman F., Dietlind L. Gerloff, and Peter Uetz. "Protein domains of unknown function are essential in bacteria." MBio 5, no. 1 (2014). • EMBL-EBI Training Course: https://www.ebi.ac.uk/training-beta/online/courses/protein- classification-intro-ebi-resources/protein-classification/what-are-protein-families/
  • 23. Date 23 Thanks Drop in @RohitSatyam1 +91 9870953351 Jamia Millia Islamia University