SlideShare a Scribd company logo
1 of 18
Tribulations of curating published key bioactive
peptides for the Guide to PHARMACOLOGY
Christopher Southan, Joanna L. Sharman, Adam J. Pawson, Simon D.
Harding, Elena Faccenda and Jamie A. Davies, IUPHAR/BPS Guide to Pharmacology,
Discovery Brain Sciences, University of Edinburgh, UK.
BPS 2018 Molecular and Cellular Pharmacology Oral Communications 1
Tuesday, December 18, 15:00
1
https://www.slideshare.net/cdsouthan
Abstract (will not be shown)
Introduction:The crucial roles of bioactive peptides in pharmacology, drug discovery and chemical biology are well
established. Consequently, the IUPHAR/BPSGuide to PHARMACOLOGY (GtoPdb) and its precursor IUPHAR-DB have
been curating peptide entries for over a decade.While small-molecule chemical structures have curatorial challenges
with which the GtoPdb team has to grapple, these are exacerbated for peptides. Because of their increasing
importance both in endogenous pharmacology (e.g. GPCR ligands) and the development of new exogenous modified
peptide therapeutics we undertook a review of our peptide statistics, curation strategies, indexing in PubChem and
enhancement options.
Methods:We assessed our internal peptide statistics for release 2018.3 including our submitted PubChem substance
entries (SIDs) and undertook a retrospective assessment of their tribulations.We also looked at equivocality problems
with searching peptides in PubChem and major sequence sources. To enhance our own curation, we piloted the Sugar
and Splice (S&S) program from NextMove Software to convert more of our medium-sized peptides from sequence
strings, including formally specified post-translational modifications (PTMs) to SMILES molecular representations
that could then merge with PubChem compound entries (CIDs).
Results:The current database includes 786 endogenous and 1310 exogenous peptide entries (n.b. the presentation will
update these stats from the upcoming 2018.4 release).These are nested within our 9345 PubChem SIDs but many have
not formed CIDs. Legacy problems were mostly due to equivocal structural specifications of PTMs and exact positions
of radiolabel incorporations. However, our capturing of at least a primary sequence string is a compromise that users
can match by BLAST search. As an example, exploring “Endothelin-1” in PubChem and by NCBI sequence search
exposed major name-to-structure mapping problems and multiple structures, includingSwiss-Port features for the
precursor protein.We assessed the major problem of similarity ascertainment for peptides because they are too large
for chemical clustering but too small for clean sequence searching. We successfully incorporated S&S into our peptide
curation triage and converted many legacy sequences to SMILES strings.
Conclusion: Despite their increasing importance, pharmacology database entries for bioactive peptides are
associated with tribulations that GtoPdb, PubChem and other databases have so far confronted with only partial
success. Many are associated with equivocal representations in papers that thus render many reported experiments
irreproducible. We urge authors and journal editors to increase the specificity of peptide specifications. In collaboration
with PubChem and NextMove we are improving our peptide curation, including for some of our legacy entries.
2
Outline
• Intoducing GtoPdb
• GtoPdb peptide content and stats
• Peptide tribulations
• PubChem peptidic pros and cons
• Getting more peptides > SMILES
• Stats and examples
• Exploiting PubChem SID tagging
• Wher we go from here
• Further information
3
Introducing the IUPHAR/BPS Guide to
PHARMACOLOGY (GtoPdb)
• IUPHAR = International Union of Basic and Clinical Pharmacology, BPS = British
Pharmacological Society
• Formerly know as IUPHAR-DB for receptors and channels since 2003
• Since 2012 funded byWellcomeTrust to cover all targets in the human genome
• Since 2015 WellcomeTrust “fork” as Guide to IMMUNOPHARMACOLOGY
• Molecular mechanism of action (mmoa) mapping primary & secondary targets
• Release cycle time (with PubChem refreshes) ~ 2 months
• Six NAR Annual Database issues, latest as PMID 29149325 (2018)
• Distilled into the 2-yearly BritishJournal of Pharmacology “Concise Guide to
PHARMACOLOGY” as a nine-paper series (see PMID 29055037) with outlinks
• Presents users with selected quality compounds for pharmacology research in
silico, in vitro, in cellulo, in vivo, in clinico
• An ELIXIR UK Node resource since 2016 http://www.guidetopharmacology.org
4
5
The GtoPdb hallmark: quantitative binding data
Document > assay > result > compound > location > protein target
D- A- R - C- L- P
Where “C” is not a small molecule, we have ~ 2000 peptides included in
the ~ 9000 substances we submit to PubChem
Endogenous peptides (786)
6
http://www.guidetopharmacology.org/GRAC/LigandListForward?type=Endogenous-peptide&database=all
Non-endogenous peptides (1310)
7http://www.guidetopharmacology.org/GRAC/LigandListForward?type=Peptide&database=all
GtoPdb peptide stats
• Peptide ligs/all ligs = 22%.
• Ligands with quantitative binding data/all ligs = 75%
• Peptides with quantitative binding data/all peps = 63%
• CID quantitative binding data peptides/all CID peps = 89%
• These are from release 2018.3 so slight changes in current 2018.4 8
Tribulations with peptides
• Author specifications often insuficient for complete molecular definition
• Consequent structural equivocalties slip through the editor/referee net
• Correct IUPAC peptide nomenclature, esp for modified residues is rare (ad-
hoc more common)
• Poor resolution of peptide name-to-structure (n2s)
• Exact location of radiolables often not specified
• Absence of purity verification and/or in vivo stability
• Different graphic rendering styles
• SMILES only < ~ 70 residues in PubChem (grey zone of small peptides)
• Literature and patent extraction for database feeds are proportionally lower
than small molecules
• Searching patents for peptide prior-art or analogues is much more difficult
than small-molecules
• Species ”zoo” for venom peptides and names
• Conjugates (peptides + linkers + proteins ect) provide even more tribulations
9
The classic peptidic triple-whammy
10
Endothelin-1, CID 91928636, 1470 ”Similar Compounds” and top-100 BLAST hits
• Too big to search or cluster by SMILES
• Too small to BLAST cleanly (and sans PTMs)
• Too many species splits for precursors
Endothelin-1 in GtoPdb
11
Swiss-Prot precursor annotation:
useful but text-only PTMs
12
PubChem bad news:
will the real Endothelin-1 please stand up?
13
• "endothelin 1"[CompleteSynonym] > 6 CIDs > 36 SIDs (10 SID-only)
• “MW 2491.9140 NOT endothelin 1“ > 16 CIDs > 23 SIDs (some unnamed)
• Problematic BioAssay spliting (including for SID-only)
• No fix on the immediate horizon :(
PubChem
biologicals
annotation
14
Hierarchical Editing Language for Macromolecules (HELM)
15
Our current GtoPdb push:
Peptide > S&S > SMILES > SIDs > CIDs
16
http://www.guidetopharmacology.org/GRAC/LigandDisplayForward?ligandId=3854
GtoPdb plans
• Continue peptide back-fill of peptides > CIDs using Sugar &Splice
• Resolve our sequences against Swiss-Prot x-refs, ChEMBL and GPCRdb
• Consider adding ”peptide” as a new SID tag
• For IUPHAR Guide to Immunopharmacology
– Sub-comitee feedback on peptides, antibodies, targets and
indications
– Continue curation of peptides relevant to immunity and inflamation
• Anticipate curation of new ”binder” therapeutics including minibodies,
polyvalents and hybrids
• Belt-and-braces of linking SMILEs with compromise (i.e. sans
modifications) FASTA approximations to facilitate BLAST indexing and
clustering of peptide ligands
• Introduce local HELM rendering
• Revise legacy data model (e.g. introduce a protein ligand classification)
17
Acknowledgments, info and COI
18https://sites.google.com/view/tw2informatics/home
Conflict of interest (minor) has consulted in the peptide area
Thanks to the NextMove Software
team for S&S support
Lin Yikai, for her M.Sc. project;
”Developing bio/cheminformatics
methods for converting bioactive
peptide structures into machine-
readable formats”
Paul Thiessen from PubChem for
support and FASTA sequences

More Related Content

Similar to Peptide Tribulations

SF and PE CTR-IN 2016 Poster_FInal
SF and PE CTR-IN 2016 Poster_FInalSF and PE CTR-IN 2016 Poster_FInal
SF and PE CTR-IN 2016 Poster_FInal
Steve Flynn
 
Session 1 part 2
Session 1 part 2Session 1 part 2
Session 1 part 2
plmiami
 
Analysing the drug targets in the human genome
Analysing the drug targets in the human genomeAnalysing the drug targets in the human genome
Analysing the drug targets in the human genome
Guide to PHARMACOLOGY
 
J Biomol Screen-2013-Maillard-868-78 (dragged)
J Biomol Screen-2013-Maillard-868-78 (dragged)J Biomol Screen-2013-Maillard-868-78 (dragged)
J Biomol Screen-2013-Maillard-868-78 (dragged)
Hyunsun Park
 
GtoPdb: A resource for cell-based perturbogens
GtoPdb:  A resource for cell-based perturbogensGtoPdb:  A resource for cell-based perturbogens
GtoPdb: A resource for cell-based perturbogens
Chris Southan
 

Similar to Peptide Tribulations (20)

Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
 
Correct drug structures for pharmacology
Correct drug structures for pharmacologyCorrect drug structures for pharmacology
Correct drug structures for pharmacology
 
SF and PE CTR-IN 2016 Poster_FInal
SF and PE CTR-IN 2016 Poster_FInalSF and PE CTR-IN 2016 Poster_FInal
SF and PE CTR-IN 2016 Poster_FInal
 
Druggable Proteome sources in UniProt
Druggable Proteome sources in UniProtDruggable Proteome sources in UniProt
Druggable Proteome sources in UniProt
 
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
 
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGYSlicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
 
The IUPHAR/MMV Guide to Malaria Pharmacology
The  IUPHAR/MMV Guide to Malaria Pharmacology  The  IUPHAR/MMV Guide to Malaria Pharmacology
The IUPHAR/MMV Guide to Malaria Pharmacology
 
Session 1 part 2
Session 1 part 2Session 1 part 2
Session 1 part 2
 
Analysing the drug targets in the human genome
Analysing the drug targets in the human genomeAnalysing the drug targets in the human genome
Analysing the drug targets in the human genome
 
The Application and Methods for Peptidomics
The Application and Methods for PeptidomicsThe Application and Methods for Peptidomics
The Application and Methods for Peptidomics
 
Analysing curated protein targets: Partitioning the drugged and the druggable
Analysing curated protein targets: Partitioning the drugged and the druggable Analysing curated protein targets: Partitioning the drugged and the druggable
Analysing curated protein targets: Partitioning the drugged and the druggable
 
cara Baca SWISSADME-TargetPred.pdf
cara Baca SWISSADME-TargetPred.pdfcara Baca SWISSADME-TargetPred.pdf
cara Baca SWISSADME-TargetPred.pdf
 
J Biomol Screen-2013-Maillard-868-78 (dragged)
J Biomol Screen-2013-Maillard-868-78 (dragged)J Biomol Screen-2013-Maillard-868-78 (dragged)
J Biomol Screen-2013-Maillard-868-78 (dragged)
 
GtoPdb_StatusReport_May2018_Core
GtoPdb_StatusReport_May2018_CoreGtoPdb_StatusReport_May2018_Core
GtoPdb_StatusReport_May2018_Core
 
Peptidomimetics by Yogesh.pptx
Peptidomimetics by Yogesh.pptxPeptidomimetics by Yogesh.pptx
Peptidomimetics by Yogesh.pptx
 
GtoPdb: A resource for cell-based perturbogens
GtoPdb:  A resource for cell-based perturbogensGtoPdb:  A resource for cell-based perturbogens
GtoPdb: A resource for cell-based perturbogens
 
Could PDC Be A New Direction For Targeted Therapy After ADC.pdf
Could PDC Be A New Direction For Targeted Therapy After ADC.pdfCould PDC Be A New Direction For Targeted Therapy After ADC.pdf
Could PDC Be A New Direction For Targeted Therapy After ADC.pdf
 
Perspective on QSAR modeling of transport
Perspective on QSAR modeling of transportPerspective on QSAR modeling of transport
Perspective on QSAR modeling of transport
 
Assessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChemAssessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChem
 
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
 

More from Chris Southan

Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
Chris Southan
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
Chris Southan
 
The big data join in pharmacology
The big data join in pharmacologyThe big data join in pharmacology
The big data join in pharmacology
Chris Southan
 

More from Chris Southan (20)

FAIR connectivity for DARCP
FAIR  connectivity for DARCPFAIR  connectivity for DARCP
FAIR connectivity for DARCP
 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
 
Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
 
Guide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updaeGuide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updae
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
 
Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?
 
Desperately seeking DARCP
Desperately seeking DARCPDesperately seeking DARCP
Desperately seeking DARCP
 
Seeking glimmers of light in Pharos “Tdark” proteins
Seeking glimmers of light in  Pharos “Tdark” proteinsSeeking glimmers of light in  Pharos “Tdark” proteins
Seeking glimmers of light in Pharos “Tdark” proteins
 
5HT2A modulators update for SAFER
5HT2A modulators update for SAFER5HT2A modulators update for SAFER
5HT2A modulators update for SAFER
 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
 
Connecting chemistry-to-biology
Connecting chemistry-to-biology Connecting chemistry-to-biology
Connecting chemistry-to-biology
 
GtoPdb June 2019 poster
GtoPdb June 2019 posterGtoPdb June 2019 poster
GtoPdb June 2019 poster
 
PubChem as a source of systems biology perturbagens
PubChem as a source of  systems biology perturbagensPubChem as a source of  systems biology perturbagens
PubChem as a source of systems biology perturbagens
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
 
Will the real proteins please stand up
Will the real proteins please stand upWill the real proteins please stand up
Will the real proteins please stand up
 
Looking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIRLooking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIR
 
Guide to Immunopharmacology update
Guide to Immunopharmacology updateGuide to Immunopharmacology update
Guide to Immunopharmacology update
 
Patents in PubChem
Patents in PubChemPatents in PubChem
Patents in PubChem
 
Pub Med to PubChem Connectivity
Pub Med to PubChem ConnectivityPub Med to PubChem Connectivity
Pub Med to PubChem Connectivity
 
The big data join in pharmacology
The big data join in pharmacologyThe big data join in pharmacology
The big data join in pharmacology
 

Recently uploaded

POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Silpa
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 

Recently uploaded (20)

Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNA
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 

Peptide Tribulations

  • 1. Tribulations of curating published key bioactive peptides for the Guide to PHARMACOLOGY Christopher Southan, Joanna L. Sharman, Adam J. Pawson, Simon D. Harding, Elena Faccenda and Jamie A. Davies, IUPHAR/BPS Guide to Pharmacology, Discovery Brain Sciences, University of Edinburgh, UK. BPS 2018 Molecular and Cellular Pharmacology Oral Communications 1 Tuesday, December 18, 15:00 1 https://www.slideshare.net/cdsouthan
  • 2. Abstract (will not be shown) Introduction:The crucial roles of bioactive peptides in pharmacology, drug discovery and chemical biology are well established. Consequently, the IUPHAR/BPSGuide to PHARMACOLOGY (GtoPdb) and its precursor IUPHAR-DB have been curating peptide entries for over a decade.While small-molecule chemical structures have curatorial challenges with which the GtoPdb team has to grapple, these are exacerbated for peptides. Because of their increasing importance both in endogenous pharmacology (e.g. GPCR ligands) and the development of new exogenous modified peptide therapeutics we undertook a review of our peptide statistics, curation strategies, indexing in PubChem and enhancement options. Methods:We assessed our internal peptide statistics for release 2018.3 including our submitted PubChem substance entries (SIDs) and undertook a retrospective assessment of their tribulations.We also looked at equivocality problems with searching peptides in PubChem and major sequence sources. To enhance our own curation, we piloted the Sugar and Splice (S&S) program from NextMove Software to convert more of our medium-sized peptides from sequence strings, including formally specified post-translational modifications (PTMs) to SMILES molecular representations that could then merge with PubChem compound entries (CIDs). Results:The current database includes 786 endogenous and 1310 exogenous peptide entries (n.b. the presentation will update these stats from the upcoming 2018.4 release).These are nested within our 9345 PubChem SIDs but many have not formed CIDs. Legacy problems were mostly due to equivocal structural specifications of PTMs and exact positions of radiolabel incorporations. However, our capturing of at least a primary sequence string is a compromise that users can match by BLAST search. As an example, exploring “Endothelin-1” in PubChem and by NCBI sequence search exposed major name-to-structure mapping problems and multiple structures, includingSwiss-Port features for the precursor protein.We assessed the major problem of similarity ascertainment for peptides because they are too large for chemical clustering but too small for clean sequence searching. We successfully incorporated S&S into our peptide curation triage and converted many legacy sequences to SMILES strings. Conclusion: Despite their increasing importance, pharmacology database entries for bioactive peptides are associated with tribulations that GtoPdb, PubChem and other databases have so far confronted with only partial success. Many are associated with equivocal representations in papers that thus render many reported experiments irreproducible. We urge authors and journal editors to increase the specificity of peptide specifications. In collaboration with PubChem and NextMove we are improving our peptide curation, including for some of our legacy entries. 2
  • 3. Outline • Intoducing GtoPdb • GtoPdb peptide content and stats • Peptide tribulations • PubChem peptidic pros and cons • Getting more peptides > SMILES • Stats and examples • Exploiting PubChem SID tagging • Wher we go from here • Further information 3
  • 4. Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) • IUPHAR = International Union of Basic and Clinical Pharmacology, BPS = British Pharmacological Society • Formerly know as IUPHAR-DB for receptors and channels since 2003 • Since 2012 funded byWellcomeTrust to cover all targets in the human genome • Since 2015 WellcomeTrust “fork” as Guide to IMMUNOPHARMACOLOGY • Molecular mechanism of action (mmoa) mapping primary & secondary targets • Release cycle time (with PubChem refreshes) ~ 2 months • Six NAR Annual Database issues, latest as PMID 29149325 (2018) • Distilled into the 2-yearly BritishJournal of Pharmacology “Concise Guide to PHARMACOLOGY” as a nine-paper series (see PMID 29055037) with outlinks • Presents users with selected quality compounds for pharmacology research in silico, in vitro, in cellulo, in vivo, in clinico • An ELIXIR UK Node resource since 2016 http://www.guidetopharmacology.org 4
  • 5. 5 The GtoPdb hallmark: quantitative binding data Document > assay > result > compound > location > protein target D- A- R - C- L- P Where “C” is not a small molecule, we have ~ 2000 peptides included in the ~ 9000 substances we submit to PubChem
  • 8. GtoPdb peptide stats • Peptide ligs/all ligs = 22%. • Ligands with quantitative binding data/all ligs = 75% • Peptides with quantitative binding data/all peps = 63% • CID quantitative binding data peptides/all CID peps = 89% • These are from release 2018.3 so slight changes in current 2018.4 8
  • 9. Tribulations with peptides • Author specifications often insuficient for complete molecular definition • Consequent structural equivocalties slip through the editor/referee net • Correct IUPAC peptide nomenclature, esp for modified residues is rare (ad- hoc more common) • Poor resolution of peptide name-to-structure (n2s) • Exact location of radiolables often not specified • Absence of purity verification and/or in vivo stability • Different graphic rendering styles • SMILES only < ~ 70 residues in PubChem (grey zone of small peptides) • Literature and patent extraction for database feeds are proportionally lower than small molecules • Searching patents for peptide prior-art or analogues is much more difficult than small-molecules • Species ”zoo” for venom peptides and names • Conjugates (peptides + linkers + proteins ect) provide even more tribulations 9
  • 10. The classic peptidic triple-whammy 10 Endothelin-1, CID 91928636, 1470 ”Similar Compounds” and top-100 BLAST hits • Too big to search or cluster by SMILES • Too small to BLAST cleanly (and sans PTMs) • Too many species splits for precursors
  • 13. PubChem bad news: will the real Endothelin-1 please stand up? 13 • "endothelin 1"[CompleteSynonym] > 6 CIDs > 36 SIDs (10 SID-only) • “MW 2491.9140 NOT endothelin 1“ > 16 CIDs > 23 SIDs (some unnamed) • Problematic BioAssay spliting (including for SID-only) • No fix on the immediate horizon :(
  • 15. Hierarchical Editing Language for Macromolecules (HELM) 15
  • 16. Our current GtoPdb push: Peptide > S&S > SMILES > SIDs > CIDs 16 http://www.guidetopharmacology.org/GRAC/LigandDisplayForward?ligandId=3854
  • 17. GtoPdb plans • Continue peptide back-fill of peptides > CIDs using Sugar &Splice • Resolve our sequences against Swiss-Prot x-refs, ChEMBL and GPCRdb • Consider adding ”peptide” as a new SID tag • For IUPHAR Guide to Immunopharmacology – Sub-comitee feedback on peptides, antibodies, targets and indications – Continue curation of peptides relevant to immunity and inflamation • Anticipate curation of new ”binder” therapeutics including minibodies, polyvalents and hybrids • Belt-and-braces of linking SMILEs with compromise (i.e. sans modifications) FASTA approximations to facilitate BLAST indexing and clustering of peptide ligands • Introduce local HELM rendering • Revise legacy data model (e.g. introduce a protein ligand classification) 17
  • 18. Acknowledgments, info and COI 18https://sites.google.com/view/tw2informatics/home Conflict of interest (minor) has consulted in the peptide area Thanks to the NextMove Software team for S&S support Lin Yikai, for her M.Sc. project; ”Developing bio/cheminformatics methods for converting bioactive peptide structures into machine- readable formats” Paul Thiessen from PubChem for support and FASTA sequences