SlideShare a Scribd company logo
Trials and tribulations of curating peptide and
antibody ligands for the IUPHAR/BPS Guide to
Pharmacology
Christopher Southan, Joanna L. Sharman, Adam J. Pawson, Simon D.
Harding, Elena Faccenda and Jamie A. Davies, IUPHAR/BPS Guide to Pharmacology,
Discovery Brain Sciences, University of Edinburgh, UK.
ACS Boston 2018, Biologics & Registration Session, Mon Aug 20,
15:50 - 16:15, Harbor Ballroom II
1
https://www.slideshare.net/cdsouthan
Abstract (will not be shown)
As an expert-curated database of approved, clinical or research pharmacological targets mapped to
defined ligands, the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) and its precursor IUPHAR-DB,
have been extracting and annotating bioactive peptides from papers for well over a decade. The current
total has reached 2089 peptides, split between exogenous and endogenous, within the 9144 ligand
entries submitted to PubChem in our 2018.2 database release. More recently, as approved drugs or
clinical candidates we have curated 235 antibodies and a small number of therapeutic nucleotides.
Indexing these entity types in GtoPdb present challenges similar to those being encountered for the
registration of biologicals as explicitly defined structures. In addition, we target-map the citation-
supported quantitative binding parameters where possible.This presentation will outline these
curatorial challenges and our efforts to at least partially ameliorate the problems. For peptides below
the PubChem CID SMILES limit of approximately 70 residues we have been using Sugar and Splice from
NextMove Software to convert more of our peptide SIDs to join the 6969 CIDs we already have.
However, we are often confounded by the equivocal structural specifications of authors w.r.t. post
translational modifications and exact positions of radiolabel incorporations. However, we do capture at
least a primary sequence string as an interim compromise that users can hit by BLAST. For reported
receptor-binding endogenous peptides we find some that do not match the Swiss-Prot features for the
precursor protein. PubChem has been encouraging and supporting us in converting more activity-
mapped peptides to CIDs and InChIKeys which should enhance inter-source connectivity. Otherwise,
biological SID data can only be joined by equivocal name matching. Antibodies and other large-
biological SIDs may also currently remain structurally orphaned and present their own challenges.
Notwithstanding, GtoPdb has successfully curated at least primary sequences for the molecular
specification of clinical Mabs. For this we use the IMGT/mAb-DB for approved monoclonals as a first
stop shop since they extract sequences from INN documents. For these and clinical candidates with
code names we also use the patent sequence databases to source a UniParc accession number and can
sometimes get binding data that has not appeared in papers. 2
Outline
• Intoducing GtoPdb
• GtoPdb peptide content and stats
• Peptide tribulations
• PubChem peptidic pros and cons
• Getting more peptides > SMILES
• GtoPdb antibody content
• Antipbody tribulations
• Stats and examples
• Exploiting PubChem SID tagging
• Wher we go from here
• Further information
3
Introducing the IUPHAR/BPS Guide to
PHARMACOLOGY (GtoPdb)
• IUPHAR = International Union of Basic and Clinical Pharmacology, BPS = British
Pharmacological Society
• Formerly know as IUPHAR-DB for receptors and channels since 2003
• Since 2012 funded byWellcomeTrust to cover all targets in the human genome
• Since 2015 WellcomeTrust “fork” as Guide to IMMUNOPHARMACOLOGY
• Molecular mechanism of action (mmoa) mapping primary & secondary targets
• Release cycle time (with PubChem refreshes) ~ 2 months
• Six well-cited NAR Annual Database issues, latest as PMID 29149325 (2018)
• Distilled into the 2-yearly BritishJournal of Pharmacology “Concise Guide to
PHARMACOLOGY” as a nine-paper series (see PMID 29055037) with outlinks
• Presents users with selected quality compounds for pharmacology research in
silico, in vitro, in cellulo, in vivo, in clinico
• An ELIXIR UK Node resource since 2016 http://www.guidetopharmacology.org
4
5
Expert-curated, citation provenanced,
quantitative binding data
Document > assay > result > compound > location > protein target
D- A- R - C- L- P
Where “C” is not a small molecule, we have ~ 2000 peptides and ~ 250
antibodies included in the ~ 9000 substances we submit to PubChem
Peptides
6
Endogenous peptides (786)
7
http://www.guidetopharmacology.org/GRAC/LigandListForward?type=Endogenous-peptide&database=all
Non-endogenous peptides (1310)
8http://www.guidetopharmacology.org/GRAC/LigandListForward?type=Peptide&database=all
Peptide stats
• Peptide ligs/all ligs = 22%.
• Ligands with quantitative binding data/all ligs = 75%
• Peptides with quantitative binding data/all peps = 63%
• CID quantitative binding data peptides/all CID peps = 89%
9
Tribulations with peptides
• Author specifications may be insuficient for complete molecular definition
• Consequent structural equivocalties slip through the editor/referee net
• Correct IUPAC peptide nomenclature is rare (ad-hoc more common)
• Exact location of radiolables often not specified
• Absence of purity verification and/or in vivo stability
• Need to surface user-intuative renderings (but HELM rules OK)
• Poor resolution of peptide name-to-structure (n2s)
• SMILES only copes for ~ 70 residues
• Searching patents for corroborative peptide prior-art is much more difficult than
small-molecules
• Literature extraction or author database submissions for bioactive peptides
proportionally lower than small molecules
• Species ”zoo” for venom peptides and their names
• Conjugates (peptides + linkers + proteins ect) even more difficult
• The PIR RESID Database of Protein Modifications is no longer maintained
10
The classic peptidic triple-whammy
11
Endothelin-1, CID 91928636, 1470 ”Similar Compounds” and top-100 BLAST hits
• Too big to search or cluster by SMILES
• Too small to BLAST cleanly (and sans PTMs)
• Too many species splits for precursors
Endothelin-1 inGtoPdb
12• But this now needs a SMILES backfill
Swiss-Prot precursor annotation:
useful but text-only PTMs
13
PubChem bad news:
will the real Endothelin-1 please stand up?
14
• "endothelin 1"[CompleteSynonym] > 6 CIDs > 36 SIDs (10 SID-only)
• “MW 2491.9140 NOT endothelin 1“ > 16 CIDs > 23 SIDs (some unnamed)
• BioAssay spliting (including for SID-only) is problematic
PubChem good news:
GtoPdb > SID SMILES > CID > biologicals annotation
15
PubChem: more good news
16
Our current push:
Peptides > S&S > SMILES > SIDs > CIDs
17
http://www.guidetopharmacology.org/GRAC/LigandDisplayForward?ligandId=3854
Antibodies
18
Tribulations with antibody curation
• Getting at least a primary Mab sequence as a molecuar definition
• Not alll clinical Mab sequences > patents > INN > IMGT-DB
• May get persistant UniParc ID sequence (on a good day)
• Papers often omit in vitro binding data
• Challenging to track press releases back to primary data
• Papers usually dont usually cite the patents
• But we sometimes get binding data from patents
• The biosimilars are piling in
• No open specification of glycan chains linked to primary sequences
• Some journals publish Mab characterisation with blinded code names
• Considering reseach reagents with vendor IDs if well provenanced
19
GtoPdb antibodies (245)
20http://www.guidetopharmacology.org/GRAC/LigandListForward?type=Antibody&database=all
Example: adalimumab
21
Exploiting PubChem SID-tagging for user selections
22
GtoP plans
• Continue peptide back-fill of peptides > CIDs using S&S
• Resolve our sequences against Swiss-Prot x-refs, ChEMBL and GPCRdb
• Continue adding antibody biosimilar cross-pointers
• Consider adding ”peptide” as a new SID tag
• For IUPHAR Guide to Immunopharmacology
– Sub-comitee feedback on peptides, antibodies, targets and indications
– Continue curation of peptides relevant to immunity and inflamation
• Anticipate curation of new ”binder” therapeutics including minibodies,
polyvalents and hybrids
• Keep watching brief on large-molecule InChIKeys
• Belt-and-braces of linking SMILEs with compromise (i.e. sans modifications)
FASTA approximations for BLAST indexing and clustering of peptide ligands
• Introduce local HELM rendering
• Revise legacy data model (e.g. introduce a protein ligand classification)
23
Acknowledgments, info, COI
24https://sites.google.com/view/tw2informatics/home
Conflict of interest (minor) has consulted in the peptide area
Thanks to the NextMove team
for S&S support
Lin Yikai, for her M.Sc. project;
”Developing
bio/cheminformatics methods
for converting bioactive peptide
structures into machine-
readable formats”
Anna Gaulton for ChEMBL FASTA
sequences
Paul Thiessen for PubChem for
FASTA sequences

More Related Content

What's hot

IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
Chris Southan
 
SureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSSureChEMBL and Open PHACTS
SureChEMBL and Open PHACTS
George Papadatos
 
Prota cs and targeted protein degradation
Prota cs and targeted protein degradationProta cs and targeted protein degradation
Prota cs and targeted protein degradation
DoriaFang
 
Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY
Chris Southan
 
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and EducationGuide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Chris Southan
 
CINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resourceCINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resource
George Papadatos
 
Capturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataCapturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor data
Chris Southan
 
MCB 432 Final Table PP 01.06.16
MCB 432 Final Table PP 01.06.16MCB 432 Final Table PP 01.06.16
MCB 432 Final Table PP 01.06.16Keegan McAuliffe
 
ChEMBL+KNIME
ChEMBL+KNIMEChEMBL+KNIME
ChEMBL+KNIME
George Papadatos
 
Integrated Magnetic Systems - Eddie Blair
Integrated Magnetic Systems - Eddie BlairIntegrated Magnetic Systems - Eddie Blair
Integrated Magnetic Systems - Eddie Blair
bluesquare
 
Integrative inference of transcriptional networks in Arabidopsis yields novel...
Integrative inference of transcriptional networks in Arabidopsis yields novel...Integrative inference of transcriptional networks in Arabidopsis yields novel...
Integrative inference of transcriptional networks in Arabidopsis yields novel...
Klaas Vandepoele
 
TF2Network: unravelling gene regulatory networks and transcription factor fun...
TF2Network: unravelling gene regulatory networks and transcription factor fun...TF2Network: unravelling gene regulatory networks and transcription factor fun...
TF2Network: unravelling gene regulatory networks and transcription factor fun...
Klaas Vandepoele
 
GPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningGPU-accelerated Virtual Screening
GPU-accelerated Virtual Screening
Olexandr Isayev
 
Antimalarial drug dscovery data disclosure
Antimalarial drug dscovery data disclosureAntimalarial drug dscovery data disclosure
Antimalarial drug dscovery data disclosure
Chris Southan
 

What's hot (15)

IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
 
GPCRs_HouseLA
GPCRs_HouseLAGPCRs_HouseLA
GPCRs_HouseLA
 
SureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSSureChEMBL and Open PHACTS
SureChEMBL and Open PHACTS
 
Prota cs and targeted protein degradation
Prota cs and targeted protein degradationProta cs and targeted protein degradation
Prota cs and targeted protein degradation
 
Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY
 
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and EducationGuide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
 
CINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resourceCINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resource
 
Capturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataCapturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor data
 
MCB 432 Final Table PP 01.06.16
MCB 432 Final Table PP 01.06.16MCB 432 Final Table PP 01.06.16
MCB 432 Final Table PP 01.06.16
 
ChEMBL+KNIME
ChEMBL+KNIMEChEMBL+KNIME
ChEMBL+KNIME
 
Integrated Magnetic Systems - Eddie Blair
Integrated Magnetic Systems - Eddie BlairIntegrated Magnetic Systems - Eddie Blair
Integrated Magnetic Systems - Eddie Blair
 
Integrative inference of transcriptional networks in Arabidopsis yields novel...
Integrative inference of transcriptional networks in Arabidopsis yields novel...Integrative inference of transcriptional networks in Arabidopsis yields novel...
Integrative inference of transcriptional networks in Arabidopsis yields novel...
 
TF2Network: unravelling gene regulatory networks and transcription factor fun...
TF2Network: unravelling gene regulatory networks and transcription factor fun...TF2Network: unravelling gene regulatory networks and transcription factor fun...
TF2Network: unravelling gene regulatory networks and transcription factor fun...
 
GPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningGPU-accelerated Virtual Screening
GPU-accelerated Virtual Screening
 
Antimalarial drug dscovery data disclosure
Antimalarial drug dscovery data disclosureAntimalarial drug dscovery data disclosure
Antimalarial drug dscovery data disclosure
 

Similar to Peptide Tribulations in GtoPdb

Druggable Proteome sources in UniProt
Druggable Proteome sources in UniProtDruggable Proteome sources in UniProt
Druggable Proteome sources in UniProt
Chris Southan
 
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGYSlicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Chris Southan
 
GtoPdb: A resource for cell-based perturbogens
GtoPdb:  A resource for cell-based perturbogensGtoPdb:  A resource for cell-based perturbogens
GtoPdb: A resource for cell-based perturbogens
Chris Southan
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
Dr. Haxel Consult
 
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Chris Southan
 
Druggable genome in GtoPdb and other dbs
Druggable genome in GtoPdb and other dbsDruggable genome in GtoPdb and other dbs
Druggable genome in GtoPdb and other dbs
Chris Southan
 
Evolving consensus-based curatorial strategies
Evolving consensus-based curatorial strategiesEvolving consensus-based curatorial strategies
Evolving consensus-based curatorial strategies
Chris Southan
 
PubChem as a source of systems biology perturbagens
PubChem as a source of  systems biology perturbagensPubChem as a source of  systems biology perturbagens
PubChem as a source of systems biology perturbagens
Chris Southan
 
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Guide to PHARMACOLOGY
 
The IUPHAR/MMV Guide to Malaria Pharmacology
The  IUPHAR/MMV Guide to Malaria Pharmacology  The  IUPHAR/MMV Guide to Malaria Pharmacology
The IUPHAR/MMV Guide to Malaria Pharmacology
Chris Southan
 
Correct drug structures for pharmacology
Correct drug structures for pharmacologyCorrect drug structures for pharmacology
Correct drug structures for pharmacology
Chris Southan
 
GtoPdb_StatusReport_May2018_Core
GtoPdb_StatusReport_May2018_CoreGtoPdb_StatusReport_May2018_Core
GtoPdb_StatusReport_May2018_Core
Guide to PHARMACOLOGY
 
Assessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChemAssessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChem
Chris Southan
 
proteomics.ppt
proteomics.pptproteomics.ppt
proteomics.ppt
MANJUSINGH948460
 
5HT2A modulators in GtoPdb and other databses
5HT2A modulators in GtoPdb and other databses5HT2A modulators in GtoPdb and other databses
5HT2A modulators in GtoPdb and other databses
Chris Southan
 
Data drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryData drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistry
Ann-Marie Roche
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology
Sean Ekins
 
GtoPdb teaching slides
GtoPdb teaching slidesGtoPdb teaching slides
GtoPdb teaching slides
Chris Southan
 
Biologics information in PubChem
Biologics information in PubChemBiologics information in PubChem
Biologics information in PubChem
Jian Zhang
 
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGYIUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
Guide to PHARMACOLOGY
 

Similar to Peptide Tribulations in GtoPdb (20)

Druggable Proteome sources in UniProt
Druggable Proteome sources in UniProtDruggable Proteome sources in UniProt
Druggable Proteome sources in UniProt
 
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGYSlicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
 
GtoPdb: A resource for cell-based perturbogens
GtoPdb:  A resource for cell-based perturbogensGtoPdb:  A resource for cell-based perturbogens
GtoPdb: A resource for cell-based perturbogens
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
 
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
 
Druggable genome in GtoPdb and other dbs
Druggable genome in GtoPdb and other dbsDruggable genome in GtoPdb and other dbs
Druggable genome in GtoPdb and other dbs
 
Evolving consensus-based curatorial strategies
Evolving consensus-based curatorial strategiesEvolving consensus-based curatorial strategies
Evolving consensus-based curatorial strategies
 
PubChem as a source of systems biology perturbagens
PubChem as a source of  systems biology perturbagensPubChem as a source of  systems biology perturbagens
PubChem as a source of systems biology perturbagens
 
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
 
The IUPHAR/MMV Guide to Malaria Pharmacology
The  IUPHAR/MMV Guide to Malaria Pharmacology  The  IUPHAR/MMV Guide to Malaria Pharmacology
The IUPHAR/MMV Guide to Malaria Pharmacology
 
Correct drug structures for pharmacology
Correct drug structures for pharmacologyCorrect drug structures for pharmacology
Correct drug structures for pharmacology
 
GtoPdb_StatusReport_May2018_Core
GtoPdb_StatusReport_May2018_CoreGtoPdb_StatusReport_May2018_Core
GtoPdb_StatusReport_May2018_Core
 
Assessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChemAssessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChem
 
proteomics.ppt
proteomics.pptproteomics.ppt
proteomics.ppt
 
5HT2A modulators in GtoPdb and other databses
5HT2A modulators in GtoPdb and other databses5HT2A modulators in GtoPdb and other databses
5HT2A modulators in GtoPdb and other databses
 
Data drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryData drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistry
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology
 
GtoPdb teaching slides
GtoPdb teaching slidesGtoPdb teaching slides
GtoPdb teaching slides
 
Biologics information in PubChem
Biologics information in PubChemBiologics information in PubChem
Biologics information in PubChem
 
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGYIUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
 

More from Chris Southan

FAIR connectivity for DARCP
FAIR  connectivity for DARCPFAIR  connectivity for DARCP
FAIR connectivity for DARCP
Chris Southan
 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
Chris Southan
 
Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
Chris Southan
 
Guide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updaeGuide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updae
Chris Southan
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
Chris Southan
 
Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?
Chris Southan
 
Desperately seeking DARCP
Desperately seeking DARCPDesperately seeking DARCP
Desperately seeking DARCP
Chris Southan
 
Seeking glimmers of light in Pharos “Tdark” proteins
Seeking glimmers of light in  Pharos “Tdark” proteinsSeeking glimmers of light in  Pharos “Tdark” proteins
Seeking glimmers of light in Pharos “Tdark” proteins
Chris Southan
 
5HT2A modulators update for SAFER
5HT2A modulators update for SAFER5HT2A modulators update for SAFER
5HT2A modulators update for SAFER
Chris Southan
 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
Chris Southan
 
Connecting chemistry-to-biology
Connecting chemistry-to-biology Connecting chemistry-to-biology
Connecting chemistry-to-biology
Chris Southan
 
GtoPdb June 2019 poster
GtoPdb June 2019 posterGtoPdb June 2019 poster
GtoPdb June 2019 poster
Chris Southan
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
Chris Southan
 
Will the real proteins please stand up
Will the real proteins please stand upWill the real proteins please stand up
Will the real proteins please stand up
Chris Southan
 
Looking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIRLooking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIR
Chris Southan
 
Guide to Immunopharmacology update
Guide to Immunopharmacology updateGuide to Immunopharmacology update
Guide to Immunopharmacology update
Chris Southan
 
Patents in PubChem
Patents in PubChemPatents in PubChem
Patents in PubChem
Chris Southan
 
Pub Med to PubChem Connectivity
Pub Med to PubChem ConnectivityPub Med to PubChem Connectivity
Pub Med to PubChem Connectivity
Chris Southan
 
The big data join in pharmacology
The big data join in pharmacologyThe big data join in pharmacology
The big data join in pharmacology
Chris Southan
 
Linking GtoP <> PubChem <> PubMed
Linking GtoP <> PubChem <> PubMed Linking GtoP <> PubChem <> PubMed
Linking GtoP <> PubChem <> PubMed
Chris Southan
 

More from Chris Southan (20)

FAIR connectivity for DARCP
FAIR  connectivity for DARCPFAIR  connectivity for DARCP
FAIR connectivity for DARCP
 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
 
Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
 
Guide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updaeGuide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updae
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
 
Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?
 
Desperately seeking DARCP
Desperately seeking DARCPDesperately seeking DARCP
Desperately seeking DARCP
 
Seeking glimmers of light in Pharos “Tdark” proteins
Seeking glimmers of light in  Pharos “Tdark” proteinsSeeking glimmers of light in  Pharos “Tdark” proteins
Seeking glimmers of light in Pharos “Tdark” proteins
 
5HT2A modulators update for SAFER
5HT2A modulators update for SAFER5HT2A modulators update for SAFER
5HT2A modulators update for SAFER
 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
 
Connecting chemistry-to-biology
Connecting chemistry-to-biology Connecting chemistry-to-biology
Connecting chemistry-to-biology
 
GtoPdb June 2019 poster
GtoPdb June 2019 posterGtoPdb June 2019 poster
GtoPdb June 2019 poster
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
 
Will the real proteins please stand up
Will the real proteins please stand upWill the real proteins please stand up
Will the real proteins please stand up
 
Looking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIRLooking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIR
 
Guide to Immunopharmacology update
Guide to Immunopharmacology updateGuide to Immunopharmacology update
Guide to Immunopharmacology update
 
Patents in PubChem
Patents in PubChemPatents in PubChem
Patents in PubChem
 
Pub Med to PubChem Connectivity
Pub Med to PubChem ConnectivityPub Med to PubChem Connectivity
Pub Med to PubChem Connectivity
 
The big data join in pharmacology
The big data join in pharmacologyThe big data join in pharmacology
The big data join in pharmacology
 
Linking GtoP <> PubChem <> PubMed
Linking GtoP <> PubChem <> PubMed Linking GtoP <> PubChem <> PubMed
Linking GtoP <> PubChem <> PubMed
 

Recently uploaded

What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 

Recently uploaded (20)

What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 

Peptide Tribulations in GtoPdb

  • 1. Trials and tribulations of curating peptide and antibody ligands for the IUPHAR/BPS Guide to Pharmacology Christopher Southan, Joanna L. Sharman, Adam J. Pawson, Simon D. Harding, Elena Faccenda and Jamie A. Davies, IUPHAR/BPS Guide to Pharmacology, Discovery Brain Sciences, University of Edinburgh, UK. ACS Boston 2018, Biologics & Registration Session, Mon Aug 20, 15:50 - 16:15, Harbor Ballroom II 1 https://www.slideshare.net/cdsouthan
  • 2. Abstract (will not be shown) As an expert-curated database of approved, clinical or research pharmacological targets mapped to defined ligands, the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) and its precursor IUPHAR-DB, have been extracting and annotating bioactive peptides from papers for well over a decade. The current total has reached 2089 peptides, split between exogenous and endogenous, within the 9144 ligand entries submitted to PubChem in our 2018.2 database release. More recently, as approved drugs or clinical candidates we have curated 235 antibodies and a small number of therapeutic nucleotides. Indexing these entity types in GtoPdb present challenges similar to those being encountered for the registration of biologicals as explicitly defined structures. In addition, we target-map the citation- supported quantitative binding parameters where possible.This presentation will outline these curatorial challenges and our efforts to at least partially ameliorate the problems. For peptides below the PubChem CID SMILES limit of approximately 70 residues we have been using Sugar and Splice from NextMove Software to convert more of our peptide SIDs to join the 6969 CIDs we already have. However, we are often confounded by the equivocal structural specifications of authors w.r.t. post translational modifications and exact positions of radiolabel incorporations. However, we do capture at least a primary sequence string as an interim compromise that users can hit by BLAST. For reported receptor-binding endogenous peptides we find some that do not match the Swiss-Prot features for the precursor protein. PubChem has been encouraging and supporting us in converting more activity- mapped peptides to CIDs and InChIKeys which should enhance inter-source connectivity. Otherwise, biological SID data can only be joined by equivocal name matching. Antibodies and other large- biological SIDs may also currently remain structurally orphaned and present their own challenges. Notwithstanding, GtoPdb has successfully curated at least primary sequences for the molecular specification of clinical Mabs. For this we use the IMGT/mAb-DB for approved monoclonals as a first stop shop since they extract sequences from INN documents. For these and clinical candidates with code names we also use the patent sequence databases to source a UniParc accession number and can sometimes get binding data that has not appeared in papers. 2
  • 3. Outline • Intoducing GtoPdb • GtoPdb peptide content and stats • Peptide tribulations • PubChem peptidic pros and cons • Getting more peptides > SMILES • GtoPdb antibody content • Antipbody tribulations • Stats and examples • Exploiting PubChem SID tagging • Wher we go from here • Further information 3
  • 4. Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) • IUPHAR = International Union of Basic and Clinical Pharmacology, BPS = British Pharmacological Society • Formerly know as IUPHAR-DB for receptors and channels since 2003 • Since 2012 funded byWellcomeTrust to cover all targets in the human genome • Since 2015 WellcomeTrust “fork” as Guide to IMMUNOPHARMACOLOGY • Molecular mechanism of action (mmoa) mapping primary & secondary targets • Release cycle time (with PubChem refreshes) ~ 2 months • Six well-cited NAR Annual Database issues, latest as PMID 29149325 (2018) • Distilled into the 2-yearly BritishJournal of Pharmacology “Concise Guide to PHARMACOLOGY” as a nine-paper series (see PMID 29055037) with outlinks • Presents users with selected quality compounds for pharmacology research in silico, in vitro, in cellulo, in vivo, in clinico • An ELIXIR UK Node resource since 2016 http://www.guidetopharmacology.org 4
  • 5. 5 Expert-curated, citation provenanced, quantitative binding data Document > assay > result > compound > location > protein target D- A- R - C- L- P Where “C” is not a small molecule, we have ~ 2000 peptides and ~ 250 antibodies included in the ~ 9000 substances we submit to PubChem
  • 9. Peptide stats • Peptide ligs/all ligs = 22%. • Ligands with quantitative binding data/all ligs = 75% • Peptides with quantitative binding data/all peps = 63% • CID quantitative binding data peptides/all CID peps = 89% 9
  • 10. Tribulations with peptides • Author specifications may be insuficient for complete molecular definition • Consequent structural equivocalties slip through the editor/referee net • Correct IUPAC peptide nomenclature is rare (ad-hoc more common) • Exact location of radiolables often not specified • Absence of purity verification and/or in vivo stability • Need to surface user-intuative renderings (but HELM rules OK) • Poor resolution of peptide name-to-structure (n2s) • SMILES only copes for ~ 70 residues • Searching patents for corroborative peptide prior-art is much more difficult than small-molecules • Literature extraction or author database submissions for bioactive peptides proportionally lower than small molecules • Species ”zoo” for venom peptides and their names • Conjugates (peptides + linkers + proteins ect) even more difficult • The PIR RESID Database of Protein Modifications is no longer maintained 10
  • 11. The classic peptidic triple-whammy 11 Endothelin-1, CID 91928636, 1470 ”Similar Compounds” and top-100 BLAST hits • Too big to search or cluster by SMILES • Too small to BLAST cleanly (and sans PTMs) • Too many species splits for precursors
  • 12. Endothelin-1 inGtoPdb 12• But this now needs a SMILES backfill
  • 14. PubChem bad news: will the real Endothelin-1 please stand up? 14 • "endothelin 1"[CompleteSynonym] > 6 CIDs > 36 SIDs (10 SID-only) • “MW 2491.9140 NOT endothelin 1“ > 16 CIDs > 23 SIDs (some unnamed) • BioAssay spliting (including for SID-only) is problematic
  • 15. PubChem good news: GtoPdb > SID SMILES > CID > biologicals annotation 15
  • 17. Our current push: Peptides > S&S > SMILES > SIDs > CIDs 17 http://www.guidetopharmacology.org/GRAC/LigandDisplayForward?ligandId=3854
  • 19. Tribulations with antibody curation • Getting at least a primary Mab sequence as a molecuar definition • Not alll clinical Mab sequences > patents > INN > IMGT-DB • May get persistant UniParc ID sequence (on a good day) • Papers often omit in vitro binding data • Challenging to track press releases back to primary data • Papers usually dont usually cite the patents • But we sometimes get binding data from patents • The biosimilars are piling in • No open specification of glycan chains linked to primary sequences • Some journals publish Mab characterisation with blinded code names • Considering reseach reagents with vendor IDs if well provenanced 19
  • 22. Exploiting PubChem SID-tagging for user selections 22
  • 23. GtoP plans • Continue peptide back-fill of peptides > CIDs using S&S • Resolve our sequences against Swiss-Prot x-refs, ChEMBL and GPCRdb • Continue adding antibody biosimilar cross-pointers • Consider adding ”peptide” as a new SID tag • For IUPHAR Guide to Immunopharmacology – Sub-comitee feedback on peptides, antibodies, targets and indications – Continue curation of peptides relevant to immunity and inflamation • Anticipate curation of new ”binder” therapeutics including minibodies, polyvalents and hybrids • Keep watching brief on large-molecule InChIKeys • Belt-and-braces of linking SMILEs with compromise (i.e. sans modifications) FASTA approximations for BLAST indexing and clustering of peptide ligands • Introduce local HELM rendering • Revise legacy data model (e.g. introduce a protein ligand classification) 23
  • 24. Acknowledgments, info, COI 24https://sites.google.com/view/tw2informatics/home Conflict of interest (minor) has consulted in the peptide area Thanks to the NextMove team for S&S support Lin Yikai, for her M.Sc. project; ”Developing bio/cheminformatics methods for converting bioactive peptide structures into machine- readable formats” Anna Gaulton for ChEMBL FASTA sequences Paul Thiessen for PubChem for FASTA sequences