Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Peptide Informatics - Bridging the gap between small-molecule and large-molecule systems


Published on

Presented by Lisa Sach-Peltason (Roche, Basel) at 2014 Bio-IT World

Published in: Science, Technology, Business
  • Be the first to comment

  • Be the first to like this

Peptide Informatics - Bridging the gap between small-molecule and large-molecule systems

  1. 1. Peptide Informatics Bridging the gap between small-molecule and large- molecule systems Lisa Sach-Peltason Data Science, pRED Informatics, Roche Basel
  2. 2. Peptide Therapeutics – An Emerging Modality US FDA approved drugs (2009-2011) Small molecule 34 Protein 9 Monocl. antibody 8 Peptide 8 Natural product 6 Amino acid 5 Steroid 2 Nucleoside 1 Enzyme 1 Macrocycle 1 Other 1 Adapted from Albericio & Kruger; Future Med. Chem. (2012), 4(12), 1527-1531.
  3. 3. Peptide Therapeutics – An Emerging Modality Saladin et al.; IDrugs (2009), 12(12), 779-784. Therapeutic categories of peptide candidates entering clinical trials (1980-2007)
  4. 4. Peptide Therapeutics – Opportunities Selectivity Generation Intracellular access Delivery Action Oral delivery Small molecules Low to high synthetic High all routes Antago./ Agonist Yes Peptides High synthetic or recombinant Possible i.v. / s.c. non-parenteral delivery feasible Agonist / Antagonist Potential Biologics High recombinant Low i.v. / s.c. Antago./ Agonist No Proven Advantages of Peptides • Efficacy at extracellular targets, especially for polar or shallow binding pockets • Rapid optimization • Low off-target pharmacology • High target selectivity * *reflects current status; future potential for peptide antagonists, e.g., PPI’s
  5. 5. Peptides at Roche Growing asset of internal and external peptide compounds • Global Roche compound DB: >25,000 compounds registered with PEPTIDE flag (of 3.9M total) • Increasing demand for informatics infrastructure and support for peptide projects Combination Chart Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 993 920 850 780 710 640 567 496 426 355 284 213 140 70 0 26200 25400 24600 23800 23000 22200 21400 20600 19800 19000 18200 17400 16600 15800 15000 Newregistrations Peptides in IRCI 2003-2013 Totalno.peptides
  6. 6. Peptide Therapeutics – Informatics Challenges Molecule graphs Sequences Cheminformatics Bioinformatics Similarity searching SAR analysis, visualization Property prediction Small-molecule registration Sequence searching Alignment Sequence analysis Size, complexity Non-standard residues Chemical modifications No format standards Peptide informatics Figure adapted from J.H.Jensen, ChemAxon European UGM, 2012
  7. 7. Data Capture Challenges Peptide sequence format IUPAC-IUB Nomenclature and Symbolism for Amino Acids and Peptides (“3AA”, 1983) • 3-letter code for standard and common non-standard amino acids • Symbolism for representing amino acid sequences H -Asp-Arg-Val-DTyr-Ile-His-Pro-Phe-OH Ac - -NH2 Boc- - H … … Separator / Peptide bond N-terminal specification Residue C-terminal specificationStereoconfiguration
  8. 8. Data Capture Challenges How to capture non-standard sequence elements? Residue symbols Modified amino acids OH NH2 O L-Norvaline Nva (discouraged by IUPAC but commonly used) L-2-Aminovaleric acid? Avl (IUPAC) L-2-Aminopentanoic acid? Ape (IUPAC) O O OH NH2 L-4-Benzoylphenylalanine 4Bpa Phe(4-Bz) (systematic; avoid combinatorial explosion)
  9. 9. Data Capture Challenges How to capture non-standard sequence elements? Cyclic peptides Cross-links (disulfide bridges within or across chains, isopeptide bonds, …) O O O O O O O O N H O N NH NH NH2 NH N H O N NH NH NH2 NH (IUPAC recommendation, depiction rather than text) cyclo[Leu-DPhe-Pro-Val-Orn-Leu-DPhe-Pro-Val-Orn] H-Cys(1)-Tyr-Ile-Gln-Asn-Cys(1)-Pro-Leu-Gly-NH2 (IUPAC) SMILES-like notation; see also Biochemfusion’s PLN
  10. 10. Peptide Data Inventory Digest Roche peptides with NextMove’s Sugar&Splice 26000 24000 22000 20000 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 Top 50 monomer frequencies of 23k Roche peptides Standard AA (without Gly and Pro): 93% Top 50 monomers: 98%
  11. 11. Peptide Data Inventory Monomer library Roche Peptide Building Blocks • ~200 manually curated templates • Up to 600 monomers extracted from Roche peptides • Direct cartridge with normalization & uniqueness check Structure ID Short Name Chemical Name Category CAS Roche Number Ala A L-Alanine L-AA 56-41-7 ROxyz Fmoc Fmoc 9- Fluorenylmeth oxy-carbonyl SAG Sequence registrationPeptide drawing
  12. 12. Peptide Sequence Information Harmonizing peptide registration LINEAR STRUCTURE DESCRIPTION field Draw structure from local monomer templates H-His-Asp-Glu-Phe-Glu-Arg-His- Ala-Glu-Gly- ... -OH Enter sequence manually No format standards or validation PEPTIDE comment Compound registration system
  13. 13. Peptide Sequence Information Harmonizing peptide registration Synchronize drawing templates with monomer library Automatic sequence generation & validation Consistent structure and sequence information Atoms and bonds • Chemical identification • Novelty check • (Sub-)Structure searches Sequence • Depiction • Visual comparison • Sequence searches Tools for data analysis Building block library H-His-Asp-Glu-Phe-Glu-Arg-His- Ala-Glu-Gly- ... -OH LINEAR STRUCTURE DESCRIPTION fieldPEPTIDE comment Compound registration system
  14. 14. Peptide Drawing Central template management in Accelrys Draw Roche Peptide Building Blocks • ~200 manually curated templates • Categories: L-AA, D-AA, nS-AA, Linkers, Attachments, Resins Accelrys Draw Add-In • Download templates to Draw • Regular check for updates • Register new templates via Sequence Template Manager • Validate new templates
  15. 15. Peptide Sequence Information Sequence generation with NextMove’s Sugar&Splice Computational perception of peptide sequence from chemical structure • Output of sequence in standard format • Lookup of non-standard names in building block library Pipeline Pilot wrapper with easy-to-use web interface for registration Maintenance procedure for batch registration and validation • Check for peptides with empty/outdated sequences and update • Process legacy peptides and complete sequence information O O O O O O O O N H O N NH NH NH2 NH N H O N NH NH NH2 NH cyclo[Leu-DPhe-Pro-Val-Orn-Leu-DPhe-Pro-Val-Orn] Building block library Sugar & Splice
  16. 16. Peptide Sequence Information Interface to biologics landscape Sequence-based analysis tools • Sequence alignment, BLAST database search, … • Conversion to standard FASTA via Sugar & Splice: – Remove cycles and cross-links – Replace non-standard residues by X or the closest natural analog – Convert D-amino acids to L form Data exchange with biologics research • HELM format for macromolecule representation • Shared dictionary for peptide building blocks • Conversion to HELM via Sugar & Splice cyclo[Leu-DPhe-Pro-Val-Orn-Leu-DPhe-Pro-Val-Orn] PEPTIDE1{L.[dF].P.V.[Orn].L.[dF].P.V.[Orn]}$PEPTIDE1,PEPTIDE1,10:R2-1:R1$$$ LFPVXLFPVX
  17. 17. Summary & Benefits Re-use and adapt small-molecule tools and systems Ensure consistent structure and sequence information Interface to large-molecule world Benefits • Maximized data value & quality through harmonized sequence information • Enable automated sequence searches & analysis for synthetic peptides • Time savings for peptide drawing, registration and analysis • Future prospect: store sequence information within the molecular structure Compound registration system H-His-Asp-Glu-Phe-Glu-Arg-His- Ala-Glu-Gly- ... -OH
  18. 18. Acknowledgments Discovery Chemistry Konrad Bleicher Eric Kitas Kersten Klar Betty Hennequin Katja Ostmann Adrian Schäublin Patrick Studer-Schriber pRED Informatics Fausto Agnetti Gerd Blanke Gunther Dörnen Sébastien Fournier Werner Gotzeina Peter Hilty Ralf Horstmöller Dieter Imark Frederic Klein Stefan Klostermann Francesca Milletti Denis Ribaud Jörg Schmiedle Daniel Stoffler Klaus Weymann Steering Committee Alexander Alanine Margret Assfalg Ralph Haffner Harald Mauser Martin Stahl Accelrys François Culot Jonas Danielsson James Jack Georgios Rafeletos NextMove Software Roger Sayle
  19. 19. Doing now what patients need next