HELM Notation Overview

565 views

Published on

HELM, which was originally developed by Pfizer, provides a way to represent molecules that are too large to represent atomically or which contain non-natural chemical modifications that make it impractical to represent them as sequences.

HELM's structure hierarchy consists of complex and simple polymers, monomers, and atoms. It describes monomers using atoms and bonds, single-type polymers are described as a sequence of monomers, and complex multi-type polymers are described as connected polymers.

A detailed description of HELM is available in a paper that was published in the Journal of Chemical Information and Modeling.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
565
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Paper will soon be posted on the upcoming HELM web site.
  • HELM Notation Overview

    1. 1. http://pistoiaalliance.org @PistoiaAlliancePistoia Alliance HELM Project- What About the Big Guys?The emerging HELM standard for macromolecularrepresentationDomain Lead – Sergio RotsteinBusiness Technology, Pfizer
    2. 2. What is a “Biomolecule”?2PeptidesTherapeuticProteinsADCsAntibodiesVaccinesASOssiRNAsFor our purposes, anythingthat is not a small molecule isa biomoleculeGoal• Eliminate biomoleculepenalty• Make these entities first-class citizens of theInformatics tool portfolio
    3. 3. GAPSo what’s the problem?3NNHOOONNHOOOSmallMoleculesSequencesBiomoleculesSmall Molecule Tools Sequence-Based Tools
    4. 4. “Fit-for-Purpose” Structure RepresentationWe need to enable therepresentation, manipulation andvisualization of each molecule type ina way that is appropriate for its sizeand complexity4
    5. 5. Fit for Purpose: “Monomer” Level• While you could draw out an oligonucleotide like this:• The representation is likely more intuitive / practical:5
    6. 6. Fit for Purpose: Sequence Level• But even the monomer level representation would not scale well toproteins with hundreds of amino acids. Larger molecules require amore sequence-oriented representation:6
    7. 7. Fit for Purpose: Component Level• For multi-component structures such as antibody drugconjugates, component level representations are required to enableeach component to dealt with separately.7“Collapsed” AntibodyExpanded DrugAb
    8. 8. Hierarchical Editing Language for Macromolecules– Hierarchical – Amenable to the various “levels”• Complex Polymer ⇒ Simple Polymer ⇒ Monomer ⇒ Atom– Extensible• Allowing addition of new biopolymer types– (Reasonably) comprehensive• e.g. Allowing representation of oligonucleotidehybridization– Canonicalizable• Facilitating uniqueness checking– (Somewhat) human-readable8
    9. 9. HELM Example: Simple polymer• HELM notation: A.R.G.[dF].C.K.[ahA].E.D.A– Non-natural amino acid codes are enclosed in squarebrackets• Natural equivalent: ARGFCKXEDA9
    10. 10. HELM Example: Complex Polymer10
    11. 11. Monomer Database• Each monomer used in the notation needs to be predefined in amonomer database• The database includes the chemical structure of the monomer anda description of all acceptable attachment points11
    12. 12. J. Chem. Inf. Model 2012, 52, 2796-280612

    ×