SlideShare a Scribd company logo
1 of 1
Download to read offline
Mining for Novel TNF Ligands Using Unison,
  http://unison-db.sourceforge.net/                                                                                                                                                           an Open Source Database for Target Discovery
                                                          Reece Hart <rkh@gene.com>  Departments of Bioinformatics and Protein Engineering  Genentech, Inc.  San Francisco, CA 94080
Abstract                                                                                                                                                                                                                      Mining Curated Sequence Databases                                                                                           Mining Six-Frame Translations of the Human Genome                                                                                                               Mining Pathogenic Sequences for TNF-like Structures
Tumor Necrosis Factor (TNF) ligands, acting through their cognate TNF receptors, are                                                                                                                                          We've mined public and proprietary sequence sources using many methods, including                                           Most TNF ligands are encoded in the Human Genome with the majority of the TNF                                                                                   Because extensive expression cloning and computational prediction failed to identify a
critical to numerous immunological responses, including B and T cell differentiation,                                                                                                                                         hidden Markov models and PSI-BLAST profiles from Pfam, CDD, Superfamily, and                                                domain in a single exon. This suggests that it might be possible to detect novel TNFs                                                                           novel human TNF ligand which bound any of the orphan TNF receptors, we began to
apoptosis, and inflammation. Several “orphan” TNF receptors exist for which the                                                                                                                                               custom sequence- and structure-based alignments, and threading using Prospect (Xu                                           by scanning naïve six-frame translations of ORFs. For calibration of scoring functions,                                                                         consider the possibility that these receptors might bind pathogenic proteins either as a
corresponding ligands are unknown. Over the past several years, we have undertaken                                                                                                                                            and Xu) and ProHit (Sippl). The figures below outline one way to integrate and analyze                                      we instead chose to scan fixed-length subsequences of 6-frame translations, as shown                                                                            surveillance mechanism or as an exploited “security hole” (as with herpes virus binding
attempts to identify these unknown ligands from curated protein sequences, six-frame                                                                                                                                          these data in Unison.                                                                                                       below.                                                                                                                                                          to HVEM, a TNF receptor). Recently, a new sequence appeared in Swiss-Prot which
translations of the human genome, and from pathogenic sequences. This poster                                                                                                                                                                                                                                          = mouse click                                                                                                                                                                                       threads extremely well to TNF backbones and occurs in a virus known for its host
                                                                                                                                                                                                                                                                                                                                                            Six-Frame Translation and Threading Method
summarizes these efforts and introduces Unison, an Open Source database for                                                                                                                                                                                                                                                                                                                                                                                                                                               evasion mechanisms.
organizing and mining complex proteomic data.                                                                                                                                                                                                                                                                                                                          1           2       3       4     5       6       7       X       8   9    10     11    12   13   14   15   16   17 18 19 20   Y   21 22


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  3B bp

Tumor Necrosis Factor Ligand Family                                                                                                                                                                                                                                                                                                                                                        450 NT fragment                                                 ●    UCSC genome assembly (NHGD34)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                450bp w/150bp overlap generates:
TNF ligands are type II membrane proteins which belong to the C1q-TNF superfamily
                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●

                                                                                                                                                                                                                                                                                                                                                                                                                                                                  –    10 M fragments
and signal through corresponding TNF receptors. Three putative TNF receptors have no                                                                                                                                                                                                                                                                                                                                                                              –    60M 6-frame translations
                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ~500M ORF fragments
known ligand, and this suggests that other ligands remain to be discovered. Most TNF                                                                                                                                                                                                                                                                                               ≤150 AA six-frame translations
                                                                                                                                                                                                                                                                                                                                                                                                                                                                  –
                                                                                                                                                                                                                                                                                                                                                                                                                                                                  –    27M fragments w/length ≥50AA (       )
domains are encoded by a single exon and bind one distinct TNF receptor, although                                                                                                                                                                                                                                                                                                                                            X                                    –    fragments <50AA ( X ) were discarded
                                                                                                                                                                                                                                                                                                                                                                                                                                                                27M fragments were threaded against 22 TNF
there are exceptions to both rules. The currently known TNF ligand-receptor                                                                                                                                                                                                                                                                                                                                                                                ●


                                                                                                                                                                                                                                                                                                                                                                                                                                                                superfamily members (TNF+C1q)
interactions and exon structures are shown below.                                                                                                                                                                                                                                                                                                                                                                                X                         ●    900K (of 27M) had score <=250; each was
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            1. Viral sequences sorted by the best TNF-C1q threading
                                                                                                             1c28, 1gr3
                    1d2q,1dg6




                                                                                         1jh5, 1kxg




                                                                                                                                                                                                                                                                                                                                                                                                                                                                threaded against 3286 representative chains
                                                                                                                                                                                                                                                                                                                                                                               X               X                                 X
                                                                  1aly, 1i9r
                                1iqa, 1jtz




                                                                                                                                                                                                                                                                                          2. Review candidates
                                             2tnf (mus)




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            “raw” score.
                                                                                                             Others:




                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●    total time: 176 CPU-weeks (4 weeks on 22 2-cpu
                                                                                                      9sgh




                                                                                                                                      TNF Family Exon Structure                                                                                                                                                                                                                                                                                                 machines)
                                                                                                                                                                                                                                                                                          Clicking any of the classified results at left returns a list
                                             1tnf,




                                                                                                                                      Most TNF domains are encoded within a single exon                                                                                                                                                                                    X           X       X                                                                                                                            VA28_MCV is one of a family of orthologous A28
                                                                                                                                                                                                                                                                                          of distinct sequences with their “best” annotations.                                                                                                                                                                              proteins in poxvirii.
                                                                                                                                                  0       50       100     150          200         250                          1. Integrating multiple search methods
                                                                                                                                          Lta   1
                                                                                                                                        TNFa    2
                                                                                                                                          Ltb   3
                                                                                                                                                                                                                                 A single Unison page allows users to select and                                                                                                                                                                                                                                                                                                       2. Threading results for VA28_MCV aligned to 3286 FSSP
                                                                                                                                       OX40L    4                                                                                integrate results from HMMs, PSSMs, and Prospect2
   1d0g 1d4v 1du3




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       representative backbones. TNF and C1q family members
                                                                                                                                       CD40L    5                                                                                threadings to any family of models (TNFs in this                                                                                      Distribution of Prospect2 raw scores
                                                                                                                                         FasL   6
                                                                                                                                                                                                                                                                                                                                                                       histogram shows the distribution of the best (lowest)                                                                                                                                                           are among the best fold recognition templates.
                                                                                                                                       CD27L    7                                                                                case). “Hits” are then classified into true positives,
                                                                                                                                                                                                                                                                                                                                                                       “raw” score for the alignment of each 150AA six-frame
                                                  1tnr




                                                                                                                                       CD30L    8                                                                                false negatives, and “unknown” positives
                                                                                                                                      4­1BBL    9
                                                                                                                                                                                                                                 (candidates) by reference to a curated list of known
                                                                                                                                                                                                                                                                                                                                                                       translation fragment to TNF-C1q superfamily backbones.                          Fragment threading identifies NP_848635.1




                                                                                                                                                                                                                                                                                                                                                           Frequency
                                                                                                                                                                                                                                                                                                                                                                       Fragment 8602 is highlighted and shown as an example
                                                              1bzi*




                                                                                                                                       TRAIL    10
                                                                                                                                                                                                                                                                                                                                                                                                                                                       Screenshots showing ambiguous alignment to different regions on
                                                                                                                                      RANKL     11                                                                               family members.                                                                                                                       below.                                                                          chr 13.
                                                                                                                                      TWEAK     12
                                                                                                                                       APRIL    13
                                                                                                                                                                                                                                                                                                                                                                       Unfortunately, only distinctly C1q-like proteins have been
                                                                                                                    NP




                                                                                                                                        BLyS    13B
                                                                                                                                       LIGHT    14                                                                                                                                                                                                                     identified so far.
                                                                                                                                        VEGI
                                                                                                                                       AITRL
                                                                                                                                                15
                                                                                                                                                18                                                                                                                                                                                                                                                         TBD: 166 w/score ≤ -120                                                                                        ▶
                                                                                                                                         EDA                                                                                                                                                                                                                                                           (max TNF fragment score = -154)
                                                                                                                                                                Exon                  TNF Domain
                                                                                                                                                                                                                                                                                                                                                                                                       analyzed: 76 w/score ≤ -200
                                                               Adapted from Bodmer, Schneider, Tschopp
                                                               TiBS 27(1): 19-26 (2002).


Twenty-two structures of TNF and C1q structures are known, all of which have                                                                                                                                                                                                                                                                                                                                         8602
                                                                                                                                                                                                                                                                                          4. Genomic map.                                                  Best raw score to any TNF SF member (lower is better)
profound structural similarity among the ligands despite very poor sequence similarity
(average pairwise identity is between ~ 9 and ~30%). Identifying TNFs by sequence-                                                                                                                                                                                                        Unison contains rudimentary protein-to-genome
                                                                                                                                                                                                                                                                                          alignments using BLAT. This sequence has a high-
based methods is difficult because of the poor sequence conservation and their                                                                                                                                                                                                            quality orthologous C-terminal fragment from mouse.                                                                                                                                                                                 3. For comparison, the alignment of Apo2L/TRAIL to
similarity to C1q proteins, which are not relevant to our interest in ligands for the                                                                                                                                                                                                     Clicking the map opens an in-house viewer with more                                                                                                                                                                                 the same FSSP representatives. The raw score for the
                                                                                                                                                                                                                                                                                          extensive genomic mapping data.                                  Threading of Unison:8602 to 1c28a                                                                                                                                  alignment of VA28_MCV to 1gr3a, a TNF-C1q family
orphan receptors.                                                                                                                                                                                                                                                                                                                                          Unison provides on-the-fly threading visualization via JMol, PyMOL,                                                                                                member, is denoted by the red triangle (▶) and is
                                                                                                                                                                                                                                                                                                                                                           and RasMOL. (PyMOL is used below.)
                                                                                                                                                                                                                                                                                                                                                           Legend: blue=identity; cyan=similarity; red=dissimilarity;
                                                                                                                                                                                                                                                                                                                                                                   blue           cyan              red                                                                                                                       comparable to those for alignments of known TNFs to
                                                                               CD40L (1aly)                                                                    structure-based alignment of two TNFs by CE                                                                                                                                                 yellow=cysteine; yellow spacefill= conserved cysteine; grey=query
                                                                                                                                                                                                                                                                                                                                                           yellow                  spacefill                        grey                                                                                                      other TNF-C1q structures.
                                                                                                                                                                                                                                                                                                                                                           gap/template insert; >nAA< = query insert/template gap

                                                                                                                                                                                                                                                                                                                                                                                                                                             Threading Results for Fragment 8602                                                                                                       4. A28 aligned to CD40L.
                                                                               A'                                                                                                                                                                                                                                                                                                                                                            looks more C1q-like than TNF-like, but close                                                                                              Legend: blue=identity; cyan=similarity; red=dissimilarity;
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               blue           cyan             red
                                                              A                                                                                                                        120º
                                                                                     B                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 yellow=cysteine; yellow spacefill= conserved cysteine; grey=query
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       yellow                  spacefill                      grey
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       gap/template insert; >nAA< = query insert/template gap
                                                          H                     B'
                                                      C
                                               F
                                                                                                                                                A'                                                                                                                                                                                                                                                                                                                                                                          Reasons for hope:                                         Reasons for doubt:
                                                                                                                                                                                                                                  3. Summary of features for Unison:8602.                                                                                                                                                                                                                                                   ●  VA28_MCV has a signal peptide and is known to be on    ●  threading alignment has a significant deletion (but is
                                                                  G                                                               H A                                                                                                                                                                                                                                                                                                                                                                                          viral coat; conditional mutants abolish entry             nearly as good as other intra-TNF family alignments)
                                                    E     D                                                                   C                                                                                                                                                                                                                                                                                                                                                                                                MCV has numerous genes for host evasion, including        A28 doesn't thread as well to other TNF backbones
                                                                                                                                                      B                                                                                                                                                                                                                                                                                                                                                                     ●                                                         ●


                                                                                                                          F                     B'                               1aly (CD40L)                                                                                                                                                                                                                                                                                                                                  homologs for a Death Effector Domain which inhibits    ●  other A28s don't thread well to TNFs
                                                                                                                                                                         90º     1tnf (TNFα)                                                                                                                                                                                                                                                                                                                                   caspase-8 (also found in HSV), IL18 BP, and MHC        ●  some viral capsid proteins also have a similar fold
                                                                                                                                             G                                                                                                                                                                                                                                                                                                                                                                                 class I complex which may act as a decoy.                 (but in RNA viruses)
                                                                                                                                  E      D                                                                                                                                                                                                                                                                                                                                                                                  ●  There is a precedent for viral entry via TNFR: HSV     ●  VA28_MCV does not appear to stimulate any of the
                                                                                                                                                                                      CE-generated alignment                                                                                                                                                                                                                                                                                                                   enters via TNFRSF14/HveA/HVEM.                            orphan receptors. Non-orphans have not been tested.
                                                                                                                                                                                      141 aligned residues
                                                                                                                                                                                      2.2 Å RMSD (backbone)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ●  MCV infects keratinocytes, which are known to
                                                                                                                                                                                      26% Identity (c.f. 19% by S-W)                                                                             5. On-the-fly re-threading of sequence 8602 to                                                                                                                                                                                express TNFR during their development
                                                                                                                                                                                      c.f.   0.71 Å RMSD / 65 AA
                                                                                                                                                                                             0.78 Å RMSD / 48 AA 1aly-1c28a
                                                                                                                                                                                                                                                                                                 the TRAIL ligand viewed with RasMOL (PyMOL
                                                                                                                                                                                                                                                                                                 and JMol are also supported).




About Unison                                                                                                                                                                                                                  Unison Contents                                                                                                             Conclusions and Directions                                                                                                                                      Acknowledgments
Unison is a database of non-redundant protein sequences, diverse computational                                                                                                                                                ● >5M distinct sequences from >40 reliable and speculative sources covering >9900                                           ● We have identified several candidate TNF ligands among curated and speculative                                                                                Kiran Mukhyala and David Cavanaugh have contributed immensely to Unison.
predictions based on these sequences, and extensive auxiliary data which facilitate                                                                                                                                             species                                                                                                                     human sequence databases, six frame translations of the R34 release of the human
interpretations of the predictions. The intent is to provide an integrated resource for                                                                                                                                       ● features and alignments from BLAST, PSI-BLAST, HMMER, Prospect threading, GPI                                               genome, and pathogenic sequence, but none appear to bind the orphan TNF                                                                                       The TNF mining effort was a multi-year collaboration within Genentech and included:
complex feature-based mining for target discovery and target elimination. Unison                                                                                                                                                anchoring, TM detection, signal prediction, cellular localization, genomic localization,                                    receptors.                                                                                                                                                    Vishva Dixit, Wayne Fairbrother, Sarah Hymowitz, Nobuhiko Kayagaki, Nick Skelton,
includes command line tools and a web interface. The schema, tools, web interface,                                                                                                                                              regular expressions, CE alignments, and secondary structure prediction                                                    ● A large number of C1q-like sequences exist in the human genome.                                                                                               Minhong Yan, and Zemin Zhang.
and dumps of non-proprietary data have recently been released under the Academic                                                                                                                                              ● external databases: NCBI taxonomy, HomoloGene, GO, PDB (w/enumerated seqres-                                              ● Unison has facilitated the management, update, and analysis of an enormous amount


Free License and are available at http://unison-db.sourceforge.net/ .                                                                                                                                                           resid mapping), SCOP, MINT, Derwent Patent Database                                                                         of diverse precomputed data.                                                                                                                                  Thanks to Genentech and William Wood for providing a great place to work.

More Related Content

Viewers also liked

The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesReece Hart
 
Invitae PSB 2014 poster
Invitae PSB 2014 posterInvitae PSB 2014 poster
Invitae PSB 2014 posterReece Hart
 
AWS Life Sciences
AWS Life SciencesAWS Life Sciences
AWS Life SciencesReece Hart
 
Building a clinical genome interpretation services company
Building a clinical genome interpretation services companyBuilding a clinical genome interpretation services company
Building a clinical genome interpretation services companyReece Hart
 
Unison: Enabling easy, rapid, and comprehensive proteomic mining
Unison: Enabling easy, rapid, and comprehensive proteomic miningUnison: Enabling easy, rapid, and comprehensive proteomic mining
Unison: Enabling easy, rapid, and comprehensive proteomic miningReece Hart
 
Clinical significance of transcript alignment discrepancies gne - 20141016
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016Reece Hart
 
HGVS 2015 poster: hgvs, uta, variantanalyzer
HGVS 2015 poster: hgvs, uta, variantanalyzerHGVS 2015 poster: hgvs, uta, variantanalyzer
HGVS 2015 poster: hgvs, uta, variantanalyzerReece Hart
 

Viewers also liked (7)

The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment Discrepancies
 
Invitae PSB 2014 poster
Invitae PSB 2014 posterInvitae PSB 2014 poster
Invitae PSB 2014 poster
 
AWS Life Sciences
AWS Life SciencesAWS Life Sciences
AWS Life Sciences
 
Building a clinical genome interpretation services company
Building a clinical genome interpretation services companyBuilding a clinical genome interpretation services company
Building a clinical genome interpretation services company
 
Unison: Enabling easy, rapid, and comprehensive proteomic mining
Unison: Enabling easy, rapid, and comprehensive proteomic miningUnison: Enabling easy, rapid, and comprehensive proteomic mining
Unison: Enabling easy, rapid, and comprehensive proteomic mining
 
Clinical significance of transcript alignment discrepancies gne - 20141016
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016
 
HGVS 2015 poster: hgvs, uta, variantanalyzer
HGVS 2015 poster: hgvs, uta, variantanalyzerHGVS 2015 poster: hgvs, uta, variantanalyzer
HGVS 2015 poster: hgvs, uta, variantanalyzer
 

Similar to Mining for Novel TNF Ligands

Experimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectExperimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectFundación Ramón Areces
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
BTC 810 Analysis of Transcriptomes.pptx
BTC 810 Analysis of Transcriptomes.pptxBTC 810 Analysis of Transcriptomes.pptx
BTC 810 Analysis of Transcriptomes.pptxChijiokeNsofor
 
Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesKlaas Vandepoele
 
Biological versus computer viruses
Biological versus computer virusesBiological versus computer viruses
Biological versus computer virusesUltraUploader
 
Forsharing cshl2011 sequencing
Forsharing cshl2011 sequencingForsharing cshl2011 sequencing
Forsharing cshl2011 sequencingSean Davis
 
DNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyDNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyBikash1489
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsgroovescience
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGLong Pei
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxxRowlet
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 

Similar to Mining for Novel TNF Ligands (20)

Experimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome ProjectExperimentos de nubes científicas: Medical Genome Project
Experimentos de nubes científicas: Medical Genome Project
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
 
Msb201158
Msb201158Msb201158
Msb201158
 
LOKITHESWARI VIPPALA
LOKITHESWARI VIPPALALOKITHESWARI VIPPALA
LOKITHESWARI VIPPALA
 
CROP GENOME SEQUENCING
CROP GENOME SEQUENCINGCROP GENOME SEQUENCING
CROP GENOME SEQUENCING
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
BTC 810 Analysis of Transcriptomes.pptx
BTC 810 Analysis of Transcriptomes.pptxBTC 810 Analysis of Transcriptomes.pptx
BTC 810 Analysis of Transcriptomes.pptx
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysis
 
2011-NAR
2011-NAR2011-NAR
2011-NAR
 
Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomes
 
Biological versus computer viruses
Biological versus computer virusesBiological versus computer viruses
Biological versus computer viruses
 
Forsharing cshl2011 sequencing
Forsharing cshl2011 sequencingForsharing cshl2011 sequencing
Forsharing cshl2011 sequencing
 
DNA Sequencing in Phylogeny
DNA Sequencing in PhylogenyDNA Sequencing in Phylogeny
DNA Sequencing in Phylogeny
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traits
 
HGP, the human genome project
HGP, the human genome projectHGP, the human genome project
HGP, the human genome project
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 

Recently uploaded

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Recently uploaded (20)

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

Mining for Novel TNF Ligands

  • 1. Mining for Novel TNF Ligands Using Unison, http://unison-db.sourceforge.net/ an Open Source Database for Target Discovery Reece Hart <rkh@gene.com>  Departments of Bioinformatics and Protein Engineering  Genentech, Inc.  San Francisco, CA 94080 Abstract Mining Curated Sequence Databases Mining Six-Frame Translations of the Human Genome Mining Pathogenic Sequences for TNF-like Structures Tumor Necrosis Factor (TNF) ligands, acting through their cognate TNF receptors, are We've mined public and proprietary sequence sources using many methods, including Most TNF ligands are encoded in the Human Genome with the majority of the TNF Because extensive expression cloning and computational prediction failed to identify a critical to numerous immunological responses, including B and T cell differentiation, hidden Markov models and PSI-BLAST profiles from Pfam, CDD, Superfamily, and domain in a single exon. This suggests that it might be possible to detect novel TNFs novel human TNF ligand which bound any of the orphan TNF receptors, we began to apoptosis, and inflammation. Several “orphan” TNF receptors exist for which the custom sequence- and structure-based alignments, and threading using Prospect (Xu by scanning naïve six-frame translations of ORFs. For calibration of scoring functions, consider the possibility that these receptors might bind pathogenic proteins either as a corresponding ligands are unknown. Over the past several years, we have undertaken and Xu) and ProHit (Sippl). The figures below outline one way to integrate and analyze we instead chose to scan fixed-length subsequences of 6-frame translations, as shown surveillance mechanism or as an exploited “security hole” (as with herpes virus binding attempts to identify these unknown ligands from curated protein sequences, six-frame these data in Unison. below. to HVEM, a TNF receptor). Recently, a new sequence appeared in Swiss-Prot which translations of the human genome, and from pathogenic sequences. This poster = mouse click threads extremely well to TNF backbones and occurs in a virus known for its host Six-Frame Translation and Threading Method summarizes these efforts and introduces Unison, an Open Source database for evasion mechanisms. organizing and mining complex proteomic data. 1 2 3 4 5 6 7 X 8 9 10 11 12 13 14 15 16 17 18 19 20 Y 21 22 3B bp Tumor Necrosis Factor Ligand Family 450 NT fragment ● UCSC genome assembly (NHGD34) 450bp w/150bp overlap generates: TNF ligands are type II membrane proteins which belong to the C1q-TNF superfamily ● – 10 M fragments and signal through corresponding TNF receptors. Three putative TNF receptors have no – 60M 6-frame translations ~500M ORF fragments known ligand, and this suggests that other ligands remain to be discovered. Most TNF ≤150 AA six-frame translations – – 27M fragments w/length ≥50AA ( ) domains are encoded by a single exon and bind one distinct TNF receptor, although X – fragments <50AA ( X ) were discarded 27M fragments were threaded against 22 TNF there are exceptions to both rules. The currently known TNF ligand-receptor ● superfamily members (TNF+C1q) interactions and exon structures are shown below. X ● 900K (of 27M) had score <=250; each was 1. Viral sequences sorted by the best TNF-C1q threading 1c28, 1gr3 1d2q,1dg6 1jh5, 1kxg threaded against 3286 representative chains X X X 1aly, 1i9r 1iqa, 1jtz 2. Review candidates 2tnf (mus) “raw” score. Others: ● total time: 176 CPU-weeks (4 weeks on 22 2-cpu 9sgh TNF Family Exon Structure machines) Clicking any of the classified results at left returns a list 1tnf, Most TNF domains are encoded within a single exon X X X VA28_MCV is one of a family of orthologous A28 of distinct sequences with their “best” annotations. proteins in poxvirii. 0 50 100 150 200 250 1. Integrating multiple search methods Lta 1 TNFa 2 Ltb 3 A single Unison page allows users to select and 2. Threading results for VA28_MCV aligned to 3286 FSSP OX40L 4 integrate results from HMMs, PSSMs, and Prospect2 1d0g 1d4v 1du3 representative backbones. TNF and C1q family members CD40L 5 threadings to any family of models (TNFs in this Distribution of Prospect2 raw scores FasL 6 histogram shows the distribution of the best (lowest) are among the best fold recognition templates. CD27L 7 case). “Hits” are then classified into true positives, “raw” score for the alignment of each 150AA six-frame 1tnr CD30L 8 false negatives, and “unknown” positives 4­1BBL 9 (candidates) by reference to a curated list of known translation fragment to TNF-C1q superfamily backbones. Fragment threading identifies NP_848635.1 Frequency Fragment 8602 is highlighted and shown as an example 1bzi* TRAIL 10 Screenshots showing ambiguous alignment to different regions on RANKL 11 family members. below. chr 13. TWEAK 12 APRIL 13 Unfortunately, only distinctly C1q-like proteins have been NP BLyS 13B LIGHT 14 identified so far. VEGI AITRL 15 18 TBD: 166 w/score ≤ -120 ▶ EDA (max TNF fragment score = -154) Exon TNF Domain analyzed: 76 w/score ≤ -200 Adapted from Bodmer, Schneider, Tschopp TiBS 27(1): 19-26 (2002). Twenty-two structures of TNF and C1q structures are known, all of which have 8602 4. Genomic map. Best raw score to any TNF SF member (lower is better) profound structural similarity among the ligands despite very poor sequence similarity (average pairwise identity is between ~ 9 and ~30%). Identifying TNFs by sequence- Unison contains rudimentary protein-to-genome alignments using BLAT. This sequence has a high- based methods is difficult because of the poor sequence conservation and their quality orthologous C-terminal fragment from mouse. 3. For comparison, the alignment of Apo2L/TRAIL to similarity to C1q proteins, which are not relevant to our interest in ligands for the Clicking the map opens an in-house viewer with more the same FSSP representatives. The raw score for the extensive genomic mapping data. Threading of Unison:8602 to 1c28a alignment of VA28_MCV to 1gr3a, a TNF-C1q family orphan receptors. Unison provides on-the-fly threading visualization via JMol, PyMOL, member, is denoted by the red triangle (▶) and is and RasMOL. (PyMOL is used below.) Legend: blue=identity; cyan=similarity; red=dissimilarity; blue cyan red comparable to those for alignments of known TNFs to CD40L (1aly) structure-based alignment of two TNFs by CE yellow=cysteine; yellow spacefill= conserved cysteine; grey=query yellow spacefill grey other TNF-C1q structures. gap/template insert; >nAA< = query insert/template gap Threading Results for Fragment 8602 4. A28 aligned to CD40L. A' looks more C1q-like than TNF-like, but close Legend: blue=identity; cyan=similarity; red=dissimilarity; blue cyan red A 120º B yellow=cysteine; yellow spacefill= conserved cysteine; grey=query yellow spacefill grey gap/template insert; >nAA< = query insert/template gap H B' C F A' Reasons for hope: Reasons for doubt: 3. Summary of features for Unison:8602. ● VA28_MCV has a signal peptide and is known to be on ● threading alignment has a significant deletion (but is G H A viral coat; conditional mutants abolish entry nearly as good as other intra-TNF family alignments) E D C MCV has numerous genes for host evasion, including A28 doesn't thread as well to other TNF backbones B ● ● F B' 1aly (CD40L) homologs for a Death Effector Domain which inhibits ● other A28s don't thread well to TNFs 90º 1tnf (TNFα) caspase-8 (also found in HSV), IL18 BP, and MHC ● some viral capsid proteins also have a similar fold G class I complex which may act as a decoy. (but in RNA viruses) E D ● There is a precedent for viral entry via TNFR: HSV ● VA28_MCV does not appear to stimulate any of the CE-generated alignment enters via TNFRSF14/HveA/HVEM. orphan receptors. Non-orphans have not been tested. 141 aligned residues 2.2 Å RMSD (backbone) ● MCV infects keratinocytes, which are known to 26% Identity (c.f. 19% by S-W) 5. On-the-fly re-threading of sequence 8602 to express TNFR during their development c.f. 0.71 Å RMSD / 65 AA 0.78 Å RMSD / 48 AA 1aly-1c28a the TRAIL ligand viewed with RasMOL (PyMOL and JMol are also supported). About Unison Unison Contents Conclusions and Directions Acknowledgments Unison is a database of non-redundant protein sequences, diverse computational ● >5M distinct sequences from >40 reliable and speculative sources covering >9900 ● We have identified several candidate TNF ligands among curated and speculative Kiran Mukhyala and David Cavanaugh have contributed immensely to Unison. predictions based on these sequences, and extensive auxiliary data which facilitate species human sequence databases, six frame translations of the R34 release of the human interpretations of the predictions. The intent is to provide an integrated resource for ● features and alignments from BLAST, PSI-BLAST, HMMER, Prospect threading, GPI genome, and pathogenic sequence, but none appear to bind the orphan TNF The TNF mining effort was a multi-year collaboration within Genentech and included: complex feature-based mining for target discovery and target elimination. Unison anchoring, TM detection, signal prediction, cellular localization, genomic localization, receptors. Vishva Dixit, Wayne Fairbrother, Sarah Hymowitz, Nobuhiko Kayagaki, Nick Skelton, includes command line tools and a web interface. The schema, tools, web interface, regular expressions, CE alignments, and secondary structure prediction ● A large number of C1q-like sequences exist in the human genome. Minhong Yan, and Zemin Zhang. and dumps of non-proprietary data have recently been released under the Academic ● external databases: NCBI taxonomy, HomoloGene, GO, PDB (w/enumerated seqres- ● Unison has facilitated the management, update, and analysis of an enormous amount Free License and are available at http://unison-db.sourceforge.net/ . resid mapping), SCOP, MINT, Derwent Patent Database of diverse precomputed data. Thanks to Genentech and William Wood for providing a great place to work.