Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PRO Use Cases for Scientific Communities


Published on

Semantic Resources Project PRO (Protein Ontology) Use Cases for Scientific Communities - 3rd PRO meeting - 26 April 2010

Published in: Technology
  • Be the first to comment

  • Be the first to like this

PRO Use Cases for Scientific Communities

  1. 1. Semantic Resources ProjectPRO (Protein Ontology) Use Cases for Scientific Communities<br />Paolo Ciccarese, PhD<br />Mass General Hospital<br /> Harvard Medical School<br />
  2. 2. Semantic Resources Project <br />Collaboration between<br />Mass General Hospital<br />Science Commons<br />Alzforum<br />Goal: creating a set of semantic resources<br />Of interest for SWAN and SCF projects <br />Following best practice<br />Two examples of PRO related resources:<br />Proteins of interest in AlzSWAN (APP, Tau, Synuclein)<br />Antibodies (immunogens and antigens) <br />
  3. 3. SWAN (Semantic Web Application in Neuromedicine)<br />It is a Web-based collaborative program that aims to organize and annotate scientific knowledge about neurodegenerative disorders.<br />Collaboration of Mass General Hospital with Alzforum(<br />Its goal is to facilitate the formation, development and testing of hypotheses about neurodegenerative diseases.<br />It is about scientific discourse<br /><br />
  4. 4. AlzSWAN (SWAN for Alzheimer)<br /><br />
  5. 5. AlzSWAN Curation Process<br />SWAN: browser<br />1. reading<br />Curator<br />5. publishing<br />2. extracting<br />4. connecting<br />Hypothesis <br />Discourse List, Citations<br />Genes, Proteins<br />3. organizing and <br />encoding<br />SWAN: workbench<br />
  6. 6. Hypothesis list<br /><br />
  7. 7. Hypothesis<br /><br />
  8. 8. Hypothesis<br /><br />
  9. 9. The hypothesis graph<br />
  10. 10. AlzSWAN proteins management<br />When the curators need a new protein they import it automatically from UniProt<br />Data are stored according to the SWAN LSEs Ontology (<br />UniProt does not give access to all the info we need (isoforms, cleavage products…)<br />Willing to use PRO ontology and others when available (and demonstrate to cover our needs)<br /><br />
  11. 11. AlzSWAN<br /><ul><li>In AlzSWAN we have statements related to gene or protein families that do not specify a single species (human vs mouse or rat), but we have also species specific statements
  12. 12. Protein post-translationally modified have distinct roles:
  13. 13. Protein isoforms from a single gene have different physiological vs. pathological functions
  14. 14. Cleavage products have functions distinct from the parent protein
  15. 15. Example: APP protein Abeta peptides, soluble APP, AICD
  16. 16. Proteins are assembled into protein complexes, and these are affected by isoform and post-translational modification
  17. 17. Example: Tau protein: how many microtubule binding domains (which exons used from gene), phosphorylation state of protein affect MT binding or fibril formation
  18. 18. Example: Abeta peptides form dimers, oligomers, amyloid plaque</li></ul>SWAN Proteins: Gwen Wong & Elizabeth Wu (Alzforum)<br />
  19. 19. APP is a gene that encodes a complexity of proteins<br /><ul><li>A single locus: single copy per haploid genome
  20. 20. Multiple splice variant mRNAs
  21. 21. Mulitiple protein isoforms expressed
  22. 22. Some isoforms are ubiquitous, others neuron specific or T cell-specific
  23. 23. Variety of conserved functional domains throughout length of protein
  24. 24. Isoform differences in major domains may have functional consequences
  25. 25. APP interacts with multiple proteins
  26. 26. Mutations throughout protein associated with familial Alzheimer Disease
  27. 27. Cleavage products generated by α-, β- and γ-secretase are involved in both normal and pathogenic protein functions.</li></ul>SWAN Proteins: Gwen Wong & Elizabeth Wu (Alzforum)<br />
  28. 28. SWAN Protein Ontology is NOT <br />just for APP<br />SWAN Proteins: Gwen Wong & Elizabeth Wu (Alzforum)<br />
  29. 29. Summary of SWAN needs<br /><ul><li> Protein Ontology needs vocabulary and relationships that include:
  30. 30. Isoforms
  31. 31. Splice variants
  32. 32. Cleavage products
  33. 33. Mutations
  34. 34. Protein complexes: homomultimeric and heteromultimeric
  35. 35. Tissue-specific expression, protein complexes are important
  36. 36. Intracellular protein trafficking and post-translational modifications are important</li></ul>Goals for SWAN:<br />Research statements attributed to APP, can be specifically attributed to:<br /><ul><li> an isoform of APP
  37. 37. a cleavage product of APP
  38. 38. a mutated form of APP
  39. 39. a protein complex formed with APP or it’s cleavage products</li></ul>SWAN Proteins: Gwen Wong & Elizabeth Wu (Alzforum)<br />
  40. 40. AlzSWAN curators interaction with PRO<br />APP isoforms and cleavage products have been modeled<br />These required changes in the PRO model<br />Currently working on Protein Complexes<br />
  41. 41. AlzSWAN<br />PRO not only for SWAN!<br />
  42. 42. Science Collaboration Framework<br />Reusable software that can be used to develop web-based, collaborative, scientific communities<br />Support interdisciplinary scientists in publishing, annotating, sharing and discussing content such as articles, perspectives, interviews and news items, as well as assert personal biographies and research interests<br />Applied for the development of the StemBook and PDOnline online communities <br /><br />
  43. 43. StemBook<br />
  44. 44. PDOnline<br />
  45. 45. PRO is attractive for our communitites<br />AlzSWAN (<br />StemBook (<br />PDOnline (<br />Others coming up…<br />
  46. 46. Some thoughts about the <br />PRO terms submission process<br />
  47. 47. PRO terms are submitted by hand<br />RACE-PRO web form<br />UniprotKB identifier<br />Annotations<br />Database xrefs<br />“comments” <br />Some requests can’t be submitted through RACE-PRO<br />Protein families are submitted through SourceForge<br />
  48. 48. Batch request format is betterfor integrating with applications<br />*Manually curated<br /> HSPs related antibodies<br /><ul><li>Tabular format
  49. 49. Essential fields from RACE-PRO
  50. 50. “Tracer ID” tracks individual rows
  51. 51. Supports automated terms management processes
  52. 52. Tracer ID can be used as a “temporary” URI</li></li></ul><li>A “Term Broker” should manage interactions between PRO and applications<br />A term broker is server software that assembles batch requests for PRO<br />Receives requests from the modeler<br />Produces placeholder URIs<br />Submissions to PRO are made in batch<br />Additional protocol supports substitution of placeholder with final URIs<br />
  53. 53. Antibodies modeling<br />The ‘term broker’ can be used for improving our antibodies modeling process<br />We are starting with 3k antibodies thaken from the AlzForum database (20k+ antibodies) (<br />The modeling starts with semi-automatic annotation of the record through our tool <br />
  54. 54. Annotation Extracts Structured Information from Antibody Records<br /><ul><li>Automatic dictionaries identify regions of structured text
  55. 55. Java application allows curators to review, reject, or extend annotations
  56. 56. Annotations exported to an OBI RDF format
  57. 57. Terms requests submitted as batch request</li></li></ul><li>Annotations Translated Into OBI-Compliant RDF<br />
  58. 58. Terms requests submitted as batch request<br />We need to generate temporary IDs to be included in the request <br />Temporary IDs are used in the application for regular functioning and for keeping track of the requests<br />We need coded responses for the term requests in order to process them automatically in the application<br />
  59. 59. People<br />Mass General Hospital<br /><ul><li>Paolo Ciccarese
  60. 60. Timothy Danford
  61. 61. Tim Clark
  62. 62. Sudeshna Das</li></ul>Science Commons<br /><ul><li>Alan Ruttenberg
  63. 63. Jonathan Rees
  64. 64. Kaitlin Thaney</li></ul>Alzforum<br /><ul><li>Elizabeth Wu
  65. 65. Gwen Wong</li>