ISA Commons / BioSharing - Susanna-Assunta Sansone - ISMB 2012

822 views

Published on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
822
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
7
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

ISA Commons / BioSharing - Susanna-Assunta Sansone - ISMB 2012

  1. 1. ISMB hashtag: #PP44 Highlights Track: Databases and OntologiesToward interoperable bioscience data Susanna-Assunta Sansone, PhD Principal Investigator, Team Leader, University of Oxford e-Research Centre, Oxford, UK @isatools @biosharing ISMB 2012, Long Beach, California, USA, July 15-17
  2. 2. ISMB tag: What is this presentation about? #PP44§  ISA Commons, a grass-root collaborative that works to facilitate collection, curation and sharing of experiments in an increasingly diverse set of life science domains, using a common, structured representation of the experiments that •  transcends individual biological and technological domains, •  follows the appropriate community norms and standards, many listed in the BioSharing catalogue and •  is implemented by several curation, storage and data sharing tools TOWARDS INTEROPERABLE BIOSCIENCE DATA doi:10.1038/ng.1054 Sansone SA, Rocca-Serra P, Field D, Maguire E, Taylor C, Hofmann O, Fang H, Neumann S, Tong W, Amaral-Zettler L, Begley K, Booth T, Bougueleret L, Burns G, Chapman B, Clark T, Coleman LA, Copeland J, Das S, de Daruvar A, de Matos P, Dix I, Edmunds S, Evelo C, Forster M, Gaudet P, Gilbert J, Goble C, Griffin J, Jacob D, Kleinjans J, Harland L, Haug K, Hermjakob H, Sui S, Laederach A, Liang S, Marshall S, Merrill E, McGrath A, Feb 2012 Reilly D, Roux M, Shamu C, Shang C, Steinbeck C, Trefethen A, Williams-Jones B,www.biosharing.org www.isacommons.org Wolstencroft K, Xenarios J, Hide W. www.isacommons.org
  3. 3. ISMB tag: From reusable data to reproducible research #PP44To make the datasets comprehensible, interoperable and reusable,underpinning future investigations, we need common ways to report andshare the experimental details and the associated results. Consistent reporting will have a positive and long-lasting impact on the value of collective scientific outputs. The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  4. 4. ISMB tag:Structured description of datasets #PP44 §  Capture all salient features of the experimental workflow §  Make annotation explicit and discoverable §  Structure the descriptions for consistency, tracking §  independent variables §  dependent variables using §  cross reference and resolvable identifiers
  5. 5. ISMB tag:Not too much, not too little, just ‘right’ #PP44 §  We must strike a balance between •  depth and breadth of information; and •  sufficient information required to reuse the data
  6. 6. experimental design sample characteristic(s) experimental variable(s) technology(s) measurement(s) protocols(s) data file(s) ...... Example of experiments by InnoMed PredTox6 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 a FP6 public-private consortium Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  7. 7. ISMB tag:A ‘general mobilization’ to develop standards, e.g.: #PP44 use the same word and allow data to flow from report the same core, refer to the same ‘thing’ one system to another essential information Challenges: different communities, different norms and standards, lack of coordination, fragmentation and uneven coverage…
  8. 8. ISMB tag: Growing number of reporting standards #PP44 + 303 Each one focuses on a particular biological or technological domains + 150 + 130 Source: MIBBI, Source: BioPortal EQUATOR Estimated MAGE-Tab! AAO! MIAME! GCDML! MIAPA! CHEBI! SRAxml! OBI! MIRIAM! VO! SOFT! MIQAS! FASTA! PATO! MIX! CML! ENVO! REMARK! DICOM! MIGEN! GELML! MOD! SBRML! MIAPE! MIQE! TEDDY! MITAB! MzML! XAO! CIMR! CONSORT! BTO!ISA-Tab! SEDML…! DO PRO! IDO…! MIASE! MISFISHIE….!
  9. 9. A catalogue to map the landscape of standards : over 400 bio-standards (public and in curation) Field*, Sansone* et al., Omics data sharing. Science9 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone 326, 234-36 (2009) doi:0.1126/science.1180598 www.ebi.ac.uk/net-project
  10. 10. ISMB tag:Example of multi-assays study – how many #PP44 ‘standards’ are applicable to this?
  11. 11. ISMB tag:Example of multi-assays study – how many #PP44 ‘standards’ are applicable to this?
  12. 12. ISMB tag:Example of multi-assays study – how many #PP44 ‘standards’ are applicable to this?
  13. 13. ISMB tag:Example of multi-assays study – how many #PP44 ‘standards’ are applicable to this?
  14. 14. ISMB tag: #PP44 user communityThe International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  15. 15. ISMB tag: #PP44Metadata tracking framework, designed tosupport the use us several standardschecklists, terminologies conversions to(a growing number of) other metadataformats, used by public repositories, e.g. MAGE-Tab Pride-xml SRA-xml SOFTCurrently finalizing conversion to RDF toexplore the growing Linked Data universe,in collaboration with the W3C HCLSIG)
  16. 16. ISMB tag: #PP44ISA software suite: supporting standards-compliant experimentalannotation and enabling curation at the community level(Rocca-Serra et al, 2010)a collaborative effort of international research/service groups:University of Oxford, EBI, Harvard School of Public Health, NERC EnvironmentalBioinformatics Centre, Genomic Standards Consortium, US FDA Center forBioinformatics, Leibniz Institute of Plant Biochemistry and more….
  17. 17. ISMB tag: #PP44 To mint DOIs17 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project empowering researchers to use standards
  18. 18. ISMB tag: #PP44Maguire E, Rocca-Serra P, Sansone SA, Davies J and Chen M.Taxonomy-based Glyph Design -- with a Case Study on VisualizingWorkflows of Biological Experiments, IEEE Transactions on Visualization and Computer Graphics, volume 18, 2012 (in press)
  19. 19. ISMB tag: #PP44Ontology Search and Tagging in Google Spreadsheets
  20. 20. ISMB tag: #PP44Ontology Search and Tagging in Google Spreadsheets
  21. 21. A growing ecosystem of over 30 public and internal resources using the ISA metadata tracking framework to facilitate standards- compliant collection, curation, management and reuse of investigations in an increasingly diverse set of life science domains, including: •  environmental health •  stem cell discovery •  environmental genomics •  system biology •  metabolomics •  transcriptomics •  metagenomics •  toxicogenomics •  nanotechnology •  also by communities working to build •  proteomics, a library of cellular signatures We aim to achieve a commonrepresentation of experimental content thattranscends individual bioscience domains
  22. 22. A growing ecosystem of over 30 public and internal resources using the ISA metadata tracking framework to facilitate standards- compliant collection, curation, management and reuse of investigations in an increasingly diverse set of life science domains, including: •  environmental health •  stem cell discovery •  environmental genomics •  system biology •  metabolomics •  transcriptomics •  metagenomics •  toxicogenomics •  nanotechnology •  also by communities working to build •  proteomics a library of cellular signatures Some of the public groups/resources: Some of the internal projects: Stem Cell Commons NanotechnologyInformatics Working Group
  23. 23. ISMB tag:Implementations at Harvard #PP44 ISA
  24. 24. ISMB tag:Implementations at Harvard #PP44 Importance of a local community
  25. 25. ISMB tag: Implementations at Harvard #PP44data sharing in ISA-Tab Importance of a local community
  26. 26. ISMB tag: Implementation at the EBI #PP4426 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  27. 27. Data papers
  28. 28. Extensions Nanotechnology Informatics Working Group28 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  29. 29. @isatools @biosharingisacommons.org biosharing.org
  30. 30. TOWARDS INTEROPERABLE BIOSCIENCE DATA doi:10.1038/ng.1054 Sansone SA, Rocca-Serra P, Field D, Maguire E, Taylor C, Hofmann O, Fang H, Neumann S, Tong W, Amaral-Zettler L, Begley K, Booth T, Bougueleret L, Burns G, Chapman B, Clark T, Coleman LA, Copeland J, Das S, de Daruvar A, de Matos P, Dix I, Edmunds S, Evelo C, Forster M, Gaudet P, Gilbert J, Goble C, Griffin J, Jacob D, Kleinjans J, Harland L, Haug K, Hermjakob H, Sui S, Laederach A, Liang S, Marshall S, Merrill E, McGrath A, Feb 2012 Reilly D, Roux M, Shamu C, Shang C, Steinbeck C, Trefethen A, Williams-Jones B, www.biosharing.org www.isacommons.org Wolstencroft K, Xenarios J, Hide W. www.isacommons.orgCommunity involvement and uptake!1st ISA-Tab workshop! 3rd ISA-Tab workshop! User workshops/visits - start! 1st public instance: ! 2nd ISA-Tab workshop! Other tools implement ! Harvard Stem Cell ! Growing number of ISA-Tab! Discovery Engine! systems starts to adopt ISA framework!Core developments! Conversions to ! Links to Pride-XML/SRA-XML/! analysis toolsStrawman ISA-Tab spec! ISA software v1! MAGE-Tab and more! starts! Final ISA-Tab spec! Database instance ! at EBI! RDF format starts!Publications! Stem Cell ! ISA-Tab and ! Discovery ! ISA Commons! Omics data sharing! Workshop reports! ISA software suite! Engine! (Science)! (Nature Genetics)! (Bioinformatics)! (NAR)!2007 2008 2009 2010 2011 2012Development timeline

×