Your SlideShare is downloading. ×
0
bioscienceThe ISA Commons: experiences from! field                                  the                    Susanna-Assunta...
•  Reproducible research    •  annotated research data and methods offer new       discovery opportunities and prevent unn...
Reproducibility                  Ioannidis et al., Repeatability of published microarray                  gene expression ...
Reproducibility                  Ioannidis et al., Repeatability of published microarray                  gene expression ...
Reproducibility                  Ioannidis et al., Repeatability of published microarray                  gene expression ...
Reproducibility                       6!                  6!
Across studies and groups                      7!                 7!
Reproducibility                       8!                  8!
NO to ‘data blobs’YES to verifiable, completeand structured information                             Image from datacite.org
Structured description of datasets                       !  Capture all salient features                          of the e...
Not too much, not too little, just ‘right’                          !  We must strike a balance                           ...
Example of experiments by                                                                                                 ...
Different community, different norms and standards, e.g.:                                   use the same word and         ...
Growing number of reporting standards                                                      + 303                          ...
A catalogue to map the                                                                                  landscape of stand...
A catalogue to map the                                                                                  landscape of stand...
Bioscience is not one domain!                                                                &+.!&*                 +,-*  ...
Is it possible to achieve a common, structuredrepresentation of diverse bioscience experiments that:•  transcends individu...
A growing ecosystem of over 30 public and internal resources         using the ISA metadata tracking framework to facilita...
A growing ecosystem of over 30 public and internal resources                          using the ISA metadata tracking fram...
A growing ecosystem of over 30 public and internal resources                          using the ISA metadata tracking fram...
Metadata tracking framework, designed tosupport the use us several standardschecklists, terminologies conversions to(a gro...
empowering researchers to use standards                                                                                   ...
TOWARDS INTEROPERABLE BIOSCIENCE DATA                                               doi:10.1038/ng.1054               Sans...
Upcoming SlideShare
Loading in...5
×

Susanna Sansone at DataCite: The ISA-Commons - experiences from the field

672

Published on

Susanna-Assunta Sansone's talk at the DataCite Summer meeting in Copenhagen on "The ISA-Commons - experiences from the field", 14th June 2012

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
672
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Susanna Sansone at DataCite: The ISA-Commons - experiences from the field"

  1. 1. bioscienceThe ISA Commons: experiences from! field the Susanna-Assunta Sansone, PhD Principal Investigator, Team Leader, University of Oxford e-Research Centre, Oxford, UK http://uk.linkedin.com/in/sasansone #biosharing DataCite Summer Meeting DIGITAL RESEARCH DATA IN PRACTICE: solutions for improving discovery, access and use June 14, 2012 Copenhagen
  2. 2. •  Reproducible research •  annotated research data and methods offer new discovery opportunities and prevent unnecessary repetition of work; •  improved data sharing underpins science of the future; •  but !.. shared data have little or no value if they are not interpretable and, consequently, reusable Image from datacite.org
  3. 3. Reproducibility Ioannidis et al., Repeatability of published microarray gene expression analyses. Nature Genetics 41(2), 3! 149-55 (2009) doi:10.1038/ng.295
  4. 4. Reproducibility Ioannidis et al., Repeatability of published microarray gene expression analyses. Nature Genetics 41(2), 4! 149-55 (2009) doi:10.1038/ng.295
  5. 5. Reproducibility Ioannidis et al., Repeatability of published microarray gene expression analyses. Nature Genetics 41(2), 5! 149-55 (2009) doi:10.1038/ng.295
  6. 6. Reproducibility 6! 6!
  7. 7. Across studies and groups 7! 7!
  8. 8. Reproducibility 8! 8!
  9. 9. NO to ‘data blobs’YES to verifiable, completeand structured information Image from datacite.org
  10. 10. Structured description of datasets !  Capture all salient features of the experimental workflow !  Make annotation explicit and discoverable !  Structure the descriptions for consistency, tracking !  independent variables !  dependent variables using !  cross reference and resolvable identifiers
  11. 11. Not too much, not too little, just ‘right’ !  We must strike a balance between •  depth and breadth of information; and •  sufficient information required to reuse the data
  12. 12. Example of experiments by InnoMed PredTox12 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 a FP6 public-private consortium Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  13. 13. Different community, different norms and standards, e.g.: use the same word and allow data to flow from report the same core, refer to the same ‘thing’ one system to another essential information Challenges: lack of coordination, fragmentation and uneven coverage
  14. 14. Growing number of reporting standards + 303 + 150 + 130 Source: MIBBI, Source: BioPortal EQUATOR Estimated MAGE-Tab! AAO! MIAME! GCDML! MIAPA! CHEBI! SRAxml! OBI! MIRIAM! VO! SOFT! MIQAS! FASTA! PATO! MIX! CML! ENVO! REMARK! DICOM! MIGEN! GELML! MOD! SBRML! MIAPE! MIQE! TEDDY! MITAB! MzML! XAO! CIMR! CONSORT! BTO!ISA-Tab! SEDML…! DO PRO! IDO…! MIASE! MISFISHIE….!
  15. 15. A catalogue to map the landscape of standards and the systems implementing them: Over 400 bio-standards (public and in curation) Field*, Sansone* et al., Omics data sharing. Science15 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone 326, 234-36 (2009) doi:0.1126/science.1180598 www.ebi.ac.uk/net-project
  16. 16. A catalogue to map the landscape of standards and the systems implementing them: Over 400 bio-standards (public and in curation) Field*, Sansone* et al., Omics data sharing. Science16 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone 326, 234-36 (2009) doi:0.1126/science.1180598 www.ebi.ac.uk/net-project
  17. 17. Bioscience is not one domain! &+.!&* +,-* /("* !"#$%&()*!  Bioscience is interdisciplinary and integrative in character •  need to deal with new and existing datasets •  deal with a variety of data types Source of the figure: EBI website
  18. 18. Is it possible to achieve a common, structuredrepresentation of diverse bioscience experiments that:•  transcends individual bioscience domains, but also•  follows the appropriate community norms and standards?
  19. 19. A growing ecosystem of over 30 public and internal resources using the ISA metadata tracking framework to facilitate standards- compliant collection, curation, management and reuse of investigations in an increasingly diverse set of life science domains, including: •  environmental health •  stem cell discovery •  environmental genomics •  system biology •  metabolomics •  transcriptomics •  metagenomics •  toxicogenomics •  nanotechnology •  also by communities working to build •  proteomics, a library of cellular signatures We aim to achieve a commonrepresentation of experimental content thattranscends individual bioscience domains Sansone et al., Towards interoperable bioscience data. Nature Genetics 44, 121-126 (2012) doi:10.1038/ng.1054
  20. 20. A growing ecosystem of over 30 public and internal resources using the ISA metadata tracking framework to facilitate standards- compliant collection, curation, management and reuse of investigations in an increasingly diverse set of life science domains, including: •  environmental health •  stem cell discovery •  environmental genomics •  system biology •  metabolomics •  transcriptomics •  metagenomics •  toxicogenomics •  nanotechnology •  also by communities working to build •  proteomics a library of cellular signatures Some of the public groups/resources: Some of the internal projects: Stem Cell Commons NanotechnologyInformatics Working Group
  21. 21. A growing ecosystem of over 30 public and internal resources using the ISA metadata tracking framework to facilitate standards- compliant collection, curation, management and reuse of investigations in an increasingly diverse set of life science domains, including: •  environmental health •  stem cell discovery •  environmental genomics •  system biology •  metabolomics •  transcriptomics •  metagenomics •  toxicogenomics •  nanotechnology •  also by communities working to build •  proteomics a library of cellular signatures Some of the public groups/resources: Some of the internal projects: Stem Cell Commons NanotechnologyInformatics Working Group
  22. 22. Metadata tracking framework, designed tosupport the use us several standardschecklists, terminologies conversions to(a growing number of) other metadataformats, used by public repositories, e.g. MAGE-Tab Pride-xml SRA-xml SOFTCurrently finalizing conversion to RDF toexplore the growing Linked Data universe,in collaboration with the W3C HCLSIG)
  23. 23. empowering researchers to use standards To mint DOIs23 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  24. 24. TOWARDS INTEROPERABLE BIOSCIENCE DATA doi:10.1038/ng.1054 Sansone SA, Rocca-Serra P, Field D, Maguire E, Taylor C, Hofmann O, Fang H, Neumann S, Tong W, Amaral-Zettler L, Begley K, Booth T, Bougueleret L, Burns G, Chapman B, Clark T, Coleman LA, Copeland J, Das S, de Daruvar A, de Matos P, Dix I, Edmunds S, Evelo C, Forster M, Gaudet P, Gilbert J, Goble C, Griffin J, Jacob D, Kleinjans J, Harland L, Haug K, Hermjakob H, Sui S, Laederach A, Liang S, Marshall S, Merrill E, McGrath A, Feb 2012 Reilly D, Roux M, Shamu C, Shang C, Steinbeck C, Trefethen A, Williams-Jones B, www.biosharing.org www.isacommons.org Wolstencroft K, Xenarios J, Hide W. www.isacommons.orgCommunity involvement and uptake!1st ISA-Tab workshop! 3rd ISA-Tab workshop! User workshops/visits - start! 1st public instance: ! ! 2nd ISA-Tab workshop! Other tools implement ! Harvard Stem Cell ! Growing number of ISA-Tab! Discovery Engine! systems starts to adopt ISA-Tab!Core developments! Conversions to ! Links to Pride-XML/SRA-XML/! analysis toolsStrawman ISA-Tab spec! ISA software v1! MAGE-Tab and more! starts! Final ISA-Tab spec! Database instance ! at EBI! RDF format starts!Publications! Stem Cell ! ISA-Tab and ! Discovery ! ISA Commons! Omics data sharing! Workshop reports! ISA software suite! Engine! (Science)! (Nature Genetics)! (Bioinformatics)! (NAR)!2007 2008 2009 2010 2011 2012Development timeline!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×