Sansone mibbi-intro


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Sansone mibbi-intro

  1. 1. Susanna-Assunta Sansone, PhD Team Leader, University of Oxford, UK Biocuration, 11 th -14 th October 2010, Tokyo, Japan Omics Data Sharing BioSharing: on Data Policies’s Plans and Reporting Standards   Innovative technology accelerating research
  2. 2. Information intensive research investigations The International Conference on Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone <ul><li>Modern studies often run source material through several kinds of assay in parallel </li></ul><ul><ul><li>E.g. genomic sequencing, protein-protein interaction assays , or the measurement of metabolite concentrations and fluxes </li></ul></ul>
  3. 3. Comprehensible, reusable, reproducible research The International Conference on Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone <ul><li>Data (descriptions of biological entities of interest, e.g. , genes, targets and their measurements, e.g. , intensity, location) must be shared accompanied by enough, well annotated experimental information ( i.e. - ‘metadata’- provenance of study materials, technology and measurement types, etc .) </li></ul>
  4. 4. The International Conference on Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone <ul><li>Three types of standards enable unambiguous representation , description and communication of the experimental information </li></ul><ul><ul><li>Minimum core information specifications: checklists </li></ul></ul><ul><ul><li>Semantics: nomenclatures and terminologies </li></ul></ul><ul><ul><li>Syntax: exchange formats </li></ul></ul>Reporting standards as enablers
  5. 5. Strong advocators The International Conference on Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone <ul><li>Journals, biocurators and the research community continue to participate in the development of standards, tools and databases </li></ul><ul><ul><li>to support sharing of sufficiently well annotated datasets </li></ul></ul><ul><ul><li>to enable comprehensible, reusable, reproducible research </li></ul></ul>
  6. 6. ...funders are developing data policies... <ul><li>Several data preservation , management and sharing policies have emerged in response to increased funding for genomics and functional genomics (omics) bioscience domains </li></ul>
  7. 7. ....similar trend in the regulatory arena <ul><li>“… lack of standardized data affects CDER’s review processes by curtailing a reviewer’s ability to perform integral tasks such as rapid acquisition, storage, analysis......efficient management of a portfolio of standards projects will require coordinated efforts and clear roles for multiple participants within/outside FDA ” </li></ul>
  8. 8. ....and commercial sector <ul><li>R&D has invested heavily in procedures and tools that integrate external information with their own data to enhance the decision-making process </li></ul><ul><li>Now joining forces to streamline non-competitive elements of the life science workflow by the specification of common standards , business terms, relationships and processes </li></ul>
  9. 9. <ul><li>Increased efficiency </li></ul><ul><ul><li>Metadata remains associated with the results generated </li></ul></ul><ul><ul><li>Avoids the risk of loss of information through staff turnover </li></ul></ul><ul><ul><li>Enables time-efficient handover of projects </li></ul></ul><ul><li>Enhanced confidence in data </li></ul><ul><ul><li>Enables fully-informed assessment of results (methods used etc.) </li></ul></ul><ul><ul><li>Supports the discovery of sources of systematic or random errors </li></ul></ul><ul><ul><li>Facilitates better-informed comparisons or cross analysis of data sets </li></ul></ul><ul><li>Defined requirements for submission, exchange and/or publication </li></ul><ul><ul><li>Within multi-sites organization or between collaborators </li></ul></ul><ul><ul><li>To journals or repositories </li></ul></ul><ul><ul><li>To regulatory bodies </li></ul></ul>Reporting standards as a means to an end
  10. 10. Escalating number of efforts (omics domain), e.g.
  11. 11. <ul><li>Wide variety of ‘authorities’ </li></ul><ul><ul><li>Standard organizations </li></ul></ul><ul><ul><li>International projects </li></ul></ul><ul><ul><li>Working groups </li></ul></ul><ul><ul><li>Grass-root movements </li></ul></ul><ul><li>Multi-stakeholders and multi-disciplinary </li></ul><ul><ul><li>Researchers, biocurators, software/database developers , modellers </li></ul></ul><ul><ul><li>Academics, industries, governmental and regulatory bodies </li></ul></ul><ul><ul><li>Manufacturers, software vendors, journal editors and funders </li></ul></ul><ul><li>Heterogeneous focus - beyond reporting requirements- e.g. </li></ul><ul><ul><li>Broader understanding of the use of omics’ data </li></ul></ul><ul><ul><li>Agreed world-wide recommendations </li></ul></ul><ul><ul><li>Measurements and methods validation </li></ul></ul>Heterogeneity of the efforts
  12. 12. Navigating a sea of ‘standards’
  13. 13. I work on plants, are these just for biomedical applications? Which one are mature enough for me to use or recommend? How can I get involved to propose extensions or modifications? Which tools and databases implement which one? Which one are widely accepted and recognized? What are the criteria to evaluate status and value? ...?.... ...?.... ....?... ...?.... ....?... ...?.... I use HT sequencing technologies, which one are applicable to me? Navigating a sea of ‘standards’ Which tools and databases implement which one? I use HT sequencing technologies, which one are applicable to me?
  14. 14. HT sequencing: public databases and ‘standards’ EBI NCBI ENA SRA
  15. 15. HT sequencing: public databases and ‘standards’ EBI NCBI ENA SRA <ul><li>checklists: MIAME, MINSEQ </li></ul><ul><li>formats: MAGE-ML, MAGE-Tab </li></ul><ul><li>ontologies: MGED Ontology... </li></ul><ul><li>checklists: MIGS, MIENS </li></ul><ul><li>format: GCDML </li></ul><ul><li>ontologies: EnVO light... </li></ul>INSDC feature table
  16. 16. (2008) Vol 26 No 8
  17. 17. <ul><ul><li>Serves researchers , biocurators , journal editors and reviewers , and funders to </li></ul></ul><ul><ul><ul><li>discover checklist s for a particular domain </li></ul></ul></ul><ul><ul><ul><li>monitor progress of extant efforts </li></ul></ul></ul><ul><ul><ul><li>facilitate collaborations </li></ul></ul></ul><ul><ul><li>Link a sister effort in health research EQUATOR ( ) </li></ul></ul><ul><ul><li>Funds for coordination activities and meetings </li></ul></ul><ul><ul><ul><li>Our next meeting is in Germany, Dec 2010 </li></ul></ul></ul>
  18. 18. Linking checklists to terminologies
  19. 19. Science (2009), Vol 326, 234-236
  20. 20. <ul><li>Aim to protect cumulative data outputs, recognizing </li></ul><ul><ul><li>data sharing as a way to accelerate subsequent exploitation </li></ul></ul><ul><ul><li>the right of first use for data provides and right to appropriate accreditation </li></ul></ul><ul><ul><li>the importance of ‘standards’ in annotation and reporting process, but... </li></ul></ul>
  21. 21. <ul><li>Often inconsistent and/or unclear on the standards and methods to be used for preserving, managing and sharing data </li></ul><ul><ul><li>“ .. recommend use of appropriate standards ..”, “.. where these exists ..”, “... mature, stable efforts ..”, “.. MIAME format ..”, “.. accredited standards organizations ..”, “.. deposition to public repositories.. ”, “.. release on websites ..”...etc </li></ul></ul>
  22. 22. <ul><li>Urgent need for fostering communications between policy makers, the ‘standards’ groups </li></ul><ul><ul><li>including researchers, biocurators and developers </li></ul></ul>
  23. 23. ‘ Social engineering’
  24. 24. <ul><li>It is NOT </li></ul><ul><ul><li>a top down initiative to harmonize the standards </li></ul></ul><ul><ul><li>another society to develop another standard </li></ul></ul><ul><ul><li>a data or tool resource </li></ul></ul><ul><li>Started as a blog (supplementary materials for the article in Science )....... </li></ul>
  25. 28. <ul><li>Call for participation </li></ul><ul><ul><li>We need more.......... </li></ul></ul><ul><ul><li>Suggestions on how to best create and populate the website, list the ‘standards’, allow updates, link to other portals etc... </li></ul></ul><ul><li>Close engagement with the Biocuration Society and the new BioDBcore effort </li></ul>
  26. 29. Part of our software development activities Rocca-Serra et al , Bioinformatics , 2010 ISA software suite open source:
  27. 30. Science , 2009 Bioinformatics , 2010 Nature Biotech , (invited)