Susanna-Assunta Sansone, PhD Team Leader,  University of Oxford, UK (Updated version of presentation given at)  Biocuration, 11 th -14 th  October 2010, Tokyo, Japan BioSharing:  on Data Policies’s Plans  and Reporting Standards  
Information intensive research investigations The International Conference on Systems Biology (ICSB) , 22-28 August, 2008  Susanna-Assunta Sansone  www.ebi.ac.uk/net-project Modern  studies  often run source material through several kinds of  assay  in parallel E.g. genomic sequencing, protein-protein interaction  assays , or the measurement of metabolite concentrations and fluxes
Comprehensible, reusable, reproducible research  The International Conference on Systems Biology (ICSB) , 22-28 August, 2008  Susanna-Assunta Sansone  www.ebi.ac.uk/net-project Data  (descriptions of biological   entities of interest,  e.g. , genes, targets and their measurements,  e.g. , intensity, location) must be shared accompanied by enough, well annotated  experimental information  ( i.e.  - ‘metadata’- provenance of study materials, technology and measurement types,  etc .)
The International Conference on Systems Biology (ICSB) , 22-28 August, 2008  Susanna-Assunta Sansone  www.ebi.ac.uk/net-project Three types of standards enable  unambiguous  representation ,  description  and  communication  of the  experimental information Minimum core information specifications:  checklists Semantics:  nomenclatures  and  terminologies Syntax:  exchange formats Reporting standards as enablers
Strong advocators The International Conference on Systems Biology (ICSB) , 22-28 August, 2008  Susanna-Assunta Sansone  www.ebi.ac.uk/net-project Journals, biocurators and the research community continue to participate in the development of  standards, tools  and  databases   to support sharing of  sufficiently well annotated   datasets to enable comprehensible, reusable, reproducible research
...funders are developing data policies... Several data  preservation ,  management  and  sharing   policies have emerged in response to increased funding for genomics and functional genomics  (omics)  bioscience  domains
....similar trend in the regulatory arena “…   lack of standardized data affects CDER’s review processes by curtailing a reviewer’s ability to perform integral tasks such as rapid acquisition, storage, analysis......efficient management of a portfolio of  standards projects  will require coordinated efforts and clear roles for multiple participants within/outside FDA ”
....and commercial sector R&D has invested heavily in procedures and tools that integrate external information with their own data to enhance the decision-making process Now joining forces to streamline non-competitive elements of the life science workflow by the specification of  common standards , business  terms, relationships and processes
Escalating number of efforts (omics domain),  e.g.
Navigating a sea of ‘standards’
I work on plants, are these just for biomedical applications?  Which one are mature enough for me to use or recommend?  How can I get involved to propose extensions or modifications? Which tools and databases  implement  which one? Which one are widely accepted and recognized? What are the criteria to evaluate status and value? ...?....  ...?....  ....?...  ...?....  ....?...  ...?....  I use HT sequencing technologies, which one are applicable to me? Navigating a sea of ‘standards’ Which tools and databases  implement  which one? I use HT sequencing technologies, which one are applicable to me?
HT sequencing: public databases and ‘standards’ EBI NCBI ENA SRA
HT sequencing: public databases and ‘standards’ EBI NCBI ENA SRA checklists: MIAME, MINSEQ formats: MAGE-ML, MAGE-Tab ontologies: MGED Ontology... checklists: MIGS, MIENS format: GCDML ontologies: EnVO light... INSDC  feature table
(2008) Vol 26 No 8 http://mibbi.org
Serves  researchers ,  biocurators ,  journal   editors   and   reviewers , and   funders   to discover   checklist s for a particular domain monitor   progress  of extant efforts  facilitate collaborations Link a sister effort in health research EQUATOR ( www.equator-network.org ) Funds for coordination activities and meetings Our next meeting is in Germany, Dec 2010
Linking checklists to terminologies http://www.ebi.ac.uk/ontology-lookup/ http://bioportal.bioontology.org/ http://www.obofoundry.org/
Science (2009), Vol 326, 234-236 http://biosharing.org
Aim to protect cumulative data outputs, recognizing  data sharing as a way to accelerate subsequent exploitation the right of first use for data provides and right to appropriate accreditation the importance of ‘standards’  in annotation and reporting process,  but...
Often  inconsistent  and/or  unclear  on the standards and methods to be used for preserving, managing and sharing data “ .. recommend use of appropriate standards ..”, “.. where these exists ..”, “... mature, stable efforts ..”, “.. MIAME format ..”, “.. accredited standards organizations ..”, “.. deposition to public repositories.. ”, “.. release on websites ..”...etc
Urgent need for fostering communications between policy makers, the ‘standards’ groups including researchers, biocurators and developers
‘ Social engineering’ http://biosharing.org
It is  NOT a top down initiative to harmonize the standards another society to develop another standard a data or tool resource Started as a blog (supplementary materials for the article in  Science )....... http://biosharing.org
 
 
 
The International Conference on Systems Biology (ICSB) , 22-28 August, 2008  Susanna-Assunta Sansone  www.ebi.ac.uk/net-project
The International Conference on Systems Biology (ICSB) , 22-28 August, 2008  Susanna-Assunta Sansone  www.ebi.ac.uk/net-project
The International Conference on Systems Biology (ICSB) , 22-28 August, 2008  Susanna-Assunta Sansone  www.ebi.ac.uk/net-project
The International Conference on Systems Biology (ICSB) , 22-28 August, 2008  Susanna-Assunta Sansone  www.ebi.ac.uk/net-project Launch of prototype in December Iterative development, also based on feedback (enrichment, enhancement and links to other existing/new resources)

Sansone bio sharing introduction

  • 1.
    Susanna-Assunta Sansone, PhDTeam Leader, University of Oxford, UK (Updated version of presentation given at) Biocuration, 11 th -14 th October 2010, Tokyo, Japan BioSharing: on Data Policies’s Plans and Reporting Standards  
  • 2.
    Information intensive researchinvestigations The International Conference on Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project Modern studies often run source material through several kinds of assay in parallel E.g. genomic sequencing, protein-protein interaction assays , or the measurement of metabolite concentrations and fluxes
  • 3.
    Comprehensible, reusable, reproducibleresearch The International Conference on Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project Data (descriptions of biological entities of interest, e.g. , genes, targets and their measurements, e.g. , intensity, location) must be shared accompanied by enough, well annotated experimental information ( i.e. - ‘metadata’- provenance of study materials, technology and measurement types, etc .)
  • 4.
    The International Conferenceon Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project Three types of standards enable unambiguous representation , description and communication of the experimental information Minimum core information specifications: checklists Semantics: nomenclatures and terminologies Syntax: exchange formats Reporting standards as enablers
  • 5.
    Strong advocators TheInternational Conference on Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project Journals, biocurators and the research community continue to participate in the development of standards, tools and databases to support sharing of sufficiently well annotated datasets to enable comprehensible, reusable, reproducible research
  • 6.
    ...funders are developingdata policies... Several data preservation , management and sharing policies have emerged in response to increased funding for genomics and functional genomics (omics) bioscience domains
  • 7.
    ....similar trend inthe regulatory arena “… lack of standardized data affects CDER’s review processes by curtailing a reviewer’s ability to perform integral tasks such as rapid acquisition, storage, analysis......efficient management of a portfolio of standards projects will require coordinated efforts and clear roles for multiple participants within/outside FDA ”
  • 8.
    ....and commercial sectorR&D has invested heavily in procedures and tools that integrate external information with their own data to enhance the decision-making process Now joining forces to streamline non-competitive elements of the life science workflow by the specification of common standards , business terms, relationships and processes
  • 9.
    Escalating number ofefforts (omics domain), e.g.
  • 10.
    Navigating a seaof ‘standards’
  • 11.
    I work onplants, are these just for biomedical applications? Which one are mature enough for me to use or recommend? How can I get involved to propose extensions or modifications? Which tools and databases implement which one? Which one are widely accepted and recognized? What are the criteria to evaluate status and value? ...?.... ...?.... ....?... ...?.... ....?... ...?.... I use HT sequencing technologies, which one are applicable to me? Navigating a sea of ‘standards’ Which tools and databases implement which one? I use HT sequencing technologies, which one are applicable to me?
  • 12.
    HT sequencing: publicdatabases and ‘standards’ EBI NCBI ENA SRA
  • 13.
    HT sequencing: publicdatabases and ‘standards’ EBI NCBI ENA SRA checklists: MIAME, MINSEQ formats: MAGE-ML, MAGE-Tab ontologies: MGED Ontology... checklists: MIGS, MIENS format: GCDML ontologies: EnVO light... INSDC feature table
  • 14.
    (2008) Vol 26No 8 http://mibbi.org
  • 15.
    Serves researchers, biocurators , journal editors and reviewers , and funders to discover checklist s for a particular domain monitor progress of extant efforts facilitate collaborations Link a sister effort in health research EQUATOR ( www.equator-network.org ) Funds for coordination activities and meetings Our next meeting is in Germany, Dec 2010
  • 16.
    Linking checklists toterminologies http://www.ebi.ac.uk/ontology-lookup/ http://bioportal.bioontology.org/ http://www.obofoundry.org/
  • 17.
    Science (2009), Vol326, 234-236 http://biosharing.org
  • 18.
    Aim to protectcumulative data outputs, recognizing data sharing as a way to accelerate subsequent exploitation the right of first use for data provides and right to appropriate accreditation the importance of ‘standards’ in annotation and reporting process, but...
  • 19.
    Often inconsistent and/or unclear on the standards and methods to be used for preserving, managing and sharing data “ .. recommend use of appropriate standards ..”, “.. where these exists ..”, “... mature, stable efforts ..”, “.. MIAME format ..”, “.. accredited standards organizations ..”, “.. deposition to public repositories.. ”, “.. release on websites ..”...etc
  • 20.
    Urgent need forfostering communications between policy makers, the ‘standards’ groups including researchers, biocurators and developers
  • 21.
    ‘ Social engineering’http://biosharing.org
  • 22.
    It is NOT a top down initiative to harmonize the standards another society to develop another standard a data or tool resource Started as a blog (supplementary materials for the article in Science )....... http://biosharing.org
  • 23.
  • 24.
  • 25.
  • 26.
    The International Conferenceon Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  • 27.
    The International Conferenceon Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  • 28.
    The International Conferenceon Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
  • 29.
    The International Conferenceon Systems Biology (ICSB) , 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project Launch of prototype in December Iterative development, also based on feedback (enrichment, enhancement and links to other existing/new resources)