Susanna Sansone presented on enabling reproducible bioscience data through biocuration standards. There is a growing movement for standards to allow data sharing and reuse. However, the many overlapping standards cause interoperability issues. Sansone's team developed the Investigation-Study-Assay (ISA) framework to provide a common format implementing various standards. Their exemplar project involves collaboratively curating life science experiments using ISA to make data comprehensible, interoperable, and reusable.
This document summarizes products and services from Essen BioScience including instrumentation, cell assays, reagents, cells, and discovery services. The instrumentation includes the IncuCyte ZOOM live content microscope for automated acquisition of phase and fluorescence images within a tissue culture incubator. The CellPlayer assays, reagents and cells allow kinetic analysis of various cellular processes like proliferation, apoptosis, cytotoxicity, migration, invasion and angiogenesis over long periods of time without disturbing the cells. Discovery services include ion channel services, live cell assay development and partnerships.
With advances in technology, enormous amounts of data have become available for bioscience researchers. While this high volume of information holds tremendous promise for expanding the science knowledge base, it must be organized for meaningful study. Bioinformatics is a discipline that devises methods for storing, distributing, and analyzing biological data used by diverse areas of research. Bioinformatics professionals develop software and tools that assist researchers in the analysis of data related to molecular biology and genome studies.
This document discusses data management and curation in bioinformatics. It describes Susanna-Assunta Sansone as the principal investigator and team leader at the University of Oxford e-Research Centre, where her team works on data management, biocuration, software development, databases, and community standards and ontologies for various domains including toxicology, health, and agriculture. The document promotes the importance of data standards to enable data sharing and reproducibility in bioscience research.
The document discusses reproducible bioscience data. It describes Susanna-Assunta Sansone as a principal investigator and team leader at the University of Oxford e-Research Centre who gives a presentation on policies, communities, and standards around reproducible bioscience data. The presentation covers topics like preserving institutional memory, utilizing public data, and addressing reproducibility and reuse of public data through community standards and structured data annotation.
This document discusses the ISA Commons project, which aims to facilitate sharing of life science experiments using a common structured representation. It does this by [1] using a format that can describe experiments across domains, [2] following community standards and norms, and [3] being implemented in curation and data sharing tools. The presentation outlines challenges around inconsistent reporting and many related standards, and describes how ISA Commons addresses these through its metadata tracking framework and software suite. This enables standardized experimental annotation and data sharing across a growing number of public resources and research groups.
This document summarizes 3 key initiatives from NISO aimed at improving discovery of electronic content: KBART (Knowledge Bases and Related Tools) which works to ensure timely and accurate data transfer to knowledge bases; ODI (Open Discovery Initiative) which develops recommendations and standards for open discovery; and PIE-J, a proposed standard for exchanging publication information between systems. It provides an overview of KBART including its phase II update and participating organizations.
1) Big data standards are needed to make data understandable, reusable, and shareable across different databases and domains.
2) Effective standards require reporting sufficient experimental details and context in both human-readable and machine-readable formats.
3) Developing standards is a collaborative process involving different stakeholder groups to define requirements, vocabularies, and data models through both formal standards bodies and grassroots organizations.
This document summarizes products and services from Essen BioScience including instrumentation, cell assays, reagents, cells, and discovery services. The instrumentation includes the IncuCyte ZOOM live content microscope for automated acquisition of phase and fluorescence images within a tissue culture incubator. The CellPlayer assays, reagents and cells allow kinetic analysis of various cellular processes like proliferation, apoptosis, cytotoxicity, migration, invasion and angiogenesis over long periods of time without disturbing the cells. Discovery services include ion channel services, live cell assay development and partnerships.
With advances in technology, enormous amounts of data have become available for bioscience researchers. While this high volume of information holds tremendous promise for expanding the science knowledge base, it must be organized for meaningful study. Bioinformatics is a discipline that devises methods for storing, distributing, and analyzing biological data used by diverse areas of research. Bioinformatics professionals develop software and tools that assist researchers in the analysis of data related to molecular biology and genome studies.
This document discusses data management and curation in bioinformatics. It describes Susanna-Assunta Sansone as the principal investigator and team leader at the University of Oxford e-Research Centre, where her team works on data management, biocuration, software development, databases, and community standards and ontologies for various domains including toxicology, health, and agriculture. The document promotes the importance of data standards to enable data sharing and reproducibility in bioscience research.
The document discusses reproducible bioscience data. It describes Susanna-Assunta Sansone as a principal investigator and team leader at the University of Oxford e-Research Centre who gives a presentation on policies, communities, and standards around reproducible bioscience data. The presentation covers topics like preserving institutional memory, utilizing public data, and addressing reproducibility and reuse of public data through community standards and structured data annotation.
This document discusses the ISA Commons project, which aims to facilitate sharing of life science experiments using a common structured representation. It does this by [1] using a format that can describe experiments across domains, [2] following community standards and norms, and [3] being implemented in curation and data sharing tools. The presentation outlines challenges around inconsistent reporting and many related standards, and describes how ISA Commons addresses these through its metadata tracking framework and software suite. This enables standardized experimental annotation and data sharing across a growing number of public resources and research groups.
This document summarizes 3 key initiatives from NISO aimed at improving discovery of electronic content: KBART (Knowledge Bases and Related Tools) which works to ensure timely and accurate data transfer to knowledge bases; ODI (Open Discovery Initiative) which develops recommendations and standards for open discovery; and PIE-J, a proposed standard for exchanging publication information between systems. It provides an overview of KBART including its phase II update and participating organizations.
1) Big data standards are needed to make data understandable, reusable, and shareable across different databases and domains.
2) Effective standards require reporting sufficient experimental details and context in both human-readable and machine-readable formats.
3) Developing standards is a collaborative process involving different stakeholder groups to define requirements, vocabularies, and data models through both formal standards bodies and grassroots organizations.
1) Traditional ontology research involved developing individual ontologies and annotating data with them, but now thousands of ontologies and huge amounts of data are available online.
2) This allows new opportunities for fundamental and applied research in automatically aligning data and ontologies to make sense of both, and mapping the landscape of semantics on the web.
3) Empirical studies are needed to understand ontology engineering practices by analyzing interconnected online ontologies, and how data and ontologies interact on the web.
Poster Semantic data integration proof of conceptNicolas Bertrand
This document summarizes a proof of concept study that tested the ability to semantically integrate ecological data from different databases using the Socio-Ecological Research and Observation oNTOlogy (SERONTO). The study showed that SERONTO could successfully import database schemas and reference lists, map relations between database tables and SERONTO concepts, and allow complex queries across multiple connected databases from within SERONTO. However, maintaining mappings between reference lists and coupling value sets, units and calculations requires further work. Overall, the study demonstrated the feasibility of using SERONTO and semantic approaches to provide integrated access to distributed ecological data.
Data integration is intrinsic to how modern research is undertaken in areas such as genomics, drug development and personalised medicine. To better enable this integration a large number of biomedical ontologies have been developed to provide standard semantics for describing metadata. There are now several hundred biomedical ontologies in widespread use that describe concepts such as genes, molecules, drugs and diseases. This amounts to millions of terms that are interconnected via relationships that naturally form a graph of biomedical terminology.
The Ontology Lookup Service (OLS) (http://www.ebi.ac.uk/ols) integrates over 160 ontologies and provide a central point for the biomedical community to query and visualise ontologies. OLS also provide a RESTful API over the ontologies that is used in high-throughput data annotation pipelines. OLS is built on top of a Neo4j database that provides efficient indexes for extracting ontological relationships. We have developed generic tools for loading RDF/OWL ontologies into Neo4j where the indexes are optimised for serving common ontology queries. We are now moving to adopt graph database more widely in applications relating to ontology mapping prediction and recommendation systems for data annotation.
The document discusses JSTOR's local discovery integration pilot program, which aims to make JSTOR collections more discoverable outside of library systems. It provides statistics on where researchers begin their discovery process and emphasizes the need to meet researchers where they start. The pilot program partners JSTOR with various discovery platforms, including Summon, Primo, and EBSCO, to integrate JSTOR's collections into the search results. The goal is a "virtuous circle of access" by improving discoverability of local resources for both local and non-local user populations. Initial feedback from the pilot sites is also discussed.
The Evolution of e-Research: Machines, Methods and MusicDavid De Roure
The document summarizes the evolution of e-research over three generations from 1981 to the present. The first generation saw early adopters using tools within their disciplines with some reuse. The second generation was characterized by increased reuse of tools, data and methods across areas. The third generation is defined by radical sharing of resources globally across any discipline through social networks and reusable research objects. The document also discusses several specific projects and tools that exemplify each generation of e-research including myExperiment, Galaxy, and SALAMI.
The Symbiotic Nature of Provenance and WorkflowEric Stephan
This document discusses the symbiotic relationship between provenance and workflows in scientific research. It notes that workflows provide automation and integration capabilities, while provenance provides documentation of what transpired. The document provides examples of workflow and provenance technologies and outlines challenges around interoperability. It concludes that recognizing the interdependent relationship between provenance and workflows can help advance systems science research.
Towards Incidental Collaboratories; Research Data ServicesAnita de Waard
This document discusses enabling "incidental collaboratories" by collecting and connecting biological research data through a centralized framework. It argues that biology research is currently quite isolated due to its small scale and competitive nature. The framework would involve storing experimental data with metadata, allowing analyses across similar experiment types and biological subjects, and preserving data long-term with access controls. This could help move labs from being isolated to being "sensors in a network" and address objections around data ownership and quality.
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Ross Mounce
A talk given at the Geological Society of London, UK on 2016/03/09 as part of the Lyell meeting on Palaeoinformatics. http://www.geolsoc.org.uk/lyell16 #lyell16
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...dolleyj
The Evidence & Conclusion Ontology (ECO) has been developed to provide standardized descriptions for types of evidence within the biological domain. Best
practices in biocuration require that when a biological assertion is made (e.g. linking a Gene Ontology (GO) term for a molecular function to a protein), the type of evidence
supporting it is captured. In recent development efforts, we have been working with other ontology groups to ensure that ECO classes exist for the types of curation they
support. These include the Ontology for Microbial Phenotypes and GO. In addition, we continue to support user-level class requests through our GitHub issue tracker. To
facilitate the addition and maintenance of new classes, we utilize ROBOT (a command line tool for working with Open Biomedical Ontologies) as part of our standard workflow.
ROBOT templates allow us to define classes in a spreadsheet and convert them to Web Ontology Language (OWL) axioms, which can then be merged into ECO. ROBOT is
also part of our automated release process. Additionally, we are engaged in ongoing work to map ECO classes to Ontology for Biomedical Investigation classes using logical
definitions. ECO is currently in use by dozens of groups engaged in biological curation and the number of ECO users continues to grow. The ontology, in OWL and Open
Biomedical Ontology (OBO) formats, and associated resources can be accessed through our GitHub site (https://github.com/evidenceontology/evidenceontology) as well as
the ECO web page (http://evidenceontology.org/).
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
ONTOLOGY SERVICE CENTER: A DATAHUB FOR ONTOLOGY APPLICATIONIJwest
With the growth of data-oriented research in humanities, a large number of research datasets have been
created and published through web services. However, how to discover, integrate and reuse these distributed
heterogeneous research datasets is a challenging task. Ontology is the soul between series digital humanities
resources, which provides a good way for people to discover and understand these datasets. With the release
of more and more linked open data and knowledge bases, a large number of ontologies have been produced
at the same time. These ontologies have different publishing formats, consumption patterns, and interactions
ways, which are not conductive to the user’s understanding of the datasets and the reuse of the ontologies.
The Ontology Service Center platform consists of Ontology Query Center and Ontology Validation Center,
mainly using linked data and ontology-based technologies. The Ontology Query Center realizes the functions
of ontology publishing, querying, data interaction and online browsing, while the Ontology Validation
Center can verify the status of using certain ontologies in the linked datasets. The empirical part of the paper
uses the Confucius portrait as an example of how OSC can be used in the semantic annotation of images. In
a word, the purpose of this paper is to construct the applied ecology of ontology to promote the development
of knowledge graphs and the spread of ontology.
ONTOLOGY SERVICE CENTER: A DATAHUB FOR ONTOLOGY APPLICATION dannyijwest
With the growth of data-oriented research in humanities, a large number of research datasets have been
created and published through web services. However, how to discover, integrate and reuse these distributed
heterogeneous research datasets is a challenging task. Ontology is the soul between series digital humanities
resources, which provides a good way for people to discover and understand these datasets. With the release
of more and more linked open data and knowledge bases, a large number of ontologies have been produced
at the same time
The document discusses making experimental data and methods more reproducible and accessible by providing structured metadata alongside narrative descriptions. It recommends using community standards and ontologies to semantically tag key information, and machine-readable formats to structure descriptions in a consistent way. Tools are proposed to help authors report structured information and curate it according to these standards to make data fully FAIR (findable, accessible, interoperable, reusable). The goal is to move from experiments that are difficult to reproduce to those that are "born reproducible".
Where are we going and how are we going to get there?David De Roure
The document discusses the myExperiment virtual research environment for sharing workflows. Some key points:
1. myExperiment is a social network and repository for research workflows and methods. It currently has over 1800 users and hundreds of shared workflows.
2. The site allows fine-grained privacy controls, grouping of related content into "packs", and integration with other systems through federation.
3. Analysis found that most workflows and other content are shared publicly, and some users actively build upon other users' shared workflows. The most viewed workflow has over 1500 views.
4. The principles behind myExperiment's design focus on empowering scientists by enabling new forms of collaboration and sharing without forcing changes to workflows. The
Taxonomies and Ontologies – The Yin and Yang of Knowledge ModellingSemantic Web Company
See how ontologies and taxonomies can play together to reach the ultimate goal, which is the cost-efficient creation and maintenance of an enterprise knowledge graph. The knowledge modelling methodology is supported by approaches taken from NLP, data science, and machine learning.
Invited talk at the European Research Council-Brussels (Scientific Seminar, 12 April 2013): "Love for Science or 'academic prostitution'". In this talk I present a personal revision (sometimes my own vision) of some issues that I consider key for doing Science. It was focused on the expected audience, mainly Scientific Officers with background in different fields of science and scholarship, but also Agency staff.
Abstract: In a recent Special issue of Nature concerning Science Metrics it was claimed that " Research reverts to a kind of 'academic prostitution' in which work is done to please editors and referees rather than to further knowledge."If this is true, funding agencies should try to avoid falling into the trap of their own system. By perpetuating this 'prostitution' they risk not funding the best research but funding the best sold research.
Given the current epoch of economical crisis, where in a quest for funds researchers are forced into competitive game of pandering to panelists, its seems a good time for deep reflection about the entire scientific system.
With this talk I aim to provoke extra critical thinking among the committees who select evaluators, and among the evaluators, who in turn require critical thinking to the candidates when selecting excellent science.
I will present some initiatives (e.g. new tracers of impact for the Web era- 'altmetrics'), and on-going projects (e.g. how to move from publishing advertising to publishing knowledge), that might enable us to favor Science over marketing.
Here are the key points about Husserl's phenomenology that are relevant to understanding the epistemological assumptions of the alternative theory presented in this dissertation:
- Phenomenology aims to study phenomena (things as they appear in our experience) rather than phenomena as they exist independently of us. It focuses on our subjective experience of the world rather than making claims about an objective reality.
- The natural attitude refers to our normal, taken-for-granted way of perceiving and interacting with the world. We see things as stable objects that exist independently of us. Phenomenology asks us to suspend or "bracket" this natural attitude in order to study phenomena as they appear to consciousness.
- Intentionality refers
This document summarizes a presentation given by Susanna Sansone at the GSC 23rd meeting education day in Bangkok, Thailand on August 7, 2023. The presentation discussed standards across life sciences, including definitions of different types of standards and over 1,600 identified standards. It covered standard organizations and grassroots groups, as well as the FAIRsharing database which catalogs over 2,885 standards and databases and aims to promote their use and value across research.
More Related Content
Similar to Sa sansone dccroadshow-nov2012Delivering reproducible bioscience data by enabling biocuration at the source
1) Traditional ontology research involved developing individual ontologies and annotating data with them, but now thousands of ontologies and huge amounts of data are available online.
2) This allows new opportunities for fundamental and applied research in automatically aligning data and ontologies to make sense of both, and mapping the landscape of semantics on the web.
3) Empirical studies are needed to understand ontology engineering practices by analyzing interconnected online ontologies, and how data and ontologies interact on the web.
Poster Semantic data integration proof of conceptNicolas Bertrand
This document summarizes a proof of concept study that tested the ability to semantically integrate ecological data from different databases using the Socio-Ecological Research and Observation oNTOlogy (SERONTO). The study showed that SERONTO could successfully import database schemas and reference lists, map relations between database tables and SERONTO concepts, and allow complex queries across multiple connected databases from within SERONTO. However, maintaining mappings between reference lists and coupling value sets, units and calculations requires further work. Overall, the study demonstrated the feasibility of using SERONTO and semantic approaches to provide integrated access to distributed ecological data.
Data integration is intrinsic to how modern research is undertaken in areas such as genomics, drug development and personalised medicine. To better enable this integration a large number of biomedical ontologies have been developed to provide standard semantics for describing metadata. There are now several hundred biomedical ontologies in widespread use that describe concepts such as genes, molecules, drugs and diseases. This amounts to millions of terms that are interconnected via relationships that naturally form a graph of biomedical terminology.
The Ontology Lookup Service (OLS) (http://www.ebi.ac.uk/ols) integrates over 160 ontologies and provide a central point for the biomedical community to query and visualise ontologies. OLS also provide a RESTful API over the ontologies that is used in high-throughput data annotation pipelines. OLS is built on top of a Neo4j database that provides efficient indexes for extracting ontological relationships. We have developed generic tools for loading RDF/OWL ontologies into Neo4j where the indexes are optimised for serving common ontology queries. We are now moving to adopt graph database more widely in applications relating to ontology mapping prediction and recommendation systems for data annotation.
The document discusses JSTOR's local discovery integration pilot program, which aims to make JSTOR collections more discoverable outside of library systems. It provides statistics on where researchers begin their discovery process and emphasizes the need to meet researchers where they start. The pilot program partners JSTOR with various discovery platforms, including Summon, Primo, and EBSCO, to integrate JSTOR's collections into the search results. The goal is a "virtuous circle of access" by improving discoverability of local resources for both local and non-local user populations. Initial feedback from the pilot sites is also discussed.
The Evolution of e-Research: Machines, Methods and MusicDavid De Roure
The document summarizes the evolution of e-research over three generations from 1981 to the present. The first generation saw early adopters using tools within their disciplines with some reuse. The second generation was characterized by increased reuse of tools, data and methods across areas. The third generation is defined by radical sharing of resources globally across any discipline through social networks and reusable research objects. The document also discusses several specific projects and tools that exemplify each generation of e-research including myExperiment, Galaxy, and SALAMI.
The Symbiotic Nature of Provenance and WorkflowEric Stephan
This document discusses the symbiotic relationship between provenance and workflows in scientific research. It notes that workflows provide automation and integration capabilities, while provenance provides documentation of what transpired. The document provides examples of workflow and provenance technologies and outlines challenges around interoperability. It concludes that recognizing the interdependent relationship between provenance and workflows can help advance systems science research.
Towards Incidental Collaboratories; Research Data ServicesAnita de Waard
This document discusses enabling "incidental collaboratories" by collecting and connecting biological research data through a centralized framework. It argues that biology research is currently quite isolated due to its small scale and competitive nature. The framework would involve storing experimental data with metadata, allowing analyses across similar experiment types and biological subjects, and preserving data long-term with access controls. This could help move labs from being isolated to being "sensors in a network" and address objections around data ownership and quality.
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Ross Mounce
A talk given at the Geological Society of London, UK on 2016/03/09 as part of the Lyell meeting on Palaeoinformatics. http://www.geolsoc.org.uk/lyell16 #lyell16
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...dolleyj
The Evidence & Conclusion Ontology (ECO) has been developed to provide standardized descriptions for types of evidence within the biological domain. Best
practices in biocuration require that when a biological assertion is made (e.g. linking a Gene Ontology (GO) term for a molecular function to a protein), the type of evidence
supporting it is captured. In recent development efforts, we have been working with other ontology groups to ensure that ECO classes exist for the types of curation they
support. These include the Ontology for Microbial Phenotypes and GO. In addition, we continue to support user-level class requests through our GitHub issue tracker. To
facilitate the addition and maintenance of new classes, we utilize ROBOT (a command line tool for working with Open Biomedical Ontologies) as part of our standard workflow.
ROBOT templates allow us to define classes in a spreadsheet and convert them to Web Ontology Language (OWL) axioms, which can then be merged into ECO. ROBOT is
also part of our automated release process. Additionally, we are engaged in ongoing work to map ECO classes to Ontology for Biomedical Investigation classes using logical
definitions. ECO is currently in use by dozens of groups engaged in biological curation and the number of ECO users continues to grow. The ontology, in OWL and Open
Biomedical Ontology (OBO) formats, and associated resources can be accessed through our GitHub site (https://github.com/evidenceontology/evidenceontology) as well as
the ECO web page (http://evidenceontology.org/).
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
ONTOLOGY SERVICE CENTER: A DATAHUB FOR ONTOLOGY APPLICATIONIJwest
With the growth of data-oriented research in humanities, a large number of research datasets have been
created and published through web services. However, how to discover, integrate and reuse these distributed
heterogeneous research datasets is a challenging task. Ontology is the soul between series digital humanities
resources, which provides a good way for people to discover and understand these datasets. With the release
of more and more linked open data and knowledge bases, a large number of ontologies have been produced
at the same time. These ontologies have different publishing formats, consumption patterns, and interactions
ways, which are not conductive to the user’s understanding of the datasets and the reuse of the ontologies.
The Ontology Service Center platform consists of Ontology Query Center and Ontology Validation Center,
mainly using linked data and ontology-based technologies. The Ontology Query Center realizes the functions
of ontology publishing, querying, data interaction and online browsing, while the Ontology Validation
Center can verify the status of using certain ontologies in the linked datasets. The empirical part of the paper
uses the Confucius portrait as an example of how OSC can be used in the semantic annotation of images. In
a word, the purpose of this paper is to construct the applied ecology of ontology to promote the development
of knowledge graphs and the spread of ontology.
ONTOLOGY SERVICE CENTER: A DATAHUB FOR ONTOLOGY APPLICATION dannyijwest
With the growth of data-oriented research in humanities, a large number of research datasets have been
created and published through web services. However, how to discover, integrate and reuse these distributed
heterogeneous research datasets is a challenging task. Ontology is the soul between series digital humanities
resources, which provides a good way for people to discover and understand these datasets. With the release
of more and more linked open data and knowledge bases, a large number of ontologies have been produced
at the same time
The document discusses making experimental data and methods more reproducible and accessible by providing structured metadata alongside narrative descriptions. It recommends using community standards and ontologies to semantically tag key information, and machine-readable formats to structure descriptions in a consistent way. Tools are proposed to help authors report structured information and curate it according to these standards to make data fully FAIR (findable, accessible, interoperable, reusable). The goal is to move from experiments that are difficult to reproduce to those that are "born reproducible".
Where are we going and how are we going to get there?David De Roure
The document discusses the myExperiment virtual research environment for sharing workflows. Some key points:
1. myExperiment is a social network and repository for research workflows and methods. It currently has over 1800 users and hundreds of shared workflows.
2. The site allows fine-grained privacy controls, grouping of related content into "packs", and integration with other systems through federation.
3. Analysis found that most workflows and other content are shared publicly, and some users actively build upon other users' shared workflows. The most viewed workflow has over 1500 views.
4. The principles behind myExperiment's design focus on empowering scientists by enabling new forms of collaboration and sharing without forcing changes to workflows. The
Taxonomies and Ontologies – The Yin and Yang of Knowledge ModellingSemantic Web Company
See how ontologies and taxonomies can play together to reach the ultimate goal, which is the cost-efficient creation and maintenance of an enterprise knowledge graph. The knowledge modelling methodology is supported by approaches taken from NLP, data science, and machine learning.
Invited talk at the European Research Council-Brussels (Scientific Seminar, 12 April 2013): "Love for Science or 'academic prostitution'". In this talk I present a personal revision (sometimes my own vision) of some issues that I consider key for doing Science. It was focused on the expected audience, mainly Scientific Officers with background in different fields of science and scholarship, but also Agency staff.
Abstract: In a recent Special issue of Nature concerning Science Metrics it was claimed that " Research reverts to a kind of 'academic prostitution' in which work is done to please editors and referees rather than to further knowledge."If this is true, funding agencies should try to avoid falling into the trap of their own system. By perpetuating this 'prostitution' they risk not funding the best research but funding the best sold research.
Given the current epoch of economical crisis, where in a quest for funds researchers are forced into competitive game of pandering to panelists, its seems a good time for deep reflection about the entire scientific system.
With this talk I aim to provoke extra critical thinking among the committees who select evaluators, and among the evaluators, who in turn require critical thinking to the candidates when selecting excellent science.
I will present some initiatives (e.g. new tracers of impact for the Web era- 'altmetrics'), and on-going projects (e.g. how to move from publishing advertising to publishing knowledge), that might enable us to favor Science over marketing.
Here are the key points about Husserl's phenomenology that are relevant to understanding the epistemological assumptions of the alternative theory presented in this dissertation:
- Phenomenology aims to study phenomena (things as they appear in our experience) rather than phenomena as they exist independently of us. It focuses on our subjective experience of the world rather than making claims about an objective reality.
- The natural attitude refers to our normal, taken-for-granted way of perceiving and interacting with the world. We see things as stable objects that exist independently of us. Phenomenology asks us to suspend or "bracket" this natural attitude in order to study phenomena as they appear to consciousness.
- Intentionality refers
Similar to Sa sansone dccroadshow-nov2012Delivering reproducible bioscience data by enabling biocuration at the source (20)
This document summarizes a presentation given by Susanna Sansone at the GSC 23rd meeting education day in Bangkok, Thailand on August 7, 2023. The presentation discussed standards across life sciences, including definitions of different types of standards and over 1,600 identified standards. It covered standard organizations and grassroots groups, as well as the FAIRsharing database which catalogs over 2,885 standards and databases and aims to promote their use and value across research.
The FAIRsharing journey in RDA document discusses:
1) FAIRsharing's growth and involvement with RDA since 2011, including its Working Group established in 2015 to curate standards, databases, and policies to promote FAIR data.
2) FAIRsharing's current activities and impact, such as its registry of over 4,000 records from many disciplines and usage in various tools and services.
3) Opportunities for further engagement with RDA, such as leveraging their expertise for contributions to the FAIR Cookbook, an open resource providing technical recipes for applying FAIR principles to life science data.
Overview of metadata standards, and how FAIRsharing and the FAIR Cookbook help selecting and using them. Presentation to the What is metadata? Common standards and properties. EHP Workshop, November 9, 2022: https://ephconference.eu/pre-conference-programme-441
Pharmas and academia are joining forces to make data FAIR (Findable, Accessible, Interoperable, and Reusable) through the development of the FAIR Cookbook. The FAIR Cookbook provides over 70 recipes and growing that give step-by-step guidance on improving the FAIRness of different data types through the use of tools, technologies, and best practices. It aims to provide practical examples and guidelines to support researchers, data managers, and others in managing data according to FAIR principles. The FAIR Cookbook is an open, community-developed resource overseen by an editorial board, with contributions from nearly 100 life sciences professionals.
FAIR, community standards and data FAIRification: components and recipesSusanna-Assunta Sansone
Overview of FAIR, FAIRsharing and the FAIR Cookbook at the ATI event on Knowledge Graphs: https://github.com/turing-knowledge-graphs/meet-ups/blob/main/symposium-2022.md
Presentation to the EOSC workshop on policies (https://www.google.com/url?q=https://eoscfuture.eu/eventsfuture/monitoring-eosc-readiness-fair-data-policies) on what FAIRsharing does for policies, including providing registration, discovery, flexible and clearer descriptions, relationships, machine readability and comparability.
The document summarizes how FAIRsharing assists others with promoting FAIR data principles without directly assessing FAIRness compliance. It does this by (1) providing a lookup service for standards and repositories via its API, (2) serving as a registry for FAIRness tests and indicators to make them discoverable, and (3) enabling communities to create profiles declaring which standards and repositories they use. The document also outlines FAIRsharing's operations, advisory boards, and future plans to further support assessment and tracking of FAIRness improvements over time.
ELIXIR is a European infrastructure that brings together life science resources from across Europe. It offers databases, tools, computing capabilities, and training opportunities. ELIXIR nodes provide these services and connect national data infrastructures. ELIXIR communities connect infrastructure experts to drive service developments. ELIXIR is funded through a mixed model including public sources. It works to sustain important biological data resources and make data FAIR through recommended standards and interoperability resources. ELIXIR also aims to develop a sustainable tools ecosystem and provides training through its portal.
Presentation to the EC Workshop on Maximizing investments in health research: FAIR data for a coordinate COVID-19 response. Workshop III, November 8, 2021.
Presentation to the EC Workshop on Maximizing investments in health research: FAIR data for a coordinate COVID-19 response. Workshop I, October 11, 2021.
The FAIR Cookbook poster, as presented at the ELIXIR-UK Node and the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
The FAIR Cookbook poster, as presented at the UK Conference of Bioinformatics and Computational Biology 2021: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-21
Sa sansone dccroadshow-nov2012Delivering reproducible bioscience data by enabling biocuration at the source
1. www.slideshare.net/SusannaSansone
Delivering reproducible bioscience data by enabling
biocuration at the source
Susanna-Assunta Sansone, PhD
Principal Investigator and Team Leader,
University of Oxford e-Research Centre, Oxford, UK
Academic Consultant, Open Access Data Products,
Nature Publishing Group
Data Curation Centre (DCC)
13th Regional Data Management Roadshow, London, 20 November 2012
3. University of Oxford e-Research Centre
Providing research
computing, high-
performance
computing
Integrating with
national and
international
infrastructure
Supporting leading
edge facilities through
education and training
4. University of Oxford e-Research Centre
Collaborating with European and wider
international groups in, e.g.:
• energy,
• radio astronomy,
• biological data federation,
• life sciences simulation,
• biodiversity,
• computational chemistry,
• neuroscience,
• digital humanities tools,
• digital music analysis
• visualization
• …
5. My team’s activities and stakeholders we work with
data management and biocuration, collaborative development of software
and database, standards and ontology
• environmental genomics • stem cell discovery
• metabolomics • system biology
• metagenomics • transcriptomics
• nanotechnology • toxicogenomics
• proteomics • environmental health
6. Outline
“The buzz around reproducible bioscience data:
the communities and the standards”
“The reality from the buzz:
challenges and exemplar project”
8. C E
O M R H
N
P E S
I
B
L
E
http://www.flickr.com/photos/notbrucelee/8016189356/ CC BY
9. C E
O M R H
N
P E S
I
N R E I
T E O P R
A
B
L
E
http://www.flickr.com/photos/notbrucelee/8016189356/ CC BY
10. C E
O M R H
N
P E S
I
N R E I
T E O P R
A
R E U S B
L
E
http://www.flickr.com/photos/notbrucelee/8016189356/ CC BY
11. experimental design
sample characteristic(s)
experimental variable(s)
technology(s)
measurement(s)
protocols(s)
data file(s)
......
11 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
12. § We must strike a balance
between
• depth and breadth of
information; and
• sufficient information
required to reuse the data
§ Capture all salient features
of the experimental
workflow
§ Make annotation explicit
and discoverable
§ Structure the descriptions
for consistency, tracking
12 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
13. Growing, worldwide movement for reproducible research
esoteric formats
comprehensible?
lack of sufficient
contextual interoperable?
information
hoc or proprietary reusable?
terminologies
Source: http://ebbailey.wordpress.com
§ Researchers and bioinformaticians in both academic and commercial
science, along with funding agencies and publishers, embrace the
concept that community-developed standards are pivotal to structure
and enrich the annotation of
• entities of interest (e.g., genes, metabolites, phenotypes) and
• experimental steps (e.g., provenance of study materials,
technology and measurement types)
14. Community mobilization to develop standards, e.g.:
use the same word and
allow data to flow from report the same core,
refer to the same ‘thing’
one system to another essential information
15. Is this general mobilization good or bad?
use the same word and
allow data to flow from report the same core,
refer to the same ‘thing’
one system to another essential information
§ Fragmentation of the standards is a major issue
• Being focused on particular communities’ interests, be their individual
technologies or biological/biomedical disciplines, leads to duplication of effort,
and more seriously, the development of (largely arbitrarily) different standards
• This severely hinders the interoperability of databases and tools and ultimately
the integration of datasets
18. But how much do we know about these standards
Which tools and I use high throughput
databases sequencing technologies,
implement which which one are applicable
standards? to me?
How can I get
What are the
involved to
criteria to evaluate
propose
their status and
extensions or
value?
modifications?
Which one are I work on plants,
mature enough for are these just for
me to use or biomedical
recommend? applications?
19. A catalogue to map the
landscape of standards and the
systems implementing them:
Over 400 bio-standards
(public and in curation)
Field*, Sansone* et al., Omics data sharing. Science
19 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
326, 234-36 (2009) doi:0.1126/science.1180598
www.ebi.ac.uk/net-project
20. • A coherent, curated and searchable catalogue of data sharing resources
• Bioscience standards and associated data-sharing policies, publications, tools and databases
• Assessment criteria for usability and popularity of standards
• Relationships among standards
• Encouragement for communication & interaction among groups
• Promoting interoperability & informed decisions about standards
21.
22. Social engineering
22 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
23. Ownership of open standards
can be problematic in broad,
grass-root collaborations; it
requires improved models, to
encourage maintenance of and
contributions to these efforts,
supporting their evolutions
23 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
24. The extensive community
liaison needs to be managed
and funded; rewards and
incentives need to be identified
for all contributors
24 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
25. The cost of implementing a
standards-supported data
sharing vision is as large as the
number of stakeholders that
must operate synchronously
25 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
32. reasoning visualization
analysis browsing integration
exchange retrieval
Community Software
Standards Tools
Well-annotated &
Structured Data
Reproducible &
Reusable
Bioscience Research
33. An exemplar approach to the status quo
§ A grass-root collaborative that works to facilitate collection, curation
and sharing of experiments using a common, structured representation
of the experiments that
• transcends individual biological and technological domains and
• can be ‘configured’ to implement (several of) the community
standards
34. metadata tracking framework
user community
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
36. A growing ecosystem of over 30 public and internal resources using the
ISA metadata tracking framework to facilitate standards-compliant
collection, curation, management and reuse of investigations in an
increasingly diverse set of life science domains, including:
• environmental health • stem cell discovery
• environmental genomics • system biology
• metabolomics • transcriptomics
• metagenomics • toxicogenomics
• nanotechnology • also by communities working to build
• proteomics, a library of cellular signatures
TOWARDS INTEROPERABLE BIOSCIENCE DATA Feb 2012
Sansone SA, Rocca-Serra P, Field D, Maguire E, Taylor C, Hofmann O, Fang H, Neumann S, Tong W,
Amaral-Zettler L, Begley K, Booth T, Bougueleret L, Burns G, Chapman B, Clark T, Coleman LA,
Copeland J, Das S, de Daruvar A, de Matos P, Dix I, Edmunds S, Evelo C, Forster M, Gaudet P, Gilbert J,
Goble C, Griffin J, Jacob D, Kleinjans J, Harland L, Haug K, Hermjakob H, Sui S, Laederach A, Liang S,
Marshall S, Merrill E, McGrath A, Reilly D, Roux M, Shamu C, Shang C, Steinbeck C, Trefethen A,
Williams-Jones B, Wolstencroft K, Xenarios J, Hide W.
40. Implementation at the EBI
40 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
41. Extensions of the
Nanotechnology
Informatics Working Group
41 The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone
www.ebi.ac.uk/net-project
42. We must increase the level of annotation
Notes in Lab Books Spreadsheets and Tables Facts as RDF statements
(information for humans) ( the compromise) (information for machines)
• Invest in curating and manage data at the source using:
• a common metadata tracking framework, such as ISA
• publicly available and community-developed terminologies
• recording sufficient contextual information of the experimental steps
§ Progressively datasets will become more comprehensible, interoperable,
reproducible and (re)usable, underpinning future investigations
43. Collaborative approaches are highly valuable but take time
Community involvement and uptake!
1st ISA-Tab workshop! 3rd ISA-Tab workshop! User workshops/visits - start! 1st public instance: !
2nd ISA-Tab workshop! Other tools implement ! Harvard Stem Cell ! Growing number of
ISA-Tab! Discovery Engine! systems starts to adopt
ISA framework!
Core developments!
Conversions to ! Links to
Pride-XML/SRA-XML/! analysis tools
Strawman ISA-Tab spec! ISA software v1! MAGE-Tab and more! starts!
Final ISA-Tab spec! Database instance !
at EBI! RDF format starts!
Publications!
Stem Cell !
ISA-Tab and ! Discovery ! ISA Commons!
Omics data sharing!
Workshop reports! ISA software suite! Engine!
(Science)! (Nature Genetics)!
(Bioinformatics)! (NAR)!
2007 2008 2009 2010 2011 2012
Development timeline