Being Reproducible: SSBSS Summer School 2017Carole Goble
Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
Lecture 1:
Being FAIR: FAIR data and model management
In recent years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs, workflows. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship [1] have proved to be an effective rallying-cry. Funding agencies expect data (and increasingly software) management retention and access plans. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems and Synthetic Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation.
Our FAIRDOM project (http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety. The FAIRDOM Platform has been installed by over 30 labs or projects. Our public, centrally hosted Asset Commons, the FAIRDOMHub.org, supports the outcomes of 50+ projects.
Now established as a grassroots association, FAIRDOM has over 8 years of experience of practical asset sharing and data infrastructure at the researcher coal-face ranging across European programmes (SysMO and ERASysAPP ERANets), national initiatives (Germany's de.NBI and Systems Medicine of the Liver; Norway's Digital Life) and European Research Infrastructures (ISBE) as well as in PI's labs and Centres such as the SynBioChem Centre at Manchester.
In this talk I will show explore how FAIRDOM has been designed to support Systems Biology projects and show examples of its configuration and use. I will also explore the technical and social challenges we face.
I will also refer to European efforts to support public archives for the life sciences. ELIXIR (http:// http://www.elixir-europe.org/) the European Research Infrastructure of 21 national nodes and a hub funded by national agreements to coordinate and sustain key data repositories and archives for the Life Science community, improve access to them and related tools, support training and create a platform for dataset interoperability. As the Head of the ELIXIR-UK Node and co-lead of the ELIXIR Interoperability Platform I will show how this work relates to your projects.
[1] Wilkinson et al, The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
Metagenomic Data Provenance and Management using the ISA infrastructure - overview, implementation patterns & software tools
Slides presented at EBI Metagenomics Bioinformatics course: http://www.ebi.ac.uk/training/course/metagenomics2014
Being Reproducible: SSBSS Summer School 2017Carole Goble
Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
Lecture 1:
Being FAIR: FAIR data and model management
In recent years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs, workflows. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship [1] have proved to be an effective rallying-cry. Funding agencies expect data (and increasingly software) management retention and access plans. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems and Synthetic Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation.
Our FAIRDOM project (http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety. The FAIRDOM Platform has been installed by over 30 labs or projects. Our public, centrally hosted Asset Commons, the FAIRDOMHub.org, supports the outcomes of 50+ projects.
Now established as a grassroots association, FAIRDOM has over 8 years of experience of practical asset sharing and data infrastructure at the researcher coal-face ranging across European programmes (SysMO and ERASysAPP ERANets), national initiatives (Germany's de.NBI and Systems Medicine of the Liver; Norway's Digital Life) and European Research Infrastructures (ISBE) as well as in PI's labs and Centres such as the SynBioChem Centre at Manchester.
In this talk I will show explore how FAIRDOM has been designed to support Systems Biology projects and show examples of its configuration and use. I will also explore the technical and social challenges we face.
I will also refer to European efforts to support public archives for the life sciences. ELIXIR (http:// http://www.elixir-europe.org/) the European Research Infrastructure of 21 national nodes and a hub funded by national agreements to coordinate and sustain key data repositories and archives for the Life Science community, improve access to them and related tools, support training and create a platform for dataset interoperability. As the Head of the ELIXIR-UK Node and co-lead of the ELIXIR Interoperability Platform I will show how this work relates to your projects.
[1] Wilkinson et al, The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
Metagenomic Data Provenance and Management using the ISA infrastructure - overview, implementation patterns & software tools
Slides presented at EBI Metagenomics Bioinformatics course: http://www.ebi.ac.uk/training/course/metagenomics2014
Tripal within the Arabidopsis Information Portal - PAG XXIIIVivek Krishnakumar
Araport plans to implement a Chado-backed data warehouse, fronted by Tripal, serving as as our core database, used to track multiple versions of genome annotation (TAIR10, Araport11, etc.), evidentiary data (used by our annotation update pipeline), metadata such as publications collated from multiple sources like TAIR, NCBI PubMed and UniProtKB (curated and unreviewed) and stock/germplasm data linked to AGI loci via their associated polymorphisms.
Written and presented by Tom Ingraham (F1000), at the Reproducible and Citable Data and Model Workshop, in Warnemünde, Germany. September 14th -16th 2015.
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
presented at 1st First International Workshop on Reproducible Open Science @ TPDL, 9 Sept 2016, Hannover, Germany
http://repscience2016.research-infrastructures.eu/
Improving the Management of Computational Models -- Invited talk at the EBIMartin Scharm
Improving the Management of Computational Models:
storage – retrieval & ranking – version control
More information and slides to download at http://sems.uni-rostock.de/2013/12/martin-visits-the-ebi/
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...Araport
PMR database is a community resource for deposition and analysis of metabolomics data and related transcriptomics data. PMR currently houses metabolomics data from over 25 species of eukaryotes. In this talk, we introduce PMRs RESTful web APIs for data sharing, and demonstrate its applications in research using Araport to provide Arabidopsis metabolomics data.
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...dgarijo
Traditional approaches to ontology development have a large lapse between the time when a user using the ontology has found a need to extend it and the time when it does get extended. For scientists, this delay can be weeks or months and can be a significant barrier for adoption. We present a new approach to ontology development and data annotation enabling users to add new metadata properties on the fly as they describe their datasets, creating terms that can be immediately adopted by others and eventually become standardized. This approach combines a traditional, consensus-based approach to ontology development, and a crowdsourced approach where ex-pert users (the crowd) can dynamically add terms as needed to support their work. We have implemented this approach as a socio-technical system that includes: 1) a crowdsourcing platform to support metadata annotation and addition of new terms, 2) a range of social editorial processes to make standardization decisions for those new terms, and 3) a framework for ontology revision and updates to the metadata created with the previous version of the ontology. We present a prototype implementation for the paleoclimate community, the Linked Earth Framework, currently containing 700 datasets and engaging over 50 active contributors. Users exploit the platform to do science while extending the metadata vocabulary, thereby producing useful and practical metadata
FAIR Workflows and Research Objects get a Workout Carole Goble
So, you want to build a pan-national digital space for bioscience data and methods? That works with a bunch of pre-existing data repositories and processing platforms? So you can share FAIR workflows and move them between services? Package them up with data and other stuff (or just package up data for that matter)? How? WorkflowHub (https://workflowhub.eu) and RO-Crate Research Objects (https://www.researchobject.org/ro-crate) that’s how! A step towards FAIR Digital Objects gets a workout.
Presented at DataVerse Community Meeting 2021
This talk explores how principles derived from experimental design practice, data and computational models can greatly enhance data quality, data generation, data reporting, data publication and data review.
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...Carole Goble
Over the past 5 years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs and so forth. Don’t stop reading. Data management isn’t likely to win anyone a Nobel prize. But publications should be supported and accompanied by data, methods, procedures, etc. to assure reproducibility of results. Funding agencies expect data (and increasingly software) management retention and access plans as part of the proposal process for projects to be funded. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems Biology demands the interlinking and exchange of assets and the systematic recording
of metadata for their interpretation.
The FAIR Guiding Principles for scientific data management and stewardship (http://www.nature.com/articles/sdata201618) has been an effective rallying-cry for EU and USA Research Infrastructures. FAIRDOM (Findable, Accessible, Interoperable, Reusable Data, Operations and Models) Initiative has 8 years of experience of asset sharing and data infrastructure ranging across European programmes (SysMO and EraSysAPP ERANets), national initiatives (de.NBI, German Virtual Liver Network, UK SynBio centres) and PI's labs. It aims to support Systems and Synthetic Biology researchers with data and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety.
This talk will use the FAIRDOM Initiative to discuss the FAIR management of data, SOPs, and models for Sys Bio, highlighting the challenges of and approaches to sharing, credit, citation and asset infrastructures in practice. I'll also highlight recent experiments in affecting sharing using behavioural interventions.
http://www.fair-dom.org
http://www.fairdomhub.org
http://www.seek4science.org
Presented at COMBINE 2016, Newcastle, 19 September.
http://co.mbine.org/events/COMBINE_2016
Reproducibility of model-based results: standards, infrastructure, and recogn...FAIRDOM
Written and presented by Dagmar Waltemath (University of Rostock) as part of the Reproducible and Citable Data and Models Workshop in Warnemünde, Germany. September 14th - 16th 2015.
Tripal within the Arabidopsis Information Portal - PAG XXIIIVivek Krishnakumar
Araport plans to implement a Chado-backed data warehouse, fronted by Tripal, serving as as our core database, used to track multiple versions of genome annotation (TAIR10, Araport11, etc.), evidentiary data (used by our annotation update pipeline), metadata such as publications collated from multiple sources like TAIR, NCBI PubMed and UniProtKB (curated and unreviewed) and stock/germplasm data linked to AGI loci via their associated polymorphisms.
Written and presented by Tom Ingraham (F1000), at the Reproducible and Citable Data and Model Workshop, in Warnemünde, Germany. September 14th -16th 2015.
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
presented at 1st First International Workshop on Reproducible Open Science @ TPDL, 9 Sept 2016, Hannover, Germany
http://repscience2016.research-infrastructures.eu/
Improving the Management of Computational Models -- Invited talk at the EBIMartin Scharm
Improving the Management of Computational Models:
storage – retrieval & ranking – version control
More information and slides to download at http://sems.uni-rostock.de/2013/12/martin-visits-the-ebi/
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...Araport
PMR database is a community resource for deposition and analysis of metabolomics data and related transcriptomics data. PMR currently houses metabolomics data from over 25 species of eukaryotes. In this talk, we introduce PMRs RESTful web APIs for data sharing, and demonstrate its applications in research using Araport to provide Arabidopsis metabolomics data.
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...dgarijo
Traditional approaches to ontology development have a large lapse between the time when a user using the ontology has found a need to extend it and the time when it does get extended. For scientists, this delay can be weeks or months and can be a significant barrier for adoption. We present a new approach to ontology development and data annotation enabling users to add new metadata properties on the fly as they describe their datasets, creating terms that can be immediately adopted by others and eventually become standardized. This approach combines a traditional, consensus-based approach to ontology development, and a crowdsourced approach where ex-pert users (the crowd) can dynamically add terms as needed to support their work. We have implemented this approach as a socio-technical system that includes: 1) a crowdsourcing platform to support metadata annotation and addition of new terms, 2) a range of social editorial processes to make standardization decisions for those new terms, and 3) a framework for ontology revision and updates to the metadata created with the previous version of the ontology. We present a prototype implementation for the paleoclimate community, the Linked Earth Framework, currently containing 700 datasets and engaging over 50 active contributors. Users exploit the platform to do science while extending the metadata vocabulary, thereby producing useful and practical metadata
FAIR Workflows and Research Objects get a Workout Carole Goble
So, you want to build a pan-national digital space for bioscience data and methods? That works with a bunch of pre-existing data repositories and processing platforms? So you can share FAIR workflows and move them between services? Package them up with data and other stuff (or just package up data for that matter)? How? WorkflowHub (https://workflowhub.eu) and RO-Crate Research Objects (https://www.researchobject.org/ro-crate) that’s how! A step towards FAIR Digital Objects gets a workout.
Presented at DataVerse Community Meeting 2021
This talk explores how principles derived from experimental design practice, data and computational models can greatly enhance data quality, data generation, data reporting, data publication and data review.
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...Carole Goble
Over the past 5 years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs and so forth. Don’t stop reading. Data management isn’t likely to win anyone a Nobel prize. But publications should be supported and accompanied by data, methods, procedures, etc. to assure reproducibility of results. Funding agencies expect data (and increasingly software) management retention and access plans as part of the proposal process for projects to be funded. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems Biology demands the interlinking and exchange of assets and the systematic recording
of metadata for their interpretation.
The FAIR Guiding Principles for scientific data management and stewardship (http://www.nature.com/articles/sdata201618) has been an effective rallying-cry for EU and USA Research Infrastructures. FAIRDOM (Findable, Accessible, Interoperable, Reusable Data, Operations and Models) Initiative has 8 years of experience of asset sharing and data infrastructure ranging across European programmes (SysMO and EraSysAPP ERANets), national initiatives (de.NBI, German Virtual Liver Network, UK SynBio centres) and PI's labs. It aims to support Systems and Synthetic Biology researchers with data and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety.
This talk will use the FAIRDOM Initiative to discuss the FAIR management of data, SOPs, and models for Sys Bio, highlighting the challenges of and approaches to sharing, credit, citation and asset infrastructures in practice. I'll also highlight recent experiments in affecting sharing using behavioural interventions.
http://www.fair-dom.org
http://www.fairdomhub.org
http://www.seek4science.org
Presented at COMBINE 2016, Newcastle, 19 September.
http://co.mbine.org/events/COMBINE_2016
Reproducibility of model-based results: standards, infrastructure, and recogn...FAIRDOM
Written and presented by Dagmar Waltemath (University of Rostock) as part of the Reproducible and Citable Data and Models Workshop in Warnemünde, Germany. September 14th - 16th 2015.
Workflows, provenance and reporting: a lifecycle perspective at BIH 2013, RomeCarole Goble
Workflow systems support the design, configuration and execution of repetitive, multi-step pipelines and analytics, well established in many disciplines, notably biology and chemistry, but less so in biodiversity and ecology. From an experimental perspective workflows are a means to handle the work of accessing an ecosystem of software and platforms, manage data and security, and handle errors. From a reporting perspective they are a means to accurately document methodology for reproducibility, comparison, exchange and reuse, and to trace the provenance of results for review, credit, workflow interoperability and impact analysis. Workflows operate in an evolving ecosystem and are assemblages of components in that ecosystem; their provenance trails are snapshots of intermediate and final results. Taking a lifecycle perspective, what are the challenges in workflow design and use with different stakeholders? What needs to be tackled in evolution, resilience, and preservation? And what are the “mitigate or adapt” strategies adopted by workflow systems in the face of changes in the ecosystem/environment, for example when tools are depreciated or datasets become inaccessible in the face of funding shortfalls?
Keynote: SemSci 2017: Enabling Open Semantic Science
1st International Workshop co-located with ISWC 2017, October 2017, Vienna, Austria,
https://semsci.github.io/semSci2017/
Abstract
We have all grown up with the research article and article collections (let’s call them libraries) as the prime means of scientific discourse. But research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
We can think of “Research Objects” as different types and as packages all the components of an investigation. If we stop thinking of publishing papers and start thinking of releasing Research Objects (software), then scholar exchange is a new game: ROs and their content evolve; they are multi-authored and their authorship evolves; they are a mix of virtual and embedded, and so on.
But first, some baby steps before we get carried away with a new vision of scholarly communication. Many journals (e.g. eLife, F1000, Elsevier) are just figuring out how to package together the supplementary materials of a paper. Data catalogues are figuring out how to virtually package multiple datasets scattered across many repositories to keep the integrated experimental context.
Research Objects [1] (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described. The brave new world of containerisation provides the containers and Linked Data provides the metadata framework for the container manifest construction and profiles. It’s not just theory, but also in practice with examples in Systems Biology modelling, Bioinformatics computational workflows, and Health Informatics data exchange. I’ll talk about why and how we got here, the framework and examples, and what we need to do.
[1] Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, Carole Goble, Why linked data is not enough for scientists, In Future Generation Computer Systems, Volume 29, Issue 2, 2013, Pages 599-611, ISSN 0167-739X, https://doi.org/10.1016/j.future.2011.08.004
Curation-Friendly Tools for the Scientific Researcherbwestra
Presentation for Online Northwest Conference, in Corvallis Oregon, February 10, 2012.
Highlights electronic lab notebooks (ELN) and OMERO (Open Microscopy Environment) as two tools that enable researchers to better manage their research data.
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
Bob Stanley, CEO, IO Informatics, explains the utility to RDF as a standard way of defining and redefining data that could have utility in managing life science information.
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
Starting from scratch – building the perfect digital repositoryVioleta Ilik
By establishing a digital repository on the Feinberg School of Medicine (FSM), Northwestern University, Chicago campus, we anticipate to gain ability to create, share, and preserve attractive, functional, and citable digital collections and exhibits. Galter Health Sciences Library did not have a repository as of November 2014. In just a few moths we formed a small team that was charged at looking to select the most suitable open source platform for our digital repository software. We followed the National Library of Medicine master evaluation criteria by looking at various factors that included: functionality, scalability, extensibility, interoperability, ease of deployment, system security, system, physical environment, platform support, demonstrated successful deployments, system support, strength of development community, stability of development organization, and strength of technology roadmap for the future. These factors are important for our case considering the desire to connect the digital repository with another platform that was an essential piece in the big FSM picture – VIVO. VIVO is a linked data platform that serves as a researchers’ hub and which provides the names of researchers from academic institutions along with their research output, affiliation, research overview, service, background, researcher’s identities, teaching, and much more.
Talk given at the Data Visualisation and the Future of Academic Publishing event. https://www.eventbrite.com/e/data-visualisation-and-the-future-of-academic-publishing-tickets-25372801733?password=dataviz
From peer-reviewed to peer-reproduced: a role for research objects in scholar...Alejandra Gonzalez-Beltran
The reproducibility of science in the digital age is attracting a lot of attention and concerns from the scientific community, where studies have shown the inability to reproduce results due to a variety of reasons, ranging from unavailability of the data to lack of proper descriptions of the experimental steps.
Multiple research object models have been proposed to describe different aspects of the research process. Investigation/Study/Assay (ISA) is a widely used general-purpose metadata tracking framework with an associated suite of open-source software, which offers a rich description of the experiment’s hypotheses and design, investigators involved, experimental factors, protocols applied. The information is organised in a three-level hierarchy where ’Investigation’ provides the project context for a ’Study’ (a research question), which itself contains one or more ’Assays’ (taking analytical measurements and key data processing and analysis steps). Nanopublication (NP) is a research object model which enables specific scientific assertions, such as the conclusions of an experiment, to be annotated with supporting evidence, published and cited. Lastly, the Research Object (RO) is a model that enables the aggregation of the digital resources contributing to findings of computational research, including results, data and software, as citable compound digital objects.
For computational reproducibility, platforms such as Taverna and Galaxy are popular and efficient ways to represent the data analysis steps in the form of reusable workflows, where the data transformations can be specified and executed in an automatic way.
In this presentation, we will address the question of whether such research object models and workflow representation frameworks can be used to assist in the peer review process, by facilitating evaluation of the accuracy of the information provided by scientific articles with respect to their repeatability.
Our case study is based on an article on a genome assembler algorithm published in GigaScience, but due to the proven use of the respective research object models in their respective communities, we argue that the combination of models and workflow system will improve the scholarly publishing process, making science peer-reproduced.
Increased access to the data generated is fuelling increased consumption and accelerating the cycle of discovery. But the successful integration and re-use of heterogeneous data from multiple providers and scientific domains is a major challenge within academia and industry, often due to incomplete description of the study details or metadata about the study. Using the BioSharing, ISA Commons and the STATistics Ontology (STATO) projects as exemplar community efforts, in this breakout session we will discuss the evolving portfolio of community-based standards and methods for structuring and curating datasets, from experimental descriptions to the results of analysis.
http://www.methodsinecologyandevolution.org/view/0/events.html#Data_workshop
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Alejandra Gonzalez-Beltran
La biología experimental se ha convertido en una ciencia intensiva en datos, gracias a los avances en las tecnologías de adquisición de señales digitales y biosensores. La disponibilidad de los datos es fundamental para la transparencia del proceso científico: para poder reproducir los resultados y también para la reutilización de los datos en estudios futuros. Esta charla explorará distintas herramientas de software que facilitan el proceso de generación de metadatos para mejorar la calidad, el reporte, la publicación y la revisión de datos, con énfasis en aplicaciones biomédicas.
Slides introducing the session on 'Big data in healthcare' at the Brazil-UK Frontiers of Engineering symposium held at Jarinu, Sao Paulo, Brazil - 6-8 November 2014.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Technical Drawings introduction to drawing of prisms
COPO kick-off meeting
1. ISA tools overview
Alejandra González-Beltrán, PhD
Oxford e-Research Centre, University of Oxford
alejandra.gonzalezbeltran@oerc.ox.ac.uk @alegonbel
COPO kick-off meeting September 16, 2014 Norwich, UK
2. • Nature Publishing Group‘s Scientific Data
• BioMedCentral and BGI‘s GigaScience
• F1000 Research
• Oxford University Press
Ontology and sematic web knowledge
representations
Ontology for Biomedical Investigation (OBI) for the
description of the experimental steps
Statistical ontology (STATO) for the description of
statistical results
Applying the linked data and the nanopublication
approaches to ISA-Tab
eTRIKS – EUTranslational Information and
Knowledge management Services
Consortium of academic (Imperial College, CNRS, Un
of Luxemburg) and pharmas (Janssen, Merck, AZ,
Lilly, Lundbeck, Pfizer, Roche, Sanofi, Bayer, GSK)
building a sustainable, open translational research
informatics platform
WG in involving PloS, NPG, BMC, BMJ, OUP, etc.
Centre for Extended Data Annotation
and Retrieval (pending notification of
award)
Centre will explore a variety of technique to ease the
work of creating and using standards-driven templates
for describing and reporting biomedical datasets
A community-driven platform for plant
science
Coordinated by TGAC and in collaboration with
EMBL-European Bioinformatics Institute and Warwick
University, COPO will provide platform- and software-as-
a-service
3. Related Funded Project:
Continued Development and Curation
of the MetaboLights database
• Large metabolomics lab submission interfaces/
integration (API)
• Analysis tools (online)
• Tools for visualisation of metabolomics data
• Enrichment and extension to the reference layer
• A curation tool for MetaboLights submissions
• Metabolite identification assistant
http://www.bbsrc.ac.uk/pa/grants/AwardDetails.aspx?FundingReference=BB/L024152/1
6. Experimental Metadata
Notes in lab notebooks
(information for humans) Spreadsheets & tables
ISA-Tab
6
RDF statements
(information for machines)
7. Experimental Metadata
Notes in lab notebooks
(information for humans) Spreadsheets & tables
ISA-Tab
RDF statements
(information for machines)
It is all about structuring experimental information to make it available to
computers and software agents to enable:
7
!
provenance tracking
assessment and evaluation
accountability, reliability, trust, evidence
conservation, preservation, storage, archiving and mining
11. Why ISA format and Tools?
investigation
assay(s) assay(s)
pointers to data file
names/location
external files in
native or other for-mats
data data
investigation
high level concept to link
related studies
study
the central unit, containing
information on the subject
under study, its characteristics
and any treatments applied.
a study has associated assays
assay
test performed either on
material taken from the sub-ject
or on the whole initial
subject, which produce quali-tative
or quantitative meas-urements
(data)
H. Sapiens
H. Sapiens
H. Sapiens
H. Sapiens
33 Years
H1
H1
H2
35
35
33
Years
Years
Years
ISA metadata specifications:
!
• workflow and process
orientated
• compatible with checklist
enforcement
• compatible with external
vocabulary resources
• compatible by design with
existing schemas
!
H1.sample1
H1.sample2
H2.sample1
Labeling
Labeling
H1.sample1.labeled
H2.sample1.labeled
h1-s1.cel
h1-s2.cel
h2-s1.cel
H1
H2
H1.sample1
H1.sample2
H2.sample1
Labeling
Labeling
H1.sample1.labeled
H2.sample1.labeled
h1-s1.cel
h1-s2.cel
h2-s1.cel
H. Sapiens
35 Years
MAGE-Tab
Pride-xml SRA-xml
26. 26
A growing ecosystem of over 30 public and internal resources using
the ISA metadata tracking framework (ISA-Tab and/or tools) to
facilitate standards-compliant collection, curation, management and
reuse of investigations in an increasingly diverse set of life science
domains, including:
!
• stem cell discovery
• system biology
• transcriptomics
• toxicogenomics
• also by communities working to build a library of cellular
signatures
!
• environmental health
• environmental genomics
• metabolomics
• metagenomics
• nanotechnology
• proteomics
28. Thanks for your attention!
Questions?
You can email us...
isatools@googlegroups.com
View our websites
View our Git repo & contribute
http://github.com/ISA-tools
View our blog
http://isatools.wordpress.com
Follow us on Twitter
@isatools