is there life between standards? Data interoperability for AI.

•Download as PPTX, PDF•

0 likes•77 views

Presented at the MAQC 2019 conference in Riva del Garda "Reproducibility for Artificial Intelligence in Medicine" https://maqc2019.fbk.eu/

Science

Is there life between
standards?
Increasing data interoperability for AI
Prof. @Chris_Evelo
Maastricht University

mine combine use
Simple steps in AI?
Deposition
Databases

Integrative Systems Biology
Internal &
external
data
repositories
e.g. dbNP,
Sage, Atlas
knowledge
resources &
(semantic web)
Integration
e.g. Open PHACTS
WikiPathways
study capturing
ISA
models
study
data
processing,
statistics,
storage
e.g. arrayanalysis.org
ontologies
modeling & data integration,
network biology (extension),
supervised statistics
curation,
simulation
annotation &
provenance
research
applications
mapping
BridgeDb
extraction,
SPARQLing
conversion

We can do things like this (diabetic liver)
Pihlajamäki et al. dataset
is from Gene Expression
Omnibus
GEO:GSE15653
Pihlajamäki et al. J Clin
Endocrinol Metab. 2009,
94 (9): 3521-3529.
DOI: 10.1210/jc.2009-
0212.
Martina Kutmon et al.
BMC Genomics 2014,
15:971.
DOI:10.1186/1471-2164-
15-971

Literature
PubChem
Genbank
Patents
Databases
Downloads
Data Analysis Data Integration Firewalled Databases
How do R&D companies use public data?

How do pharma companies use public data?
Pfizer
AZ
Roche
n

Nanopu
b
Db
VoID
Data Cache
(Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
CorePlatform
P12374
EC2.43.4
CS4532
“Adenosine
receptor
2a”
VoID
Db
Nanopu
b
Db
VoID
Db
VoID
Nanopu
b
VoID
Public Content Commercial
Public
Ontologies
User
Annotations
Apps

To be done: build an open source chemistry resolution service

To be done: integrate some traditional bioinformatics services.

Interoperability for reusability
If rocket scientist can reuse rockets we should be able to reuse data.

This document discusses databases in bioinformatics. It begins by explaining that bioinformatics concerns the creation and maintenance of biological databases to allow researchers to access existing information and submit new entries. The aims of bioinformatics are to organize data, develop analysis tools, and use these tools to analyze data and interpret results in a biologically meaningful way. Several important biological databases are described, including nucleotide sequence databases like NCBI and protein sequence databases. GenBank is also discussed as the annotated collection of all publicly available DNA sequences. Biological databases make large datasets available to researchers and are important for biological research infrastructure.

Fabricio Silva: Cloud Computing Technologies for Genomic Big Data Analysis

Flávio Codeço Coelho

This document discusses the use of cloud computing technologies for genomic big data analysis. It begins by defining big data and describing the exponential growth of genomic data. It then discusses how cloud computing provides flexibility, scalability, and accessibility for genomic data processing through virtualization and large computing clusters. Specific technologies enabled for the cloud that help with genomic analysis are described, such as Hadoop, MapReduce, and genomic analysis tools adapted for these frameworks. The document concludes by discussing challenges remaining around data transfer speeds and the need for cloud application expertise, but also describes how platforms like Galaxy Cloudman and Cloudgene allow genomic analysis in the cloud without programming expertise.

Bioinformatics Databases

cschlos2

Bioinformatics databases aim to manage the complexity of life by integrating diverse biological data types. Relational databases use standardized identifiers and data formats to store sequence, expression, proteomic, and metabolomic data. Cross-referencing multiple databases through data warehousing and centralized schemas allows for functional querying of biological networks and neighborhoods. Future directions include greater use of machine learning, data mining, and global data standards.

Model management tools for improved reproducibility in systems biology

University Medicine Greifswald

This document discusses challenges to reproducibility in systems biology and potential solutions. It notes a lack of data standards, quality, availability, and transparency make it difficult for researchers to reproduce results. Tools and initiatives discussed that aim to improve reproducibility include the COMBINE archive to bundle necessary files, graph databases to integrate model-related data, and version control systems to track model evolution over time. The overall goal is to better support scientists in sharing reproducible model-based studies.

Data retrieval tools

Vidya Kalaivani Rajkumar

The document discusses three data retrieval tools - Entrez, DBGET, and SRS - that allow molecular biologists to search and access information across multiple linked databases. Entrez, developed by NCBI, integrates information from databases including GenBank, RefSeq, PDB, and PubMed. SRS, developed by EBI, is an open source software that integrates over 80 molecular biology databases and has a scripting language called Icarus. SRS indexes over 250 databases and has over 35 servers worldwide. It allows searching of sequence, structure, gene-related, and bibliographic databases through a uniform web interface.

Biological Database

Sombir Kashyap

This document discusses biological databases. It defines biological databases as structured, searchable collections of biological data that are periodically updated and cross-referenced. It notes that biological databases store data electronically and systematize, make available, and allow analysis of computed biological data. The document then describes some key features of biological databases, including data heterogeneity, high data volumes, uncertainty, data curation, integration, sharing, and dynamic nature. It also provides examples of different types of biological databases classified by data type, maintainer, access, source, design, and organism covered.

Standards and tools for model management in biomedical research

University Medicine Greifswald

Slides from the presentation at IDAMO 2016, Rostock. May 2016. Most scientific discoveries rely on previous or other findings. A lack of transparency and openness led to what many consider the "reproducibility crisis" in systems biology and systems medicine. The crisis arose from missing standards and inappropriate support of standards in software tools. As a consequence, numerous results in low-and high-profile publications cannot be reproduced. In my presentation, I summarise key challenges of reproducibility in systems biology and systems medicine, and I demonstrate available solutions to the related problems.

Data retreival system

Shikha Thakur

This document describes several text-based biological databases and how to search them. It discusses Entrez, which searches multiple databases and links related entries. It also describes the Sequence Retrieval System (SRS) which allows searching over 80 biological databases. Additionally, it outlines DBGET/LinkDB, an integrated system that searches about 20 databases and links results to associated information. The document provides an example of using each system to retrieve information on a specific protein entry.

This document discusses databases in bioinformatics. It begins by noting the rapid increase in biological data from sources like gene sequences, protein sequences, structural data, and gene expression data. It then defines biological databases as structured, searchable collections of data that are periodically updated and cross-referenced. The major purposes of databases are to make biological data available, systematize the data, and allow analysis of computed biological data. The document provides a brief history of biological databases and sequencing efforts. It also classifies biological databases based on data type, maintenance status, data access, data sources, database design, and organism. Specific databases discussed include DDBJ, EMBL, GenBank, Swiss-Prot, and NCB

FedCentric_Presentation

Yatpang Cheung

This document discusses leveraging graph data structures to analyze variant data and related annotations from large genomic datasets. In phase I, simple queries on a graph database had performance speeds better than or equal to a relational database. Complex queries exploring patterns and clusters were also possible. In phase II, spectral clustering of 1000 genomes data identified three main clusters supporting known population genetics patterns, demonstrating the potential of graph databases for mining complex genomic correlations. The results indicate a graph database provides an effective approach for precision cancer research by enabling both known and novel queries on large genomic datasets.

2015 GU-ICBI Poster (third printing)

Michael Atkins

This document discusses leveraging graph data structures to analyze variant data and related annotations from large genomic datasets in a scalable way. An in-memory graph database was used to model variants, annotations, and their relationships. Simple queries on the graph performed as well or better than a relational database. More complex queries and analysis, like spectral clustering of populations, were also possible with the graph model and helped identify patterns not feasible with relational approaches. The results indicate graph databases are a powerful tool for precision medicine research by enabling both known and novel analysis of large genomic datasets.

Data and model management in Systems Biology

University Medicine Greifswald

This document discusses data and model management in systems biology. It covers topics such as data ownership, metadata, ontologies, standards for encoding models and analyses, and tools for working with systems biology models and data. Standards like SBML, SBGN, SED-ML and COMBINE Archive allow for structured representation, visualization, simulation, and sharing of models and data. Resources like SEEK enable curation, simulation and publication of models in a findable, accessible, interoperable and reusable (FAIR) manner.

Features of biological databases

Charu Sharma

This document provides information on biological databases, including their history, features, and classifications. It notes that the first protein sequenced was insulin in 1965, and the first genome sequenced was of a virus in 1995. Key features of biological databases discussed include their heterogeneity, high volume of data, uncertainty, data curation, integration, sharing, and dynamic nature as new data is added. Biological databases can be classified by data type, maintainer status, data access, source, design, and organism covered. The purpose of biological databases is to systematically organize and make available vast amounts of complex biological data.

The Electronic Notebook Ontology

Stuart Chalk

Science is rapidly being brought into the electronic realm and electronic laboratory notebooks (ELN) are a big part of this activity. The representation of the scientific process in the context of an ELN is an important component to making the data recorded in ELNs semantically integrated. This presentation outlined initial developments of an Electronic Notebook Ontology (ENO) that will help tie together the ExptML ontology, HCLS Community Profile data descriptions, and the VIVO-ISF ontology.

Charleston Conference 2016

Anita de Waard

The document summarizes Anita de Waard's presentation on Elsevier's experiments with big and small data. It discusses Elsevier's work with text mining and knowledge graphs to extract information from over 14 million articles. It also describes Elsevier's Medical Graph which predicts the probability of over 2,000 medical conditions occurring based on analysis of clinical data from 6 million patients. Finally, it reviews Elsevier's various tools and services to help researchers preserve, process, share, comprehend, access, and discover research data and publications.

UKON 2014

Alejandra Gonzalez-Beltran

FAIR sequencing data repository based on iRODS

Felipe Gutierrez

Research data management (RDM) and the FAIR principles (Findable, Accessible, Interoperable, Reusable) are widely promoted as basis for a shared research data infrastructure. Nevertheless, researchers involved in next generation sequencing (NGS) still lack adequate RDM solutions. The NGS metadata is generally not stored together with the raw NGS data, but kept by individual researchers in separate files. This situation complicates RDM practice. Moreover, the (meta)data does often not meet the FAIR principles [6]. Consequently, a central FAIR-compliant repository is highly desirable to support NGS related research. We have selected iRODS (Rule-Oriented Data management systems) [3] as a basis for implementing a sequencing data repository because it allows storing both data and metadata together. iRODS serves as scalable middleware to access different storage facilities in a centralized and virtualized way, and supports different types of clients. This repository will be part of an ecosystem of RDM solutions that cover complementary phases of the research data life cycle in our organization (Academic Medical Center of the University of Amsterdam). We selected Virtuoso [5] to enrich the metadata from iRODS to enable the management of a triplestore for linked data. The metadata in the iCat (iRODS’ metadata catalogue) and the ontology in Virtuoso are kept synchronized by enforcement of strict data manipulation policies. We have implemented a prototype to preserve raw sequencing data for one research group. Three iRODS client interfaces are used for different purposes: Davrods [4] for data and metadata ingestion, data retrieval; Metalnx-web [7] for administration, data curation, and repository browsing; and iCommands [2] for all tasks by advanced users. Different user profiles are defined (principal investigator, data curator, repository administrator), with different access rights. New data is ingested by copying raw sequence files and the corresponding metadata file (a sample sheet) to the landing collection on iRODS. An iRODS rule is triggered by the sample sheet file, which extracts the metadata and registers it to the iCAT as AVU (Attribute, Value and Unit). Ontology files are registered into Virtuoso. The sequence files are copied to the persistent collection and are made uniquely identifiable based on metadata. All the steps are recorded into a report file that enables monitoring and tracking of progress and faults. Here we describe the design and implementation of the prototype, and discuss the first assessment results. Initial results indicate that the proposed solution is acceptable and fits the researchers workflow well.

NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...

European School of Oncology

This document discusses data management requirements for predictive modeling using large datasets from multiple clinical, specimen, and lab repositories. It notes the need to assemble complete and up-to-date datasets while maintaining quality assurance and transparency. Over time, data storage systems experience problems with exponential data growth, manual data curation difficulties, and challenges integrating heterogeneous databases across different research groups. The document examines a spectrum of potential data management approaches and highlights collaborative networks and use of open source platforms as ways to address these issues.

PRISM Project Update

imgcommcall

Claudia medina: Linking Health Records for Population Health Research in Brazil.

Flávio Codeço Coelho

The document discusses record linkage, which is the process of identifying and merging records from different databases that refer to the same individual. It describes common record linkage approaches used in Brazil's health sector, including probabilistic and deterministic methods. It also evaluates the accuracy of applying a probabilistic record linkage strategy to identify deaths among AIDS cases reported to Brazil's surveillance database, finding a sensitivity of 87.6% and specificity of 99.6%. Finally, it discusses the potential impact of linkage errors on risk ratio estimates in longitudinal mortality studies.

The Research Object Initiative:Frameworks and Use Cases

Carole Goble

Data and Model Management for Systems Biology

University Medicine Greifswald

Short introduction to SED-ML

University Medicine Greifswald

This document discusses SED-ML (Simulation Experiment Description Markup Language), a standard for describing computational simulations. SED-ML files contain information like the models, data, simulation settings and algorithms used in an experiment. Using SED-ML allows experiments to be reproduced and shared. The document encourages adopting SED-ML to make research more reproducible and help curation of models in repositories. It also provides an overview of tools that support SED-ML and ways to get involved in its development.

Reproducibility and Scientific Research: why, what, where, when, who, how

Carole Goble

This document discusses the importance of reproducibility in scientific research. It makes three key points: 1. For results to be considered valid, scientific publications should provide clear descriptions of methods and protocols so that other researchers can successfully repeat and extend the work. 2. Many factors can undermine reproducibility, such as publication pressures, poor training, disorganization, and outright fraud. Ensuring reproducible research requires transparency across experimental designs, data, software, and computational workflows. 3. Achieving reproducible science is challenging and poorly incentivized due to the resources and time required to prepare materials for independent verification. Overcoming these issues will require collective effort across the research community.

Research resources: curating the new eagle-i discovery system

Nicole Vasilevsky

Data Retrieval Systems

Saramita De Chakravarti

The document discusses different text-based database retrieval systems for accessing biological data, including Entrez, SRS, and DBGET/LinkDB. It describes their key features and how each system allows users to search text databases using queries, with Entrez providing linked related data across multiple databases. An example shows how each system can be used to retrieve and view related information for a SwissProt protein entry.

Entrez databases

Hafiz Muhammad Zeeshan Raza

The document discusses biological databases and retrieval systems. It provides an overview of Entrez, a retrieval system developed by NCBI that allows integrated searches across multiple biological databases. It also describes how Entrez links related data between databases, and some key features of Entrez like limits, preview/index, and history. Additionally, it summarizes specific NCBI databases accessible through Entrez like PubMed and OMIM, as well as another retrieval system called SRS maintained by EBI.

Major databases in bioinformatics

Vidya Kalaivani Rajkumar

GenBank, EMBL, and DDBJ are primary nucleotide sequence databases that collaborate to store publicly available DNA sequences. NCBI's GenBank is one of the largest primary sequence databases, containing over 240,000 organisms' sequences submitted from laboratories. PubMed and Entrez are literature and biomedical databases maintained by NCBI that allow users to search biomedical research articles and integrate related data from multiple sources. SRS is a sequence retrieval system developed by EBI that integrates over 250 molecular biology databases and allows complex queries across data sources.

Being FAIR: Enabling Reproducible Data Science

Carole Goble

Bioinformatics databases: Current Trends and Future Perspectives

University of Malaya

Data is the most powerful resource in any field or subject of study. In Biology, data comes from scientists and their actions, while any institution that makes sense of the data collected, will be in the forefront in their respective research field. In the beginning of any data collection endeavour, it is critical to find proper management techniques to store data and to maximise its utilisation. This presentation reflects upon the current trends and techniques of data modeling, architecture with a highlight on the uses of database, focusing on Bioinformatics examples and case studies. Finally, the future of bioinformatics databases is highlighted to give an overview of the modeling techniques to accommodate the biological data escalation in coming years.

What's hot

databases in bioinformatics

nadeem akhter

FedCentric_Presentation

Yatpang Cheung

2015 GU-ICBI Poster (third printing)

Michael Atkins

Data and model management in Systems Biology

University Medicine Greifswald

Features of biological databases

Charu Sharma

The Electronic Notebook Ontology

Stuart Chalk

Charleston Conference 2016

Anita de Waard

UKON 2014

Alejandra Gonzalez-Beltran

FAIR sequencing data repository based on iRODS

Felipe Gutierrez

NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...

European School of Oncology

PRISM Project Update

imgcommcall

Claudia medina: Linking Health Records for Population Health Research in Brazil.

Flávio Codeço Coelho

The Research Object Initiative:Frameworks and Use Cases

Carole Goble

Data and Model Management for Systems Biology

University Medicine Greifswald

Short introduction to SED-ML

University Medicine Greifswald

Reproducibility and Scientific Research: why, what, where, when, who, how

Carole Goble

Research resources: curating the new eagle-i discovery system

Nicole Vasilevsky

Data Retrieval Systems

Saramita De Chakravarti

Entrez databases

Hafiz Muhammad Zeeshan Raza

Major databases in bioinformatics

Vidya Kalaivani Rajkumar

What's hot (20)

databases in bioinformatics

FedCentric_Presentation

2015 GU-ICBI Poster (third printing)

Data and model management in Systems Biology

Features of biological databases

The Electronic Notebook Ontology

Charleston Conference 2016

UKON 2014

FAIR sequencing data repository based on iRODS

NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...

PRISM Project Update

Claudia medina: Linking Health Records for Population Health Research in Brazil.

The Research Object Initiative:Frameworks and Use Cases

Data and Model Management for Systems Biology

Short introduction to SED-ML

Reproducibility and Scientific Research: why, what, where, when, who, how

Research resources: curating the new eagle-i discovery system

Data Retrieval Systems

Entrez databases

Major databases in bioinformatics

Similar to is there life between standards? Data interoperability for AI.

Being FAIR: Enabling Reproducible Data Science

Carole Goble

Bioinformatics databases: Current Trends and Future Perspectives

University of Malaya

A biologist in e-Science

Leiden University Medical Center

1. The document discusses how a biologist, Marco Roos, became interested in e-science through his work in molecular and cellular biology, bioinformatics, and data integration projects. 2. Roos describes how e-science allows for collaboration between different experts and disciplines through technologies like workflows, semantic web, and virtual laboratories. 3. Roos emphasizes that e-science should empower scientists by making tools and resources easy to use, share, and build upon so that scientists can focus on scientific problems rather than technical challenges.

Dynamic Semantic Metadata in Biomedical Communications

Tim Clark

1) The document discusses challenges in curing complex medical disorders and proposes that semantic annotation, hypothesis management, and nanopublications can help address these challenges by enabling improved information sharing and integration across research communities. 2) It describes various technologies and frameworks like the Annotation Ontology, SWAN Annotation Framework, and nanopublications that can help researchers semantically annotate documents, manage hypotheses, and publish and share interpretations. 3) International collaborations between researchers and informaticians are seen as important to building the information ecosystem needed to make progress on curing complex diseases.

Semantic (Web) Technologies for Translational Research in Life Sciences

Artificial Intelligence Institute at UofSC

This document discusses using semantic web technologies for translational research in life sciences. It provides an overview of semantic web standards and outlines several projects demonstrating applications in healthcare and biomedical research. These include developing an active semantic electronic medical record, semantically annotating experimental glycomics data, and integrating diverse biomedical data sources using ontologies to enable complex querying and knowledge discovery.

Semantic Web for Health Care and Biomedical Informatics

Amit Sheth

Services For Science April 2009

Ian Foster

The document discusses the increasing scale and complexity of knowledge generation in science domains like astronomy and medicine over recent centuries. It argues that knowledge generation can be viewed as a systems problem involving many actors and processes. The document proposes a service-oriented approach using web services as an integrating framework to address challenges of scale, complexity, and distributed collaboration in e-Science. Key challenges discussed include semantics, documentation, scaling issues, and sociological factors like incentives.

Investigating plant systems using data integration and network analysis

Catherine Canevet

The document discusses challenges in integrating plant data from multiple sources and proposes solutions. It notes that plant data is sparse, distributed across many databases in various formats, and focused primarily on the model plant Arabidopsis. Data integration is necessary to address key biological questions by consolidating information from pathway databases, gene annotations, protein interactions, and more. The document outlines approaches to data integration including controlled vocabularies, ontologies, data standards, and integration applications specifically designed to combine data sources like Ondex. Effective integration is important to fully leverage available plant data.

Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks

Carole Goble

AI from the Perspective of a School of Data Science

Philip Bourne

'A PAL's Life' for OMII-UK Board, May 2008

Leiden University Medical Center

The document summarizes the experience of a biologist in adopting an e-science approach to their work. It describes how before e-science, the biologist took an uncoordinated "spaghetti" approach using various tools without a unified strategy. The biologist then explains how adopting e-science principles like collaboration, reusable workflows, and web services helped enhance their work by allowing experts from different domains to combine their expertise. The biologist also reflects on outreach efforts to promote e-science to other researchers.

Data analysis & integration challenges in genomics

mikaelhuss

Aaas Data Intensive Science And Grid

Ian Foster

The document discusses the growth of data-intensive science and the need for new computing infrastructures to manage the large amounts of data being produced. It covers three perspectives on infrastructure: grid computing which enables sharing of distributed resources over the internet, data centers which provide integrated storage and computing services, and e-science which combines grids, collaboration tools, and data analysis services. Examples are given of different scientific domains using these infrastructures.

MoM2010: Bioinformatics

Hend Al-Khalifa

Bioinformatics is an interdisciplinary field that combines biology, computer science, and information technology. It involves the electronic storage, retrieval, analysis, and correlation of biological data. The document outlines key concepts in bioinformatics including the central dogma of molecular biology, biological data representation, how computers can be useful for biology, challenges in the field, and examples of intelligent bioinformatics applications. It emphasizes that bioinformatics is an important and growing field at the intersection of biology and computer science.

UK Digital Curation Centre: enabling research data management at the coalface

LizLyon

OpenTox Europe 2013

Alejandra Gonzalez-Beltran

The document discusses the ISA infrastructure, which provides a standardized format (ISA-TAB) for experimental metadata and data exchange. It can be used across various domains like toxicology, systems biology, and nanotechnology. The Risa R package integrates experimental metadata with analysis and allows updating metadata. Nature Scientific Data is a new publication for describing valuable datasets. The ISA framework has been adopted by over 30 public and private resources and is growing in use for facilitating reuse of investigations in various life science domains. Toxicity examples include EU projects on predictive toxicology and a rat study of drug candidates. Questions can be directed to the ISA tools group.

HKU Data Curation MLIM7350 Class 8

Scott Edmunds

FAIRer Research

Carole Goble

This document summarizes Professor Carole Goble's presentation on making research more reproducible and FAIR (Findable, Accessible, Interoperable, Reusable) through the use of research objects and related standards and infrastructure. It discusses challenges to reproducibility in computational research and proposes bundling datasets, workflows, software and other research products into standardized research objects that can be cited and shared to help address these challenges.

ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...

Carole Goble

Keynote given by Carole Goble on 23rd July 2013 at ISMB/ECCB 2013 http://www.iscb.org/ismbeccb2013 How could we evaluate research and researchers? Reproducibility underpins the scientific method: at least in principle if not practice. The willing exchange of results and the transparent conduct of research can only be expected up to a point in a competitive environment. Contributions to science are acknowledged, but not if the credit is for data curation or software. From a bioinformatics view point, how far could our results be reproducible before the pain is just too high? Is open science a dangerous, utopian vision or a legitimate, feasible expectation? How do we move bioinformatics from one where results are post-hoc "made reproducible", to pre-hoc "born reproducible"? And why, in our computational information age, do we communicate results through fragmented, fixed documents rather than cohesive, versioned releases? I will explore these questions drawing on 20 years of experience in both the development of technical infrastructure for Life Science and the social infrastructure in which Life Science operates.

Thesis def

Jay Vyas

Similar to is there life between standards? Data interoperability for AI. (20)

Being FAIR: Enabling Reproducible Data Science

Bioinformatics databases: Current Trends and Future Perspectives

A biologist in e-Science

Dynamic Semantic Metadata in Biomedical Communications

Semantic (Web) Technologies for Translational Research in Life Sciences

Semantic Web for Health Care and Biomedical Informatics

Services For Science April 2009

Investigating plant systems using data integration and network analysis

Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks

AI from the Perspective of a School of Data Science

'A PAL's Life' for OMII-UK Board, May 2008

Data analysis & integration challenges in genomics

Aaas Data Intensive Science And Grid

MoM2010: Bioinformatics

UK Digital Curation Centre: enabling research data management at the coalface

OpenTox Europe 2013

HKU Data Curation MLIM7350 Class 8

FAIRer Research

ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...

Thesis def

More from Chris Evelo

Use of data

Chris Evelo

Opening up pharmacological space, the OPEN PHACTs api

Chris Evelo

The document provides an overview of the Open PHACTS project, which aims to create an open pharmacological space (OPS) through semantic integration of public drug discovery resources. It discusses the challenges of accessing and integrating scientific data across organizational boundaries. Open PHACTS builds a service layer and applications to allow standardized access and analysis of data from various public sources. It is a collaborative project involving academic and industry partners seeking to make pre-competitive drug discovery data more accessible and useful through semantic integration and common standards.

WikiPathways: how open source and open data can make omics technology more us...

Chris Evelo

This document discusses WikiPathways, an open source pathway database. It began in 2007 with the goals of having an online platform by March 2007 and gaining a first unknown user by January 2008, both of which were successes. WikiPathways has grown significantly since, now containing over 400 human pathways and 6,200 unique human genes. It receives over 1 million pageviews annually. The document advocates for opening up data and code to make omics technology more useful. It describes WikiPathways' various features including its BioPAX format, REST services, and integration with Cytoscape. It also discusses professionalizing open source and collaborating with existing communities and tools rather than trying to change the world alone.

A real life example to show the added value of the Phenotype Database (dbNP)....

Chris Evelo

NuGO has initiated the development of the Phenotype Database (dbNP). This database is developed together with several other consortia (e.g. Netherlands Metabolomics Centre) and is currently used within several European projects, such as Food4me, NU-AGE, Bioclaims and Nutritech. The Phenotype Database (www.dbnp.org) is a web-based application/database that can store any biological study. We used this application to perform an analysis on a combination of several studies with the objective to test if it is possible to answer new research questions using a ‘virtual cohort’. Study comparison: The assessment of the health status of an individual is an important but challenging issue. Nowadays, challenge tests are proposed as a method to assess and quantify health status. We would like to find mechanistic explanations for differences in clinical subgroups and to develop a metabolomics platform based fingerprint at baseline that represents important parameters of the challenge test. Currently, there is not one single study available that includes enough subjects from specific clinical subgroups to develop such a fingerprint or study the biological processes specific for those subgroups. Therefore, we developed a toolbox that facilitates the combined analysis of multiples studies.

Analysis with biological pathways:

Chris Evelo

Using ontologies to do integrative systems biology

Chris Evelo

The document discusses using ontologies to integrate systems biology data. It describes typical steps in systems biology studies such as finding studies, processing data, integrating data, and combining data from multiple sources. Ontologies can help link information from different analysis techniques and combine data from many studies by capturing study metadata. The document advocates using standards like ISA-TAB and MAGE-TAB to capture study data and proposes using a generic study capture framework with modular components to integrate different types of 'omics data. Ontologies are needed for collaboration and to provide controlled vocabularies for annotation.

Using biological network approaches for dynamic extension of micronutrient re...

Chris Evelo

This document discusses using biological network approaches to dynamically extend pathways with regulatory information such as microRNAs (miRNAs). It describes tools like PathVisio that can integrate gene expression, proteomics and metabolomics data onto pathways to identify significantly changed processes. WikiPathways is introduced as a public pathway resource that can be contributed to and curated by researchers. The document outlines approaches for visualizing regulatory interactions on pathways using plugins, exploring pathway interactions through network analysis, and integrating other data types such as SNPs, fluxes and gene annotations to build a more comprehensive understanding of biological systems.

More from Chris Evelo (7)

Use of data

Opening up pharmacological space, the OPEN PHACTs api

WikiPathways: how open source and open data can make omics technology more us...

A real life example to show the added value of the Phenotype Database (dbNP)....

Analysis with biological pathways:

Using ontologies to do integrative systems biology

Using biological network approaches for dynamic extension of micronutrient re...

Recently uploaded

SAR of Medicinal Chemistry 1st by dk.pdf

KrushnaDarade1

Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf

TinyAnderson

Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...

Ana Luísa Pinho

Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.

Phenomics assisted breeding in crop improvement

IshaGoswami9

As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus, high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology, and analysis methods, numerous infrastructure platforms have been developed for phenotyping.

Applied Science: Thermodynamics, Laws & Methodology.pdf

University of Hertfordshire

When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best. Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level. Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.

Deep Software Variability and Frictionless Reproducibility

University of Rennes, INSA Rennes, Inria/IRISA, CNRS

The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions. Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability. Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields. I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating). I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems. Exposé invité Journées Nationales du GDR GPL 2024

bordetella pertussis.................................ppt

kejapriya1

原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样

yqqaatn0

原版纸张【微信：741003700 】【(carleton毕业证书)卡尔顿大学毕业证】【微信：741003700 】学位证，留信认证（真实可查，永久存档）offer、雅思、外壳等材料/诚信可靠,可直接看成品样本，帮您解决无法毕业带来的各种难题！外壳，原版制作，诚信可靠，可直接看成品样本。行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备。十五年致力于帮助留学生解决难题，包您满意。本公司拥有海外各大学样板无数，能完美还原海外各大学 Bachelor Diploma degree, Master Degree Diploma 1:1完美还原海外各大学毕业材料上的工艺：水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。材料咨询办理、认证咨询办理请加学历顾问Q/微741003700 留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才

20240520 Planning a Circuit Simulator in JavaScript.pptx

Sharon Liu

The binding of cosmological structures by massless topological defects

Sérgio Sacani

Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is mitigated, at least in part.

Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...

University of Maribor

The debris of the ‘last major merger’ is dynamically young

Sérgio Sacani

The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the ‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space, because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago. We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data 1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’ did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within the last few Gyr, consistent with the body of work surrounding the VRM.

8.Isolation of pure cultures and preservation of cultures.pdf

by6843629

The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx

MAGOTI ERNEST

Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024). Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).

Medical Orthopedic PowerPoint Templates.pptx

terusbelajar5

Nucleophilic Addition of carbonyl compounds.pptx

SSR02

Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general. Carbonyls undergo addition reactions with a large range of nucleophiles. Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen. Electronic effects (inductive effects, electron donation) have a large impact on reactivity. Large groups adjacent to the carbonyl will slow the rate of reaction. Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样

yqqaatn0

原版纸张【微信：741003700 】【(uvic毕业证书)维多利亚大学毕业证】【微信：741003700 】学位证，留信认证（真实可查，永久存档）offer、雅思、外壳等材料/诚信可靠,可直接看成品样本，帮您解决无法毕业带来的各种难题！外壳，原版制作，诚信可靠，可直接看成品样本。行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备。十五年致力于帮助留学生解决难题，包您满意。本公司拥有海外各大学样板无数，能完美还原海外各大学 Bachelor Diploma degree, Master Degree Diploma 1:1完美还原海外各大学毕业材料上的工艺：水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。材料咨询办理、认证咨询办理请加学历顾问Q/微741003700 留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才

Cytokines and their role in immune regulation.pptx

Hitesh Sikarwar

Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...

AbdullaAlAsif1

The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.

NuGOweek 2024 Ghent programme overview flyer

pablovgd

Recently uploaded (20)

SAR of Medicinal Chemistry 1st by dk.pdf

Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf

Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...

Phenomics assisted breeding in crop improvement

Applied Science: Thermodynamics, Laws & Methodology.pdf

Deep Software Variability and Frictionless Reproducibility

bordetella pertussis.................................ppt

原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样

20240520 Planning a Circuit Simulator in JavaScript.pptx

The binding of cosmological structures by massless topological defects

Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...

The debris of the ‘last major merger’ is dynamically young

8.Isolation of pure cultures and preservation of cultures.pdf

The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx

Medical Orthopedic PowerPoint Templates.pptx

Nucleophilic Addition of carbonyl compounds.pptx

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样

Cytokines and their role in immune regulation.pptx

Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...

NuGOweek 2024 Ghent programme overview flyer

is there life between standards? Data interoperability for AI.

1. Is there life between standards? Increasing data interoperability for AI Prof. @Chris_Evelo Maastricht University

3. mine combine use Simple steps in AI? Deposition Databases

4. Integrative Systems Biology Internal & external data repositories e.g. dbNP, Sage, Atlas knowledge resources & (semantic web) Integration e.g. Open PHACTS WikiPathways study capturing ISA models study data processing, statistics, storage e.g. arrayanalysis.org ontologies modeling & data integration, network biology (extension), supervised statistics curation, simulation annotation & provenance research applications mapping BridgeDb extraction, SPARQLing conversion

5. We can do things like this (diabetic liver) Pihlajamäki et al. dataset is from Gene Expression Omnibus GEO:GSE15653 Pihlajamäki et al. J Clin Endocrinol Metab. 2009, 94 (9): 3521-3529. DOI: 10.1210/jc.2009- 0212. Martina Kutmon et al. BMC Genomics 2014, 15:971. DOI:10.1186/1471-2164- 15-971

7. Literature PubChem Genbank Patents Databases Downloads Data Analysis Data Integration Firewalled Databases How do R&D companies use public data?

8. How do pharma companies use public data? Pfizer AZ Roche n

10. Nanopu b Db VoID Data Cache (Virtuoso Triple Store) Semantic Workflow Engine Linked Data API (RDF/XML, TTL, JSON) Domain Specific Services Identity Resolution Service Chemistry Registration Normalisation & Q/C Identifier Management Service Indexing CorePlatform P12374 EC2.43.4 CS4532 “Adenosine receptor 2a” VoID Db Nanopu b Db VoID Db VoID Nanopu b VoID Public Content Commercial Public Ontologies User Annotations Apps

11. Nanopu b Db VoID Data Cache (Virtuoso Triple Store) Semantic Workflow Engine Linked Data API (RDF/XML, TTL, JSON) Domain Specific Services Identity Resolution Service Chemistry Registration Normalisation & Q/C Identifier Management Service Indexing CorePlatform P12374 EC2.43.4 CS4532 “Adenosine receptor 2a” VoID Db Nanopu b Db VoID Db VoID Nanopu b VoID Public Content Commercial Public Ontologies User Annotations Apps

12.

13.

14. OxO

15. To be done: build an open source chemistry resolution service

16. To be done: integrate some traditional bioinformatics services.

17. Interoperability for reusability If rocket scientist can reuse rockets we should be able to reuse data.

Editor's Notes

Standard Interoperability in the Life Sciences
From: https://xkcd.com/927/ This is basically why we in COST CHARME and in ELIXIR intereoperability platform work on the glue between standards
Animated slide Showing data and knowledge resources on the left (you can use FAIRsharing to find these). Results are mined from these, combined (where the gluing occurs) and used for AI. This talks focusses on the combine aspects. If you do that correctly the AI part lateron will not have to make the connections and the power can be used to obtain other rseults
This slide shows the overall approach and the position of the different components/projects
This study shows we can actually find useful data (in this case liver transcriptomics from human diabetic patients compared to non-diabetic patient data). Process the data, perform pathway enrichment analysis (map to the entities in the pathways) combine pathways into a network (combining overlapping pathways that in the case of WikiPathways need to be mapped again). Extend that network with transcription factors (from e.g. ENCODE, again targets need to be mapped) and look for active nodes in the networks and find the transcription factors that affect these active nodes (essentially turning the network inside out). The result shows the main known transcription factors in diabetics, however using information from just one study not he wealth of information that makes this known.
A little bit more illustration of how that was done Showing the pathways affected, the overlapping entities and a small representation of the resulting network
How pharma reused public data
And every company does the same
So for the Open PHACTS project we had the idea to link the relevant data together. Using data from ChEMBL (compound-target), NextProt (on the targets), WikiPathways and Reactome (processes and pathways these targets are involved in) and DisGeNET linking the genes coding for these targets to diseases. All that using a. semantic web approach that would make it one big linked dataset
Alart from the fact that such data is not automatically linked even if you describe it well
So we added resources that map between textual concepts, ontology terms, database IDs and chemical structures
Some of such mapping tools are now part of ELIXIR’s recommended interoperability services
E.g. BridgeDb able to map gene and gene product database IDs and metabolite IDs and more. A BridgeDb based identifier mapping service was part of the original Open PHACTS
The ontology mapping and crossreference service OxO at EBI, which has not yet been deployed to work on such tasks, but offers the potential to do so
While the CDK could be used to develop a service that can resolve siubstructures, replacing the original service that was ”not the best open source” project
And we need classic bioinformatics approaches to map between similar services, domains affected, predicted functions of variants and map SNPs to Indels and such
And if we get all that done we might be able to reuse data the way rocket scientists reuse their actual rockets (showing the landing of the two side boosters of the very first Falcon Heavy and the first time two of such boosters made it back to earth at the same time, potentially able to be reused)

is there life between standards? Data interoperability for AI.

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to is there life between standards? Data interoperability for AI.

Similar to is there life between standards? Data interoperability for AI. (20)

More from Chris Evelo

More from Chris Evelo (7)

Recently uploaded

Recently uploaded (20)

is there life between standards? Data interoperability for AI.

Editor's Notes