SlideShare a Scribd company logo
Chris Evelo
Hans Constandt
May 15, 2018
Easier Integration and
Enrichment of Data by
Making Data More FAIR
chris_evelo
hconstandt
Agenda
• Elixir: Interoperability services aiming at actual reuse
• Things we can do (and how we do it)
• Lessons from Open PHACTS
• Services for Findability
• Services for Interoperability
• Integration in end user tools for actual reuse
• FAIR Data @Work
• What do we want from a Data Set?
• How do we get there?
• Your internal linked data?
• Lift data to higher level of FAIRness?
• Linked Data federation?
• Thank You
• Q&A
www.elixir-europe.org
ELIXIR
Safeguarding the results of life science
research in Europe
From reproducibility to reusability
We can do things like this (diabetic liver)
Pihlajamäki et al. dataset
is from Gene Expression
Omnibus (accession
number GSE15653)
Pihlajamäki et al. J Clin
Endocrinol Metab. 2009,
94 (9): 3521-3529. DOI:
10.1210/jc.2009-0212.
Martina Kutmon et al.
BMC Genomics 2014,
15:971.
DOI: 10.1186/1471-2164-
15-971
Data predators
Data predators
Vitamin D-microRNA network
31 targets up-regulated (3 in pathways)
23 targets down-regulated (4 in pathways)
Targeted by multiple microRNAs:
CLSPN - cell cycle
FZD5 - receptor for Wnt proteins
CACNG4 - calcium channel
Data: Wang et al. 2011. in Gene
Expression Omnibus (GEO, accession
number: GSE17461).
Published paper: Effects of
1alpha,25 dihydroxyvitamin D3 and
testosterone on miRNA and mRNA
expression in LNCaP cells. WL Wang
et al. Mol Cancer 2011. 10.
doi:10.1186/1476-4598-10-58
This work: Integrative network-
based analysis of mRNA and
microRNA expression in 1, 25-
dihydroxyvitamin D 3-treated cancer
cells. M Kutmon et al. Genes &
nutrition 10 (5), 35
doi:10.1007/s12263-015-0484-0
Workflow
Internal &
external
data
repositories
e.g. dbNP,
Sage, Atlas
knowledge
resources &
(semantic web)
Integration
e.g. Open PHACTS
WikiPathways
study capturing
ISA
models
study
data
processing,
statistics,
storage
e.g. arrayanalysis.org
ontologies
modeling & data integration,
network biology (extension),
supervised statistics
curation,
simulation annotation &
provenance
Integrative Systems Biology
research
applications
mapping
BridgeDb
extraction,
SPARQLing
conversion
Nanopub
Db
VoID
Data Cache
(Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
CorePlatform
P12374
EC2.43.4
CS4532
“Adenosine
receptor 2a”
VoID
Db
Nanopub
Db
VoID
Db
VoID
Nanopub
VoID
Public Content Commercial
Public
Ontologies
User
Annotations
Apps
Nanopub
Db
VoID
Data Cache
(Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
CorePlatform
P12374
EC2.43.4
CS4532
“Adenosine
receptor 2a”
VoID
Db
Nanopub
Db
VoID
Db
VoID
Nanopub
VoID
Public Content Commercial
Public
Ontologies
User
Annotations
Apps
www.elixir-europe.org/excelerate
@ELIXIREurope /company/elixir-europe
ELIXIR-EXCELERATE is funded by the European Commission within the
Research Infrastructures programme of Horizon 2020, grant agreement number
676559.
Interoperability services aiming at actual reuse
ELIXIR
Three levels of activity to reuse FAIR data
1. FAIR data itself
2. Interoperability services
3. Integration in end user tools
Services for findability
- Bioschemas: annotate the web
- FAIRsharing: describing standards and data collection
- BioStudies/BioSamples: describing studies and
samples and (links to) data
- (Omics)data repositories: ELIXIR core data resources
- FAIR datapoints: the future?
- Nanopublication collections: RDF with explicit
provenance and evidence
Services for interoperability
Services for interoperability
Services for interoperability
• Database identifier mapping
• Mappings for
• Gene products (ENSEMBL)
• Metabolites (HMDB, ChEBI, WikiData)
• Reactions (RHEA)
• Gene Variants (ENSEMBL, dbSNP)
• Framework and web service
• Stackable
• Integrated in PathVisio and Cytoscape, available for R, SemWeb
2c. Needed services for interoperability
- Dedicated mapping approaches (e.g. variants to
protein domains, Mutalyzer)
- Chemistry resolution services (not really there)
- Textual term resolution (also for queries) (sounds so
simple..)
- …
Don’t be afraid to reinvent wheels!
Integration in tools
- (Federated) SPARQL queries with integrated mapping
(this is what is behind the Open PHACTS API)
- Combine in R, Cytoscape, Galaxy, KNIME, CWLetc
Needs packages, plugins, tools
- Combine in knowledge discovery tools working with
your own data and additional on purpose mapping
e.g. Ontoforce’s DISQOVER
The Biggest Wins for Knowledge Creation
- Integrating various data sources
- Using a unified (semantic) logic
- Linking data points
hconstandt
Real-World Knowledge Creation Challenges
- Combining humans & domain
knowledge are unbeatable
- Maximize synergy between
human Creativity & machine power
In complex real-world knowledge creation challenges,
often the right questions themselves are not known.
hconstandt
What do we want from a Data Set?
• Potential users know it exists and where to find it
• Potential users know what it’s about
• Usable for different purposes
• Connected
• Future proof
• Usable for automation
hconstandt
Ultimately?
• Driving knowledge generation
• Saving time and money in research and business
doi:10.1038/nrd3681
PMID:22378269.
hconstandt
How do we get there?
• Integrate data
• Harmonize data
• Enrich data
• Interlink data
• And … never forget the end user…
hconstandt
Use golden sources & specialized niche sources to
• fill in the gaps
• make the links
• expand
• enrich
Create your internal Linked Data ecosystem
hconstandt
Lift data to higher level of FAIRness
• Keep in sync with public data
• Harmonize
• Interlink
• Uniform User Interface and API
Sync Harmonize
Link
Source
DISQOVER
SyncSource Harmonize
hconstandt
Linked Data federation
• Thank You
• Q&A
Linked Data federation
• Thank You
• Q&A
Linked Data federation
• Thank You
• Q&A
Linked Data federation
• Thank You
• Q&A
Linked Data federation
• Thank You
• Q&A
hconstandt
Special thanks to people who worked with us and inspired us
Michel Dumontier Helena Deus Bryn Robert MaryJo Zabarowski Phillips Kuhl
Tom Plasterer Wolfgang Colsman Anthony Rowe John Reynders Tim Berners Lee
Bernard Munos Lee Harland Dan Gshwend Sebastien Lefebvre Eric Prudhommeaux
Dean Allemang Robert Greenwood Wolfgang Hoeck Bhanu Bahl Ian Dix
Carole Goble Matthias Nolte Derek Marren Chris Evelo Mark Benioff
Ian Harrow Maryann Martone Rajan Dasai Theo Platt Renzo Constandt
Bart Van Leeuwen Arun Nayar Martin Leach Jay Bergeron … & many more
Thank you!
hconstandt
chris_evelo
hconstandt

More Related Content

What's hot

Research Data Alliance Member Statistics August 2015
Research Data Alliance Member Statistics August 2015Research Data Alliance Member Statistics August 2015
Research Data Alliance Member Statistics August 2015
Research Data Alliance
 
Linking Open Data to Accelerate Low - Carbon Development
Linking Open Data to Accelerate Low - Carbon Development Linking Open Data to Accelerate Low - Carbon Development
Linking Open Data to Accelerate Low - Carbon Development
Martin Kaltenböck
 
Research Data Alliance Member Statistics October 2015
Research Data Alliance Member Statistics October 2015Research Data Alliance Member Statistics October 2015
Research Data Alliance Member Statistics October 2015
Research Data Alliance
 
B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu |
EUDAT
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zeneca
Kerstin Forsberg
 
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu | B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
EUDAT
 
Research Data Alliance Member Statistics June 2015
Research Data Alliance Member Statistics June 2015Research Data Alliance Member Statistics June 2015
Research Data Alliance Member Statistics June 2015
Research Data Alliance
 
Research Data Alliance Member Statistics September 2015
Research Data Alliance Member Statistics September 2015Research Data Alliance Member Statistics September 2015
Research Data Alliance Member Statistics September 2015
Research Data Alliance
 
Research Data Alliance Member Statistics July 2015
Research Data Alliance Member Statistics July 2015Research Data Alliance Member Statistics July 2015
Research Data Alliance Member Statistics July 2015Research Data Alliance
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsRichard Cyganiak
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
Paul Groth
 
B2FIND - User training| www.eudat.eu |
B2FIND - User training| www.eudat.eu | B2FIND - User training| www.eudat.eu |
B2FIND - User training| www.eudat.eu |
EUDAT
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph Futures
Paul Groth
 
Linked Open Data (LOD) Pilot Austria
Linked Open Data (LOD) Pilot AustriaLinked Open Data (LOD) Pilot Austria
Linked Open Data (LOD) Pilot Austria
Martin Kaltenböck
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT
 
Holger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and usefulHolger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and useful
semanticsconference
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
Carole Goble
 
Open Data Support - bridging open data supply and demand
Open Data Support - bridging open data supply and demandOpen Data Support - bridging open data supply and demand
Open Data Support - bridging open data supply and demand
Open Data Support
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
Paul Groth
 
The Future of LOD
The Future of LODThe Future of LOD
The Future of LOD
Ghislain ATEMEZING
 

What's hot (20)

Research Data Alliance Member Statistics August 2015
Research Data Alliance Member Statistics August 2015Research Data Alliance Member Statistics August 2015
Research Data Alliance Member Statistics August 2015
 
Linking Open Data to Accelerate Low - Carbon Development
Linking Open Data to Accelerate Low - Carbon Development Linking Open Data to Accelerate Low - Carbon Development
Linking Open Data to Accelerate Low - Carbon Development
 
Research Data Alliance Member Statistics October 2015
Research Data Alliance Member Statistics October 2015Research Data Alliance Member Statistics October 2015
Research Data Alliance Member Statistics October 2015
 
B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu |
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zeneca
 
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu | B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu |
 
Research Data Alliance Member Statistics June 2015
Research Data Alliance Member Statistics June 2015Research Data Alliance Member Statistics June 2015
Research Data Alliance Member Statistics June 2015
 
Research Data Alliance Member Statistics September 2015
Research Data Alliance Member Statistics September 2015Research Data Alliance Member Statistics September 2015
Research Data Alliance Member Statistics September 2015
 
Research Data Alliance Member Statistics July 2015
Research Data Alliance Member Statistics July 2015Research Data Alliance Member Statistics July 2015
Research Data Alliance Member Statistics July 2015
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF Datasets
 
Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
B2FIND - User training| www.eudat.eu |
B2FIND - User training| www.eudat.eu | B2FIND - User training| www.eudat.eu |
B2FIND - User training| www.eudat.eu |
 
Knowledge Graph Futures
Knowledge Graph FuturesKnowledge Graph Futures
Knowledge Graph Futures
 
Linked Open Data (LOD) Pilot Austria
Linked Open Data (LOD) Pilot AustriaLinked Open Data (LOD) Pilot Austria
Linked Open Data (LOD) Pilot Austria
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
 
Holger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and usefulHolger Wollschläger | E-government at its best: Open, transparent and useful
Holger Wollschläger | E-government at its best: Open, transparent and useful
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Open Data Support - bridging open data supply and demand
Open Data Support - bridging open data supply and demandOpen Data Support - bridging open data supply and demand
Open Data Support - bridging open data supply and demand
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
The Future of LOD
The Future of LODThe Future of LOD
The Future of LOD
 

Similar to BioIT 2018 'Easier integration and enrichment of your data by making public data more FAIR'

CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECAProject
 
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open DataODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
Martin Kaltenböck
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
Neo4j
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk Slides
BioCatalogue
 
Uniting traditional GIS and mainstream IT
Uniting traditional GIS and mainstream ITUniting traditional GIS and mainstream IT
Uniting traditional GIS and mainstream IT
gssg
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
Carole Goble
 
Delivering Faster Insights with a Logical Data Fabric
Delivering Faster Insights with a Logical Data FabricDelivering Faster Insights with a Logical Data Fabric
Delivering Faster Insights with a Logical Data Fabric
Denodo
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
PRELIDA Project
 
Why are e-Infrastructures useful from a small business perspective?
Why are e-Infrastructures useful from a small business perspective?Why are e-Infrastructures useful from a small business perspective?
Why are e-Infrastructures useful from a small business perspective?
Nikos Manouselis
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Yael Garten
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Shirshanka Das
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
Warren Kibbe
 
Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.eu
EUDAT
 
Eudat presentation nov2013 | www.eudat.eu |
Eudat presentation nov2013 | www.eudat.eu | Eudat presentation nov2013 | www.eudat.eu |
Eudat presentation nov2013 | www.eudat.eu |
EUDAT
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data Commons
Vivien Bonazzi
 
Jisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 PaperJisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 Paper
Jisc RDM
 
2016 05 sanger
2016 05 sanger2016 05 sanger
2016 05 sanger
Chris Dwan
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 
Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"
Lviv Startup Club
 

Similar to BioIT 2018 'Easier integration and enrichment of your data by making public data more FAIR' (20)

CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...
 
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open DataODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk Slides
 
Uniting traditional GIS and mainstream IT
Uniting traditional GIS and mainstream ITUniting traditional GIS and mainstream IT
Uniting traditional GIS and mainstream IT
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
 
Delivering Faster Insights with a Logical Data Fabric
Delivering Faster Insights with a Logical Data FabricDelivering Faster Insights with a Logical Data Fabric
Delivering Faster Insights with a Logical Data Fabric
 
agINFRA – a multilingual infrastructure for information on agricultural innov...
agINFRA – a multilingual infrastructure for information on agricultural innov...agINFRA – a multilingual infrastructure for information on agricultural innov...
agINFRA – a multilingual infrastructure for information on agricultural innov...
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
 
Why are e-Infrastructures useful from a small business perspective?
Why are e-Infrastructures useful from a small business perspective?Why are e-Infrastructures useful from a small business perspective?
Why are e-Infrastructures useful from a small business perspective?
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.eu
 
Eudat presentation nov2013 | www.eudat.eu |
Eudat presentation nov2013 | www.eudat.eu | Eudat presentation nov2013 | www.eudat.eu |
Eudat presentation nov2013 | www.eudat.eu |
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data Commons
 
Jisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 PaperJisc Research Data Shared Service Open Repositories 2018 Paper
Jisc Research Data Shared Service Open Repositories 2018 Paper
 
2016 05 sanger
2016 05 sanger2016 05 sanger
2016 05 sanger
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"
 

Recently uploaded

一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 

Recently uploaded (20)

一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 

BioIT 2018 'Easier integration and enrichment of your data by making public data more FAIR'

  • 1. Chris Evelo Hans Constandt May 15, 2018 Easier Integration and Enrichment of Data by Making Data More FAIR chris_evelo hconstandt
  • 2. Agenda • Elixir: Interoperability services aiming at actual reuse • Things we can do (and how we do it) • Lessons from Open PHACTS • Services for Findability • Services for Interoperability • Integration in end user tools for actual reuse • FAIR Data @Work • What do we want from a Data Set? • How do we get there? • Your internal linked data? • Lift data to higher level of FAIRness? • Linked Data federation? • Thank You • Q&A
  • 3. www.elixir-europe.org ELIXIR Safeguarding the results of life science research in Europe
  • 5. We can do things like this (diabetic liver) Pihlajamäki et al. dataset is from Gene Expression Omnibus (accession number GSE15653) Pihlajamäki et al. J Clin Endocrinol Metab. 2009, 94 (9): 3521-3529. DOI: 10.1210/jc.2009-0212. Martina Kutmon et al. BMC Genomics 2014, 15:971. DOI: 10.1186/1471-2164- 15-971
  • 8. Vitamin D-microRNA network 31 targets up-regulated (3 in pathways) 23 targets down-regulated (4 in pathways) Targeted by multiple microRNAs: CLSPN - cell cycle FZD5 - receptor for Wnt proteins CACNG4 - calcium channel Data: Wang et al. 2011. in Gene Expression Omnibus (GEO, accession number: GSE17461). Published paper: Effects of 1alpha,25 dihydroxyvitamin D3 and testosterone on miRNA and mRNA expression in LNCaP cells. WL Wang et al. Mol Cancer 2011. 10. doi:10.1186/1476-4598-10-58 This work: Integrative network- based analysis of mRNA and microRNA expression in 1, 25- dihydroxyvitamin D 3-treated cancer cells. M Kutmon et al. Genes & nutrition 10 (5), 35 doi:10.1007/s12263-015-0484-0
  • 10. Internal & external data repositories e.g. dbNP, Sage, Atlas knowledge resources & (semantic web) Integration e.g. Open PHACTS WikiPathways study capturing ISA models study data processing, statistics, storage e.g. arrayanalysis.org ontologies modeling & data integration, network biology (extension), supervised statistics curation, simulation annotation & provenance Integrative Systems Biology research applications mapping BridgeDb extraction, SPARQLing conversion
  • 11.
  • 12. Nanopub Db VoID Data Cache (Virtuoso Triple Store) Semantic Workflow Engine Linked Data API (RDF/XML, TTL, JSON) Domain Specific Services Identity Resolution Service Chemistry Registration Normalisation & Q/C Identifier Management Service Indexing CorePlatform P12374 EC2.43.4 CS4532 “Adenosine receptor 2a” VoID Db Nanopub Db VoID Db VoID Nanopub VoID Public Content Commercial Public Ontologies User Annotations Apps
  • 13. Nanopub Db VoID Data Cache (Virtuoso Triple Store) Semantic Workflow Engine Linked Data API (RDF/XML, TTL, JSON) Domain Specific Services Identity Resolution Service Chemistry Registration Normalisation & Q/C Identifier Management Service Indexing CorePlatform P12374 EC2.43.4 CS4532 “Adenosine receptor 2a” VoID Db Nanopub Db VoID Db VoID Nanopub VoID Public Content Commercial Public Ontologies User Annotations Apps
  • 14. www.elixir-europe.org/excelerate @ELIXIREurope /company/elixir-europe ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559. Interoperability services aiming at actual reuse ELIXIR
  • 15. Three levels of activity to reuse FAIR data 1. FAIR data itself 2. Interoperability services 3. Integration in end user tools
  • 16. Services for findability - Bioschemas: annotate the web - FAIRsharing: describing standards and data collection - BioStudies/BioSamples: describing studies and samples and (links to) data - (Omics)data repositories: ELIXIR core data resources - FAIR datapoints: the future? - Nanopublication collections: RDF with explicit provenance and evidence
  • 19. Services for interoperability • Database identifier mapping • Mappings for • Gene products (ENSEMBL) • Metabolites (HMDB, ChEBI, WikiData) • Reactions (RHEA) • Gene Variants (ENSEMBL, dbSNP) • Framework and web service • Stackable • Integrated in PathVisio and Cytoscape, available for R, SemWeb
  • 20. 2c. Needed services for interoperability - Dedicated mapping approaches (e.g. variants to protein domains, Mutalyzer) - Chemistry resolution services (not really there) - Textual term resolution (also for queries) (sounds so simple..) - …
  • 21. Don’t be afraid to reinvent wheels!
  • 22. Integration in tools - (Federated) SPARQL queries with integrated mapping (this is what is behind the Open PHACTS API) - Combine in R, Cytoscape, Galaxy, KNIME, CWLetc Needs packages, plugins, tools - Combine in knowledge discovery tools working with your own data and additional on purpose mapping e.g. Ontoforce’s DISQOVER
  • 23. The Biggest Wins for Knowledge Creation - Integrating various data sources - Using a unified (semantic) logic - Linking data points hconstandt
  • 24. Real-World Knowledge Creation Challenges - Combining humans & domain knowledge are unbeatable - Maximize synergy between human Creativity & machine power In complex real-world knowledge creation challenges, often the right questions themselves are not known. hconstandt
  • 25. What do we want from a Data Set? • Potential users know it exists and where to find it • Potential users know what it’s about • Usable for different purposes • Connected • Future proof • Usable for automation hconstandt
  • 26. Ultimately? • Driving knowledge generation • Saving time and money in research and business doi:10.1038/nrd3681 PMID:22378269. hconstandt
  • 27. How do we get there? • Integrate data • Harmonize data • Enrich data • Interlink data • And … never forget the end user… hconstandt
  • 28. Use golden sources & specialized niche sources to • fill in the gaps • make the links • expand • enrich Create your internal Linked Data ecosystem hconstandt
  • 29. Lift data to higher level of FAIRness • Keep in sync with public data • Harmonize • Interlink • Uniform User Interface and API Sync Harmonize Link Source DISQOVER SyncSource Harmonize hconstandt
  • 30.
  • 31. Linked Data federation • Thank You • Q&A
  • 32. Linked Data federation • Thank You • Q&A
  • 33. Linked Data federation • Thank You • Q&A
  • 34. Linked Data federation • Thank You • Q&A
  • 35. Linked Data federation • Thank You • Q&A
  • 36.
  • 38. Special thanks to people who worked with us and inspired us Michel Dumontier Helena Deus Bryn Robert MaryJo Zabarowski Phillips Kuhl Tom Plasterer Wolfgang Colsman Anthony Rowe John Reynders Tim Berners Lee Bernard Munos Lee Harland Dan Gshwend Sebastien Lefebvre Eric Prudhommeaux Dean Allemang Robert Greenwood Wolfgang Hoeck Bhanu Bahl Ian Dix Carole Goble Matthias Nolte Derek Marren Chris Evelo Mark Benioff Ian Harrow Maryann Martone Rajan Dasai Theo Platt Renzo Constandt Bart Van Leeuwen Arun Nayar Martin Leach Jay Bergeron … & many more Thank you! hconstandt