SlideShare a Scribd company logo
GA4GH – Metadata task team
Mélanie Courtot
On behalf of the Metadata task team
mcourtot@ebi.ac.uk
@mcourtot
The Metadata task Team (MTT)
•  Main challenges:
•  MTT cross cutting; hard in GA4GH first iteration
•  Issue finding good datasets: “free-floating” development of
metadata standards
•  Lack of use cases across task teams
•  Mid-2016: move towards application-driven changes to
bring focus to model development
Initial metadata projects
•  ArrayMap: cancer genome array data, for
visualization and somatic copy number
aberrations
•  Beacon+: on top of ArrayMap, incorporates
structural genomic variants
•  BioSamples: 5 millions samples data, linking to
EMBL-EBI archives (ArrayExpress, ENA, EGA…)
diverse
focused
http://arraymap.org
http://beacon.arraymap.org/beacon/beaconplus-ui/
https://www.ebi.ac.uk/biosamples/
Use cases
“For a given phenotype, retrieve the genotype”
Results
DIPG
•  Diffuse Intrinsic Pontine Glioma
•  Rare, incurable brain tumor in children 6-8 years old
•  Median survival < 1 year
•  Lack of good model hampers progress in treatment: no
significant advances in 30 years
Misuraca et al., Front Oncol. 2015; 5: 172.
•  910 cases taken from 20 published series + 157
unpublished cases
•  Added into ArrayMap and curated, accessible through
Beacon+
Michael
Baudis
Bo
Gao
(MacKay et al., Cancer Cell 2017)
The DIPG dataset
Beacon+
Concept
• Implementation of cancer beacon
prototype, backed by arrayMap
and DIPG data set 
• structural variations (DUP, DEL) in
addition to SNV
• diagnosis queries using ontology
codes (NCIT, ICD-O)
• quantitative responses
• GA4GH schema compatible
variant & metadata API
Querying over integrated datasets through the
GA4GH API
•  1 variant is found in 21 biosamples, of which 12 are from
the brain stem (i.e. DIPG)
http://dipg.progenetix.org/beacon/beaconplus-server/beaconresponse.cgi?
dataset_id=dipg&variants.reference_name=chr17&assembly_id=GRCh36&variants.variant_type=SNV&variants.start=7577121&v
ariants.reference_bases=G&variants.alternate_bases=A&biosamples.bio_characteristics.ontology_terms.term_id=pgx:icdot:c71.7
A few issues along the way…
but we know how to make problems
far more tractable.
Real world
metadata is
complex
•  27,000 unique attributes
•  38,000,000 key:value pairs
organism,4614610synonym,2639043model,
1810386package,1809830organismPart,
1338399sampleSourceName,1323241strain,
1249016sex,913802colectionDate,
876805sampleTitle,
862792geographicLocation,781174age,
641594cellType,534536isolationSource,
481916sourceName,
453390secondaryDescription,418998host,
404790latitudeAndLongitude,350241genotype,
343348diseaseState,
329477environmentBiome,
323954environmentMaterial,
302849environmentFeature,
295626sampleType,290803isolate,
277671species,267416collectedBy,
242394latitude,193751longitude,
191945biomaterialProvider,
178351developmentStage,
172165sampleCharacteristics,
161154projectName,158703hostSubjectId,
158365depth,154759developmentalStage,
150380geographicLocationCountryAndOrSea,
142960elevation,133134investigationType,
132980treatment,132957individual,
132123cultivar,127959anonymizedName,
104052sequencingMethod,102771title,
102415envBiome,98786envFeature,
diseaseState
sampleCharacteristics
hostDisease
diabetes
Diagnosis
…
We generate GA4GH datasets for integration
over BioSD
Trish Whetzel Matt Green
We use data and
annotations to provide semi-
automated curation
diseaseState
sampleCharacteristics
hostDisease
diabetes
Diagnosis
…
disease
Semantic as a services
Iden%ty	Resolu%on	
Id	Version	&	
Provenance	
Tools	Registry		Ontology	Services	
Standards	and	APIs	
Linked	Data	
Pipelines	Applica%ons	 Publishers	
Template	Services	
Iden%ty	
Mapping	
Guidelines	and	
Standards	Registry		
Cita%on	
Implementa%on	
Search	(BioSolr)	 Prefix	commons	
Dataset	Descrip%on	
Metadata	
Valida%on	Services
The Ontology Toolkit
https://ebispot.github.io
Open Source Software
http://www.ebi.ac.uk/spot/ontology
Different datasets use different standards: we
provide mappings
•  International Classification of Diseases for Oncology codes
the site (topography) and the histology (morphology) of
neoplasms
•  Combination of ICDO morphology and topography can be
mapped to NCI Thesaurus
Paula Carrio Cordo
We usually work with open data
•  Our data is open and
publicly released
•  Not the case for all our
users, e.g. EGA requires
controlled access
Development of DUO
for standard consent
codes and data
restrictions
•  Collaboration EGA/Broad
•  Integration with ADA-M
and Beacon
https://github.com/EBISPOT/DUO
Moran
Cabili
Dylan
Spalding
Giselle
Kerry
A modular interoperable schema is more
useful than a big one
MTT – short term: a new home
•  Move to a distinct metadata repository
•  Updated documentation
•  Split into modules
•  Link to examples
⇒ increase visibility/uptake
•  Community adoption/alignment
MTT – medium term: coordination with work
streams
•  Streaming: sample identification and representation
supporting streaming use cases
•  Implementation Biosamples and EGA
•  Discovery: representation for discovery use cases
•  Implementation Beacon+ and ArrayMap
•  Genomic Knowledge Standards: dataset level description,
study representation. Analysis result?
•  Implementation Biosamples and ENA
Long term vision: leveraging
clinical data
DIPG data in Biosamples
•  GA4GH API has been
implemented over
BioSamples
•  Allows querying via
GA4GH metadata model,
linking to other EBI
archives and integrating
available data
Sample is a bridge for clinical data
Present
Get in touch!
mcourtot@ebi.ac.uk
Acknowledgements
•  Wellcome Trust-EBI grant 201535/Z/16Z
•  Elixir
•  CORBEL
•  ELIXIR-EXCELERATE
•  Metadata task team
•  EMBL-EBI
•  DUO collaborators
•  Samples, phenotype and ontology team

More Related Content

What's hot

Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance
 
Gene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialGene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -Tutorial
Dmitry Grapov
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
Chris Evelo
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literature
petermurrayrust
 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
Rothamsted Research, UK
 
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
AIMS (Agricultural Information Management Standards)
 
Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biology
Chris Evelo
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
Pistoia Alliance
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the future
Pistoia Alliance
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019
Pistoia Alliance
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
Carole Goble
 
Towards Responsible Content Mining: A Cambridge perspective
Towards Responsible Content Mining: A Cambridge perspectiveTowards Responsible Content Mining: A Cambridge perspective
Towards Responsible Content Mining: A Cambridge perspective
petermurrayrust
 
B.3.5
B.3.5B.3.5
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UK
petermurrayrust
 
Content Mining of Science in Europe
Content Mining of Science in EuropeContent Mining of Science in Europe
Content Mining of Science in Europe
petermurrayrust
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidata
petermurrayrust
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
petermurrayrust
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
open_phacts
 
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsMining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Sean Ekins
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
Research Information Network
 

What's hot (20)

Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Gene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialGene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -Tutorial
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literature
 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
 
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
 
Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biology
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the future
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Towards Responsible Content Mining: A Cambridge perspective
Towards Responsible Content Mining: A Cambridge perspectiveTowards Responsible Content Mining: A Cambridge perspective
Towards Responsible Content Mining: A Cambridge perspective
 
B.3.5
B.3.5B.3.5
B.3.5
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UK
 
Content Mining of Science in Europe
Content Mining of Science in EuropeContent Mining of Science in Europe
Content Mining of Science in Europe
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidata
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsMining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
 

Similar to GA4GH Metadata task team presentation

Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK Story
Carole Goble
 
Large scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyLarge scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biology
Maté Ongenaert
 
Mining Big datasets to create and validate machine learning models
Mining Big datasets to create and validate machine learning modelsMining Big datasets to create and validate machine learning models
Mining Big datasets to create and validate machine learning models
Sean Ekins
 
2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model
aimsnist
 
Use of spark for proteomic scoring seattle presentation
Use of spark for  proteomic scoring   seattle presentationUse of spark for  proteomic scoring   seattle presentation
Use of spark for proteomic scoring seattle presentation
lordjoe
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation Sequencing
EdizonJambormias2
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
KAMAL CHOUDHARY
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
Stuti Nayak
 
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Joe Parker
 
10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
vantinhkhuc
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
Connected Data World
 
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Marcel Swart
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
jennomics
 
Resume_June_2020
Resume_June_2020Resume_June_2020
Resume_June_2020
Amar Thaker
 
Polymerase chain reaction(PCR).presentation
Polymerase chain reaction(PCR).presentationPolymerase chain reaction(PCR).presentation
Polymerase chain reaction(PCR).presentation
Mimranjaved1
 
Polymerase chain reaction(PCR) presentationpptx
Polymerase chain reaction(PCR) presentationpptxPolymerase chain reaction(PCR) presentationpptx
Polymerase chain reaction(PCR) presentationpptx
Mimranjaved1
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121
Jane Landolin
 
Microbial physiology in genomic era
Microbial physiology in genomic eraMicrobial physiology in genomic era
Microbial physiology in genomic era
The Women University Multan
 
Assays for protein dna interactions
Assays for protein dna interactionsAssays for protein dna interactions
Assays for protein dna interactions
oikawa
 
2016 bergen-sars
2016 bergen-sars2016 bergen-sars
2016 bergen-sars
c.titus.brown
 

Similar to GA4GH Metadata task team presentation (20)

Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK Story
 
Large scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyLarge scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biology
 
Mining Big datasets to create and validate machine learning models
Mining Big datasets to create and validate machine learning modelsMining Big datasets to create and validate machine learning models
Mining Big datasets to create and validate machine learning models
 
2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model
 
Use of spark for proteomic scoring seattle presentation
Use of spark for  proteomic scoring   seattle presentationUse of spark for  proteomic scoring   seattle presentation
Use of spark for proteomic scoring seattle presentation
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation Sequencing
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
 
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...
 
10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
 
Resume_June_2020
Resume_June_2020Resume_June_2020
Resume_June_2020
 
Polymerase chain reaction(PCR).presentation
Polymerase chain reaction(PCR).presentationPolymerase chain reaction(PCR).presentation
Polymerase chain reaction(PCR).presentation
 
Polymerase chain reaction(PCR) presentationpptx
Polymerase chain reaction(PCR) presentationpptxPolymerase chain reaction(PCR) presentationpptx
Polymerase chain reaction(PCR) presentationpptx
 
Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121Open pacbiomodelorgpaper j_landolin_20150121
Open pacbiomodelorgpaper j_landolin_20150121
 
Microbial physiology in genomic era
Microbial physiology in genomic eraMicrobial physiology in genomic era
Microbial physiology in genomic era
 
Assays for protein dna interactions
Assays for protein dna interactionsAssays for protein dna interactions
Assays for protein dna interactions
 
2016 bergen-sars
2016 bergen-sars2016 bergen-sars
2016 bergen-sars
 

More from Melanie Courtot

Bioschemas for Biosamples
Bioschemas for BiosamplesBioschemas for Biosamples
Bioschemas for Biosamples
Melanie Courtot
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
Melanie Courtot
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
Melanie Courtot
 
Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015
Melanie Courtot
 
20141112 courtot big_datasemwebontologies
20141112 courtot big_datasemwebontologies20141112 courtot big_datasemwebontologies
20141112 courtot big_datasemwebontologies
Melanie Courtot
 
2014 icbo courtot_meddra
2014 icbo courtot_meddra2014 icbo courtot_meddra
2014 icbo courtot_meddra
Melanie Courtot
 
Biocuration 2014 - Effective automated classification of adverse events using...
Biocuration 2014 - Effective automated classification of adverse events using...Biocuration 2014 - Effective automated classification of adverse events using...
Biocuration 2014 - Effective automated classification of adverse events using...
Melanie Courtot
 
Diagnostic criteria and clinical guidelines standardization to automate case ...
Diagnostic criteria and clinical guidelines standardization to automate case ...Diagnostic criteria and clinical guidelines standardization to automate case ...
Diagnostic criteria and clinical guidelines standardization to automate case ...
Melanie Courtot
 
Enabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology supportEnabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology support
Melanie Courtot
 
ICBO2012 Flash talk
ICBO2012 Flash talkICBO2012 Flash talk
ICBO2012 Flash talk
Melanie Courtot
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web tools
Melanie Courtot
 
Flow cytometry and ontologies
Flow cytometry and ontologiesFlow cytometry and ontologies
Flow cytometry and ontologies
Melanie Courtot
 
Adverse Events Following Immunization: Reporting standardization, Automatic C...
Adverse Events Following Immunization: Reporting standardization, Automatic C...Adverse Events Following Immunization: Reporting standardization, Automatic C...
Adverse Events Following Immunization: Reporting standardization, Automatic C...
Melanie Courtot
 
BUILDING THE OBO FOUNDRY – ONE POLICY AT A TIME
BUILDING THE OBO FOUNDRY – ONE POLICY AT A TIMEBUILDING THE OBO FOUNDRY – ONE POLICY AT A TIME
BUILDING THE OBO FOUNDRY – ONE POLICY AT A TIME
Melanie Courtot
 
TOWARDS AN ADVERSE EVENT REPORTING ONTOLOGY
TOWARDS AN ADVERSE EVENT REPORTING ONTOLOGYTOWARDS AN ADVERSE EVENT REPORTING ONTOLOGY
TOWARDS AN ADVERSE EVENT REPORTING ONTOLOGY
Melanie Courtot
 
Towards an adverse event reporting ontology
Towards an adverse event reporting ontologyTowards an adverse event reporting ontology
Towards an adverse event reporting ontology
Melanie Courtot
 
PHAC/CIHR Influenza Research Network
PHAC/CIHR Influenza Research NetworkPHAC/CIHR Influenza Research Network
PHAC/CIHR Influenza Research Network
Melanie Courtot
 
MIREOT
MIREOTMIREOT

More from Melanie Courtot (18)

Bioschemas for Biosamples
Bioschemas for BiosamplesBioschemas for Biosamples
Bioschemas for Biosamples
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015Standards for public health genomic epidemiology - Biocuration 2015
Standards for public health genomic epidemiology - Biocuration 2015
 
20141112 courtot big_datasemwebontologies
20141112 courtot big_datasemwebontologies20141112 courtot big_datasemwebontologies
20141112 courtot big_datasemwebontologies
 
2014 icbo courtot_meddra
2014 icbo courtot_meddra2014 icbo courtot_meddra
2014 icbo courtot_meddra
 
Biocuration 2014 - Effective automated classification of adverse events using...
Biocuration 2014 - Effective automated classification of adverse events using...Biocuration 2014 - Effective automated classification of adverse events using...
Biocuration 2014 - Effective automated classification of adverse events using...
 
Diagnostic criteria and clinical guidelines standardization to automate case ...
Diagnostic criteria and clinical guidelines standardization to automate case ...Diagnostic criteria and clinical guidelines standardization to automate case ...
Diagnostic criteria and clinical guidelines standardization to automate case ...
 
Enabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology supportEnabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology support
 
ICBO2012 Flash talk
ICBO2012 Flash talkICBO2012 Flash talk
ICBO2012 Flash talk
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web tools
 
Flow cytometry and ontologies
Flow cytometry and ontologiesFlow cytometry and ontologies
Flow cytometry and ontologies
 
Adverse Events Following Immunization: Reporting standardization, Automatic C...
Adverse Events Following Immunization: Reporting standardization, Automatic C...Adverse Events Following Immunization: Reporting standardization, Automatic C...
Adverse Events Following Immunization: Reporting standardization, Automatic C...
 
BUILDING THE OBO FOUNDRY – ONE POLICY AT A TIME
BUILDING THE OBO FOUNDRY – ONE POLICY AT A TIMEBUILDING THE OBO FOUNDRY – ONE POLICY AT A TIME
BUILDING THE OBO FOUNDRY – ONE POLICY AT A TIME
 
TOWARDS AN ADVERSE EVENT REPORTING ONTOLOGY
TOWARDS AN ADVERSE EVENT REPORTING ONTOLOGYTOWARDS AN ADVERSE EVENT REPORTING ONTOLOGY
TOWARDS AN ADVERSE EVENT REPORTING ONTOLOGY
 
Towards an adverse event reporting ontology
Towards an adverse event reporting ontologyTowards an adverse event reporting ontology
Towards an adverse event reporting ontology
 
PHAC/CIHR Influenza Research Network
PHAC/CIHR Influenza Research NetworkPHAC/CIHR Influenza Research Network
PHAC/CIHR Influenza Research Network
 
MIREOT
MIREOTMIREOT
MIREOT
 

Recently uploaded

Adhd Medication Shortage Uk - trinexpharmacy.com
Adhd Medication Shortage Uk - trinexpharmacy.comAdhd Medication Shortage Uk - trinexpharmacy.com
Adhd Medication Shortage Uk - trinexpharmacy.com
reignlana06
 
Efficacy of Avartana Sneha in Ayurveda
Efficacy of Avartana Sneha in AyurvedaEfficacy of Avartana Sneha in Ayurveda
Efficacy of Avartana Sneha in Ayurveda
Dr. Jyothirmai Paindla
 
CHEMOTHERAPY_RDP_CHAPTER 1_ANTI TB DRUGS.pdf
CHEMOTHERAPY_RDP_CHAPTER 1_ANTI TB DRUGS.pdfCHEMOTHERAPY_RDP_CHAPTER 1_ANTI TB DRUGS.pdf
CHEMOTHERAPY_RDP_CHAPTER 1_ANTI TB DRUGS.pdf
rishi2789
 
The Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic PrinciplesThe Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic Principles
MedicoseAcademics
 
Diabetic nephropathy diagnosis treatment
Diabetic nephropathy diagnosis treatmentDiabetic nephropathy diagnosis treatment
Diabetic nephropathy diagnosis treatment
arahmanzai5
 
Ketone bodies and metabolism-biochemistry
Ketone bodies and metabolism-biochemistryKetone bodies and metabolism-biochemistry
Ketone bodies and metabolism-biochemistry
Dhayanithi C
 
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotesPromoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
PsychoTech Services
 
K CỔ TỬ CUNG.pdf tự ghi chép, chữ hơi xấu
K CỔ TỬ CUNG.pdf tự ghi chép, chữ hơi xấuK CỔ TỬ CUNG.pdf tự ghi chép, chữ hơi xấu
K CỔ TỬ CUNG.pdf tự ghi chép, chữ hơi xấu
HongBiThi1
 
Integrating Ayurveda into Parkinson’s Management: A Holistic Approach
Integrating Ayurveda into Parkinson’s Management: A Holistic ApproachIntegrating Ayurveda into Parkinson’s Management: A Holistic Approach
Integrating Ayurveda into Parkinson’s Management: A Holistic Approach
Ayurveda ForAll
 
Complementary feeding in infant IAP PROTOCOLS
Complementary feeding in infant IAP PROTOCOLSComplementary feeding in infant IAP PROTOCOLS
Complementary feeding in infant IAP PROTOCOLS
chiranthgowda16
 
Osteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdfOsteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdf
Jim Jacob Roy
 
Netter's Atlas of Human Anatomy 7.ed.pdf
Netter's Atlas of Human Anatomy 7.ed.pdfNetter's Atlas of Human Anatomy 7.ed.pdf
Netter's Atlas of Human Anatomy 7.ed.pdf
BrissaOrtiz3
 
TEST BANK For Community Health Nursing A Canadian Perspective, 5th Edition by...
TEST BANK For Community Health Nursing A Canadian Perspective, 5th Edition by...TEST BANK For Community Health Nursing A Canadian Perspective, 5th Edition by...
TEST BANK For Community Health Nursing A Canadian Perspective, 5th Edition by...
Donc Test
 
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
rightmanforbloodline
 
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Oleg Kshivets
 
Cardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdfCardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdf
shivalingatalekar1
 
Best Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and IndigestionBest Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and Indigestion
Swastik Ayurveda
 
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
Holistified Wellness
 
Cell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune DiseaseCell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune Disease
Health Advances
 
The Best Ayurvedic Antacid Tablets in India
The Best Ayurvedic Antacid Tablets in IndiaThe Best Ayurvedic Antacid Tablets in India
The Best Ayurvedic Antacid Tablets in India
Swastik Ayurveda
 

Recently uploaded (20)

Adhd Medication Shortage Uk - trinexpharmacy.com
Adhd Medication Shortage Uk - trinexpharmacy.comAdhd Medication Shortage Uk - trinexpharmacy.com
Adhd Medication Shortage Uk - trinexpharmacy.com
 
Efficacy of Avartana Sneha in Ayurveda
Efficacy of Avartana Sneha in AyurvedaEfficacy of Avartana Sneha in Ayurveda
Efficacy of Avartana Sneha in Ayurveda
 
CHEMOTHERAPY_RDP_CHAPTER 1_ANTI TB DRUGS.pdf
CHEMOTHERAPY_RDP_CHAPTER 1_ANTI TB DRUGS.pdfCHEMOTHERAPY_RDP_CHAPTER 1_ANTI TB DRUGS.pdf
CHEMOTHERAPY_RDP_CHAPTER 1_ANTI TB DRUGS.pdf
 
The Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic PrinciplesThe Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic Principles
 
Diabetic nephropathy diagnosis treatment
Diabetic nephropathy diagnosis treatmentDiabetic nephropathy diagnosis treatment
Diabetic nephropathy diagnosis treatment
 
Ketone bodies and metabolism-biochemistry
Ketone bodies and metabolism-biochemistryKetone bodies and metabolism-biochemistry
Ketone bodies and metabolism-biochemistry
 
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotesPromoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
Promoting Wellbeing - Applied Social Psychology - Psychology SuperNotes
 
K CỔ TỬ CUNG.pdf tự ghi chép, chữ hơi xấu
K CỔ TỬ CUNG.pdf tự ghi chép, chữ hơi xấuK CỔ TỬ CUNG.pdf tự ghi chép, chữ hơi xấu
K CỔ TỬ CUNG.pdf tự ghi chép, chữ hơi xấu
 
Integrating Ayurveda into Parkinson’s Management: A Holistic Approach
Integrating Ayurveda into Parkinson’s Management: A Holistic ApproachIntegrating Ayurveda into Parkinson’s Management: A Holistic Approach
Integrating Ayurveda into Parkinson’s Management: A Holistic Approach
 
Complementary feeding in infant IAP PROTOCOLS
Complementary feeding in infant IAP PROTOCOLSComplementary feeding in infant IAP PROTOCOLS
Complementary feeding in infant IAP PROTOCOLS
 
Osteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdfOsteoporosis - Definition , Evaluation and Management .pdf
Osteoporosis - Definition , Evaluation and Management .pdf
 
Netter's Atlas of Human Anatomy 7.ed.pdf
Netter's Atlas of Human Anatomy 7.ed.pdfNetter's Atlas of Human Anatomy 7.ed.pdf
Netter's Atlas of Human Anatomy 7.ed.pdf
 
TEST BANK For Community Health Nursing A Canadian Perspective, 5th Edition by...
TEST BANK For Community Health Nursing A Canadian Perspective, 5th Edition by...TEST BANK For Community Health Nursing A Canadian Perspective, 5th Edition by...
TEST BANK For Community Health Nursing A Canadian Perspective, 5th Edition by...
 
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
TEST BANK For An Introduction to Brain and Behavior, 7th Edition by Bryan Kol...
 
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
 
Cardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdfCardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdf
 
Best Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and IndigestionBest Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and Indigestion
 
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
8 Surprising Reasons To Meditate 40 Minutes A Day That Can Change Your Life.pptx
 
Cell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune DiseaseCell Therapy Expansion and Challenges in Autoimmune Disease
Cell Therapy Expansion and Challenges in Autoimmune Disease
 
The Best Ayurvedic Antacid Tablets in India
The Best Ayurvedic Antacid Tablets in IndiaThe Best Ayurvedic Antacid Tablets in India
The Best Ayurvedic Antacid Tablets in India
 

GA4GH Metadata task team presentation

  • 1. GA4GH – Metadata task team Mélanie Courtot On behalf of the Metadata task team mcourtot@ebi.ac.uk @mcourtot
  • 2. The Metadata task Team (MTT) •  Main challenges: •  MTT cross cutting; hard in GA4GH first iteration •  Issue finding good datasets: “free-floating” development of metadata standards •  Lack of use cases across task teams •  Mid-2016: move towards application-driven changes to bring focus to model development
  • 3. Initial metadata projects •  ArrayMap: cancer genome array data, for visualization and somatic copy number aberrations •  Beacon+: on top of ArrayMap, incorporates structural genomic variants •  BioSamples: 5 millions samples data, linking to EMBL-EBI archives (ArrayExpress, ENA, EGA…) diverse focused http://arraymap.org http://beacon.arraymap.org/beacon/beaconplus-ui/ https://www.ebi.ac.uk/biosamples/
  • 4. Use cases “For a given phenotype, retrieve the genotype”
  • 6. DIPG •  Diffuse Intrinsic Pontine Glioma •  Rare, incurable brain tumor in children 6-8 years old •  Median survival < 1 year •  Lack of good model hampers progress in treatment: no significant advances in 30 years Misuraca et al., Front Oncol. 2015; 5: 172.
  • 7. •  910 cases taken from 20 published series + 157 unpublished cases •  Added into ArrayMap and curated, accessible through Beacon+ Michael Baudis Bo Gao (MacKay et al., Cancer Cell 2017) The DIPG dataset
  • 8. Beacon+ Concept • Implementation of cancer beacon prototype, backed by arrayMap and DIPG data set • structural variations (DUP, DEL) in addition to SNV • diagnosis queries using ontology codes (NCIT, ICD-O) • quantitative responses • GA4GH schema compatible variant & metadata API
  • 9. Querying over integrated datasets through the GA4GH API •  1 variant is found in 21 biosamples, of which 12 are from the brain stem (i.e. DIPG) http://dipg.progenetix.org/beacon/beaconplus-server/beaconresponse.cgi? dataset_id=dipg&variants.reference_name=chr17&assembly_id=GRCh36&variants.variant_type=SNV&variants.start=7577121&v ariants.reference_bases=G&variants.alternate_bases=A&biosamples.bio_characteristics.ontology_terms.term_id=pgx:icdot:c71.7
  • 10. A few issues along the way… but we know how to make problems far more tractable.
  • 12. •  27,000 unique attributes •  38,000,000 key:value pairs organism,4614610synonym,2639043model, 1810386package,1809830organismPart, 1338399sampleSourceName,1323241strain, 1249016sex,913802colectionDate, 876805sampleTitle, 862792geographicLocation,781174age, 641594cellType,534536isolationSource, 481916sourceName, 453390secondaryDescription,418998host, 404790latitudeAndLongitude,350241genotype, 343348diseaseState, 329477environmentBiome, 323954environmentMaterial, 302849environmentFeature, 295626sampleType,290803isolate, 277671species,267416collectedBy, 242394latitude,193751longitude, 191945biomaterialProvider, 178351developmentStage, 172165sampleCharacteristics, 161154projectName,158703hostSubjectId, 158365depth,154759developmentalStage, 150380geographicLocationCountryAndOrSea, 142960elevation,133134investigationType, 132980treatment,132957individual, 132123cultivar,127959anonymizedName, 104052sequencingMethod,102771title, 102415envBiome,98786envFeature, diseaseState sampleCharacteristics hostDisease diabetes Diagnosis …
  • 13. We generate GA4GH datasets for integration over BioSD Trish Whetzel Matt Green We use data and annotations to provide semi- automated curation diseaseState sampleCharacteristics hostDisease diabetes Diagnosis … disease
  • 14. Semantic as a services Iden%ty Resolu%on Id Version & Provenance Tools Registry Ontology Services Standards and APIs Linked Data Pipelines Applica%ons Publishers Template Services Iden%ty Mapping Guidelines and Standards Registry Cita%on Implementa%on Search (BioSolr) Prefix commons Dataset Descrip%on Metadata Valida%on Services
  • 15. The Ontology Toolkit https://ebispot.github.io Open Source Software http://www.ebi.ac.uk/spot/ontology
  • 16. Different datasets use different standards: we provide mappings •  International Classification of Diseases for Oncology codes the site (topography) and the histology (morphology) of neoplasms •  Combination of ICDO morphology and topography can be mapped to NCI Thesaurus Paula Carrio Cordo
  • 17. We usually work with open data •  Our data is open and publicly released •  Not the case for all our users, e.g. EGA requires controlled access
  • 18. Development of DUO for standard consent codes and data restrictions •  Collaboration EGA/Broad •  Integration with ADA-M and Beacon https://github.com/EBISPOT/DUO Moran Cabili Dylan Spalding Giselle Kerry
  • 19. A modular interoperable schema is more useful than a big one
  • 20. MTT – short term: a new home •  Move to a distinct metadata repository •  Updated documentation •  Split into modules •  Link to examples ⇒ increase visibility/uptake •  Community adoption/alignment
  • 21. MTT – medium term: coordination with work streams •  Streaming: sample identification and representation supporting streaming use cases •  Implementation Biosamples and EGA •  Discovery: representation for discovery use cases •  Implementation Beacon+ and ArrayMap •  Genomic Knowledge Standards: dataset level description, study representation. Analysis result? •  Implementation Biosamples and ENA
  • 22. Long term vision: leveraging clinical data
  • 23. DIPG data in Biosamples •  GA4GH API has been implemented over BioSamples •  Allows querying via GA4GH metadata model, linking to other EBI archives and integrating available data
  • 24. Sample is a bridge for clinical data
  • 25.
  • 27. Acknowledgements •  Wellcome Trust-EBI grant 201535/Z/16Z •  Elixir •  CORBEL •  ELIXIR-EXCELERATE •  Metadata task team •  EMBL-EBI •  DUO collaborators •  Samples, phenotype and ontology team