SlideShare a Scribd company logo
1 of 17
Download to read offline
Standardization of the HIPC Data
Templates: The Story So Far
Ahmad C. Bukhari, Ph.D., Kei-Hoi Cheung, Ph.D. and Steven H. Kleinstein, Ph.D.
Yale University, School of Medicine
User Group
(HIPC)
● An important resource for raw data and protocols from clinical trials,
mechanistic studies and novel methods for cellular and molecular
measurements
● Provides templates and standard operating procedures to facilitate data
representation and transfer.
● Provides a variety of tools for data access and manipulation
ImmPort
SQL Dump for local
hosting
Human Immunology Project Consortium (HIPC)
● Well-characterized human cohorts are studied using a variety of modern
analytic tools including multiplex transcriptional, cytokine, and proteomic
assays.
● HIPC submitted data is an important subset of the ImmPort database
● Submitted HIPC data is not standardized.
● Inconsistent naming and data reporting
Our aim is to make HIPC data FAIR
● Findability
○ Finding a large variety of related datasets is an important step to knowledge discovery
● Accessibility
○ A growing number of datasets are being submitted to public repositories such as ImmPort.
These datasets can accessed through different methods including web-based search, bulk
download and API access
● Interoperability
○ Data mining/analysis often requires multiple datasets to be integrated within a single repository
or across multiple repositories
● Reusability
○ Entering enough metadata as part of the data submission process facilitates data reuse
❖ FAIR a set of Digital Object Compliance principles that describes the properties of digital objects
defined under NIH Commons initiative
Current practices towards data FAIRness
● Minimum information standards (checklists) specify the minimum amount of
information (metadata) needed for reporting results in a reproducible and
reusable fashion. For example,
○ MIAME: Minimum information about a microarray experiment
○ MIAPE: Minimum Information About a Proteomics Experiment
● Scientific communities have developed templates incorporating detailed
checklists of the metadata needed to describe about the particular types of
experimental data sources.
● Standard identifiers/terminologies/ontologies have been created for different
domains
We propose an ontological mapping for the
ImmPort data submission templates.
● Ontology term mapping allows to achieve semantic normalization across
different repositories.
● Ontologically annotated datasets allow context-aware queries and data
integration
● Mapping to controlled vocabularies, relationships and rules facilitates
run-time data validation.
● These help achieve data FAIRness.
Ontology mapping of templates
Ontology
Recommender
OBI, OBO, Cell, PR
1
3
2
4
6 5
Incorporate into CEDAR and ImmPort Retrieve annotation (concept Uri, defns, etc)
A collection of ontologies
Expert Verification
Finalizing Mapping
Suggested Alteration
Terms Suggestion
Concept mapper
Concept mapper uses NCBO web services to suggest suitable mapping
Our mapping strategy
• For certain value sets such as cell populations and cytokines, CM maps
the values to domain specific ontologies such as Cell Ontology (CL) and
Protein Ontology (PR)
• For other elements, CM maps them to the terms in Ontology for
Biomedical Investigations (OBI)
• For elements that do not have matches in OBI, we map these elements to
terms in top-ranked ontologies by OBO Foundry
• For elements that do not have any ontology term matches, we perform
manual search in Bioportal and other available repos for these missing
terms.
• We work closely with individual ontology groups (e.g., CL, OBI) to fill the
Template elements mapped to ontologies
• Assay types (e.g., gene expression, flow cytometry, ELISA,
HAI, Luminex )
• Template types (e.g., human subject, biosample)
• Column names (e.g., biosample type, measurement
technique)
• Value sets (e.g., set of cell populations, set of measurement
techniques)
Assay Type # Templates # Sub-Templates # Concept # Value Set
Microarray gene
expression
6 10 113 209
Flowcytometry 6 - 67 262
ELISA 2 - 39 602
HAI 2 - 37 117
Luminex 7 - 102 1032
General 6 - 115 190
Mapping Statistics
OBI
OBI
OBI
Newly added
A device that moves charged particles through a .... OBI_0001121
A cytometry assay in which the presence of molecules OBI_0002115
CEDAR helps to generate ontology-linked metadata
Use case: CEDAR immunology data submission
templates
CEDAR has employed our suggested mapping
Map to cell term
in cell ontology
Manual Mapping to “assay”
In OBI Automatic mapping with NCIT
https://cedar.metadatacenter.net
Automatic mapping with OBI
Future plan
• Refine mapping of new assay types with updated
algorithm.
• Mapping of clinical metadata with ontology terms.
• Incorporate our ontology-term mapping approach into
CEDAR and ImmPort
• Submit missing terms to relevant ontologies (e.g., OBI)
Acknowledgment
• ImmPort
• Jeff Wiser, Patrick Dunn
• Yale
• Hailong Meng, Subhasis Mohanty
•Cell Ontology
• Alex Diehl
• NCBO BioPortal and CEDAR
• Mark Musen, John Graybeal, Martin O’connor
• OBI
• Bjoern Peters

More Related Content

What's hot

Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Valery Tkachenko
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchUniversity Medicine Greifswald
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesValery Tkachenko
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardizationValery Tkachenko
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...Syed Ahmad Chan Bukhari, PhD
 
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015Stuart Chalk
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectStuart Chalk
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Valery Tkachenko
 
From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...Catherine Canevet
 
Model management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyModel management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyUniversity Medicine Greifswald
 
Enabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology supportEnabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology supportMelanie Courtot
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardStuart Chalk
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Oscar Peña del Rio
 
Schema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive DataSchema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive DataLars Gleim
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)Michael Atkins
 

What's hot (20)

Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 
Short introduction to SED-ML
Short introduction to SED-MLShort introduction to SED-ML
Short introduction to SED-ML
 
Data and Model Management for Systems Biology
Data and Model Management  for Systems BiologyData and Model Management  for Systems Biology
Data and Model Management for Systems Biology
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
 
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP Project
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
 
From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...
 
Model management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyModel management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biology
 
Enabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology supportEnabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology support
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data Standard
 
Data and model management in Systems Biology
Data and model management in Systems BiologyData and model management in Systems Biology
Data and model management in Systems Biology
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
Schema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive DataSchema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive Data
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 

Similar to Standardization of the HIPC Data Templates: The Story So Far

The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologySnow Owl
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...Syed Ahmad Chan Bukhari, PhD
 
A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...Koray Atalag
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsLuis Marco Ruiz
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsLuis Marco Ruiz
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Amit Sheth
 
Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003robertstevens65
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Syed Ahmad Chan Bukhari, PhD
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Ahmad C. Bukhari
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
 
Provenance abstraction for implementing security: Learning Health System and ...
Provenance abstraction for implementing security: Learning Health System and ...Provenance abstraction for implementing security: Learning Health System and ...
Provenance abstraction for implementing security: Learning Health System and ...Vasa Curcin
 
FedCentric_Presentation
FedCentric_PresentationFedCentric_Presentation
FedCentric_PresentationYatpang Cheung
 
Towards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery LabsTowards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery LabsOla Spjuth
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxChimezie Ogbuji
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marcGenomeInABottle
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...DataScienceConferenc1
 

Similar to Standardization of the HIPC Data Templates: The Story So Far (20)

The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
 
Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 
Provenance abstraction for implementing security: Learning Health System and ...
Provenance abstraction for implementing security: Learning Health System and ...Provenance abstraction for implementing security: Learning Health System and ...
Provenance abstraction for implementing security: Learning Health System and ...
 
FedCentric_Presentation
FedCentric_PresentationFedCentric_Presentation
FedCentric_Presentation
 
Towards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery LabsTowards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery Labs
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptx
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marc
 
Dia09
Dia09Dia09
Dia09
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
 
Deep Learning for EHR Data
Deep Learning for EHR DataDeep Learning for EHR Data
Deep Learning for EHR Data
 

Recently uploaded

Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 

Recently uploaded (20)

Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 

Standardization of the HIPC Data Templates: The Story So Far

  • 1. Standardization of the HIPC Data Templates: The Story So Far Ahmad C. Bukhari, Ph.D., Kei-Hoi Cheung, Ph.D. and Steven H. Kleinstein, Ph.D. Yale University, School of Medicine User Group (HIPC)
  • 2. ● An important resource for raw data and protocols from clinical trials, mechanistic studies and novel methods for cellular and molecular measurements ● Provides templates and standard operating procedures to facilitate data representation and transfer. ● Provides a variety of tools for data access and manipulation ImmPort SQL Dump for local hosting
  • 3. Human Immunology Project Consortium (HIPC) ● Well-characterized human cohorts are studied using a variety of modern analytic tools including multiplex transcriptional, cytokine, and proteomic assays. ● HIPC submitted data is an important subset of the ImmPort database ● Submitted HIPC data is not standardized. ● Inconsistent naming and data reporting
  • 4. Our aim is to make HIPC data FAIR ● Findability ○ Finding a large variety of related datasets is an important step to knowledge discovery ● Accessibility ○ A growing number of datasets are being submitted to public repositories such as ImmPort. These datasets can accessed through different methods including web-based search, bulk download and API access ● Interoperability ○ Data mining/analysis often requires multiple datasets to be integrated within a single repository or across multiple repositories ● Reusability ○ Entering enough metadata as part of the data submission process facilitates data reuse ❖ FAIR a set of Digital Object Compliance principles that describes the properties of digital objects defined under NIH Commons initiative
  • 5. Current practices towards data FAIRness ● Minimum information standards (checklists) specify the minimum amount of information (metadata) needed for reporting results in a reproducible and reusable fashion. For example, ○ MIAME: Minimum information about a microarray experiment ○ MIAPE: Minimum Information About a Proteomics Experiment ● Scientific communities have developed templates incorporating detailed checklists of the metadata needed to describe about the particular types of experimental data sources. ● Standard identifiers/terminologies/ontologies have been created for different domains
  • 6.
  • 7. We propose an ontological mapping for the ImmPort data submission templates. ● Ontology term mapping allows to achieve semantic normalization across different repositories. ● Ontologically annotated datasets allow context-aware queries and data integration ● Mapping to controlled vocabularies, relationships and rules facilitates run-time data validation. ● These help achieve data FAIRness.
  • 8. Ontology mapping of templates Ontology Recommender OBI, OBO, Cell, PR 1 3 2 4 6 5 Incorporate into CEDAR and ImmPort Retrieve annotation (concept Uri, defns, etc) A collection of ontologies Expert Verification Finalizing Mapping Suggested Alteration Terms Suggestion Concept mapper
  • 9. Concept mapper uses NCBO web services to suggest suitable mapping
  • 10. Our mapping strategy • For certain value sets such as cell populations and cytokines, CM maps the values to domain specific ontologies such as Cell Ontology (CL) and Protein Ontology (PR) • For other elements, CM maps them to the terms in Ontology for Biomedical Investigations (OBI) • For elements that do not have matches in OBI, we map these elements to terms in top-ranked ontologies by OBO Foundry • For elements that do not have any ontology term matches, we perform manual search in Bioportal and other available repos for these missing terms. • We work closely with individual ontology groups (e.g., CL, OBI) to fill the
  • 11. Template elements mapped to ontologies • Assay types (e.g., gene expression, flow cytometry, ELISA, HAI, Luminex ) • Template types (e.g., human subject, biosample) • Column names (e.g., biosample type, measurement technique) • Value sets (e.g., set of cell populations, set of measurement techniques)
  • 12. Assay Type # Templates # Sub-Templates # Concept # Value Set Microarray gene expression 6 10 113 209 Flowcytometry 6 - 67 262 ELISA 2 - 39 602 HAI 2 - 37 117 Luminex 7 - 102 1032 General 6 - 115 190 Mapping Statistics
  • 13. OBI OBI OBI Newly added A device that moves charged particles through a .... OBI_0001121 A cytometry assay in which the presence of molecules OBI_0002115
  • 14. CEDAR helps to generate ontology-linked metadata Use case: CEDAR immunology data submission templates
  • 15. CEDAR has employed our suggested mapping Map to cell term in cell ontology Manual Mapping to “assay” In OBI Automatic mapping with NCIT https://cedar.metadatacenter.net Automatic mapping with OBI
  • 16. Future plan • Refine mapping of new assay types with updated algorithm. • Mapping of clinical metadata with ontology terms. • Incorporate our ontology-term mapping approach into CEDAR and ImmPort • Submit missing terms to relevant ontologies (e.g., OBI)
  • 17. Acknowledgment • ImmPort • Jeff Wiser, Patrick Dunn • Yale • Hailong Meng, Subhasis Mohanty •Cell Ontology • Alex Diehl • NCBO BioPortal and CEDAR • Mark Musen, John Graybeal, Martin O’connor • OBI • Bjoern Peters