SlideShare a Scribd company logo
1 of 17
Download to read offline
Standardization of the HIPC Data
Templates: The Story So Far
Ahmad C. Bukhari, Ph.D., Kei-Hoi Cheung, Ph.D. and Steven H. Kleinstein, Ph.D.
Yale University, School of Medicine
User Group
(HIPC)
● An important resource for raw data and protocols from clinical trials,
mechanistic studies and novel methods for cellular and molecular
measurements
● Provides templates and standard operating procedures to facilitate data
representation and transfer.
● Provides a variety of tools for data access and manipulation
ImmPort
SQL Dump for local
hosting
Human Immunology Project Consortium (HIPC)
● Well-characterized human cohorts are studied using a variety of modern
analytic tools including multiplex transcriptional, cytokine, and proteomic
assays.
● HIPC submitted data is an important subset of the ImmPort database
● Submitted HIPC data is not standardized.
● Inconsistent naming and data reporting
Our aim is to make HIPC data FAIR
● Findability
○ Finding a large variety of related datasets is an important step to knowledge discovery
● Accessibility
○ A growing number of datasets are being submitted to public repositories such as ImmPort.
These datasets can accessed through different methods including web-based search, bulk
download and API access
● Interoperability
○ Data mining/analysis often requires multiple datasets to be integrated within a single repository
or across multiple repositories
● Reusability
○ Entering enough metadata as part of the data submission process facilitates data reuse
❖ FAIR a set of Digital Object Compliance principles that describes the properties of digital objects
defined under NIH Commons initiative
Current practices towards data FAIRness
● Minimum information standards (checklists) specify the minimum amount of
information (metadata) needed for reporting results in a reproducible and
reusable fashion. For example,
○ MIAME: Minimum information about a microarray experiment
○ MIAPE: Minimum Information About a Proteomics Experiment
● Scientific communities have developed templates incorporating detailed
checklists of the metadata needed to describe about the particular types of
experimental data sources.
● Standard identifiers/terminologies/ontologies have been created for different
domains
We propose an ontological mapping for the
ImmPort data submission templates.
● Ontology term mapping allows to achieve semantic normalization across
different repositories.
● Ontologically annotated datasets allow context-aware queries and data
integration
● Mapping to controlled vocabularies, relationships and rules facilitates
run-time data validation.
● These help achieve data FAIRness.
Ontology mapping of templates
Ontology
Recommender
OBI, OBO, Cell, PR
1
3
2
4
6 5
Incorporate into CEDAR and ImmPort Retrieve annotation (concept Uri, defns, etc)
A collection of ontologies
Expert Verification
Finalizing Mapping
Suggested Alteration
Terms Suggestion
Concept mapper
Concept mapper uses NCBO web services to suggest suitable mapping
Our mapping strategy
• For certain value sets such as cell populations and cytokines, CM maps
the values to domain specific ontologies such as Cell Ontology (CL) and
Protein Ontology (PR)
• For other elements, CM maps them to the terms in Ontology for
Biomedical Investigations (OBI)
• For elements that do not have matches in OBI, we map these elements to
terms in top-ranked ontologies by OBO Foundry
• For elements that do not have any ontology term matches, we perform
manual search in Bioportal and other available repos for these missing
terms.
• We work closely with individual ontology groups (e.g., CL, OBI) to fill the
Template elements mapped to ontologies
• Assay types (e.g., gene expression, flow cytometry, ELISA,
HAI, Luminex )
• Template types (e.g., human subject, biosample)
• Column names (e.g., biosample type, measurement
technique)
• Value sets (e.g., set of cell populations, set of measurement
techniques)
Assay Type # Templates # Sub-Templates # Concept # Value Set
Microarray gene
expression
6 10 113 209
Flowcytometry 6 - 67 262
ELISA 2 - 39 602
HAI 2 - 37 117
Luminex 7 - 102 1032
General 6 - 115 190
Mapping Statistics
OBI
OBI
OBI
Newly added
A device that moves charged particles through a .... OBI_0001121
A cytometry assay in which the presence of molecules OBI_0002115
CEDAR helps to generate ontology-linked metadata
Use case: CEDAR immunology data submission
templates
CEDAR has employed our suggested mapping
Map to cell term
in cell ontology
Manual Mapping to “assay”
In OBI Automatic mapping with NCIT
https://cedar.metadatacenter.net
Automatic mapping with OBI
Future plan
• Refine mapping of new assay types with updated
algorithm.
• Mapping of clinical metadata with ontology terms.
• Incorporate our ontology-term mapping approach into
CEDAR and ImmPort
• Submit missing terms to relevant ontologies (e.g., OBI)
Acknowledgment
• ImmPort
• Jeff Wiser, Patrick Dunn
• Yale
• Hailong Meng, Subhasis Mohanty
•Cell Ontology
• Alex Diehl
• NCBO BioPortal and CEDAR
• Mark Musen, John Graybeal, Martin O’connor
• OBI
• Bjoern Peters

More Related Content

What's hot

ELIXIR-UK and the ELIXIR Interoperability Platform
ELIXIR-UK and the ELIXIR Interoperability PlatformELIXIR-UK and the ELIXIR Interoperability Platform
ELIXIR-UK and the ELIXIR Interoperability PlatformELIXIR UK
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchUniversity Medicine Greifswald
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesValery Tkachenko
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardizationValery Tkachenko
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...Syed Ahmad Chan Bukhari, PhD
 
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015Stuart Chalk
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectStuart Chalk
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Valery Tkachenko
 
From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...Catherine Canevet
 
Model management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyModel management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyUniversity Medicine Greifswald
 
Enabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology supportEnabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology supportMelanie Courtot
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardStuart Chalk
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Oscar Peña del Rio
 
Schema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive DataSchema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive DataLars Gleim
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)Michael Atkins
 

What's hot (20)

ELIXIR-UK and the ELIXIR Interoperability Platform
ELIXIR-UK and the ELIXIR Interoperability PlatformELIXIR-UK and the ELIXIR Interoperability Platform
ELIXIR-UK and the ELIXIR Interoperability Platform
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 
Short introduction to SED-ML
Short introduction to SED-MLShort introduction to SED-ML
Short introduction to SED-ML
 
Data and Model Management for Systems Biology
Data and Model Management  for Systems BiologyData and Model Management  for Systems Biology
Data and Model Management for Systems Biology
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
 
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP Project
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
 
From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...
 
Model management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyModel management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biology
 
Enabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology supportEnabling faster analysis of vaccine adverse event reports with ontology support
Enabling faster analysis of vaccine adverse event reports with ontology support
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data Standard
 
Data and model management in Systems Biology
Data and model management in Systems BiologyData and model management in Systems Biology
Data and model management in Systems Biology
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
Schema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive DataSchema Extraction for Privacy Preserving Processing of Sensitive Data
Schema Extraction for Privacy Preserving Processing of Sensitive Data
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 

Similar to Standardization of the HIPC Data Templates

The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologySnow Owl
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...Syed Ahmad Chan Bukhari, PhD
 
A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...Koray Atalag
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsLuis Marco Ruiz
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsLuis Marco Ruiz
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Amit Sheth
 
Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003robertstevens65
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Syed Ahmad Chan Bukhari, PhD
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Ahmad C. Bukhari
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
 
Provenance abstraction for implementing security: Learning Health System and ...
Provenance abstraction for implementing security: Learning Health System and ...Provenance abstraction for implementing security: Learning Health System and ...
Provenance abstraction for implementing security: Learning Health System and ...Vasa Curcin
 
FedCentric_Presentation
FedCentric_PresentationFedCentric_Presentation
FedCentric_PresentationYatpang Cheung
 
Towards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery LabsTowards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery LabsOla Spjuth
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxChimezie Ogbuji
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marcGenomeInABottle
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...DataScienceConferenc1
 

Similar to Standardization of the HIPC Data Templates (20)

The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
 
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse EnvironmentsEnabling Clinical Data Reuse with openEHR Data Warehouse Environments
Enabling Clinical Data Reuse with openEHR Data Warehouse Environments
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
 
Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003Beyond Transparency: Success & Lessons From tambisBoston2003
Beyond Transparency: Success & Lessons From tambisBoston2003
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
 
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
Leveraging the CEDAR Workbench for Ontology-linked Submission of Adaptive Imm...
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 
Provenance abstraction for implementing security: Learning Health System and ...
Provenance abstraction for implementing security: Learning Health System and ...Provenance abstraction for implementing security: Learning Health System and ...
Provenance abstraction for implementing security: Learning Health System and ...
 
FedCentric_Presentation
FedCentric_PresentationFedCentric_Presentation
FedCentric_Presentation
 
Towards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery LabsTowards Automated AI-guided Drug Discovery Labs
Towards Automated AI-guided Drug Discovery Labs
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptx
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marc
 
Dia09
Dia09Dia09
Dia09
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
 
Deep Learning for EHR Data
Deep Learning for EHR DataDeep Learning for EHR Data
Deep Learning for EHR Data
 

More from Syed Ahmad Chan Bukhari, PhD

CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...Syed Ahmad Chan Bukhari, PhD
 
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...Syed Ahmad Chan Bukhari, PhD
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataSyed Ahmad Chan Bukhari, PhD
 
A semantic framework for biomedical image discovery
A semantic framework for biomedical image discoveryA semantic framework for biomedical image discovery
A semantic framework for biomedical image discoverySyed Ahmad Chan Bukhari, PhD
 
Semantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSemantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSyed Ahmad Chan Bukhari, PhD
 
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...Syed Ahmad Chan Bukhari, PhD
 
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR WorkbenchCAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR WorkbenchSyed Ahmad Chan Bukhari, PhD
 
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...Syed Ahmad Chan Bukhari, PhD
 
AN Intelligent Realtime multiple vessel collision risk assessment system
AN Intelligent Realtime multiple vessel collision risk assessment system AN Intelligent Realtime multiple vessel collision risk assessment system
AN Intelligent Realtime multiple vessel collision risk assessment system Syed Ahmad Chan Bukhari, PhD
 

More from Syed Ahmad Chan Bukhari, PhD (13)

CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
CEDAR: Easing Authoring of Metadata to Make Biomedical Data Sets More Findabl...
 
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
Finding and Reusing Biomedical Datasets using CEDAR Metadata Repository and T...
 
CEDAR Technologies for AIRR Submissions
CEDAR Technologies for AIRR SubmissionsCEDAR Technologies for AIRR Submissions
CEDAR Technologies for AIRR Submissions
 
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized MetadataCEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
CEDAR: Web-Based Tools for Accelerating the Creation of Standardized Metadata
 
A semantic framework for biomedical image discovery
A semantic framework for biomedical image discoveryA semantic framework for biomedical image discovery
A semantic framework for biomedical image discovery
 
Semantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSemantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence images
 
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
 
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR WorkbenchCAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
 
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
BioNLP-SADI: A Suite of interoperable BioNLP Semantic Web Services based on S...
 
Type 2 fuzzy ontology ahmadchan
Type 2 fuzzy ontology ahmadchanType 2 fuzzy ontology ahmadchan
Type 2 fuzzy ontology ahmadchan
 
AN Intelligent Realtime multiple vessel collision risk assessment system
AN Intelligent Realtime multiple vessel collision risk assessment system AN Intelligent Realtime multiple vessel collision risk assessment system
AN Intelligent Realtime multiple vessel collision risk assessment system
 
Canadian health census to lod
Canadian health census to lodCanadian health census to lod
Canadian health census to lod
 
Type-2 Fuzzy Ontology
Type-2 Fuzzy OntologyType-2 Fuzzy Ontology
Type-2 Fuzzy Ontology
 

Recently uploaded

Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 

Recently uploaded (20)

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 

Standardization of the HIPC Data Templates

  • 1. Standardization of the HIPC Data Templates: The Story So Far Ahmad C. Bukhari, Ph.D., Kei-Hoi Cheung, Ph.D. and Steven H. Kleinstein, Ph.D. Yale University, School of Medicine User Group (HIPC)
  • 2. ● An important resource for raw data and protocols from clinical trials, mechanistic studies and novel methods for cellular and molecular measurements ● Provides templates and standard operating procedures to facilitate data representation and transfer. ● Provides a variety of tools for data access and manipulation ImmPort SQL Dump for local hosting
  • 3. Human Immunology Project Consortium (HIPC) ● Well-characterized human cohorts are studied using a variety of modern analytic tools including multiplex transcriptional, cytokine, and proteomic assays. ● HIPC submitted data is an important subset of the ImmPort database ● Submitted HIPC data is not standardized. ● Inconsistent naming and data reporting
  • 4. Our aim is to make HIPC data FAIR ● Findability ○ Finding a large variety of related datasets is an important step to knowledge discovery ● Accessibility ○ A growing number of datasets are being submitted to public repositories such as ImmPort. These datasets can accessed through different methods including web-based search, bulk download and API access ● Interoperability ○ Data mining/analysis often requires multiple datasets to be integrated within a single repository or across multiple repositories ● Reusability ○ Entering enough metadata as part of the data submission process facilitates data reuse ❖ FAIR a set of Digital Object Compliance principles that describes the properties of digital objects defined under NIH Commons initiative
  • 5. Current practices towards data FAIRness ● Minimum information standards (checklists) specify the minimum amount of information (metadata) needed for reporting results in a reproducible and reusable fashion. For example, ○ MIAME: Minimum information about a microarray experiment ○ MIAPE: Minimum Information About a Proteomics Experiment ● Scientific communities have developed templates incorporating detailed checklists of the metadata needed to describe about the particular types of experimental data sources. ● Standard identifiers/terminologies/ontologies have been created for different domains
  • 6.
  • 7. We propose an ontological mapping for the ImmPort data submission templates. ● Ontology term mapping allows to achieve semantic normalization across different repositories. ● Ontologically annotated datasets allow context-aware queries and data integration ● Mapping to controlled vocabularies, relationships and rules facilitates run-time data validation. ● These help achieve data FAIRness.
  • 8. Ontology mapping of templates Ontology Recommender OBI, OBO, Cell, PR 1 3 2 4 6 5 Incorporate into CEDAR and ImmPort Retrieve annotation (concept Uri, defns, etc) A collection of ontologies Expert Verification Finalizing Mapping Suggested Alteration Terms Suggestion Concept mapper
  • 9. Concept mapper uses NCBO web services to suggest suitable mapping
  • 10. Our mapping strategy • For certain value sets such as cell populations and cytokines, CM maps the values to domain specific ontologies such as Cell Ontology (CL) and Protein Ontology (PR) • For other elements, CM maps them to the terms in Ontology for Biomedical Investigations (OBI) • For elements that do not have matches in OBI, we map these elements to terms in top-ranked ontologies by OBO Foundry • For elements that do not have any ontology term matches, we perform manual search in Bioportal and other available repos for these missing terms. • We work closely with individual ontology groups (e.g., CL, OBI) to fill the
  • 11. Template elements mapped to ontologies • Assay types (e.g., gene expression, flow cytometry, ELISA, HAI, Luminex ) • Template types (e.g., human subject, biosample) • Column names (e.g., biosample type, measurement technique) • Value sets (e.g., set of cell populations, set of measurement techniques)
  • 12. Assay Type # Templates # Sub-Templates # Concept # Value Set Microarray gene expression 6 10 113 209 Flowcytometry 6 - 67 262 ELISA 2 - 39 602 HAI 2 - 37 117 Luminex 7 - 102 1032 General 6 - 115 190 Mapping Statistics
  • 13. OBI OBI OBI Newly added A device that moves charged particles through a .... OBI_0001121 A cytometry assay in which the presence of molecules OBI_0002115
  • 14. CEDAR helps to generate ontology-linked metadata Use case: CEDAR immunology data submission templates
  • 15. CEDAR has employed our suggested mapping Map to cell term in cell ontology Manual Mapping to “assay” In OBI Automatic mapping with NCIT https://cedar.metadatacenter.net Automatic mapping with OBI
  • 16. Future plan • Refine mapping of new assay types with updated algorithm. • Mapping of clinical metadata with ontology terms. • Incorporate our ontology-term mapping approach into CEDAR and ImmPort • Submit missing terms to relevant ontologies (e.g., OBI)
  • 17. Acknowledgment • ImmPort • Jeff Wiser, Patrick Dunn • Yale • Hailong Meng, Subhasis Mohanty •Cell Ontology • Alex Diehl • NCBO BioPortal and CEDAR • Mark Musen, John Graybeal, Martin O’connor • OBI • Bjoern Peters