CAIRR is a pipeline that allows users to submit AIRR data to NCBI through the CEDAR Workbench. It addresses issues with the current direct NCBI submission process such as a lack of metadata standardization and error-prone data entry. CAIRR streamlines the submission process into three simple steps - finding a template, adding standardized metadata by mapping attributes to ontologies, and uploading and submitting the data. This helps ensure metadata is complete and consistent compared to direct submission to NCBI.
Aspirational Block Program Block Syaldey District - Almora
CAIRR: A pipeline to submit AIRR data to the NCBI through the CEDAR Workbench
1. CAIRR: A pipeline to submit AIRR data to the
NCBI through the CEDAR Workbench
Syed Ahmad Chan Bukhari, Martin J. O'Connor, Marcos Martínez-Romero, Attila L. Egyedi, Debra
Willrett, John Graybeal , Mark A. Musen, Florian Rubelt, Steven H. Kleinstein , Kei-Hoi Cheung
3. NCBI is an important resource to archive biomedical data
● NCBI hosts a collection of biomedical databases and provide long-term support.
○ BioProject, BioSample, SRA, GenBank, GEO etc.
● Minimal use of standard terminologies to define the necessary metadata
○ Ontologies recommended for some data elements (Not implemented)
● NCBI metadata are often described using inconsistent terminologies
○ Limit our ability to access, find, interoperate and reuse the data sets
4. What are the issues with the current NCBI
submission process?
● Rapid growth
● Lack of metadata standardization
● Error prone data entry
● Lack of community-specific metadata
(e.g., AIRR)
5. CEDAR for AIRR
Submit your AIRR
metadata to NCBI,
faster and better.
Organism NCBITAXON
Disease/Diagnosis DOID
Tissue BTO
Cell Subset CL
Example of Ontological Mapping
10. Why use CAIRR instead of direct submission?
1. Just one simple form (with tool tips!) to fill out, instead of multiple NCBI
templates.
2. Make your metadata right the first time with auto-completion, suggestions,
and validation.
3. Exact answers—Your metadata attributes and values come from unique
ontology concepts, so they are unambiguous and fully described.
4. Better feedback during the submission process.
5. CEDAR is faster and easier!