(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...
A practical guide to using The Cancer Imaging Archive for QIN Challenges and Collaborative Projects
1. http://cancerimagingarchive.net
Justin Kirby – justin.kirby@nih.gov
Frederick National Laboratory for Cancer Research
Leidos Biomedical Research, Inc.
Support to: Cancer Imaging Program/DCTD/NCI
A Practical Guide to Using TCIA for QIN
Challenges & Collaborative Projects
QIN F2F 2018
3. 3
The Cancer Imaging Archive: Brief intro
• 84 data sets consisting of over 41,000 subjects
available for download
• Covers most modalities (CT/MR/PET/RT)
• Wide variety of cancers + phantoms
• Patient populations vary from a handful to
>26,000 (NLST)
• Many have associated meta-data
Demographics/outcomes/therapy
Pathology imaging
Radiologist expert and automated
computational analyses (segmentations,
features)
• ‘Omics ties to GDC/TCGA, CPTAC, and GEO
http://www.cancerimagingarchive.net
4. 4
Organization of TCIA ecosystem
The
Cancer
Imaging
Archive
Data Collection Center
•Tools and staffing to support data
collection, curation, and de-
identification
Data Access
•Browse (home page)
•Filter/Search (Data Portal)
•REST API
•Analysis Data
Data Analysis Centers
•3rd party web sites or tools which
connect to TCIA’s API or mirror its
data
5. 5
TCIA services (not just software)
Relieves PI of majority of data sharing burden/risks
• Data hosting with >99% uptime
• De-identification using pre-configured RSNA’s Clinical Trials Processor (CTP) and
DICOM PS 3.15 Annex E standards
• Multi-tiered QC process inspects both DICOM headers and pixels for PHI and
integrity of data set
Phone/email support available for end users and submitters
Extensive documentation throughout the site
Exposure to a large community of researchers
• Increase visibility of your work, get more citations!
6. 6
Data Collection Center: Publish Your Data
Primary Data (radiology, pathology, clinical, etc) Analysis Results (derived from primary data)
Image credit: Hugo Aerts
7. 7
Data Collection Center:
Publishing data in addition to manuscripts
Data citations for both primary and analysis data to enable reproducible research
Analysis Dataset Citation (derived image features)
Gutman DA, Cooper LA, Hwang SN, Holder CA, Gao J, Aurora TD, Dunn WD Jr, Scarpace L,
Mikkelsen T, Jain R, Wintermark M, Jilwan M, Raghavan P, Huang E, Clifford RJ, Mongkolwat
P, Kleper V, Freymann J, Kirby J, Zinn PO, Moreno CS, Jaffe C, Colen R, Rubin DL, Saltz J,
Flanders A, Brat DJ. (2014). MR Imaging Predictors of Molecular Profile and Survival: Multi-
institutional Study of the TCGA Glioblastoma Data Set. The Cancer Imaging Archive.
http://doi.org/10.7937/K9/TCIA.2014.4HTXYRCN
Publication Citation (cites specific data used)
MR imaging predictors of molecular profile and survival: multi-
institutional study of the TCGA glioblastoma data set. Radiology.
2013 May;267(2):560-9. doi: 10.1148/radiol.13120118. Epub
2013 Feb 7. PubMed PMID: 23392431; PubMed Central PMCID:
PMC3632807.
Primary Data Citation (TCIA images used for study)
Scarpace, L., Mikkelsen, T., Cha, soonmee, Rao, S., Tekchandani, S.,
Gutman, D., … Pierce, L. J. (2016). Radiology Data from The Cancer
Genome Atlas Glioblastoma Multiforme [TCGA-GBM] collection. The
Cancer Imaging Archive. http://doi.org/10.7937/K9/TCIA.2016.RNYFUYE9
8. 8
Data Descriptor Journals
Journal Recommended Repositories
Nature Scientific Data https://www.nature.com/sdata/policies/repositories#imaging
Medical Physics http://aapm.onlinelibrary.wiley.com/hub/journal/10.1002/(ISSN)2473-4209/about/author-
guidelines.html (see section 13-Medical Physics Dataset Articles)
Elsevier Data in Brief http://www.elsevier.com/authors/author-services/research-data/data-base-linking/supported-
data-repositories#Health
PLOS ONE http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories
Research Data Support https://www.springernature.com/gp/authors/research-data-policy/repositories-bio/12327160
Publish detailed descriptions about how to use your TCIA data to gain
academic credit (publication/citations) in addition to the novel scientific
findings you might publish in traditional journals.
9. 9
A significantly growing community!
38 incoming data sets in varying stages of curation
Over 10,000 active users per month
• Up from ~3,000/month in 2015
Downloads of 40-50TB per month
• Up from ~2TB/month in 2015
613 publications based on TCIA data
• 134 new publications in 2017
10. 10
Researchers want to share data – 38 data set queue
Community Proposed Data Sets
GBM-DSC-MRI-DRO
ASCC TNM Consensus
Colorectal Liver Metastases
QIN-BREAST-02
MyelomaTT3a
Low Dose CT Liver Metastases
Lung Fused-CT-Pathology
HNSCC Oropharyngeal Radiomics
OPC-Radiomics
Oropharynx Phantoms
HNSCC 3D CT RT
MSK Pancreatic Cancer Repository
Program Data Sets Notes
TCGA 2 collection//sites
CPTAC 9 cancer types, 14
sites
Exceptional
Responders
24 of 58 subjects
remaining
Immunotherapy 2 cancer types
PDX mouse Not started
NCTN integration RTOG 0617 pilot in
process
QIN ECOG-ACRIN 10 trials
12. 12
QIN Use Cases
Collaborative research projects
Challenge competitions
Data sharing requirements
(NIH guidelines, publication
guidelines)
13. 13
Summary of QIN Use:
15 QIN Data Sets from 11 out of 19 active sites
Collection Cancer Type Modalities Subjects Access Updated QIN Use Case
Brain-Tumor-Progression Brain Cancer MR 20 Limited 2018/01/31 Collaborative project
QIN LUNG CT Non-small Cell Lung Cancer CT 47 Public 2017/07/31 Lung Seg Challenge
ACRIN FLT Breast Breast Cancer PET, CT, OT 83 Limited 2017/12/11 TBD
Breast-MRI-NACT-Pilot Breast Cancer MR, SEG 64 Public 2016/01/26 BMMR Challenge
ISPY1 Breast Cancer MR, SEG 222 Public 2016/08/31 BMMR Challenge
Lung Phantom Lung Phantom CT 1 Public 2014/06/19 Lung Seg Challenge
QIN-BRAIN-DSC-MRI Low & High Grade Glioma MR 49 Limited 2015/08/28 DSC MRI Challenge
QIN-Breast Breast Cancer MR, PT, CT 67 Limited 2015/09/04 General data sharing
QIN Breast DCE-MRI Breast Cancer MR, KO 10 Public 2014/07/31 DCE Challenge
QIN GBM DCE-MRI Glioblastoma Multiforme MR 10 Limited 2015/08/14 TBD
QIN GBM Treatment Response Glioblastoma Multiforme MR 54 Limited 2015/08/12 TBD
QIN-HeadNeck Head and Neck Carcinomas PT, CT, SR, SEG, RWV 156 Public 2014/08/26 Publications, ITCR use,
FDG PET Seg Challenge
QIN PET Phantom PET Phantom PT 2 Public 2014/09/04 FDG PET Seg Challenge
QIN Prostate Prostate Cancer MR 22 Limited 2014/07/02 Collaborative project,
AIF challenge
QIN-SARCOMA Sarcomas MR 15 Limited 2014/09/05 AIF Challenge
14. 14
Suggestions for the coming year
Re-use of existing data has been limited
Capabilities to archive results data underutilized thus far (limited
as they may be)
More lead time required due to increased demand
Continued efforts to share diverse/rich data sets are critical
While painful compared to “quick and dirty” solutions,
standardization is worthwhile when planning CCP’s and data
storage
16. Advancing TCGA-GBM and TCGA-LGG with expert labels & radiomic features
Rich datasets without segmentation
labels
• Essential for quantitative
studies.
• Enabling radiogenomic analyses.
www.braintumorsegmentation.org
Highly competitive challenge utilizing TCIA data
ranked 1st at BraTS’15
Bakas et al., “GLISTRboost”, Springer LNCS 2016
Automated method Combining Biophysical
Tumor Growth Modeling with Machine Learning
for Glioma Segmentation
Input: TCIA data Segmentations:
using GLISTRboost
Manual segmentations
approved by expert neuro-
radiologist
Publicly available contribution towards repeatable and reproducible studies, by:
- enabling direct utilization of the TCGA/TCIA glioma collections
- allowing to fully exploit their potential in clinical and computational studies
1.
Panel of >500 imaging features, extracted from the manual segmentations2.
19. 19
Clinical trials data: ECOG-ACRIN / QIN
2 completed trials
• 6657/ISPY1 (public)
• 6688/FLT-Breast (restricted to QIN use until 12/18/18).
2 trials in progress
• 6684/brain FMISO expected to be completed next month
• 6668/NSCLC FDG-PET has been transferred from ACRIN, now starting curation at TCIA
How can QIN best leverage this data in clinically meaningful CCP’s?
• Schedule presentations about these data to Executive Committee or Working Groups as they come online?
• Identify logical follow up questions to ask based on trial publications?
• Sequester portions of trial data for validation sets?
• Alter the priority list as new needs arise?
20. 20
Clinical trials data: NCTN Data Archive Integration
Pilot project underway
to connect RTOG 0617
data in TCIA to NCTN
Data Archive clinical data
21. 21
Clinical Proteomics Tumor Analysis Consortium (CPTAC)
Collection Cancer Type Modalities Subjects Location Updated
CPTAC-CCRCC Clear Cell Carcinoma CT 18 Kidney 2018/04/26
CPTAC-GBM Glioblastoma
Multiforme
CT, MR 24 Brain 2018/04/26
CPTAC-LSCC Squamous Cell
Carcinoma
CT,CR, DX 3 Lung 2018/04/26
CPTAC-LUAD Adenocarcinoma CT, MR, PT 10 Lung 2018/04/26
CPTAC-PDA Ductal
Adenocarcinoma
CT, MR, DX, CR 43 Pancreas 2018/04/26
CPTAC-UCEC Corpus Endometrial
Carcinoma
CT,MR,PT 31 Uterus 2018/04/26
CPTAC-HNSCC Head and Neck
Cancer
CT 5 Head-Neck 2018/04/26
CPTAC-CM Cutaneous Melanoma MR 1 Brain 2018/02/13
Precision medicine data
• Proteomics
• Radiology
• Pathology
• Clinical
• Genomics
Similar to TCGA, but prospective
• Newer scanners
• Same variability in acquisitions
23. 23
Pilot accomplishments with VA and DOD
Boston VA MAVERIC program
• Now have VA-approved SOPs for image transfer
• Proof of principle: 36 imaging studies on 7 patients hosted
• Genomic data submitted to GDC
Walter Reed GYN-COE Imaging for APOLLO 2
• SOPs developed for images from three sites
• 250 studies on 90 patients transferred, not yet public
• Facilitating image feature extraction analysis by experts
24. APOLLO-5 Year 1 Projected Cases
Multiple Cancer Types and Imaging Modalities
Cancer Type Low Estimate High Estimate
GYN 300 400
Breast 100 200
Prostate 50 100
Colon/GI 50 100
ENT/Thyroid 50 100
Kidney 25 50
Lung 25 50
Brain 10 20
Sarcomas 10 20
Lymphoid 10 20
TOTAL 630 1060
• Imaging will be from multiple imaging modalities
• Most cases will have multiple image sets & time points
25. 25
Crowds Cure Cancer – BIDS WG pipeline project
Preliminary pipeline stages
• Ingest tumor location seed points
from TCGA subjects
• Generate 3d segmentations
• Compute radiomic features
• Predict outcomes
Access the data
• Visit our Analysis Results page
• Direct link:
https://doi.org/10.7937/K9/TCIA.201
8.OW73VLO2
26. 26
PET-CT Image Feature Standardization
Extending/improving IBSI guidelines
Utilizing TCIA data such as
CC-Radiomics-Phantom
28. 28
Acknowledgements
Funding: NCI Cancer Imaging Program
Frederick National Laboratory for Cancer Research
• John Freymann, Justin Kirby, Brenda Fevrier-Sullivan, Luis Cordeiro,
Craig Hill
Consultant - Carl Jaffe
University of Arkansas Medical School
• Fred Prior, Kirk Smith, Lawrence Tarbox, Bill Bennett, Tracy Nolan,
Dwayne Dobbins
Emory University
• Ashish Sharma
Editor's Notes
Provide DOIs to collections and meta-collections (article’s analysis)
Publication can refer to the specific data sets used via the DOIs in the data citations
Currently working with NLM, collaborating with Nature Scientific Data and other publications
The VA team through its MAVERIC (Massachusetts Veterans Epidemiology Research and Information Center) and RePOP projects have worked extensively to develop an internal SOP to collect and centralize the imaging data of patients participating in APOLLO, pulling imaging data on request from its 21 regional service network facilities, or VISNs.
The Gyno. Center of Excellence (GYN-COE) developed its imaging SOP in the context of supporting the APOLLO 2 project; as such the emphasis was on collecting the data from multiple sites, configuring and testing the data de-identification and submission systems, and collaborating with TCIA on the quality control steps, staging the images for feature extraction and returning quantitative imaging measures back to the APOLLO data systems.
This is ongoing with the feature extractions, the imaging data are all in TCIA
Murtha Cancer Center Biobank (MCCB) Sites: WRNMMC, Ft. Bragg, Portsmouth, Keesler, San Diego, Madigan, Fort Belvoir, San Antonio, William Beaumont El Paso, Anne Arundel Medical Center estimated to contribute 200-300 cancers of all types / year
VA Palo Alto initially estimated to contribute 50 lung and GI cancer cases / year
Clinical Breast Care Project (CBCP) estimated to provide 100-200 breast cancers / year
Center for Prostate Disease Research (CPDR) estimated to contribute 50-100 prostate cancers / year
University of Virginia (UVA) estimated to provide 25-50 lung cancer cases / year
Gynecologic Cancer Center of Excellence (GYN-COE) Tissue and Data Acquisition Network (TDAN) Sites: Inova, Duke, Roswell Park with negotiations underway for the Universities of Hawaii and Oklahoma estimated to contribute 300-400 GYN cancers /year
Priorities: 1. Active Duty
2. Minorities
3. High priority aggressive cancers
4. High priority events (e.g. metastasis, recurrence, resistance)