Integrated Morphologic Analysis for Identification and            Characterization of Disease Subtypes                    ...
Agenda    •   Background    •   Pipeline for integrated morphologic analysis    •   Results and validation    •   Software...
Background3
NCI caBIG® In Silico Brain Tumor Research    Center                     Emory University                                  ...
Application domain: glioblastoma    •   Most common primary brain        tumor in adults    •   Median survival 50 weeks  ...
Glioblastoma Histology               Necrosis      Angiogenesis6
The Cancer Genome Atlas (TCGA)    •   Characterize 500 tumors for each of a variety of cancers    •   Clinical records    ...
Slide scanning and image analysis     •   High throughput slide scanning systems     •   Digitize entire slides at 200X / ...
Glioblastoma morphology    •   Themes: morphology, subtypes, rich datasets             Are there natural clusters of GBM m...
Methodology     Cooper LA, Kong J, Gutman DA, Wang F, Gao J, Appin C, Cholleti S, Pan T, Sharma A,     Scarpace L, Mikkels...
Computational     Pathology and     Correlative     Analysis11
Morphology engine12
Clustering engine                     Patient Morphology Profiles13
Correlative engine                          Patient Cluster Labels14
Genome wide analysis                            GISTIC15
Results16
Clustering identifies three morphological groups      • Analyzed 200 million nuclei from 162 TCGA GBMs (462 slides)      •...
Gene Expression Class Associations     •   Cox proportional hazards          • Verhaak expression class1 not significant p...
Clustering Validation     •   Separate set of 84 GBMs from Henry Ford Hospital     •   ClusterRepro: CC p=7.2e-3, CM p=1.3...
Representative nuclei                          Large,        Small light nuclei,   Intermediate                       hype...
Associations21
From Gene Lists to Biology     •   Nuclear lumen localization most highly enriched in cluster         associated genes    ...
Software Infrastructure     Wang F, Kong J, Cooper L, Pan T, Kurc T, Chen W, Sharma A, Niedermayr C, Oh T-W, Brat     D, F...
How to scale to 14,000 images?     • TCGA contains 20 cancer types        • 14K images – 4 Terabytes     • How to analyze ...
HPC Segmentation and Feature Extraction Pipeline                                  Tony Pan and George Teodoro25
PAIS (Pathology Analytical Imaging Standards)                                    PAIS Logical Model:                      ...
Microscopy Image Database      Image analysis                                 PAIS model                    PAIS data mana...
Cancer Digital Slide Archive28
cancer.digitalslidearchive.net29
cancer.digitalslidearchive.net30
cancer.digitalslidearchive.net31
Future Work and Conclusions32
Radiology Imaging Correlative Study33
Studying Protein Expression Patterns     Using Quantum Dot Immunohistochemistry                                           ...
Conclusions     •   Pathology imagery contains important cues     •   Pipeline for analyzing whole slide imagery     •   T...
In Silico Brain Tumor Research Center Team     •   Emory University                   •   Henry Ford Hospital          • J...
Related Papers and Acknowledgements     •   Cooper LA, Kong J, Gutman DA, Wang F, Gao J, Appin C, Cholleti S, Pan T,      ...
Upcoming SlideShare
Loading in …5
×

Dr. Lee Cooper: Integrated Morphologic Analysis for Identification and Characterization of Disease Subtypes

806 views

Published on

On March 14th, from 11:00am to 12:00pm, Dr. Lee Cooper delivered a virtual presentation via Adobe Connect highlighting his recent publication, “Integrated morphologic analysis for the identification and characterization of disease subtypes.” Dr. Cooper is a postdoctoral researcher in the Center for Comprehensive Informatics at Emory University. He received a Ph.D. in Electrical Engineering from Ohio State University in 2009, where he worked to develop computational methods for image-based phenotyping in mouse models of breast cancer. Dr. Cooper joined Emory in 2009 where he works under the guidance of Joel Saltz to develop methods for analyzing and integrating genomic and imaging datasets to discover associations among pathology, genetics, and patient outcomes. While at Emory, Dr. Cooper has co-authored several methodological and scientific papers describing work performed at the Emory In Silico Brain Tumor Research Center.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Dr. Lee Cooper: Integrated Morphologic Analysis for Identification and Characterization of Disease Subtypes

  1. 1. Integrated Morphologic Analysis for Identification and Characterization of Disease Subtypes Lee Cooper Center for Comprehensive Informatics, Emory University1
  2. 2. Agenda • Background • Pipeline for integrated morphologic analysis • Results and validation • Software Infrastructure • Future Work and Conclusions • Acknowledgements2
  3. 3. Background3
  4. 4. NCI caBIG® In Silico Brain Tumor Research Center Emory University Atlanta, GA Joel Saltz, MD PhD Daniel Brat, MD PhD Director Science PI Jefferson Hospital Henry Ford Hospital Stanford University Philadelphia, PA Detroit, MI Stanford, CA4
  5. 5. Application domain: glioblastoma • Most common primary brain tumor in adults • Median survival 50 weeks • ISBTRC Goals: • To leverage rich datasets to understand the mechanisms of glioma progression through In Silico analysis • To manage, explore and share semantically complex data among researchers5
  6. 6. Glioblastoma Histology Necrosis Angiogenesis6
  7. 7. The Cancer Genome Atlas (TCGA) • Characterize 500 tumors for each of a variety of cancers • Clinical records • Genomics: gene, miRNA expression, copy number, sequence, DNA methylation • Imaging: pathology and radiology histology radiology genomic clincalpathology Integrated Analysis7
  8. 8. Slide scanning and image analysis • High throughput slide scanning systems • Digitize entire slides at 200X / 400X magnification • 250 slides / day • Algorithms to segment and describe cells and structures8
  9. 9. Glioblastoma morphology • Themes: morphology, subtypes, rich datasets Are there natural clusters of GBM morphology? Are there links to patient outcome and molecular characteristics?9
  10. 10. Methodology Cooper LA, Kong J, Gutman DA, Wang F, Gao J, Appin C, Cholleti S, Pan T, Sharma A, Scarpace L, Mikkelsen T, Kurc T, Moreno CS, Brat DJ, Saltz JH, “Integrated morphologic analysis for the identification and characterization of disease subtypes,” Journal of the American Medical Informatics Association, 2012 19:317-32310
  11. 11. Computational Pathology and Correlative Analysis11
  12. 12. Morphology engine12
  13. 13. Clustering engine Patient Morphology Profiles13
  14. 14. Correlative engine Patient Cluster Labels14
  15. 15. Genome wide analysis GISTIC15
  16. 16. Results16
  17. 17. Clustering identifies three morphological groups • Analyzed 200 million nuclei from 162 TCGA GBMs (462 slides) • Named for functions of associated genes: Cell Cycle (CC), Chromatin Modification (CM), Protein Biosynthesis (PB) • Prognostically-significant (logrank p=4.5e-4) CC CM PB 10 20 Feature Indices 30 40 5017
  18. 18. Gene Expression Class Associations • Cox proportional hazards • Verhaak expression class1 not significant p=0.58 • Morphology clustering p=5.0e-3 100 Classical Mesenchymal 80 Subtype Percentage (%) Neural Proneural 60 40 20 0 CC CM PB Cluster 1Verhaak RG, Hoadley KA, Purdom E, et al; Cancer Genome Atlas Research Network. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 2010;17:98e110.18
  19. 19. Clustering Validation • Separate set of 84 GBMs from Henry Ford Hospital • ClusterRepro: CC p=7.2e-3, CM p=1.3e-2 CC Mixed CM 1 CC 10 0.8 Mixed CM Feature Indices 20 0.6 30 Survival 0.4 40 0.2 50 0 0 20 40 60 80 100 Months19
  20. 20. Representative nuclei Large, Small light nuclei, Intermediate hyperchromatic Eosinophilic cyoplasm nuclei20
  21. 21. Associations21
  22. 22. From Gene Lists to Biology • Nuclear lumen localization most highly enriched in cluster associated genes (CC p=2.8e-36, CM p=2.17e-19, PB p=1.08e-15) • Other enriched GO terms: DNA repair, m-phase , cell cycle, protein biosynthesis, chromatin modification • Differences in activation of cancer-related pathways including ATM and TP53 DNA damage checkpoints, NFκB pathway, Wnt signaling and PTEN/AKT pathways22
  23. 23. Software Infrastructure Wang F, Kong J, Cooper L, Pan T, Kurc T, Chen W, Sharma A, Niedermayr C, Oh T-W, Brat D, Farris A, Foran D, Saltz J, “A Data Model and Database for High-resolution Pathology Analytical Image Informatics,” Journal of Pathology Informatics, Vol. 2, Issue 1, pp. 32-40, 2011. Teodoro G, Kurc T, Pan T, Cooper L, Kong J, Widener P, Saltz J, “Accelerating Large Scale Image Analyses on Parallel CPU-GPU Equipped Systems”, Accepted for presentation at the International Parallel and Distributed Processing Symposium, China, 2012. Also available as Emory University, Center for Comprehensive Informatics, Technical Report: CCI-TR- 2011-4, 2011.23
  24. 24. How to scale to 14,000 images? • TCGA contains 20 cancer types • 14K images – 4 Terabytes • How to analyze larger datasets? HPC Pipeline • How to organize results? PAIS Database • How to interact with the data? CDSA Portal24
  25. 25. HPC Segmentation and Feature Extraction Pipeline Tony Pan and George Teodoro25
  26. 26. PAIS (Pathology Analytical Imaging Standards) PAIS Logical Model: • 62 UML classes • markups, annotations, imageReferences, provenance • Semantic enabled PAIS Data Representation: • XML (compressed) or HDF5 PAIS Databases: • loading, managing and querying and sharing data • RDBMS + SDBMS + parallel DBMS Fusheng Wang26
  27. 27. Microscopy Image Database Image analysis PAIS model PAIS data management Modeling and management of markup and annotation for querying and sharing through parallel RDBMS + spatial DBMS Segmentation HDFS data staging MapReduce based queries On the fly data processing for algorithm validation/algorithm Feature extraction sensitivity studies, or discovery of preliminary results27
  28. 28. Cancer Digital Slide Archive28
  29. 29. cancer.digitalslidearchive.net29
  30. 30. cancer.digitalslidearchive.net30
  31. 31. cancer.digitalslidearchive.net31
  32. 32. Future Work and Conclusions32
  33. 33. Radiology Imaging Correlative Study33
  34. 34. Studying Protein Expression Patterns Using Quantum Dot Immunohistochemistry Cytoplasm Nucleus34
  35. 35. Conclusions • Pathology imagery contains important cues • Pipeline for analyzing whole slide imagery • Tooling to handle large datasets • Other TCGA diseases (14000 Images!) • Developing richer descriptions of image content • Resources: • Emory Websites: bmi.emory.edu cci.emory.edu • Cancer Digital Slide Archive: cancer.digitalslidearchive.net • TCGA Symposium Talk: http://cancergenome.nih.gov/newsevents/multimedialibrary/videos/morphol ogicalcooper • JAMIA Paper: http://jamia.bmj.com/content/19/2/317.abstract35
  36. 36. In Silico Brain Tumor Research Center Team • Emory University • Henry Ford Hospital • Joel Saltz (Director) • Tom Mikkelsen • Daniel Brat (Science PI) • Lisa Scarpace • Carlos Moreno (Bioinformatics Lead) • Thomas Jefferson University • Lee Cooper • Adam Flanders (Radiology • David Gutman Lead) • Jun Kong • Fusheng Wang • Stanford University • Chad Holder • Daniel Rubin • Christina Appin • Candace Chisolm • Erwin van Meir • Tahsin Kurc • Sharath Cholleti • Tony Pan • Ashish Sharma36
  37. 37. Related Papers and Acknowledgements • Cooper LA, Kong J, Gutman DA, Wang F, Gao J, Appin C, Cholleti S, Pan T, Sharma A, Scarpace L, Mikkelsen T, Kurc T, Moreno CS, Brat DJ, Saltz JH, “Integrated morphologic analysis for the identification and characterization of disease subtypes”, Journal of the American Medical Informatics Association, in press, 2012. Pre-print Available: http://jamia.bmj.com/content/19/2/317.long • Wang F, Kong J, Cooper L, Pan T, Kurc T, Chen W, Sharma A, Niedermayr C, Oh T-W, Brat D, Farris A, Foran D, Saltz J, “A Data Model and Database for High-resolution Pathology Analytical Image Informatics,” Journal of Pathology Informatics, Vol. 2, Issue 1, pp. 32-40, 2011. • Teodoro G, Kurc T, Pan T, Cooper L, Kong J, Widener P, Saltz J, “Accelerating Large Scale Image Analyses on Parallel CPU-GPU Equipped Systems”, Accepted for presentation at the International Parallel and Distributed Processing Symposium, China, 2012. Also available as Emory University, Center for Comprehensive Informatics, Technical Report: CCI-TR-2011-4, 2011. This work is supported in part by NCI HHSN261200800001E, NHLBI R24HL085343, NLM R01LM011119-01 and R01LM009239, NIH RC4MD005964, NIH NIBIB BISTI P20EB000591, and CTSA PHS Grant UL1RR025008.37

×