Keynote talk at HP-MICCAI / MICCAI-DCI 2011, The Joint Workshop on High Performance and Distributed Computing for Medical Imaging, featured at MICCAI 2011, held on September 22nd 2011 in Toronto, Canada
On May 9, Dr. David Gutman delivered a presentation titled "Development and Validation of Radiology Descriptors in Gliomas." Researchers at Emory University, in collaboration with investigators at the University of Virginia, Henry Ford Hospital, and Thomas Jefferson Hospital, have been working to develop the Visually Accessible Rembrandt Images (VASARI) feature set, a standardized set of qualitative imaging features used to describe high-grade gliomas.
Presented during the 34th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC'12). Part of the workshop 'New Models and Modes for Data Sharing: Experiences from Neuroscience'. Presented by Jeffrey S. Grethe, Ph.D. from the Center for Research in Biological Systems at the University of California, San Diego.
This workshop featured several large scale efforts to establish data sharing platforms, standards and tools to promote data intensive analysis in the neurosciences. As we head into the second decade of the 21st century, many scientists realize that current methods for publishing and accessing data are outmoded and inefficient. Neuroscience, with its large diverse and highly competitive community, has been slow to adopt more open sharing of data and has lacked effective tools to do so. There has been a significant investment in databases and tools for biological science, and frequent calls for more of them, but few calls to the biological community to adopt practices and frameworks for making their resources more easily discoverable and data more accessible. Data are contained within diverse sources, from web pages, databases, literature to personal lab systems, making for a haphazard mechanism for data and tool discovery. Although these mechanisms are effective for small communities, they are parochial for the totality of resources available, leading to fragmentation in the resource ecosystem. Neuroscience, with its diverse subdisciplines, complex data types and broad domain, presents the perfect exemplar of the current practices, bottlenecks and issues surrounding open access to data. This situation is changing, however, as groups have started to work together to define new models and tools for sharing and analyzing neuroscience data on an international scale. In this workshop, we bring together experts from national and international projects to discuss issues of data access and progress towards establishing platforms and best practices for effective sharing of neuroscience data in support of basic and clinical neuroscience.
AN EFFICIENT WAVELET BASED FEATURE REDUCTION AND CLASSIFICATION TECHNIQUE FOR...ijcseit
This research paper proposes an improved feature reduction and classification technique to identify mild
and severe dementia from brain MRI data. The manual interpretation of changes in brain volume based on
visual examination by radiologist or a physician may lead to missing diagnosis when a large number of
MRIs are analyzed. To avoid the human error, an automated intelligent classification system is proposed
which caters the need for classification of brain MRI after identifying abnormal MRI volume, for the
diagnosis of dementia. In this research work, advanced classification techniques using Support Vector
Machines based on Particle Swarm Optimisation and Genetic algorithm are compared. Feature reduction
by wavelets and PCA are analysed. From this analysis, it is observed that the proposed classification of
SVM based PSO is found to be efficient than SVM trained with GA and wavelet based feature reduction
technique yields better results than PCA.
On May 9, Dr. David Gutman delivered a presentation titled "Development and Validation of Radiology Descriptors in Gliomas." Researchers at Emory University, in collaboration with investigators at the University of Virginia, Henry Ford Hospital, and Thomas Jefferson Hospital, have been working to develop the Visually Accessible Rembrandt Images (VASARI) feature set, a standardized set of qualitative imaging features used to describe high-grade gliomas.
Presented during the 34th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC'12). Part of the workshop 'New Models and Modes for Data Sharing: Experiences from Neuroscience'. Presented by Jeffrey S. Grethe, Ph.D. from the Center for Research in Biological Systems at the University of California, San Diego.
This workshop featured several large scale efforts to establish data sharing platforms, standards and tools to promote data intensive analysis in the neurosciences. As we head into the second decade of the 21st century, many scientists realize that current methods for publishing and accessing data are outmoded and inefficient. Neuroscience, with its large diverse and highly competitive community, has been slow to adopt more open sharing of data and has lacked effective tools to do so. There has been a significant investment in databases and tools for biological science, and frequent calls for more of them, but few calls to the biological community to adopt practices and frameworks for making their resources more easily discoverable and data more accessible. Data are contained within diverse sources, from web pages, databases, literature to personal lab systems, making for a haphazard mechanism for data and tool discovery. Although these mechanisms are effective for small communities, they are parochial for the totality of resources available, leading to fragmentation in the resource ecosystem. Neuroscience, with its diverse subdisciplines, complex data types and broad domain, presents the perfect exemplar of the current practices, bottlenecks and issues surrounding open access to data. This situation is changing, however, as groups have started to work together to define new models and tools for sharing and analyzing neuroscience data on an international scale. In this workshop, we bring together experts from national and international projects to discuss issues of data access and progress towards establishing platforms and best practices for effective sharing of neuroscience data in support of basic and clinical neuroscience.
AN EFFICIENT WAVELET BASED FEATURE REDUCTION AND CLASSIFICATION TECHNIQUE FOR...ijcseit
This research paper proposes an improved feature reduction and classification technique to identify mild
and severe dementia from brain MRI data. The manual interpretation of changes in brain volume based on
visual examination by radiologist or a physician may lead to missing diagnosis when a large number of
MRIs are analyzed. To avoid the human error, an automated intelligent classification system is proposed
which caters the need for classification of brain MRI after identifying abnormal MRI volume, for the
diagnosis of dementia. In this research work, advanced classification techniques using Support Vector
Machines based on Particle Swarm Optimisation and Genetic algorithm are compared. Feature reduction
by wavelets and PCA are analysed. From this analysis, it is observed that the proposed classification of
SVM based PSO is found to be efficient than SVM trained with GA and wavelet based feature reduction
technique yields better results than PCA.
Brain Image Fusion using DWT and Laplacian Pyramid Approach and Tumor Detecti...INFOGAIN PUBLICATION
Image fusion is the process of combining important information from two or more images into a single image. The resulting image will be more enhanced than any of the input pictures. The idea of combining multiple image modalities to furnish a single, more enhanced image is well established, special fusion methods have been proposed in literature. This paper is based on image fusion using laplacian pyramid and Discreet Wavelet Transform (DWT) methods. This system uses an easy and effective algorithm for multi-focus image fusion which uses fusion rules to create fused image. Subsequently, the fused image is obtained by applying inverse discreet wavelet transform. After fused image is obtained, watershed segmentation algorithm is applied to detect the tumor part in fused image.
EMG Diagnosis using Neural Network Classifier with Time Domain and AR FeaturesIDES Editor
The shapes of motor unit action potentials
(MUAPs) in an electromyographic (EMG) signal
provide an important source of information for the
diagnosis of neuromuscular disorders. To extract this
information from the EMG signals, the first step is
identification of the MUAPs composed by the EMG
signal, second step is clustering of MUAPs with similar
shapes, third step is extraction of the features of MUAP
clusters and last step is classification of MUAPs. In this
work, the MUAPs are identified by using a data driven
segmentation algorithm, statistical pattern recognition
technique is used for clustering of MUAPs. Followed by
the extraction of time domain and autoregressive (AR)
features of the MUAP clusters. Finally, a neural
network (NN) classifier is used for classification of
MUAPs. A total of 12 EMG signals obtained from 3
normal (NOR), 5 myopathic (MYO) and 4 motor
neuron diseased (MND) subjects were analyzed. The
success rate for the segmentation technique is 95.90%
and for the statistical technique is 93.13%. The
classification accuracy of NN is 66.72% with time
domain parameters and 75.06 % with AR parameters.
Tumor Detection Based On Symmetry InformationIJERA Editor
Various subjects that are paired usually are not identically the same, asymmetry is perfectly normal but sometimes asymmetry can benoticeable too much. Structural and functional asymmetry in the human brain and nervous system is reviewed in a historical perspective. Brainasymmetry is one of such examples, which is a difference in size or shape, or both. Asymmetry analysis of brain has great importance because itis not only indicator for brain cancer but also predict future potential risk for the same. In our work, we have concentrated to segment theanatomical regions of brain, isolate the two halves of brain and to investigate each half for the presence of asymmetry of anatomical regions inMRI.
Artificial Intelligence in High Content Screening and Cervical Cancer DiagnosisUniversity of Zurich
Artficial Intelligence (AI) studies and designs intelligent agents, e.g. systems that perceive their environment and take actions that maximize the chances of success. In microscopy a success is often understood when the automated image analysis effciently detects phenotypes, as in biological screens, or retrieves a diagnostically relevant statistics from images, as in the medical diagnosis. During the talk I will present two applications aimed at supporting high-content screening and diagnosis of cervical cancer where subdomains of AI, e.g. Evolutionary Algorithm, Neural Networks and Machine Learning techniques were applied.
The 2009 annual report, covering the activity of the whole Institute, is now available in two formats: in the print version and in the on-line version that can be consulted online. It is an opportunity to look back over an eventful year and to share this document which is both important and at the same time enjoyable to read.
SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...sipij
Wavelet transform (WT) is a powerful modern tool for time-frequency analysis of non-stationary signals such as electroencephalogram (EEG). The aim of this study is to choose the best and suitable mother wavelet function (MWT) for analyzing normal, seizure-free and seizured EEG signals. Several MWTs can be used, but the best MWT is the one that conserves the quasi-totality of information of the original signal on wavelet coefficients and gather more EEG rhythms in terms of frequency. In this study, Daubechies, Symlets and Coiflets orthogonal families were used as bsis mother wavelet functions. The percentage rootmeans square difference (PRD), the signal to noise ratio (SNR) and the simulated frequencies as the selection metrics. Simulation results indicate Daubechies wavelet at level 4 (Db4) as the most suitable MWT for EEG frequency bands decomposition.Furthermore, due to the redundancy of the extracted features, linear discriminant analysis (LDA) is applied for feature selection. Scatter plot showed that the selected feature vector represents the amount of changes in frequency distribution and carries most of the discriminative and representative information about their classes. Then, this study can provide a reference for the selection of a suitable MWT and discriminativefeatures.
Image Processing Technique for Brain Abnormality DetectionCSCJournals
Medical imaging is expensive and very much sophisticated because of proprietary software and expert personalities. This paper introduces an inexpensive, user friendly general-purpose image processing tool and visualization program specifically designed in MATLAB to detect much of the brain disorders as early as possible. The application provides clinical and quantitative analysis of medical images. Minute structural difference of brain gradually results in major disorders such as schizophrenia, Epilepsy, inherited speech and language disorder, Alzheimer's dementia etc. Here the main focusing is given to diagnose the disease related to the brain and its psychic nature (Alzheimer’s disease).
Brain Image Fusion using DWT and Laplacian Pyramid Approach and Tumor Detecti...INFOGAIN PUBLICATION
Image fusion is the process of combining important information from two or more images into a single image. The resulting image will be more enhanced than any of the input pictures. The idea of combining multiple image modalities to furnish a single, more enhanced image is well established, special fusion methods have been proposed in literature. This paper is based on image fusion using laplacian pyramid and Discreet Wavelet Transform (DWT) methods. This system uses an easy and effective algorithm for multi-focus image fusion which uses fusion rules to create fused image. Subsequently, the fused image is obtained by applying inverse discreet wavelet transform. After fused image is obtained, watershed segmentation algorithm is applied to detect the tumor part in fused image.
EMG Diagnosis using Neural Network Classifier with Time Domain and AR FeaturesIDES Editor
The shapes of motor unit action potentials
(MUAPs) in an electromyographic (EMG) signal
provide an important source of information for the
diagnosis of neuromuscular disorders. To extract this
information from the EMG signals, the first step is
identification of the MUAPs composed by the EMG
signal, second step is clustering of MUAPs with similar
shapes, third step is extraction of the features of MUAP
clusters and last step is classification of MUAPs. In this
work, the MUAPs are identified by using a data driven
segmentation algorithm, statistical pattern recognition
technique is used for clustering of MUAPs. Followed by
the extraction of time domain and autoregressive (AR)
features of the MUAP clusters. Finally, a neural
network (NN) classifier is used for classification of
MUAPs. A total of 12 EMG signals obtained from 3
normal (NOR), 5 myopathic (MYO) and 4 motor
neuron diseased (MND) subjects were analyzed. The
success rate for the segmentation technique is 95.90%
and for the statistical technique is 93.13%. The
classification accuracy of NN is 66.72% with time
domain parameters and 75.06 % with AR parameters.
Tumor Detection Based On Symmetry InformationIJERA Editor
Various subjects that are paired usually are not identically the same, asymmetry is perfectly normal but sometimes asymmetry can benoticeable too much. Structural and functional asymmetry in the human brain and nervous system is reviewed in a historical perspective. Brainasymmetry is one of such examples, which is a difference in size or shape, or both. Asymmetry analysis of brain has great importance because itis not only indicator for brain cancer but also predict future potential risk for the same. In our work, we have concentrated to segment theanatomical regions of brain, isolate the two halves of brain and to investigate each half for the presence of asymmetry of anatomical regions inMRI.
Artificial Intelligence in High Content Screening and Cervical Cancer DiagnosisUniversity of Zurich
Artficial Intelligence (AI) studies and designs intelligent agents, e.g. systems that perceive their environment and take actions that maximize the chances of success. In microscopy a success is often understood when the automated image analysis effciently detects phenotypes, as in biological screens, or retrieves a diagnostically relevant statistics from images, as in the medical diagnosis. During the talk I will present two applications aimed at supporting high-content screening and diagnosis of cervical cancer where subdomains of AI, e.g. Evolutionary Algorithm, Neural Networks and Machine Learning techniques were applied.
The 2009 annual report, covering the activity of the whole Institute, is now available in two formats: in the print version and in the on-line version that can be consulted online. It is an opportunity to look back over an eventful year and to share this document which is both important and at the same time enjoyable to read.
SUITABLE MOTHER WAVELET SELECTION FOR EEG SIGNALS ANALYSIS: FREQUENCY BANDS D...sipij
Wavelet transform (WT) is a powerful modern tool for time-frequency analysis of non-stationary signals such as electroencephalogram (EEG). The aim of this study is to choose the best and suitable mother wavelet function (MWT) for analyzing normal, seizure-free and seizured EEG signals. Several MWTs can be used, but the best MWT is the one that conserves the quasi-totality of information of the original signal on wavelet coefficients and gather more EEG rhythms in terms of frequency. In this study, Daubechies, Symlets and Coiflets orthogonal families were used as bsis mother wavelet functions. The percentage rootmeans square difference (PRD), the signal to noise ratio (SNR) and the simulated frequencies as the selection metrics. Simulation results indicate Daubechies wavelet at level 4 (Db4) as the most suitable MWT for EEG frequency bands decomposition.Furthermore, due to the redundancy of the extracted features, linear discriminant analysis (LDA) is applied for feature selection. Scatter plot showed that the selected feature vector represents the amount of changes in frequency distribution and carries most of the discriminative and representative information about their classes. Then, this study can provide a reference for the selection of a suitable MWT and discriminativefeatures.
Image Processing Technique for Brain Abnormality DetectionCSCJournals
Medical imaging is expensive and very much sophisticated because of proprietary software and expert personalities. This paper introduces an inexpensive, user friendly general-purpose image processing tool and visualization program specifically designed in MATLAB to detect much of the brain disorders as early as possible. The application provides clinical and quantitative analysis of medical images. Minute structural difference of brain gradually results in major disorders such as schizophrenia, Epilepsy, inherited speech and language disorder, Alzheimer's dementia etc. Here the main focusing is given to diagnose the disease related to the brain and its psychic nature (Alzheimer’s disease).
Pathomics Based Biomarkers and Precision MedicineJoel Saltz
Role of Digital Pathology Data Science (Pathomics) in precision medicine. Features from billions or trillions of objects segmented from digital Pathology data can be employed to predict patient outcome and steer treatment.
Presentation at Imaging 2020, Jackson Hole, WY September 2016
Machine Learning and Deep Contemplation of DataJoel Saltz
Spatio temporal data analytics - Generation of Features
1) Sanity Checking and Data Cleaning, 2) Qualitative Exploration, 3) Descriptive Statistics, 4) Classification, 5) Identification of Interesting Phenomena, 6) Prediction, 7) Control and 8)
Save Data for Later (Compression).
Detailed example from Precision Medicine; Pathomics, Radiomics.
Talk given to the Emory Cancer Control and Population Science Program 2/17/2011 Describing Biomedical Informatics, Integrative Cancer Research, caBIG and CTSA
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeJoel Saltz
I surveyed the development of Digital Pathology methodology beginning with the 1997 virtual microscope prototype at Hopkins (PMC2233368) to current tools, methods and algorithms designed to display, analyze and classify whole slide imaging data. I will describe the capabilities of current methods, describe how these methods are likely to evolve and how they will be likely to impact Pathology research and practice.
Complete Sequencing – Clifford Reid, PhD; CEO, Complete Genomics as presented at the Personalized Health Care Conference at Ohio State. Dr. Reid discussed what complete human sequencing looks like and costs now and in the near future.
During past few years, brain tumor segmentation in CT has become an emergent research area in the field of medical imaging system. Brain tumor detection helps in finding the exact size and location of tumor. An efficient algorithm is proposed in this project for tumor detection based on segmentation and morphological operators. Firstly quality of scanned image is enhanced and then morphological operators are applied to detect the tumor in the scanned image. The problem with biopsy is that the patient has to be hospitalized and also the results (around 15%) give false negative. Scan images are read by radiologist but it's a subjective analysis which requires more experience. In the proposed work we segment the renal region and then classify the tumors as benign or malignant by using ANFIS, which is a non-invasive automated process. This approach reduces the waiting time of the patient.
BIOBASE, the leader in data annotation and curation for genomics, took part in the Genome Informatics Alliance 2012: Logistics meeting in Oregon, and had an opportunity to present on trends in annotation of genomic data.
A new Standard Uptake Values (SUV) Calculation based on Pixel Intensity ValuesPawitra Masa-ah
S. Soongsathitanon, et al., "A new Standard Uptake Values (SUV)
Calculation based on Pixel Intensity Values," INTERNATIONAL JOURNAL of MATHEMATICS AND COMPUTERS IN SIMULATION , NAUN, Issue 1, Volume 6, pp 26-33, 2012 .
http://www.naun.org/journals/mcs/
Presentation given by Appistry's Vice President of Product Strategy, Sultan Meghi at the World Genome Data Analysis Summit. Meghi presented about the big data challenges facing labs as they strive to manage the flow of genetic data from sequencer to the clinic.
dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022dkNET
Abstract
HuBMAP aims to catalyze the development of an open, global framework for comprehensively mapping the human body at cellular resolution. HuBMAP goals include: (1) Accelerate the development of the next generation of tools and techniques for constructing high resolution spatial tissue maps. (2) Generate foundational 3D tissue atlases. (3) Establish an open data platform. (4) Coordinate and collaborate with other funding agencies, programs, and the biomedical research community. (5) Support projects that demonstrate the value of the resources developed by the program. The HuBMAP Portal can be found at https://portal.hubmapconsortium.org and the Visible Human MOOC describes the compilation and coverage of HuBMAP data, demonstrates new single-cell analysis and mapping techniques, and introduces major features of the HuBMAP portal.
The top 3 key questions that HuBMAP can answer:
1. What assay types are best to map the human body in 3D and across scales?
2. What Common Coordinate System (CCF) is best to construct the Human Reference Atlas?
3. How can others help construct and/or use the Human Reference Atlas?
Presenters:
Katy Börner, PhD, Victor H. Yngve Distinguished Professor of Engineering and Information Science, Department of Intelligent Systems Engineering and Information Science, Indiana University
Jeffrey Spraggins, PhD, Assistant Professor, Department of Cell and Developmental Biology, Vanderbilt University
Upcoming webinars schedule: https://dknet.org/about/webinar
Similar to MICCAI - Workshop on High Performance and Distributed Computing for Medical Imaging - Sept 22 2011 (20)
American Association for Cancer Research Annual Meeting 2022
Analysis of images of routinely acquired tissue specimens promise to provide biomarkers that can be used to predict disease outcome and steer treatment, improve diagnostic reproducibility, and reveal new insights to further advance current human understanding of disease. The advent of AI and ubiquitous high-end computing are making it possible to carry out accurate whole slide image morphological and molecular tissue analyses at cellular and subcellular resolutions. AI methods are can enable exploration and discovery of novel diagnostic biomarkers grounded in prognostically predictive spatial and molecular patterns as well as quantitative assessments of predictive value and reproducibility of traditional morphological patterns employed in anatomic pathology. AI methods may be adapted to help steer treatment through integrative analysis of clinical information along with Pathology, Radiology and molecular data.
Learning, Training, Classification, Common Sense and Exascale ComputingJoel Saltz
In this talk, I will describe work my group has carried out in development of deep learning methods that target semantic segmentation and object identification tasks in terapixel Pathology datasets and for satellite data. I will describe what we have been able to achieve, how this work can generalize to additional types of problems and will outline how exascale computing could be used to transform and integrate our methods and pipelines. I will then go on to outline broad research program in exascale computing and deep learning that promises to identify common deep learning methods for previously disparate large and extreme scale data tasks.
Integrative Everything, Deep Learning and Streaming DataJoel Saltz
Workshop on Clusters, Clouds, and Data for Scientific Computing, September 6, 2018
The need to create to label information and segment regions in individual sensor data sources and to create synthesizes from multiple disparate data sources span many areas of science, biomedicine and technology. The rapid evolution in sensor technologies – from digital microscopes to UAVs drive requirements in this area. I will describe a variety of use cases, describe technical challenges as well as tools, algorithms and techniques developed by our group and collaborators.
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Joel Saltz
In this presentation, I will survey the development of Digital Pathology methodology beginning with the 1997 virtual microscope prototype at Hopkins to current tools, methods and algorithms designed to display, analyze and classify whole slide imaging data. I will describe methods, tools and algorithms to extract information from Pathology images. These tools include ability to traverse whole slide images, segment nuclei, carry out deep learning region classification and characterize relationship between extracted features and morphological structures. I will also describe some of the research efforts that motivate development of these tools, the role Pathomics is playing in precision medicine research as well as the impact of Pathology Informatics on clinical practice and health care quality.
Presentation at the Department of Biomedical Informatics, University Pittsburgh Medical Center, April 27, 2018
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeJoel Saltz
Presentation at Pathology Visions 2017 - https://digitalpathologyassociation.org/2017-pathology-visions-agenda
I will survey the development of Digital Pathology methodology beginning with the 1997 virtual microscope prototype at Hopkins (PMC2233368) to current tools, methods and algorithms designed to display, analyze and classify whole slide imaging data. I will describe the capabilities of current methods, describe how these methods are likely to evolve and how they will be likely to impact Pathology research and practice.
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...Joel Saltz
Description of NCI Information Technology for Cancer Research Project dedicated to 1) development of development of Digital Pathology pipelines, databases, data modeling and visualization methods, 2) support for digital pathology/Radiology/"omics" based precision medicine
Presented at Spring 2015 Information Technology for Cancer Research PI Meeting
Generation and Use of Quantitative Pathology PhenotypeJoel Saltz
Motivation, tools and methods analysis of digital pathology imagery, integration with "omics" and Radiology, use in Precision Medicine. Presentation at the Early Detection Research Network meeting, April 2015, Atlanta GA
Spatio-‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Joel Saltz
Presentation at Clusters, Clouds and Data for Scientific Computing 2014
Integrative analyses of large scale spatio-temporal datasets play increasingly important roles in many areas of science and engineering. Our recent work in this area is motivated by application scenarios involving complementary digital microscopy, radiology and “omic” analyses in cancer research. In these scenarios, the objective is to use a coordinated set of image analysis, feature extraction and machine learning methods to predict disease progression and to aid in targeting new therapies. I will describe tools and methods our group has developed for extraction, management, and analysis of features along with the systems software methods for optimizing execution on high end CPU/GPU platforms. Once having provided our current work as an introduction, I will then describe 1) related but much more ambitious exascale biomedical and non-biomedical use cases that also involve the complex interplay between multi-scale structure and molecular mechanism and 2) concepts and requirements for methods and tools that address these challenges.
Integrative
analyses of large scale spatio-temporal datasets play increasingly important roles in many areas of science and engineering. Our recent work in this area is motivated by application scenarios involving complementary digital microscopy, Radiology and "omic"
analyses in cancer research. In these scenarios, our objective is to use a coordinated set of image analysis, feature extraction and machine learning methods to predict disease progression and to aid in targeting new therapies.
We describe methods
we have developed for extraction, management and analysis of features along with the systems software methods for optimizing execution on high end CPU/GPU platforms. We will also describe biomedical results obtained from these studies and extensions of the
computational methods to broader application areas.
2. Squeezing Information from Spatial Datasets
• Leverage exascale data and
Center for Comprehensive Informatics
computer resources to
squeeze the most out of
image, sensor or simulation
data
• Run lots of different
algorithms to derive same
features
• Run lots of algorithms to
derive complementary
features
• Data models and data
management infrastructure
to manage data products,
feature sets and results from
classification and machine
learning algorithms
3. INTEGRATIVE BIOMEDICAL
INFORMATICS ANALYSIS
Center for Comprehensive Informatics
Reproducible anatomic/functional
characterization at gross level (Radiology) and
fine level (Pathology)
Integration of anatomic/functional
characterization with multiple types of “omic”
information
Create categories of jointly classified data to
describe pathophysiology, predict prognosis,
response to treatment
4. In Silico Center for
Brain Tumor Research
Specific Aims:
1. Influence of necrosis/
hypoxia on gene expression and
genetic classification.
2. Molecular correlates of high
resolution nuclear morphometry.
3. Gene expression profiles
that predict glioma progression.
4. Molecular correlates of MRI
enhancement patterns.
5. Integration of heterogeneous multiscale
information
Center for Comprehensive Informatics
•Coordinated initiatives
Pathology, Radiology,
“omics”
Radiology
•Exploit synergies
Imaging
between all initiatives
to improve ability to
forecast survival & Patient
“Omic” Outco
response. me
Data
Pathologic
Features
6. Example: Pathology and Gene Expression
Joint Predictors of Recurrence/Survival
Lee Cooper Carlos Moreno
8. In Silico Center for
Brain Tumor Research
Key Data Sets
REMBRANDT: Gene expression and genomics data
set of all glioma subtypes
The Cancer Genome Atlas (TCGA): Rich “omics” set
of GBM, digitized Pathology and Radiology
Pathology and Radiology Images from Henry Ford
Hospital, Emory, Thomas Jefferson U, MD Anderson
and others
10. TCGA and REMBRANDT Datasets
Center for Comprehensive Informatics
*Some overlap with TCGA main repository but rescanned at 40X Mikkelsen @ H
** Data obtained by generosity of Lisa Scarpace/Tom
** Data obtained by generosity of Flanders/Mark Curtis @ TJU @ HF
***Thanks to Adam Lisa Scarpace/Tom Mikkelsen
***Thanks to Adam Flanders/Mark Curtis @ TJU
11. Neuroimaging/MRI Data
Center for Comprehensive Informatics
*Pending Upload to NBIA portal
** Data obtained by efforts of Lisa Scarpace/Tom Mikkelsen
@ HF
14. Requirements for High Resolution Whole Slide
Feature Characterization
Center for Comprehensive Informatics
• 1010 pixels for each whole slide image
• 10 whole slide images per patient
• 108 image features per whole slide
image
• 10,000 brain tumor patients
• 1015 pixels
• 1013 features
• Hundreds of algorithms
• Annotations and markups from dozens
of humans
15. Distinguishing Characteristic in
Gliomas
Nuclear Qualities
Round shaped with Elongated with
smooth regular texture rough, irregular texture
Oligodendroglioma Astrocytoma
Use image analysis algorithms to segment and classify microanatomic
features (Nuclei, Astrocytoma, Necrosis ...) in whole slide images
Represent the segmentation and classification in a well defined
structured format that can be used to correlate the pathology with
other data modalities
16. Astrocytoma vs Oligodendroglima
Overlap in genetics, gene expression, histology
Center for Comprehensive Informatics
Astrocytoma vs Oligodendroglima
• Assess nuclear size (area and
perimeter), shape (eccentricity,
circularity major axis, minor axis, Fourier
shape descriptor and extent ratio),
intensity (average, maximum, minimum,
standard error) and texture (entropy,
energy, skewness and kurtosis).
17. Machine-based Classification of TCGA GBMs (J Kong)
Whole slide scans from 14 TCGA GBMS (69 slides)
7 purely astrocytic in morphology; 7 with 2+ oligo component
399,233 nuclei analyzed for astro/oligo features
Cases were categorized based on ratio of oligo/astro cells
TCGA Gene
Expression Query:
c-Met overexpression
19. Nuclear Features Used to Classify GBMs
10
Center for Comprehensive Informatics
20
30
40
Feature Indices
50
60
70
80
90
100
110
Clustergram of selected features used in consensus
clustering
20. Nuclear Features Used to Classify GBMs
2 1 3 4
Center for Comprehensive Informatics
1
20
40
60
2
Cluster
80
100
3
120
140
160 4
Consensus clustering of morphological
50 100 150
0 0.2 0.
signatures Silho
Study includes 200 million nuclei taken from 480
slides corresponding to 167 distinct patients.
21. 1
Cluster 1
0.9 Cluster 2
Center for Comprehensive Informatics
Cluster 3
0.8
Cluster 4
0.7
0.6
Survival
0.5
0.4
0.3
0.2
0.1
0
0 500 1000 1500 2000 2500 3000
Days
Survival of morphological clusters
22. 1
Proneural
0.9 Neural
Center for Comprehensive Informatics
Classical
0.8
Mesenchymal
0.7
0.6
Survival
0.5
0.4
0.3
0.2
0.1
0
0 500 1000 1500 2000 2500 3000
Days
Survival of patients by molecular tumor
subtype
23. Cluster 2:
What does this mean? Gemistocytic Signature
Center for Comprehensive Informatics
• No association with
• Transcriptional class
• Age
• Molecular correlates:
Wnt signaling, TP53 signaling,
Oxidative phosphorylation,
mRNA processing
25. Multicore/GPU Implementation
Center for Comprehensive Informatics
• Nuclear/cell segmentation
• Feature extraction
• Generate clusters of nuclear/cell
Features
• Classify nuclei/cells by feature cluster
• Classify patients by populations of
classified nuclei/cells
32. Scheduling for CPU/GPU System
Center for Comprehensive Informatics
• PRIORITY: assigns tasks to CPUs or GPUs based
on estimate of relative performance gain of each
task on GPU vs on CPU and load of the CPUs and
GPUs
• Sorted queue of (task, data element) tuples based
on the relative speedup expected for each tuple
• As more (task, data element) tuples are created,
tuples are inserted into the queue such that the
queue remains sorted
• First Come First Served: simple low overhead
policy
33. Aggregation to reduce kernel invocation and data
movement overheads
Center for Comprehensive Informatics
• Objects are grouped based on which image tile they
are segmented from. Each tuple consists of (objects
in image tile, feature computation operation)
• Perform group of tasks that are applied to the same
data element (or the same chunk of data elements)
• Performance Aware Task Fusion (PATF) -- groups
tasks based on GPU speedup values. Two groups of
tasks: one group containing tasks with good GPU
speedup values and the other group containing the
remaining tasks
37. Multiscale Systems Biology
Employ multi-resolution methods to
characterize necrosis, angiogenesis and
correlate these with “omics”
No enhancement Rim-enhancement
Normal Vessels ? Vascular Changes
Stable lesion Rapid progression
38. Correlation of Necrosis, Angiogenesis and “omics”
Center for Comprehensive Informatics
• GBMs display variable and regionally heterogeneous degrees
of necrosis (asterisk) and angiogenesis
• These factors may impact gene expression profiles
39. Genes Correlated with Necrosis include Transcription Factors
Identified as Regulators of the Mesenchymal Transition
Center for Comprehensive Informatics
•Frozen sections from 88 GBM samples marked
to identify regions of necrosis and angiogenesis
•Extent of both necrosis and angiogenesis calculated as a
percentage of total tissue area
Gene SAM q-value
Symbol (Corrected p-value)
C/EBPB < 0.000001
C/EBPD < 0.000001
FOSL2 < 0.000001
Carro MS, et al. Nature 263: 318-25, 2010
STAT3 0.0047
RUNX1 0.0082
41. Defining Rich Set of Qualitative and
Quantitative Image Biomarkers
Center for Comprehensive Informatics
• Community-driven ontology development
project; collaboration with ASNR
• Imaging features (5 categories)
– Location of lesion
– Morphology of lesion margin (definition, thickness,
enhancement, diffusion)
– Morphology of lesion substance (enhancement, PS
characteristics, focality/multicentricity, necrosis, cysts, midline
invasion, cortical involvement, T1/FLAIR ratio)
– Alterations in vicinity of lesion (edema, edema
crossing midline, hemorrhage, pial invasion, ependymal invasion,
satellites, deep WM invasion, calvarial remodeling)
– Resection features (extent of nCE tissue, CE tissue, resected
components)
42. Imaging Predictors of survival and molecular profiles
in the TCGA Glioblastoma Data set
The TCGA glioma working group
Center for Comprehensive Informatics
1Emory University Hospital, Atlanta, GA 2National Cancer Institute, Bethesda, MD. 3Thomas Jefferson University Hospital, Philadelphia,
PA. 4Henry Ford University Hospital, Detroit, Michigan. 5National Institute of Health, Bethesda, MD. 6Boston University School of
Medicine, Boston, MA. 7SAIC-Frederick, Inc., Frederick, MD. 8University of Virginia, Charlottesville, VA. 9 Northwestern University
Chicago, IL
Emory TJU/CBIT/NCI UVA/Northwestern Henry Ford
David A Gutman1 Adam Flanders3 Max Wintermark8 Lisa Scarpace4
Lee Cooper1 Eric Huang2 Manal Jilwan8 Tom Mikkelsen4
Scott N Hwang1 Robert J Clifford2 Prashant Raghavan8
Chad A Holder1 Dina Hammoud3 Pat Mongkolwat9
Doris Gao1 John Freymann7
Carlos Moreno1 Justin Kirby7
Arun Krishnan1 Carl Jaffe6
Jun Kong1
Seena Dehkharghani1
Joel Saltz1
Dan Brat1
43. Correlative Imaging Results
Center for Comprehensive Informatics
• Minimal enhancing tumor (≤5%)
strongly associated with Proneural
classification (p=0.0006).
• >5% proportion of necrosis and the
presence of microvascular hyperplasia in
pathology slides (p=0.008).
• Greater maximum tumor dimension (T2
signal) associated with
present/abundant microvascular
hyperplasia (p=0.001).
< 5% Enhancement
44. Correlative Imaging Results
Center for Comprehensive Informatics
• TP53 mutant tumors had a smaller mean
tumor sizes (p=0.002) on T2-weighted or
FLAIR images.
• EGFR mutant tumors were significantly
larger than TP53 mutant tumors
(p=0.0005).
• High level EGFR amplification was
associated with >5% enhancement and
>5% proportion of necrosis (p < 0.01).
> 5% Necrosis
46. Large Scale Data Management
Implemented with IBM DB2 for large scale
Center for Comprehensive Informatics
pathology image metadata (~million markups per
slide)
Represented by a complex data model capturing
multi-faceted information including markups,
annotations, algorithm provenance, specimen, etc.
Support for complex relationships and spatial
query: multi-level granularities, relationships
between markups and annotations, spatial and
nested relationships
47. Data Models to Represent Feature Sets
and Experimental Metadata
PAIS |pās| : Pathology Analytical Imaging Standards
• Provide semantically enabled data model to
support pathology analytical imaging
• Data objects, comprehensive data types, and
flexible relationships
• Object-oriented design, easily extensible
• Reuse existing standards
– Reuse relevant classes already defined in AIM
– Follow DICOM WG 26 metadata specifications on WSI
reference
– Specimen information in DICOM Supplement 122 and
caTissue
– Use caDSR for CDE and NCI Thesaurus for ontology
48. Pathology Imaging GIS
class Domain Mo...
WholeSlideImageReference TMAImageReference
Patient
Region
MicroscopyImageReference
Subject
0..1
Center for Comprehensive Informatics
0..1
DICOMImageReference 1
1 Specimen
ImageReference 1 0..*
0..* 0..1
1 0..1 AnatomicEntity
User 0..* 1
1
0..1
1 1 Equipment
Group 0..1
1
0..1 1 PAIS
1 0..*
Project AnnotationReference
1
0..1
0..* 1 1 1
Collection 0..* 0..*
1 0..*
0..* Annotation 0..*
0..* Markup
0..1
0..*
GeometricShape 1
Image analysis
1 1
0..* 0..* 0..*
1
Observation Calculation Inference
Surface Field
1 1
0..1 0..1 0..1
Provenance
0..*
PAIS model PAIS data management
Modeling and management of markup and annotation for
querying and sharing through parallel RDBMS + spatial DBMS
Segmentation
HDFS data staging MapReduce based queries
On the fly data processing for algorithm validation/algorithm
Feature extraction
sensitivity studies, or discovery of preliminary results
49. Workflow, Adaptivity and Quality of Service
• In-transit data processing using
filter/stream systems
• Semantic Workflows
• Hierarchical pipeline design with coarse
and fine grained components
• Adaptivity and Quality of Service
50. Computerized Classification System
for Grading Neuroblastoma
Initialization Yes
Image Tile Background? Label
I=L
• Background Identification
No
Create Image I(L)
• Image Decomposition (Multi-
Training Tiles resolution levels)
Segmentation I = I -1 • Image Segmentation
Down-sampling
(EMLDA)
Segmentation
Feature Construction
• Feature Construction (2nd
Yes
No order statistics, Tonal
Feature Extraction I > 1?
Feature Construction
Features)
Feature Extraction
Classification
• Feature Extraction (LDA) +
Classification (Bayesian)
Classifier Training
No
• Multi-resolution Layer
Within Confidence
Region ? Controller (Confidence
Yes
TRAINING Region)
TESTING
51. Slides’ Preparation
• 64990 x 59412 pixels in full resolution
• Original Size: 10.8 Gb; Compressed Sized: ≈ 833Mb
8x
40x
52. Semantic Workflows (Wings)
Collaborative Work with Yolanda Gil, Mary Hall
Center for Comprehensive Informatics
• A systematic strategy for composing application
components into workflows
• Search for the most appropriate implementation of
both components and workflows
• Component optimization
– Select among implementation variants of the same
computation
– Derive integer values of optimization parameters
– Only search promising code variants and a restricted
parameter space
• Workflow optimization
– Knowledge-rich representation of workflow properties
58. Conclusions
• Increasingly common task: Integrative analysis of
complementary multi-resolution and “omic” data sources
• Elucidation of mechanisms, identify medically important
disease sub-classes
• Cell level whole slide analysis of whole slide images
requires innovations in parallelization, data management
• Integration of Radiology and Pathology has promise but
pose additional challenges
• Role of quality of service/on demand computing
59. Thanks to:
• In silico center team: Dan Brat (Science PI), Tahsin Kurc, Ashish Sharma, Tony Pan, David
Gutman, Jun Kong, Sharath Cholleti, Carlos Moreno, Chad Holder, Erwin Van Meir, Daniel
Rubin, Tom Mikkelsen, Adam Flanders, Joel Saltz (Director)
• caGrid Knowledge Center: Joel Saltz, Mike Caliguiri, Steve Langella co-Directors; Tahsin
Kurc, Himanshu Rathod Emory leads
• caBIG In vivo imaging team: Eliot Siegel, Paul Mulhern, Adam Flanders, David Channon,
Daniel Rubin, Fred Prior, Larry Tarbox and many others
• In vivo imaging Emory team: Tony Pan, Ashish Sharma, Joel Saltz
• Emory ATC Supplement team: Tim Fox, Ashish Sharma, Tony Pan, Edi Schreibmann, Paul
Pantalone
• Digital Pathology R01: Foran and Saltz; Jun Kong, Sharath Cholleti, Fusheng Wang, Tony
Pan, Tahsin Kurc, Ashish Sharma, David Gutman (Emory), Wenjin Chen, Vicky Chu, Jun
Hu, Lin Yang, David J. Foran (Rutgers)
• NIH/in silico TCGA Imaging Group: Scott Hwang, Bob Clifford, Erich Huang, Dima
Hammoud, Manal Jilwan, Prashant Raghavan, Max Wintermark, David Gutman, Carlos
Moreno, Lee Cooper, John Freymann, Justin Kirby, Arun Krishnan, Seena Dehkharghani,
Carl Jaffe
• ACTSI Biomedical Informatics Program: Marc Overcash, Tim Morris, Tahsin Kurc,
Alexander Quarshie, Circe Tsui, Adam Davis, Sharon Mason, Andrew Post, Alfredo Tirado-
Ramos
• NSF Scientific Workflow Collaboration: Vijay Kumar, Yolanda Gil, Mary Hall, Ewa Deelman,
Tahsin Kurc, P. Sadayappan, Gaurang Mehta, Karan Vahi