Applications of Machine Learning to Medical Imaging

1,743 views

Published on

Published in: Health & Medicine, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,743
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
70
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Classification, comparison, or analysis of images is performed almost always in terms of a set of features extracted from the images. Usual this is necessary for one of the following reasons:
    Reduction of dimensionality: an 8-bit per pixel image of size 256x256 pixels has 25665,536 =10157,826 possible realisations. Clearly, it is worth –while to express structure within and similarities between images in ways that depends on fewer, higher-level representations of their pixels and relationship. It will important to show that the reduction nevertheless preserves information important to the task.
    Incorporation of cues from human perception. Much is known about the effects of basic stimuli on the visual system. In many situations, we have considerable insight into how humans analyse images (essential in the training of radiologist and photo interpreters). Use of the right kinds of features would allow for the incorporation of that experience into automated analysis.
    Transcend the limit of human perception. Though we can very easily understand many kinds of images, there are properties (e.g. some textures) of images that we cannot perceive visually, but which could be useful in characterising them. Features can be constructed from various manipulations of the images that make those properties evident.
    Need for invariance. The meaning and the utility of an image are often unchanged when the image is perturbed in various way. Changes in one or more of scale, location, brightness and orientation for example and the presence of noise, artefacts and intrinsic variation are image alteration to which well-designed featured are wholly or partially invariant.
  • Classification, comparison, or analysis of images is performed almost always in terms of a set of features extracted from the images. Usual this is necessary for one of the following reasons:
    Reduction of dimensionality: an 8-bit per pixel image of size 256x256 pixels has 25665,536 =10157,826 possible realisations. Clearly, it is worth –while to express structure within and similarities between images in ways that depends on fewer, higher-level representations of their pixels and relationship. It will important to show that the reduction nevertheless preserves information important to the task.
    Incorporation of cues from human perception. Much is known about the effects of basic stimuli on the visual system. In many situations, we have considerable insight into how humans analyse images (essential in the training of radiologist and photo interpreters). Use of the right kinds of features would allow for the incorporation of that experience into automated analysis.
    Transcend the limit of human perception. Though we can very easily understand many kinds of images, there are properties (e.g. some textures) of images that we cannot perceive visually, but which could be useful in characterising them. Features can be constructed from various manipulations of the images that make those properties evident.
    Need for invariance. The meaning and the utility of an image are often unchanged when the image is perturbed in various way. Changes in one or more of scale, location, brightness and orientation for example and the presence of noise, artefacts and intrinsic variation are image alteration to which well-designed featured are wholly or partially invariant.
  • Show how outlines can also be different. Explain that for the same nodule, slices with biggest nodule can be different for different radiologists. Start talking about calculating image features of a nodule, go to the next slide.
  • Talk more about interpretation (and interpretation variability) of separate semantic characteristics and move to the next two slides to show a specific example.
  • Describe 4 types of features used in a study. Explain how features are mapped to the semantic characteristics. Describe vector representation of a nodule after mapping is done {c1…c7, f1…f64} as input for automatic interpretation algorithm.
  • Present the results. Show that both approaches improved the accuracy for all semantic characteristics in comparison with the decision trees. Mention that difference in accuracies between two approaches are not significant except for lobulation. Depending on how much time will be left talk about further work (what we are doing right now) either show and explain the next slide or list what we have tried to do.
  • Applications of Machine Learning to Medical Imaging

    1. 1. 11 Applications of Machine Learning to Medical Imaging Daniela S. Raicu, PhD Associate Professor, CDM DePaul University Email: draicu@cs.depaul.edu Lab URL: http://facweb.cs.depaul.edu/research/vc/
    2. 2. About me…About me… • BS in Mathematics from UniversityBS in Mathematics from University of Bucharest, Romaniaof Bucharest, Romania • MS in CS from Wayne StateMS in CS from Wayne State University, MichiganUniversity, Michigan • PhD in CS from OaklandPhD in CS from Oakland University, MichiganUniversity, Michigan
    3. 3. My dissertation workMy dissertation work • Research areasResearch areas:: Data Mining & Computer VisionData Mining & Computer Vision • Dissertation topic:Dissertation topic: Content-based image retrievalContent-based image retrieval • Research hypothesisResearch hypothesis: “A picture is worth thousands of words…” • “There is enough information in the image content to perform image retrieval whose similarity results correspond to the human perceived similarity”.
    4. 4. My dissertation work (cont)My dissertation work (cont) • Research hypothesisResearch hypothesis: •“There is enough information in the image content to perform image retrieval whose similarity results correspond to the human perceived similarity”. • Methodology:Methodology: 1) extract color image features, 2) define color-based similarity, 3) cluster images based on color, 4) retrieve similar images • Output:Output: Color-based CBIR for general purpose image datasets Proof of hypothesis:Proof of hypothesis: Google similar images: http://similar-images.googlelabs. /
    5. 5. Towards an academic careerTowards an academic career • Assistant Professor at DePaul, 2002-2008Assistant Professor at DePaul, 2002-2008 • Associate Professor, 2008- PresentAssociate Professor, 2008- Present • Teaching areas & research interests: data analysis, data mining, image processing, computer vision & medical informatics • Co-director of the Intelligent Multimedia Processing, Medical Informatics lab & the NSF REU Program in Medical Informatics
    6. 6. Outline Part I: Introduction to Medical Informatics Medical Informatics Clinical Decision Making Imaging Modalities and Medical Imaging Basic Concepts in Image Processing Part II: Advances in Medical Imaging Research Computer-Aided Diagnosis Computer-Aided Diagnostic Characterization Texture-based Classification Content-based Image Retrieval
    7. 7. Medical informatics researchMedical informatics research What is medical informatics?What is medical informatics? Medical informatics is the application of computers,is the application of computers, communications and information technology andcommunications and information technology and systems to all fields of medicinesystems to all fields of medicine - medical care- medical care - medical education- medical education - medical research.- medical research. MF Collen, MEDINFO '80, TokyoMF Collen, MEDINFO '80, Tokyo
    8. 8. What is medical informatics?What is medical informatics? Medical informaticsMedical informatics is the branch of science concernedis the branch of science concerned with the use of computers and communicationwith the use of computers and communication technologytechnology to acquire, store, analyze, communicate, andto acquire, store, analyze, communicate, and display medical information and knowledgedisplay medical information and knowledge to facilitateto facilitate understanding andunderstanding and improve the accuracy, timeliness,improve the accuracy, timeliness, and reliability of decision-making.and reliability of decision-making. Warner, Sorenson and Bouhaddou, KnowledgeWarner, Sorenson and Bouhaddou, Knowledge Engineering in Health Informatics, 1997Engineering in Health Informatics, 1997
    9. 9. Clinical decision makingClinical decision making • Making sound clinical decisions requires:Making sound clinical decisions requires: –– right information, right time, right formatright information, right time, right format • Clinicians face a surplus of informationClinicians face a surplus of information –– ambiguous, incomplete, or poorlyambiguous, incomplete, or poorly organizedorganized • Rising tide of informationRising tide of information –– Expanding knowledge sourcesExpanding knowledge sources – 40K new biomedical articles per month40K new biomedical articles per month – Publicly accessible online health infoPublicly accessible online health info – Hundreds of pictures per scan for one patientHundreds of pictures per scan for one patient
    10. 10. Clinical decision makingClinical decision making:: What is theWhat is the problem?problem? • Man is an imperfect data processorMan is an imperfect data processor –– We are sensitive to theWe are sensitive to the quantityquantity andand organizationorganization of informationof information • Army officers and pilots commit ‘fatal errors’ whenArmy officers and pilots commit ‘fatal errors’ when given too many, too few, or poorly organized datagiven too many, too few, or poorly organized data • The same is true for clinicians who ‘watch’ forThe same is true for clinicians who ‘watch’ for eventsevents • Clinicians are particularly susceptible to errors ofClinicians are particularly susceptible to errors of omissionomission
    11. 11. Clinical decision makingClinical decision making:: What is theWhat is the problem?problem? • Humans are “non-perfectable” data processorsHumans are “non-perfectable” data processors - Better performance requires more time to process- Better performance requires more time to process - Irony- Irony •• Clinicians increasingly face productivityClinicians increasingly face productivity expectationsexpectations •• Clinicians face increasing administrative tasksClinicians face increasing administrative tasks
    12. 12. Subdomains of medical informaticsSubdomains of medical informatics (by(by Wikipedia)Wikipedia) • imaging informaticsimaging informatics • clinical informaticsclinical informatics • nursing informaticsnursing informatics • consumer healthconsumer health informaticsinformatics • public health informaticspublic health informatics • dental informaticsdental informatics • clinical researchclinical research informaticsinformatics • bioinformaticsbioinformatics • pharmacy informaticspharmacy informatics
    13. 13. The study of medical imaging is concerned with the interaction of all forms of radiation with tissue and the development of appropriate technology to extract clinically useful information (usually displayed in an image format) from observation of this technology. What is medical imaging (MI)?What is medical imaging (MI)? • Structural/anatomical information (CT, MRI, US) - within each elemental volume, tissue-differentiating properties are measured. • Information about function (PET, SPECT, fMRI). Sources of Images:
    14. 14. Examples of medical imagesExamples of medical images
    15. 15. The imaging “chain”The imaging “chain” Raw data Reconstruction 123…………… 2346………….. 65789………… 6578………….. Quantitative output Processing Analysis Filtering “Raw data” Signal acquisition
    16. 16. Image analysis:Image analysis: Turning an image into dataTurning an image into data • User extracted qualitative featuresUser extracted qualitative features • User extracted quantitative featuresUser extracted quantitative features • Semi automatedSemi automated • AutomatedAutomated Exam Level: Feature 1 Feature 2 Feature 3 . . Finding: Feature 1 Feature 2 . .
    17. 17. Major advances in medical imagingMajor advances in medical imaging These major advances can play a major role inThese major advances can play a major role in earlyearly detection, diagnosis, and computerized treatmentdetection, diagnosis, and computerized treatment planning in cancer radiation therapy.planning in cancer radiation therapy. Image Segmentation Image Classification  Computer-Aided Diagnosis Systems  Computer-Aided Diagnostic Characterization  Content-based Image Retrieval  Image Annotation
    18. 18. Computer-Aided Diagnosis • Computed Aided Diagnosis (CAD) is diagnosis made by a radiologist when the output of computerized image analysis methods has been incorporated into his or her medical decision-making process. • CAD may be interpreted broadly to incorporate both • the detection of the abnormality task and • the classification task: likelihood that the abnormality represents a malignancy
    19. 19. Motivation for CAD systems The amount of image data acquired during a CT scan is becoming overwhelming for human vision and the overload of image data for interpretation may result in oversight errors. Computed Aided Diagnosis for: • Breast Cancer • Lung Cancer – A thoracic CT scan generates about 240 section images for radiologists to interpret. • Colon Cancer – CT colonography (virtual colonoscopy) is being examined as a potential screening device (400-700 images)
    20. 20. CAD for Breast Cancer A mammogram is an X-ray of breast tissue used as a screening tool searching for cancer when there are no symptoms of anything being wrong. A mammogram detects lumps, changes in breast tissue or calcifications when they're too small to be found in a physical exam. • Abnormal tissue shows up a dense white on mammograms. • The left scan shows a normal breast while the right one shows malignant calcifications.
    21. 21. CAD for Lung Cancer • Identification of lung nodules in thoracic CT scan; the identification is complicated by the blood vessels • Once a nodule has been detected, it may be quantitatively analyzed as follows: • The classification of the nodule as benign or malignant • The evaluation of the temporal size in the nodule size.
    22. 22. CAD for Colon Cancer • Virtual colonoscopy (CT colonography) is a minimally invasive imaging technique that combines volumetrically acquired helical CT data with advanced graphical software to create two and three- dimensional views of the colon. Three-dimensional endoluminal view of the colon showing the appearance of normal haustral folds and a small rounded polyp.
    23. 23. Role of Image Analysis & Machine Learning for CAD • An overall scheme for computed aided diagnosis systems
    24. 24. SoC Medical imaging researchSoC Medical imaging research projectsprojects 1. Computer-aided characterization for lung nodules1. Computer-aided characterization for lung nodules Goal:Goal: establish the link between computer-based imageestablish the link between computer-based image features of lung nodules in CT scans and visualfeatures of lung nodules in CT scans and visual descriptors defined by human experts (semanticdescriptors defined by human experts (semantic concepts) for automatic interpretation of lung nodulesconcepts) for automatic interpretation of lung nodules Example:Example: This lung nodule has a “solid”This lung nodule has a “solid” texturetexture and hasand has a “sharp”a “sharp” marginmargin
    25. 25. 25 Why computer-aided characterization?Why computer-aided characterization? Ratings and Boundaries across radiologists areRatings and Boundaries across radiologists are different!!!different!!! Reader 1Reader 1 Reader 2Reader 2 Reader 3Reader 3 Reader 4Reader 4 Lobulation=4 Malignancy=5 “highly suspicious” Sphericity=2 Lobulation=1 “marked” Malignancy=5 “highly suspicious” Sphericity=4 Lobulation=2 Malignancy=5 “highly suspicious” Sphericity=5 “round” Lobulation=5 “none” Malignancy=5 “highly suspicious” Sphericity=3 “ovoid”
    26. 26. Computer-aided characterizationComputer-aided characterization • Research HypothesisResearch Hypothesis • ““The working hypothesis is that certain radiologists’The working hypothesis is that certain radiologists’ assessments can be mapped to the most important low-levelassessments can be mapped to the most important low-level image features”.image features”. • MethodologyMethodology • new semi-supervised probabilistic learning approaches thatnew semi-supervised probabilistic learning approaches that will deal with both the inter-observer variability and the smallwill deal with both the inter-observer variability and the small set of labeled data (annotated lung nodules).set of labeled data (annotated lung nodules). • Our proposed learning approach will be based on anOur proposed learning approach will be based on an ensemble of classifiers (instead of a single classifier as withensemble of classifiers (instead of a single classifier as with most CAD systems) built to emulate the LIDC ensemblemost CAD systems) built to emulate the LIDC ensemble (panel) of radiologists.(panel) of radiologists.
    27. 27. Computer-aided characterizationComputer-aided characterization (cont.)(cont.) • Expected outcome:Expected outcome: • an optimal set of quantitative diagnostic features linked to thean optimal set of quantitative diagnostic features linked to the visual descriptors (semantic concepts).visual descriptors (semantic concepts). • Significance:Significance: • The derived mappings can serve to showThe derived mappings can serve to show – the computer interpretation of the corresponding radiologistthe computer interpretation of the corresponding radiologist rating in terms of a set of standard and objective imagerating in terms of a set of standard and objective image features,features, – automatically annotate new images,automatically annotate new images, – and augment the lung nodule retrieval results with theirand augment the lung nodule retrieval results with their probabilistic diagnostic interpretations.probabilistic diagnostic interpretations.
    28. 28. Computer-aidedComputer-aided characterizationcharacterization • Preliminary resultsPreliminary results – NIH Lung Image Database Consortium (LIDC):NIH Lung Image Database Consortium (LIDC): • 149 distinct nodules from about 85 cases/patients;149 distinct nodules from about 85 cases/patients; • four radiologists marked the nodules using 9four radiologists marked the nodules using 9 semantic characteristics on a scale from 1 to 5semantic characteristics on a scale from 1 to 5 except for calcification (1 to 6) and internalexcept for calcification (1 to 6) and internal structure (1 to 4)structure (1 to 4)
    29. 29. 29 CharacteristicCharacteristic Possible ScoresPossible Scores CalcificationCalcification 1. Popcorn1. Popcorn 2. Laminated2. Laminated 3. Solid3. Solid 4. Non-central4. Non-central 5. Central5. Central 6. Absent6. Absent Internal structureInternal structure 1. Soft Tissue1. Soft Tissue 2. Fluid2. Fluid 3. Fat3. Fat 4. Air4. Air LobulationLobulation 1. Marked1. Marked 2. . 3. .2. . 3. . 4. .4. . 5. None5. None MalignancyMalignancy 1. Highly Unlikely1. Highly Unlikely 2. Moderately Unlikely2. Moderately Unlikely 3. Indeterminate3. Indeterminate 4. Moderately Suspicious4. Moderately Suspicious 5. Highly Suspicious CharacteristiCharacteristi cc Possible ScoresPossible Scores MarginMargin 1. Poorly Defined1. Poorly Defined 2. .2. . 3. .3. . 4. .4. . 5. Sharp5. Sharp SphericitySphericity 1. Linear1. Linear 2. .2. . 3. Ovoid3. Ovoid 4. .4. . 5. Round5. Round SpiculationSpiculation 1. Marked1. Marked 2. .2. . 3. .3. . 4. .4. . 5. None5. None SubtletySubtlety 1. Extremely Subtle1. Extremely Subtle 2. Moderately Subtle2. Moderately Subtle 3. Fairly Subtle3. Fairly Subtle 4. Moderately Obvious4. Moderately Obvious 5. Obvious5. Obvious TextureTexture 1. Non-Solid1. Non-Solid 2. .2. . 3. Part Solid/(Mixed)3. Part Solid/(Mixed) 4. .4. . Computer-aidedComputer-aided characterizationcharacterization• LIDC high level concepts &LIDC high level concepts & ratingsratings
    30. 30. 30 Computer-aided characterizationComputer-aided characterization Shape FeaturesShape Features Size FeaturesSize Features Intensity FeaturesIntensity Features Texture FeaturesTexture Features CircularityCircularity AreaArea MinIntensityMinIntensity 11 Haralick features calculated11 Haralick features calculated from co-occurrence matricesfrom co-occurrence matrices RoughnessRoughness ConvexAreaConvexArea MaxintensityMaxintensity 24 Gabor features24 Gabor features ElongationElongation PerimeterPerimeter SDIntensitySDIntensity 5 Markov Random Field features5 Markov Random Field features CompactnessCompactness ConvexPerimeterConvexPerimeter MinIntensityBGMinIntensityBG    EccentricityEccentricity EquivDiameterEquivDiameter MaxIntensityBGMaxIntensityBG    SoliditySolidity MajorAxisLengthMajorAxisLength MeanIntensityBGMeanIntensityBG    ExtentExtent MinorAxisLengthMinorAxisLength SDIntensityBGSDIntensityBG    RadialDistanceSDRadialDistanceSD    IntensityDifferenceIntensityDifference    • Low-level image featuresLow-level image features
    31. 31. 31 Computer-aided characterizationComputer-aided characterization CharacteriCharacteri sticsstics DecisionDecision treestrees Add instancesAdd instances predicted with highpredicted with high confidence (60%)confidence (60%) Add instances predicted with highAdd instances predicted with high confidence (60%) and instancesconfidence (60%) and instances with low margin (5%)with low margin (5%) LobulationLobulation 27.44%27.44% 81.00%81.00% 69.66%69.66% MalignancyMalignancy 42.22%42.22% 96.31%96.31% 96.31%96.31% MarginMargin 35.36%35.36% 98.68%98.68% 96.83%96.83% SphericitySphericity 36.15%36.15% 91.03%91.03% 90.24%90.24% SpiculationSpiculation 36.15%36.15% 63.06%63.06% 58.84%58.84% SubtletySubtlety 38.79%38.79% 93.14%93.14% 92.88%92.88% TextureTexture 53.56%53.56% 97.10%97.10% 97.36%97.36% AverageAverage 38.52%38.52% 88.62%88.62% 86.02%86.02% • Accuracy resultsAccuracy results
    32. 32. Computer-aidedComputer-aided characterizationcharacterization • ChallengesChallenges • Small number of training samples and largeSmall number of training samples and large number of featuresnumber of features “curse of dimensionality”“curse of dimensionality” problemproblem • Nodule sizeNodule size • Variation in the nodules’ boundariesVariation in the nodules’ boundaries • Different types of imaging acquisition parametersDifferent types of imaging acquisition parameters • Clinical evaluation: observer performance studiesClinical evaluation: observer performance studies require collaboration with medicalrequire collaboration with medical schools or hospitalsschools or hospitals
    33. 33. - 2. Texture-based Pixel Classification - tissue segmentation - context-sensitive tools for radiology reporting SoC Medical imagingSoC Medical imaging research projectsresearch projects Pixel Level Texture Extraction Pixel Level Classification Organ Segmentation 1 2, , kd d d  K _tissue label  
    34. 34. Texture-based Pixel Classification • Texture Feature extraction: consider texture around the pixel of interest. • Capture texture characteristic based on estimation of joint conditional probability of pixel pair occurrences Pij(d,θ). – Pij denotes the normalized co-occurrence matrix of specify by displacement vector (d) and angle (θ). Neighborhood of a pixel
    35. 35. Haralick Texture Features
    36. 36. Haralick Texture Features
    37. 37. Examples of Texture Images Texture images: original image, energy and cluster tendency, respectively. M. Kalinin, D. S. Raicu, J. D. Furst, D. S. Channin,, " A Classification Approach for Anatomical Regions Segmentation", The IEEE International Conference on Image Processing (ICIP), Genoa, Italy, September 11-14, 2005.
    38. 38. Texture Classification of Tissues in CT Chest/Abdomen Example of Liver Segmentation: (J.D. Furst, R. Susomboon, and D.S. Raicu, "Single Organ Segmentation Filters for Multiple Organ Segmentation", IEEE 2006 International Conference of the Engineering in Medicine and Biology Society (EMBS'06)) Region growing at 70% Region growing at 60% Segmentation Result Original Image Initial Seed at 90% Split & Merge at 85% Split & Merge at 80%
    39. 39. (a) Optimal selection of an adequate set of textural features is a challenge, especially with the limited data we often have to deal with in clinical problems. Consequently, the effectiveness of any classification system will always be conditional on two things: (i) how well the selected features describe the tissues (ii) how well the study group reflects the overall target patient population for the corresponding diagnosis Classification models: challenges
    40. 40. (b) how other type of information can be incorporated into the classification models: - metadata - image features from other imaging modalities (need of image fusion) (c) how stable and general the classification models are Classification models: challenges
    41. 41. - Definition of Content-based Image Retrieval: Content-based image retrieval is a technique for retrieving images on the basis of automatically derived image features such as texture and shape. Content-based medical image retrieval (CBMS) systems Applications of Content-based Image Retrieval: • Teaching • Research • Diagnosis • PACS and Electronic Patient Records
    42. 42. Feature Extraction Similarity Retrieval Image Features [D1, D2,…Dn] Image Database Query Image Query Results Feedback Algorithm User Evaluation Diagram of a CBIR http://viper.unige.ch/~muellerh/demoCLEFmed/index.php
    43. 43. An image retrieval system can help when the diagnosis depends strongly on direct visual properties of images in the context of evidence-based medicine or case-based reasoning. CBIR as a Diagnosis Aid
    44. 44. An image retrieval system will allow students/teachers to browse available data themselves in an easy and straightforward fashion by clicking on “show me similar images”. Advantages: - stimulate self-learning and a comparison of similar cases - find optimal cases for teaching Teaching files: • Casimage: http://www.casimage.com • myPACS: http://www.mypacs.net CBIR as a Teaching Tool
    45. 45. CBIR as a Research Tool Image retrieval systems can be used: • to complement text-based retrieval methods • for visual knowledge management whereby the images and associated textual data can be analyzed together • multimedia data mining can be applied to learn the unknown links between visual features and diagnosis or other patient information • for quality control to find images that might have been misclassified
    46. 46. CBIR as a tool for lookup and reference in CT chest/abdomen • Case Study: lung nodules retrieval – Lung Imaging Database Resource for Imaging Research http://imaging.cancer.gov/programsandresources/InformationSystems – 29 cases, 5,756 DICOM images/slices, 1,143 nodule images – 4 radiologists annotated the images using 9 nodule characteristics: calcification, internal structure, lobulation, malignancy, margin, sphericity, spiculation, subtlety, and texture • Goals: – Retrieve nodules based on image features: • Texture, Shape, and Size – Find the correlations between the image features and the radiologists’ annotations
    47. 47. Choose a nodule
    48. 48. Choose an image feature& a similarity measure M. Lam, T. Disney, M. Pham, D. Raicu, J. Furst, “Content-Based Image Retrieval for Pulmonary Computed Tomography Nodule Images”, SPIE Medical Imaging Conference, San Diego, CA, February 2007
    49. 49. Retrieved Images
    50. 50. CBIR systems: challenges •Type of features • image features: - texture features: statistical, structural, model and filter-based - shape features • textual features (such as physician annotations) • Similarity measures -point-based and distribution based metrics • Retrieval performance: • precision and recall • clinical evaluation
    51. 51. uestions ?

    ×