Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ground truth generation in medical imaging: a crowdsourcing-based iterative approach


Published on

As in many other scientific domains where computer–based tools need to be evaluated, also medical imaging often requires the expensive generation of manual ground truth. For some specific tasks medical doctors can be required to guarantee high quality and valid results, whereas other tasks such as the image modality classification described in this text can in sufficiently high quality be performed with simple domain experts.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Ground truth generation in medical imaging: a crowdsourcing-based iterative approach

  1. 1. Ground truth generation in medical imaging: a crowdsourcing-based iterative approach Antonio Foncubierta-Rodríguez Henning Müller
  2. 2. Introduction• Medical image production grows rapidly in scientific and clinical environment• If images are easily accessed, they can be reused: • Clinical decision support • Young physician training • Relevant document retrieval for researchers• Modality classification improves retrieval and accessibility of images
  3. 3. Motivation and dataset• ImageCLEF dataset: • Over 300,000 images from open access biomedical literature • Over 30 modalities hierarchically defined• Manual classification is expensive and time consuming• How can this be done in a more efficient way?
  4. 4. Conventional Diagnostic Ultrasound Tables, forms MRI Program Listing CT 2D X-RAYStatistical figures, graphs and Angiography charts PET Radiology System overviews SPECT Infrared Flowcharts Combined GraphGene sequence Skin Gross Organs light VisibleChromatography, Endoscopy Classification Hierarchy gel EEG Chemical structure ECG, EKG Compound waves Signals, EMG Symbol Light Micr. Transmission Electron Math formulae Microscope Micr. Fluorescence Phase Interference contrast Microscopy Dark field Non photos clinical 2D 3D sketches Reconstructions Hand-drawn
  5. 5. Image examplesCOMPOUND GENERIC GENERIC Table Figures/ChartsDIAGNOSTIC DIAGNOSTIC DIAGNOSTIC Radiology Radiology Microscopy Ultrasound CT Fluorescence
  6. 6. Iterative workflow• Avoid manual classification as much as possible• Iterative approach: 1. Create a small training set • Manual classification into 34 categories 2. Use an automatic tool that learns from training set 3. Evaluate results • Manual classification into right/wrong categories 4. Improve training set 5. Repeat from 2
  7. 7. Crowdsourcing in medical imaging• Crowdsourcing reduces time and cost for annotation• Medical image annotation is often done by • Medical doctors • Domain experts• Can unknown users provide valid annotations? • Quality? • Speed?
  8. 8. User Groups• Experiments were performed with three different user groups: 1 MD 18 known experts 2470 contributors from open crowdsourcing
  9. 9. Crowdsourcing platform• Crowdflower platform was chosen for the experiments • Integrated interface for job design • Complete set of management tools: gold creation, internal interface, statistics, raw data • Hub feature: jobs can be announced in several crowdsourcing pools: • Amazon Mturk • Get Paid • Zoombucks
  10. 10. Experiment: Initial training set generation• Initial training set generation • 1,000 images • Limited to 18 known experts • Aim: test the crowdsourcing interface
  11. 11. Experiment: Automated classification verification• 300,000 images• Binary task: approve or refuse classification• Aim: evaluate speed and difficulty of verification task
  12. 12. Experiments: trustability• Trustability experiments • Aim: compare user groups expected accuracy • 3,415 images were classified by the Medical Doctor • The two user groups were required to reclassify images • Random subset of 1,661 images used as gold standard • Feedback on wrong classification was given to the known experts for detecting ambiguities • Feedback on 847 of the gold images was muted for the crowd
  13. 13. Results: user self assessment• Users were required to answer how sure they were of their choice• Allows discarding untrusted data from trusted sources• Confidence rate • Medical doctor: 100 % • Known experts group: 95.04 % • Crowd group: 85.56 %
  14. 14. Results: tasks completed per userOpen crowdsourcing Internal interface
  15. 15. Results: MD and known experts• Agreement • Broad category: 88.76 % • Diagnostic subcategory: 97.40 % • Microscopy: 89.06 % • Radiology: 90.91 % • Reconstructions: 100 % • Visible light photography: 79.41 % • Conventional subcategory: 76 %• Speed • MD: 85 judgements per hour • Experts: 66 judgements per hour and user
  16. 16. Results: MD and Crowd• Agreement • Broad category: 85.53 % • Diagnostic subcategory: 85.15 % • Microscopy: 70.89 % • Radiology: 64.01 % • Reconstructions: 0 % • Visible light photography: 58.89 % • Conventional subcategory: 75.91 %• Speed • MD: 85 judgements per hour • Crowd: 25 judgements per hour and user
  17. 17. Results: Automatic classification verification• Verification by experts• 1,000 images were verified• Agreement among annotators: 100%• Speed: • Users answered twice as fast
  18. 18. Conclusions• Iterative approach reduces amount of manual work • Only a small subset is fully manually annotated • Automatic classification verification is faster• Significant differences among user groups • Faster crowd annotations due to the number of contributors • Poorer crowd annotations in the most specific classes• Comparable performance among user groups • Broad categories
  19. 19. Future work• Experiments can be redesigned to fit the crowd behaviour: • A smaller number of (good) contributors has previously led to CAD-comparable performance • Selection of contributors: • Historical performance on the platform? • Selection/Training phase within the job
  20. 20. Thanks for your attention! Antonio Foncubierta-Rodríguez and Henning Müller. “Ground truth generation in medical imaging: A crowdsourcing based iterative approach”,in Workshop on Crowdsourcing for Multimedia, ACM Multimedia, Nara, Japan, 2012 Contact: