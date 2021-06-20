Successfully reported this slideshow.
Medical Computer Vision: Current Limitations of Vision Datasets | CVPR 2021

Panel on Current limitations of vision datasets at the Future of CV Datasets Workshop, CVPR 2021

  1. 1. Medical Computer Vision CURRENT LIMITATIONS OF VISION DATASETS CVPR 2021 Future of computer vision datasets workshop Dr. Asma Ben Abacha Sunday, June 20, 2021
  2. 2. Disclaimer u The views and opinions expressed do not necessarily state or reflect those of the U.S. Government, and they may not be used for advertising or product endorsement purposes. 2
  3. 3. Medical Computer Vision • Automatic understanding of medical images can support clinical decision making, second-opinion feedback, decrease the diagnosis and treatment time, help with clinical education and patient engagement.
  4. 4. Radiology: Some of our solutions u Automatic Data Creation u Deep Learning & Data Augmentation Methods “Visual Question Generation from Radiology Images”. Sarrouti, Ben Abacha, and Demner-Fushman. ACL-ALVR 2020. “VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF”. Ben Abacha et al. ImageCLEF 2019, 2020 & 2021.
  5. 5. Current Limitations (1/2) u 1) Data Size and Coverage u All modalities (e.g. CT, MRI, X-ray, ultrasound) u Several images per case/patient u Images for each specific abnormality v Examples of abnormalities in brain images: • glioblastoma multiforme, meningioma, multiple sclerosis, craniopharyngioma, tuberous sclerosis, pilocytic astrocytoma, cerebellar hemangioblastoma, vestibular schwannoma, arachnoid cyst, developmental venous anomaly, central neurocytoma, choroid plexus papilloma, juvenile pilocytic astrocytoma, intracranial hypotension, intraventricular meningioma, pituitary apoplexy, etc. Examples of Open Domain Datasets Places: 2.5 M images ImageNet: 1.4 M images. Open-Domain VQA: 204,721 COCO images, 1M questions & 11M answers. Examples of Medical Datasets MIMIC-CXR: 377,110 Chest X-rays with free- text radiology reports. DeepLesion: 32,000 annotated lesions identified on CT images. VQA-Med (2019-2021): 10,200 radiology images with 21,292 QA pairs.
  6. 6. Current Limitations (2/2) u Diverse sources: Data from different hospitals and clinical centers: v publicly available to the research community v Preserving the patient privacy u Inclusive datasets v Images from patients from all minorities, ages, genders and, races to reduce model bias. u Clinical deployment v Goal-orientated/clinically relevant datasets v Inclusive and high-quality gold standard datasets Hippocratic Oath
  7. 7. e 7 asma.benabacha@nih.gov asma.benabacha@gmail.com @AsmaBenAbacha Thank you for your Attention! Datasets: github.com/abachaa

