Medical image analysis, retrieval and
evaluation infrastructures
Henning Müller
HES-SO VS &
Martinos Center
Overview
• Medical image retrieval projects
• Image analysis and 3D texture modeling
• Data science evaluation infrastructures
– ImageCLEF
– VISCERAL
– EaaS – Evaluation as a Service
• What comes next?
Henning Müller
• Studies in medical informatics in
Heidelberg, Germany
– Work in Portland, OR, USA
• PhD in image processingin Geneva,
focus on image analysis and retrieval
– Exchange at Monash Uni., Melbourne, Australia
• Prof titulaire at UNIGE/HUG in medicine (2014)
– Medical image analysis and retrieval for decision
support
• Professor at the HES-SO Valais (2007)
– Head of the eHealth unit
• Sabbaticalat the Martinos Center, Boston, MA
Medical image retrieval (history)
• MedGIFT project started in 2002
– Global image similarity
• Texture, grey levels
– Teaching files
– Linking text files and
image similarity
• Often data not available
– Medical data hard to get
– Images and text are
connected in cases
• Unrealistic expectations, high quality vs. browsing
– Semantic gap
Medical imaging is big data!!
• Much imaging datais produced
• Imaging data is very complex
– And getting more complex
• Imaging is essential for
diagnosis and treatment
• Images out of their context
loose most of their sense
– Clinical data are necessary
– Diagnoses often not precise
• Evidence-based medicine&
case-basedreasoning
Decision support in medicine
• Mixing multilingualdata from many resources
and semantic information for medical retrieval
– LinkedLifeData
The informed patient
Integrated interfaces
Texture analysis (2D->3D->4D)
• Describe various medical tissue types
– Brain, lung, …
– Concentration on 3D and 4D data
– Mainly texture descriptors
• Extract visual features/signatures
– Learned, so relation to deep learning
Adrien Depeursinge, Antonio Foncubierta–Rodriguez, Dimitri Van de Ville, and Henning
Müller, Three–dimensional solid texture analysis and retrieval: review and opportunities,
Medical Image Analysis, volume 18, number 1, pages 176-196, 2014.
Database with CT image of
interstitial lung diseases
• 128 cases with CT image series and biopsy
confirmed diagnosis
• Manually annotated regions for tissue classes (1946)
– 6 tissue types of 13 with a larger number of examples
• 159 clinical parameters extracted (sparse)
– Smoking history, age, gender,
hematocrit, …
• Availableafter signature of a
license agreement
Learned 3D signatures
• Learn combinations of Riesz wavelets as digital
signatures using SVMs (steerable filters)
– Create signatures to detect small local lesions
and visualize them
Adrien Depeursinge, Antonio Foncubierta–Rodriguez, Dimitri Van de Ville, and Henning
Müller, Rotation–covariant feature learning using steerable Riesz wavelets, IEEE
Transactions on Image Processing, volume 23, number 2, page 898-908, 2014.
Learning Riesz in 3D
• Most medical tissues are naturally 3D
• But modeling gets much more complex
– Vertical planes
– 3-D checkerboard
– 3-D wiggled
checkerboard
Aiding clinical decisions
• Benchmark on multimodal imageretrieval
– Run since 2003, medical task since 2004
– Part of the Cross language evaluation forum
• Many tasks related to image retrieval
– Image classification
– Image-based retrieval
– Case-based retrieval
– Compound figure separation
– Caption prediction
– …
• Many old databases remain available, imageclef.org
Test
Resources available
Test DataTraining Data
Participants Organiser
Participant
Virtual
MachinesRegistration
System
Annotation
Management System
Analysis
System
Annotators
(Radiologists)
Locally Installed
Annotation
Clients
Microsoft
Azure
Cloud
Test Data
Evaluation as a Service (EaaS)
• Moving the algorithms to the data not vice versa
– Required when data are: very large, changing
quickly, confidential (medical, commercial, …)
• Different approaches
– Source code submission, APIs, VMs local or in the
cloud, Docker containers, specific frameworks
• Allows for continuous evaluation, component-
based evaluation, total reproducibility, updates, …
– Workshop March 2015 in Sierre on EaaS
– Workshop November 2015 in Boston on cloud-
based evaluation
Sharing images, research data
• Very important aspect of research is to have solid
methods, data, large if possible
– If data not available, results can not be reproduced
– If data are small, results may be meaningless
• Many multi-center projects spend most money on
data acquisition, often delayed no time for analysis
– IRB takes long, sometimes restrictions are strange
• Research is ineternational!
• NIH & NCI are great to push data availability
– But data can be made available in an unusable way
Political support for research
infrastructures!
Sustaining biomedical big data
Microsoft Azure
Intels CCC
Institutional support (NCI)
• Using crowdsourcing to link researcher and challenges
Business models for these links
• Manually annotate large data sets for challenges
– Data needs to be available in a secure space
• Have researcher work on data (on infrastructure)
– Deliver code
• Commercialize results and share benefits
Future of research infrastructures
• Much more centered around data!!
– Nature Scientific Data underlines the importance!
• Data need to be available but in a meaningful way
– Infrastructure needs to be available and way to
evaluate on the data with specific tasks
• More work for data preparation but in line with IRB
– Analysis inside medical insitutions
• Code will become even more portable
– Docker helps enormously and develops quickly
• Public private partnerships to be sustainable
• Total reproducibility, long term, sharing tools
• Much higher efficiency
• Part of QIN – Quantitative Imaging Network (NCI)
• Create challenges for QIN to validate tools
• Use Codalabto run project challenges
– Run code in containers (Docker), well integrated
• Automate as much as possible
– Share code blocks across teams, evaluate
combinations
Conclusions
• Medicine is (becoming) digital medicine
– More data and more complex links (genes, visual,
signals, …)
• Medical data science requires new infrastructures
– Use routine data, not manually extracted, curated
data, curate large scale, accommodate for errors
– Use large data sets from data warehouses
– Keep data where they are produced
• More “local” computation, so where data are
– Secure aggregation of results
• Sharing infrastructures, data and more
Contact
• More information canbe found at
– http://khresmoi.eu/
– http://visceral.eu/
– http://medgift.hevs.ch/
– http://publications.hevs.ch/
• Contact:
– Henning.mueller@hevs.ch
Medical image analysis, retrieval and evaluation infrastructures

Medical image analysis, retrieval and evaluation infrastructures

  • 1.
    Medical image analysis,retrieval and evaluation infrastructures Henning Müller HES-SO VS & Martinos Center
  • 2.
    Overview • Medical imageretrieval projects • Image analysis and 3D texture modeling • Data science evaluation infrastructures – ImageCLEF – VISCERAL – EaaS – Evaluation as a Service • What comes next?
  • 3.
    Henning Müller • Studiesin medical informatics in Heidelberg, Germany – Work in Portland, OR, USA • PhD in image processingin Geneva, focus on image analysis and retrieval – Exchange at Monash Uni., Melbourne, Australia • Prof titulaire at UNIGE/HUG in medicine (2014) – Medical image analysis and retrieval for decision support • Professor at the HES-SO Valais (2007) – Head of the eHealth unit • Sabbaticalat the Martinos Center, Boston, MA
  • 4.
    Medical image retrieval(history) • MedGIFT project started in 2002 – Global image similarity • Texture, grey levels – Teaching files – Linking text files and image similarity • Often data not available – Medical data hard to get – Images and text are connected in cases • Unrealistic expectations, high quality vs. browsing – Semantic gap
  • 5.
    Medical imaging isbig data!! • Much imaging datais produced • Imaging data is very complex – And getting more complex • Imaging is essential for diagnosis and treatment • Images out of their context loose most of their sense – Clinical data are necessary – Diagnoses often not precise • Evidence-based medicine& case-basedreasoning
  • 6.
  • 7.
    • Mixing multilingualdatafrom many resources and semantic information for medical retrieval – LinkedLifeData
  • 8.
  • 9.
  • 10.
    Texture analysis (2D->3D->4D) •Describe various medical tissue types – Brain, lung, … – Concentration on 3D and 4D data – Mainly texture descriptors • Extract visual features/signatures – Learned, so relation to deep learning Adrien Depeursinge, Antonio Foncubierta–Rodriguez, Dimitri Van de Ville, and Henning Müller, Three–dimensional solid texture analysis and retrieval: review and opportunities, Medical Image Analysis, volume 18, number 1, pages 176-196, 2014.
  • 11.
    Database with CTimage of interstitial lung diseases • 128 cases with CT image series and biopsy confirmed diagnosis • Manually annotated regions for tissue classes (1946) – 6 tissue types of 13 with a larger number of examples • 159 clinical parameters extracted (sparse) – Smoking history, age, gender, hematocrit, … • Availableafter signature of a license agreement
  • 12.
    Learned 3D signatures •Learn combinations of Riesz wavelets as digital signatures using SVMs (steerable filters) – Create signatures to detect small local lesions and visualize them Adrien Depeursinge, Antonio Foncubierta–Rodriguez, Dimitri Van de Ville, and Henning Müller, Rotation–covariant feature learning using steerable Riesz wavelets, IEEE Transactions on Image Processing, volume 23, number 2, page 898-908, 2014.
  • 13.
    Learning Riesz in3D • Most medical tissues are naturally 3D • But modeling gets much more complex – Vertical planes – 3-D checkerboard – 3-D wiggled checkerboard
  • 14.
  • 15.
    • Benchmark onmultimodal imageretrieval – Run since 2003, medical task since 2004 – Part of the Cross language evaluation forum • Many tasks related to image retrieval – Image classification – Image-based retrieval – Case-based retrieval – Compound figure separation – Caption prediction – … • Many old databases remain available, imageclef.org
  • 16.
  • 17.
  • 18.
    Test DataTraining Data ParticipantsOrganiser Participant Virtual MachinesRegistration System Annotation Management System Analysis System Annotators (Radiologists) Locally Installed Annotation Clients Microsoft Azure Cloud Test Data
  • 19.
    Evaluation as aService (EaaS) • Moving the algorithms to the data not vice versa – Required when data are: very large, changing quickly, confidential (medical, commercial, …) • Different approaches – Source code submission, APIs, VMs local or in the cloud, Docker containers, specific frameworks • Allows for continuous evaluation, component- based evaluation, total reproducibility, updates, … – Workshop March 2015 in Sierre on EaaS – Workshop November 2015 in Boston on cloud- based evaluation
  • 20.
    Sharing images, researchdata • Very important aspect of research is to have solid methods, data, large if possible – If data not available, results can not be reproduced – If data are small, results may be meaningless • Many multi-center projects spend most money on data acquisition, often delayed no time for analysis – IRB takes long, sometimes restrictions are strange • Research is ineternational! • NIH & NCI are great to push data availability – But data can be made available in an unusable way
  • 21.
    Political support forresearch infrastructures!
  • 22.
  • 23.
  • 24.
  • 25.
    Institutional support (NCI) •Using crowdsourcing to link researcher and challenges
  • 26.
    Business models forthese links • Manually annotate large data sets for challenges – Data needs to be available in a secure space • Have researcher work on data (on infrastructure) – Deliver code • Commercialize results and share benefits
  • 27.
    Future of researchinfrastructures • Much more centered around data!! – Nature Scientific Data underlines the importance! • Data need to be available but in a meaningful way – Infrastructure needs to be available and way to evaluate on the data with specific tasks • More work for data preparation but in line with IRB – Analysis inside medical insitutions • Code will become even more portable – Docker helps enormously and develops quickly • Public private partnerships to be sustainable • Total reproducibility, long term, sharing tools • Much higher efficiency
  • 28.
    • Part ofQIN – Quantitative Imaging Network (NCI) • Create challenges for QIN to validate tools • Use Codalabto run project challenges – Run code in containers (Docker), well integrated • Automate as much as possible – Share code blocks across teams, evaluate combinations
  • 29.
    Conclusions • Medicine is(becoming) digital medicine – More data and more complex links (genes, visual, signals, …) • Medical data science requires new infrastructures – Use routine data, not manually extracted, curated data, curate large scale, accommodate for errors – Use large data sets from data warehouses – Keep data where they are produced • More “local” computation, so where data are – Secure aggregation of results • Sharing infrastructures, data and more
  • 30.
    Contact • More informationcanbe found at – http://khresmoi.eu/ – http://visceral.eu/ – http://medgift.hevs.ch/ – http://publications.hevs.ch/ • Contact: – Henning.mueller@hevs.ch