SlideShare a Scribd company logo
Medical Multimedia
Systems and Applications
Steven Hicks1 / Michael Riegler1, Pål Halvorsen1, Klaus Schoeffmann2
2 Institute of Information Technology
Klagenfurt University, Austria
1 Simula Research Laboratory
Norway
• Introduction & Overview
• Multimedia Data in Medicine
• Characteristics of Endoscopic Video
• Different Fields and Communities
• Application 1: Post-Procedural Usage of Surgery Videos
• Domain-Specific Storage for long-term Archiving
• Medical Video Content Analysis and Datasets
• Medical Video Interaction
• Application 2: Diagnostic Decision Support and Case Studies
• Knowledge Transfer
• Analysis
• Feedback
• Explainability and Trust
• Conclusions & Outlook
Agenda
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 2
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 3
Notice
This presentation contains images and videos
from medical surgeries,
which you may find disturbing!
Introduction
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 4
Medical inspections/interventions produce many kinds of data
• Medical text
• OR reports, Patient records…
• Sensor signals
• ECG, EEG, vital signs
• Medical images (radiology)
• Ultrasound, x-ray
• CT, MRI, PET, …
• Medical video
• Screenings
• Surgery
Multimedia Data in Medicine
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 5
 Signal Processing
 Medical Imaging
 Robotics
 Multimedia
 Data Mining
• Traditional open surgery?
• Minimally-invasive surgery
• Interventions with endoscopes
• Reduced trauma for patient
• Less invasive and faster
• Less rehabilitation time
• Microscopic surgery
Video Data Sources in Medicine
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 6
Therapeutic Endoscopy
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 7
• Rigid endoscope
• Small incisions
• Therapy / Surgery
• Laparoscopy
• Cholecystectomy
• Gynecological Surgery
• Urological Surgery
• …
• Arthroscopy
• …
Diagnostic Endoscopy
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 8
• Flexible endoscope
• Natural orifices
• Diagnosis / Inspections
• Gastroenterology (colonoscopy, gastroscopy)
• Bronchoscopy
• Hysteroscopy
• …
• WCE (Wireless capsule endoscopy)
Endoscopic Video Examples
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 9
Domain-specific Characteristics & Challenges
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 10
• Full HD or 4K (even stereo 3D)
• One shot recordings
• Up to multiple hours
• Homogenous color distribution
• Visually very similar content
• Circular content area
• Fast motion
• Geometric distortion
• Specular reflections
• Occlusions
• Smoke, motion blur, blood, flying particles
• Size!
Literature Overview
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 11
Münzer, Bernd, Klaus Schoeffmann, and Laszlo Böszörmenyi. "Content-based processing and analysis of endoscopic images and videos: A survey." Multimedia Tools and Applications (2017): 1-40.
Pre-Processing
• Image Enhancement
• Contrast enhancement, color misalignment
correction…
• Camera calibration and distortion correction
• Specular reflection removal
• Comb structure removal & super resolution
• …
• Information Filtering
• Frame filtering
• Image segmentation
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 12
Real-time Support at Intervention Time
Applications
 Diagnosis support
 Robot-assisted surgery
 Context awareness
 Augmented reality
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 13
Post-Procedural Applications
Management and Retrieval
• Compression and storage
• Content-based retrieval
• Temporal video segmentation
• Video summarization
• Visualization & Interaction
Quality Assessment
 Skills assessment
 Education & Training
 Error Rating
 Assessment of intervention quality
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 14
Post-Procedural Use of Surgery Videos
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 15
• Video documentation of endoscopic procedures is on the rise
• “a picture paints a thousand words“, a moving picture paints millions!
• In some countries even mandatory already
• Current documentation practice poses many problems
• Hard task to retrieve relevant information
• Huge amounts of storage space
• High ratio of irrelevant data (“rubbish”)
• Very inefficient encoding (especially for HD content)
Motivation for Video Documentation
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 16
• Later inspection of specific moments
• Discussion of critical moments (e.g., with OP team)
• Information to patients
• Preparation of future interventions
• Forensics & investigations (e.g., comparisons)
• Training & teaching
• Surgical quality assessment (technical errors)
Post-Procedural Use of Surgical Videos
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 17
Full Storage of Endoscopic Videos
• Exemplary hospital
• 5 departments (Lap, Gyn, Arthro, GI, ENT)
• 2 operation rooms, each 4 ops/day, each op ca. 1-2h
•  i.e. 40 interventions per day, each ~ 90 mins.
• 60 hours video per day!
• Assumption: HD 1920x1080, H.264/AVC
• 270 GB / day (1h=4.5 GB)
• 1.9 TB / week
• 100 TB / year (200 TB MPEG-2)
4K: even more
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 18
Great challenge for a hospital’s IT department!
How to Reduce Storage Requirements?
Exploit domain-specific characteristics:
1. Spatial compression optimization
2. Temporal compression optimization
3. Perceptual quality based optimization
4. Long-term archiving strategy
Transcoding
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 19
up to 30%
up to 40%
up to 93%
Study on Video Quality
• Subjective quality assessment
• Catharina Hospital Eindhoven, NL
• 37 participants
• 19 experienced surgeons and 18 trainees
• 7 women, 30 men, average age: 40 years
• Subjective tests regarding
maximum compression
1) Perceivable quality loss
• Double-Stimulus (ITU-R BT.500-11)
• Switch between reference and test video
2) Perceivable semantic information loss
• Single Stimulus (ITU-R P.910)
• Assessing random videos (incl. reference)
Münzer, B., Schoeffmann, K., Böszörmenyi, L., Smulders, J. F., & Jakimowicz, J. J. (2014, May). Investigation of the impact of compression on the perceptional quality of laparoscopic videos. In 2014 IEEE 27th
International Symposium on Computer-Based Medical Systems (pp. 153-158). IEEE.
Session 1 Session 2
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 20
Assessment of Video Quality (Session 1)
-5
0
5
10
15
20
25
30
35
0
3000
6000
9000
12000
15000
18000
21000
24000
20 22 24 26 28 18 20 22 24 26 18 18
DifferenceMeanOpinionScore(DMOS)
Bitrate(Kb/s)
Test Conditions
Average bitrate Rating difference
1920x1080 1280x720 960x540 640x360
subjectively better
than reference
Reference video
(MPEG-2, HD, 20 (35) Mbit/s)
“lossless”
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 21
crf
(constant rate factor)
Assessment of Video Quality (Session 2)
1. Visually lossless with 8 Mbit/s Q1
(in comparison to 20 Mbit/s)
Reduction: 60% data vs. 0% MOS
2. Good quality with 2,5 Mbit/s and Q2
reduced resolution (1280x720)
Reduction: 88% data vs. 7% MOS
3. Acceptable quality with 1,4 Mbit/s Q3
and lower resolution (640x360)
Reduction: 93% data vs. 31% MOS
1
2
3
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 22
Example Videos
1280x720
Weak compression
16 MB
(crf 18)
640x360
Strong compression
0,8 MB
(crf 26)
20x
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 23
Medical Video Analysis
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 24
• With several hours of videos each day
• Manual search in archive becomes impractical!
• Automatic content analysis
• Filter for relevant scenes in the videos
• Anatomical structures
• Surgical actions
• Instruments
• Operation phases
• Irregular/Adverse events
• …
• Content classification (e.g., with neural networks)
• Video Retrieval/Interaction systems
Medical Videos
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 25
Suture
Cutting
Injection
Coagulation?
? ?
? ?
1000 frames
(sampled from
17min with 1fps)
2
6
ACM Multimedia 2019 Tutorial
Medical Multimedia Systems and Applications
Content Relevance Filtering / Instrument Recognition
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 27
Münzer, B., Schoeffmann, K., & Böszörmenyi, L. (2013, December). Relevance segmentation of laparoscopic videos. In Multimedia (ISM), 2013 IEEE International Symposium on (pp. 84-91). IEEE.
Primus, M. J., Schoeffmann, K., & Böszörmenyi, L. (2015, June). Instrument classification in laparoscopic videos. In Content-Based Multimedia Indexing (CBMI), 2015 13th International Workshop on (pp. 1-6). IEEE.
Instrument detection/segmentation
for better content understanding
(e.g., op phase segmentation, following
instruments in robot-assisted surgery)
Out-of-patient Scenes Blurry Scenes Border Area
Smoke Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 28
Smoke Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 29
Cauterization in
90% surgeries
Instruments:
Laser or HF
(100° - 1200° C)
filtration
system
(manual)
 Automatic Smoke Detection & Removal?
(Real-Time)
Automatic Smoke Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 30
Achievable Performance with Saturation Peak Analysis (SPA)
Automatic Smoke Detection - Performance
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 31
20K images (DS A)
10K images (DS A)
4.5K images (DS B)
SPA: Saturation Peak Analysis
GLN RGB: GoogLeNet using RGB images
GLN SAT: GoogLeNet using saturation channel only
Deep Learning
Real-Time Smoke Detection Prototype
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 32
Andreas Leibetseder, Manfred J. Primus, Stefan Petscharnig, and Klaus Schoeffmann. “Image-based Smoke Detection in Laparoscopic Videos“. Proceedings of Computer Assisted and Robotic Endoscopy and Clinical Image-Based
Procedures: 4th International Workshop, CARE 2017, and 6th International Workshop, CLIP 2017, held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, September 14, 2017, pp. 70-87
Surgical Action Classification
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 33
Gynecologic Laparoscopy: Relevant Surgical Actions
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 34
Dissection– 58 Segs / 35.517 Pics Coagulation– 212 Segs / 84.786 Pics Cutting cold – 271 Segs / 26.388 Pics
Cutting– 106 Segs / 92.653 Pics Hysterectomy– 25 Segs / 68.466 Pics Injection– 52 Segs / 52.355 Pics
Suturing– 92 Segs / 321.851 PicsSuction & Irrigation – 173 Segs /
73.977 Pics
1.105 segments (823.000 frames)
9h annotated video of 111 interventions
10-fold cross-validation
Stefan Petscharnig and Klaus Schoeffmann. 2018. Learning
Laparoscopic Video Shot Classification for Gynecological Surgery.
Multimedia Tools and Applications (MTAP), 77, 7, Springer US, 8061-
8079.
Gynecologic Laparoscopy: Surgical Actions Classification
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 35
R...Recall P...Precision
• Early fusion
• Integrate motion information from consecutive frames
• Feed into CNN as additional input channel(s)
• Compare two approaches
• Block-Based Motion Estimation (BBME): using block matching
• Residual Motion (ResM): local motion
• Late fusion
• Assume we already know scene boundaries and classify all frames of segments
• Temporal aggregation of single-frame classifications
• Majority vote (maximum occurrence of class in frames of scene)
• Average confidence
Fusing Temporal Information with CNNs
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 36
S. Petscharnig, K. Schöffmann, J. Benois-Pineau, S. Chaabouni and J. Keckstein, "Early and Late Fusion of Temporal Information for Classification of Surgical Actions in Laparoscopic Gynecology," 2018 IEEE 31st
International Symposium on Computer-Based Medical Systems (CBMS), Karlstad, 2018, pp. 369-374.
Gynecologic Laparoscopy: Surgical Actions Classification
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 37
Petscharnig, S., & Schöffmann, K. (2017). Learning laparoscopic video shot classification for gynecological surgery. Multimedia Tools and Applications, 1-19.
Instrument
Segmentation/Recognition
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 38
Instrument Segmentation/Recognition
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 39
INPUT
Video recordings of
laparoscopic procedures
in gynecology
OUTPUT
Position and category of
each instrument in the
video
• Use a region-based CNN for
1. Binary instrument segmentation
• distinguish between instrument instances and background
(without recognizing the actual instrument)
2. Multi-class instrument recognition
• Labeling different instrument segments
• We approach this task by using
• Mask R-CNN
• Very small dataset (only about 50 examples/instrument; 12 classes)
• Several data augmentation techniques
Surgical Instrument Segmentation/Recognition
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 40
Sabrina Kletz, Klaus Schoeffmann, Jenny Benois-Pineau, and Heinrich Husslein. 2019. Identifying Surgical Instruments in Laparoscopy Using Deep Learning Instance Segmentation. Proceedings of the International
Conference on Content-Based Multimedia Indexing (CBMI 2019). IEEE, Los Alamitos, CA, USA, 6 pages
Instrument Segmentation/Recognition: Dataset
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 41
11 different instrument types and
one class covering unspecified instruments.
• Settings
• Training from scratch and transfer learning from COCO dataset
• 60/20/20 split for training, validation, and test
• SGD as optimizer, different LR={0.01, 0.001, 0.0001}
• Evaluation
• Average precision with IoU (Jaccard index) for every instance
• with ground truth G and the detected region D
• COCO metrics
• Average precision with different thresholds
• AP50 and AP50:95
Instrument Segmentation/Recognition: Experimental Setup
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 42
𝐼𝑜𝑈 =
𝑇 ∩ 𝐷
𝑇 ∪ 𝐷
Instrument Segmentation/Recognition: Quantitative Results
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 43
Quantitative Results of Multi-Class Segmentation
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 44
Classification performance after 50th epoch
Instrument Segmentation/Recognition: Qualitative Results
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 45
Sabrina Kletz, Klaus Schoeffmann, Jenny Benois-Pineau, and Heinrich Husslein. 2019. Identifying Surgical Instruments in Laparoscopy Using Deep Learning Instance Segmentation. Proceedings of the International
Conference on Content-Based Multimedia Indexing (CBMI 2019). IEEE, Los Alamitos, CA, USA, 6 pages
Medical Video Datasets
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 46
LapGyn4: Laparoscopic Gynecology Dataset
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 47
Surgical Actions (~31K images) Anatomical Structures (~3K images)
Andreas Leibetseder, Stefan Petscharnig, Manfred Jürgen Primus, Sabrina Kletz, Bernd Münzer, Klaus Schoeffmann, and Jörg Keckstein. 2018. Lapgyn4: a dataset for 4 automatic content analysis problems in the
domain of laparoscopic gynecology. In Proceedings of the 9th ACM Multimedia Systems Conference (MMSys '18). ACM, New York, NY, USA, 357-362.
Instrument Count (~22K images) Suturing on Anatomy (~1K images)
• Over 57,000 images
• 500+ surgeries
• Baseline Evaluations: GoogleNet
• 5-fold cross validation over 100 epochs
• Dataset with annotations of endometriosis
• benign but potentially painful anomaly affecting females in child-bearing age
• Dislocation of uterine-like tissue; cicatrization and enclosed bleedings
• Serious and painful disease
• Often hard to diagnose
GLENDA: Gynecologic Laparoscopy Endometriosis Dataset
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 48
• Many of which show endometriosis cases of varying severity
• Pathology: peritoneum, ovary, uterus, deep infiltrated endometriosis (DIE)
• No pathology
• Region-based and temporal
expert annotations
• hand-drawn sketches
GLENDA Dataset – 25682 Frames from 400+ Surgeries
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 49
Andreas Leibetseder, Sabrina Kletz, Klaus Schoeffmann, Simon Keckstein, and Jörg Keckstein. 2020. GLENDA: Gynecologic Laparoscopy Endometriosis Dataset. Proceedings of the 26th International Conference
on Multimedia Modeling 2020 (MMM2020). Lecture Notes in Computer Science, Springer International Publishing, Cham, 12 pages. to appear
http://www.itec.aau.at/ftp/datasets/GLENDA/
GLENDA Dataset – Endometriosis Examples
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 50
• Cataract-101
• Videos recorded from 101 cataract surgeries in 2017 and 2018
• Only surgeries without any serious complications
• Comes with phase segmentation ground truth (11 phases)
Cataract-101 Video Dataset
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 51
Klaus Schoeffmann, Mario Taschwer, Stephanie Sarny, Bernd Münzer, Manfred Jürgen Primus, and Doris Putzgruber. 2018. Cataract-101: video dataset of 101 cataract surgeries. In Proceedings of the 9th ACM
Multimedia Systems Conference (MMSys '18). ACM, New York, NY, USA, 421-425.
http://www-itec.aau.at/ftp/datasets/ovid/cat-101/
Classification of Cataract OP Phases
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 52
Manfred J. Primus, Doris Putzgruber-Adamitsch, Mario Taschwer, Bernd Münzer, Yosuf El-Shabrawi, Laszlo Böszörmenyi, and Klaus Schoeffmann. 2018. Frame-Based Classification of Operation Phases in
Cataract Surgery Videos. In Proceedings of the 24th International Conference on Multimedia Modeling 2018 (MMM2018). Lecture Notes in Computer Science, vol 10704, Springer, Cham, 241-253.
Typical instruments used in Cataract surgery:
• Primary incision knife (pik)
• Secondary incision knife (sik)
• Katena forceps (kf)
• Capsulorhexis forceps (cf)
• Cannula (c)
• 27 gauge cannula (27gc)
• Phacoemulsifier handpiece (ph)
• Spatula (s)
• Irrigation/aspiration handpiece (iah)
• Implant injector (ii)
Cataract Instrument Recognition (Cat-101 Dataset)
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 53
pik
kf + sik
cf
c
27gc
ph
s
iah
ii
• Classification Study
• 26 randomly selected videos
• Manually annotated 8000 frames for instrument usage (see next slide)
• 800 frames for each of the 10 instruments (balanced)
• Instrument classification (full frame) and generalization performance
• ResNet-50, Inception v3, NASNet Mobile
• Multi-label classification, loss=binary cross-entropy, bs=32, 50 epochs training from scratch
• Tested with different settings (Adam optimizer, SGD, lrinit=0.1/0.01/0.001)
Cataract Instrument Recognition (Cat-101 Dataset)
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 54
Natalia Sokolova, Klaus Schoeffmann, Mario Taschwer, Doris Putzgruber-Adamitsch, and Yosuf El-Shabrawi. 2020. Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery
Videos. Proceedings of the 26th International Conference on Multimedia Modeling 2020 (MMM2020). Lecture Notes in Computer Science, Springer International Publishing, Cham, 11 pages. to appear
Medical Video
Interaction Tools
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 55
Past/Current Status
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 56
Patient
names
File Explorers &
Segments to Download
2014
2009
Desired Status
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 57
Bernd Münzer, Klaus Schoeffmann and Laszlo Boeszoermenyi. “EndoXplore: A Web-based Video Explorer for Endoscopic Videos“. Proceedings of the IEEE International Symposium on Multimedia 2017 (ISM
2017), Taipei, Taiwan, 2017, pp. 1-2
Special Content Visualization
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 58
• Clinicians check full video recordings for occurrence of technical errors:
• Errors are rated according to standardized schemes (e.g., OSATS, GERT)
and surgeons are made aware of them
• Studies have shown that this significantly improves surgical quality
Surgical Quality Assessment (SQA)
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 59
Surgical Quality Assessment (SQA)
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 60
Surgical Quality Assessment (SQA) Software
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 61
• Integrating rating features
• More efficient video navigation/browsing
Marco A. Hudelist, Heinrich Husslein, Bernd Muenzer, Sabrina Kletz and Klaus Schoeffmann. “A Tool to Support Surgical Quality Assessment“, in Proceedings of the Third IEEE International Conference on
Multimedia Big Data (BigMM), Laguna Hills, CA, USA, 2017, pp. 238-239.
(Diagnostic) Decision Support
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 62
Challenges and Requirements
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 63
There is a Need for Complete Systems!
anomalies are
missed
detection depends
on experience
there is a lack of
medical personnel
for large scale
screening programs
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 64
There is a Need for Complete Systems!
Medical knowledge transfer
Automated
analysis / detection / classification
Feedback / visualization
&
administrative
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 65
• Medical knowledge transfers – need DATA w/Ground Truth
• High detection accuracy
• Fast and efficient: real-time feedback and large scale
• Fit the normal examination procedures
• Assist administrative and report writing work
• Adhere to ethical, legal, privacy challenges & regulations
Key Challenges & Requirements
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 66
Gastrointestinal (GI) Case Study
(challenges, system support, datasets, diagnostic decision support, ...)
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 67
• Many types of diseases can potentially affect the human gastrointestinal (GI) tract
• about 2.8 millions of new luminal GI cancers (esophagus, stomach, colorectal) are detected yearly
• the mortality is about 65%
• Screening of the GI tract using different types of endoscopy…
• is costly (colonoscopy according to NY Times: $1100/patient, $10 billion dollars)
• consumes valuable medical personnel time (1-2 hours)
• does not scale to large populations
• is intrusive to the patient
• …
• Current technology may potentially enable automatic algorithmic screening and assisted examinations
 a true interdisciplinary activity with high chances of societal impact
GI Tract Challenges and Potential
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 68
Colorectal Cancer
Women
Men
Colorectal cancer is the third most common cause of cancer
mortality for both women and men, and it is a condition
where early detection is important for survival,
i.e., a 5-year survival probability of
going from a low 10-30% if detected in later stages
to a high 90% survival probability in early stages.
Colonoscopy is not the ideal screening test.
Related to the cancer example, on average
20% of polyps (possible predecessors of cancer) are missed
or incompletely removed. The risk of getting cancer largely
depend on the endoscopists ability to detect and remove polyps.
Large inter- and intra-clinician variations.
A 1% increase in detection can decrease the risk of cancer with 3%.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 69
Automatic Detection of
Anomalies
Colonoscopy & Gastroscopy
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 70
• A polyp is an abnormal growth
of tissue attached to the underlying mucosa
• Detection accuracy depends on experience and skills
• average miss rates of approx. 20%
• large inter- and intra-variations (e.g., a norwegian study shows variations between 36-65% for polyps)
• should reach a high (>85%) accuracy threshold to be acceptable
• Current technology may potentially enable
automated algorithmic assisted examinations
• Introduce a digital “third eye”
(with high accuracy and real-time processing)
Standard endoscopy: Live Polyp Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 71
A complete System
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 72
System Overview
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 73
Medical Knowledge Transfer
(Data Collection)
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 74
Available GI Datasets
Name Contain Annotation Size Type Usage
CVC-ClinicDB Polyps GT masks 12000 images (several
versions)
Trad. ©, by permission
ETIS-Larib Polyp DB Polyps, Normal GT masks 1500 images Trad. ©, by permission
ASU-Mayo Clinic DB Polyps, Normal GT masks 20 videos Trad. ©, by permission
Colonoscopy Videos DB Various Lesions Sorted 76 videos Trad. Academic
Capsule Endoscopy DB Various Lesions and Findings Sorted 3170 images, 47 videos VCE Academic, by request
GastroAtlas Various Lesions and Findings Sorted, Text
annotations
4449 videos Trad. Academic
WEO Atlas Various Lesions and Findings Sorted, Text
annotations
? Trad. Academic
GASTROLAB Various Lesions and Findings Sorted, Text
annotations
? Trad. Academic
Atlas of GE Various Lesions Sorted, Text
annotations
669 images Trad. ©, by permission
KID Various Lesions Sorted 2500 + 47 videos Trad. ©, by permission
ASU-Mayo dataset: POLYPS
• 20 videos
• 10 with polyps, 10 without
• 8-64 seconds long
• varying resolution
• ~18.000 frames/images
• image mask of polyp (ground truth)
• (currently) restricted use
CVC: POLYPS
• CVC-356 – 356 polyp images, 1350 normal frames
• CVC-612 – 612 polyp images, 1350 normal frames
• CVC-968 – 968 polyp images, 1350 normal frames
• CVC-12K – 10025 polyp images, 1929 normal frames
• image mask of polyp (ground truth)
• (currently) restricted use
Need more data to transfer the medical knowledge, and thus tools …
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 75
• Which image is not from the same class?
… and it gets worse …
• Making a mistake between cats and dogs may not matter,
but a misclassification here may have lethal consequences
Why Can’t CS People Do the Annotation!?
PylorusZ-line Z-line Z-line Z-line Z-line
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 76
Available time of the clinicians?
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 77
• Simple and efficient
• Web-based
• Assisted object tracking
Video Annotation Subsystem
"Expert Driven Semi-Supervised Elucidation Tool for Medical Endoscopic Videos"
Zeno Albisser, et. al.
Proceedings of MMSys, Portland, OR, USA, March 2015
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 78
• For large collection of images
• VV / Kvasir dataset
• Fully cleaned
• Feature extraction
mechanisms
• Different unsupervised
clustering algorithms
• Hierarchical image collection
visualization
• Open source: ClusterTag
https://bitbucket.org/mpg_projects/clustertag
ClusterTag: Image Clustering and Tagging Tool
"ClusterTag: Interactive Visualization, Clustering and Tagging Tool for Big Image Collections"
Konstantin Pogorelov, et. al.
Proceedings of ICMR, Bucharest, Romania, June 2017
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 79
• Still need even more efficient tools and data of entire procedures
1. “Annotation” during examination
2. Video with bookmarks
3. Annotate bookmarks
4. Automatically annotate
neighboring frames using
object tracking – and verify
Next version of the annotation tool
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 80
• Multi-Class Image Dataset for Computer Aided GI Disease Detection
• GI endoscopy images
• Some images contain the position and configuration of the endoscope (scope guide)
• 8 different anomalies and anatomical landmarks
• v1: 500 images per class, 6 pre-extracted global features
• v2: 1000 images per class
• v3: 16 classes, multi-label – to be released soon
• Open source: http://datasets.simula.no/kvasir/
The Kvasir Dataset
"Kvasir: A Multi-Class Image-Dataset for Computer Aided Gastrointestinal Disease Detection"
Konstantin Pogorelov, et al.
Proceedings of MMSYS, Taiwan, June 2017
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 81
• Bowel Preparation Quality Video
• 21 GI endoscopy videos of colon
• Some frames contain the position and
configuration of the endoscope (scope
guide)
• 4 classes showing the four-score Boston
Bowel Preparation Scale (BBPS)-defined
bowel-preparation quality
• 0 - very dirty
• …
• 3 - very clean
• Open source:
http://datasets.simula.no/nerthus/
The Nerthus Dataset
"Nerthus: A Bowel Preparation Quality Video Dataset"
Konstantin Pogorelov, et al.
Proceedings of MMSYS, Taiwan, June 2017
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 82
• Kvasir does not contain segmentation masks
• 1000 accurate pixel-accurate masks of the
polyps in Kvasir
• Some similar datasets exist
(e.g., CVC-356, CVC-612, ETIS-Larib Polyp DB),
but small, restricted, etc.
• http://datasets.simula.no/kvasir-seg/
The Kvasir-SEG Dataset
”Kvasir-SEG: A Segmented Polyp Dataset"
Debesh Jha, et al.
Proceedings of MMM, Korea, January 2020
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 83
GI Anomaly
Detection System
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 84
• Common approaches
• Handcrafted features
• Convolutional neural network
• Generative Adversarial Networks
• Easy to extend with new diseases
• Easy to extend with new algorithms
• Easy to train
• Results are explainable?
• Disease Localization?
• Real-time?
Requirements Detection and Automatic Analysis subsystem
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 85
State-of-The-Art: Example Detection Systems – 5 years ago
Polyp-Alert
• detects polyps using edges and texture
• near real-time feedback during colonoscopy (10fps)
• detected 97.7% (42 of 43) of polyp shots on 53 randomly selected
(not per frame detection)
• one of the few end-to-end systems
• Wallapak Tavanapong – from MM community
100s of new approaches the last years, many with good detection results…
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 86
Performance
(accuracy and speed)
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 87
 Mayo dataset (18781 images/frames)
 masks for all polyps
• GF:
• JCD and Tamura
• recall 98.50%, precision 93.88%, fps ~300
• CNN:
• Modified Inception v3: recall 95.86%, precision 80.78%, fps: ~30
• Inception v3 + WEKA: recall: 88.87%, precision: 89.16%, fps: ~30
ASU Mayo Dataset: Polyp Detection
”EIR - Efficient Computer Aided Diagnosis Framework for Gastrointestinal Endoscopies"
Michael Riegler, et. al.
Proceedings of CBMI, Bucharest, Romania, June 2016
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 88
• Resource consumption and processing performance of GF:
• CNNs (also including GPU support)?
• tests so far: ~30 fps (same GPU as above)
• but adding layers, more networks, … !?? (newer GPU)
• Inception v3: 66 fps, plain CNN: ~40-45 fps
• GAN: ~12 fps (for 160x160)
ASU Mayo Dataset: Polyp Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 89
• Vestre Viken (VV) multi-disease dataset (250 images per class)
• GF:
• recall 90.60 %
• precision 91.40%
• fps ~30
• CNN:
• recall: 87.20%
• precision: 87.90%
• fps: ~30
VV Dataset: Multi-Disease Detection
""Efficient disease detection in gastrointestinal videos - global features versus neural networks"
Konstantin Pogorelov, et. al.
Multimedia Tools and Applications, 2017
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 90
• GF
• CNN
VV Dataset: Multi-Disease Detection
""Efficient disease detection in gastrointestinal videos - global features versus neural networks"
Konstantin Pogorelov, et. al.
Multimedia Tools and Applications, 2017
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 91
• 7 different algorithms
• Convolutional neural networks (CNN) (2) – trained from scratch
• 3-layers
• 6-layers
• Transfer learning (1) – retrained Inception v3
• Global features (4)
• 2 global features (JCD, Tamura)
• 6 global features (JCD, Tamura, Color Layout, Edge Histogram, Auto Color Correlogram and PHOG)
• 2 different algorithms (Random forest and logistic model tree)
• 2 baselines
• Random Forrest with one global feature
• Majority class
• 2-folded cross validation
Kvasir Dataset v1: Multi-Disease Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 92
Kvasir Dataset v1: Multi-Disease Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 93
Kvasir Dataset v1: Multi-Disease Detection
DyedandLiftedPolypDyedResectionMargin
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 94
Kvasir Dataset v1: Multi-Disease Detection
CecumPylorus
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 95
• Using same GF and some new deep features, i.e.,
• Pre-trained ImageNet dataset Inception v3
• ResNet50 models
• Used different ML classifications;
• random tree (RT)
• random forest (RF)
• logistic model tree (LMR) – performed best
• Uses weights of 1000 pre-defined concepts as
features
• Top layer input as features vector
(16384 for Inception v3 and 2048 for ResNet50)
Kvasir Dataset v1  v2: Multi-Disease Detection
Pretrained
model
Output or top-
layer input
weights
WEKA for
classification
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 96
• Multiclass: 16 classes of anomalies and landmarks
• Very varying dataset sizes for the different classes
• Combination of retrained networks
Kvasir Dataset v2  v3: Multi-Disease Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 97
• MICCAI
• …
• Medico @ MediaEval
• BioMedia @ ACM MM
Competitions
Team Approaches F1 FPS
SCL-UMD Global-features and deep-features extraction,
Inception-V3 and VGGNet CNN models, followed by
machine-learning-based classification using RT, RF,
SVM and LMR classifiers
0.848 1.3
FAST-NU-DS Global and local features combined followed by data
size reduction by applying K-means clustering and
than using logistic regression model for the
classification
0.767 2.3
ITEC-AAU Two different custom Inception-like CNN models 0.755 1.4
HKBU A manifold learning method (bidirectional marginal
Fisher analysis) learning a compact representation of
the data, then machine-learning-based multi-class
support vector machine is used for the classification
0.703 2.2
SIMULA GF-features extraction, ResNet50 and Inception-V3
CNN models and followed by machine-learning-based
classification using RT, RF and LMR classifiers
0.826 46.0
Team and Run Name F1 MCC
Average Processing
Speed
HCMUS 0,934236452 0,931232439
Fastenough
S@M (Simula) 0,929733339 0,928383755
LesCats (Simula) 0,923640116 0,922827982
RUNE (Simula) 0,855590739 0,855590694
UMM-SIM_detection_InResV2-Van_3712 0,836795839 0,836636058
ParaNoMundo_detection_kt12dense201_3808 0,811417906 0,814635359
AAUITEC_detection_LSVM-comb2_5293 0,866259873 0,864100277
SIMULA_detection_run1_5293 0,814535427 0,811510687
FASTNUCES_detection_ver1_300 0,586802677 0,602579617
NOAT_detection_1_5293 0,391347034 0,390125827
HKBU_detection_1_5293 0,482962822 0,460894862
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 98
Compared:
• Handcrafted global features (GF-D) using LIRE
• Retrained and fine tuned existing DL architectures (RT-D)
• Generative adversarial network (GAN)
• Combined various datasets captured by different equipment
in different hospitals.
• With our best working GAN-based detection approach,
• we reached detection specificity of ~94% and accuracy of ~90% with
only 356 training and 6,000 test samples,
slightly better if increasing training size
• though a bit too many false positives (a bit low sensitivity)
The Next Level: Comparing Handcrafted and Deep
Learning Features – Cross Datasets
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 99
Detecting Bowel Cleanness
Levels
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 100
• 7 different algorithms
• Convolutional neural networks (CNN) (2) – trained from scratch
• 3-layers
• 6-layers
• Transfer learning (1) – retrained Inception v3
• Global features (4)
• 2 global features
(JCD, Tamura)
• 6 global features
(JCD, Tamura, Color Layout, Edge Histogram, Auto Color Correlogram and PHOG)
• 2 different algorithms (Random forest and logistic model tree)
• 2 baselines
• Random Forrest with one global feature
• Majority class
• 2-folded cross validation
Nerthus Dataset: Bowel Cleanness Level
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 101
Nerthus Dataset: Bowel Cleanness Level
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 102
Nerthus Dataset: Bowel Cleanness Level
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 103
Localization / Segmentation
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 104
• Detection first, then process only frames containing polyps
• Image enhancements
• Detects curve-shaped objects and
local maximums
• Builds energy map and selects
4 possible locations
• Localization performance:
• recall 31.83 %,
• precision 32.07%
• ~30 fps
• later better GPU: ~75 fps (detection: 300 fps ; localization 100 fps)
ASU Mayo Dataset: Polyp Localization
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 105
• Can we generate the full mask,
not only pointing to one of the affected pixels?
• Extended and improved the ResUNet
architecture and compared to several other
segmentation systems
Kvasir-SEG: Generate the anomaly mask
Input
Conv2D (3х3)
BN
ReLU
Conv2D (3х3)
Addition
Squeeze & Excite
Atrous Spatial Pyramidal
Pooling (ASPP)
BN
ReLU
Upsampling
Attention
Conv2D (3х3)
BN
ReLU
Conv2D (3х3)
Conv2D (3х3)
ReLU
Addition
Squeeze & Excite
BN
Conv2D (3х3)
ReLU
BN
Conv2D (3х3)
ReLU
Addition
Squeeze & Excite
BN
Conv2D (3х3)
ReLU
BN
ASPP
Outputs
Conv2D (1х1)
Sigmoid
Concatenate
Addition
BN
ReLU
Upsampling
Attention
Conv2D (3х3)
BN
ReLU
Conv2D (3х3)
Concatenate
Addition
BN
ReLU
Upsampling
Attention
Conv2D (3х3)
BN
ReLU
Conv2D (3х3)
Concatenate
Addition
Encoding
Decoding
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 106
• Can we generate the full mask,
not only pointing to one of the affected pixels?
• Extended and improved the ResUNet
architecture and compared to several other
segmentation systems:
• U-Net
• ResUNet
• ResUNet-mod
• ResUNet++
Kvasir-SEG: Generate the anomaly mask
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 107
• Can we generate the full mask,
not only pointing to one of the affected pixels?
Kvasir-SEG: Generate the anomaly mask
Trained and tested on Kvasir-SEG:
Original Ground truth UNet ResUNet ResUNet-mod ResUNet++
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 108
• Can we generate the full mask,
not only pointing to one of the affected pixels?
Kvasir-SEG: Generate the anomaly mask
Trained on CVC-612 and
tested on Kvasir-SEG:
Original Ground truth UNet ResUNet ResUNet-mod ResUNet++
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 109
Preprocessing &
Augmentation
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 110
• Too little data!!
• Blurry images due to camera motion
• Objects too close to camera
• Under or over scene lighting
• Flares
• Artificial objects and natural “contaminations”
• Low resolution of capsular endoscopes
• …
Data Challenges: Preprocessing
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 111
Data Enhancements for CNN Training
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 112
Data Enhancements for CNN Training
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 113
• Artifacts in the images can
influence the algorithm
• Understanding of what the
algorithm reacts to is crucial
Borders and Overlays
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 114
• Results on Kvasir + CVC-986
• Accuracy improved for
almost all models with some
preprocessing
(F1 from 0.7% to 4.4%)
Borders and Overlays
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 115
• Replacing artifacts in the video/image
• Different methods
• Clipping
• Autoencoders
• Contextencoder
• Context Conditional (CC)-GAN
• Some difference but marginal
GAN inpainting of Navigation Box
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 116
Automatic Detection of
Angiectasia
Video Capsule Endoscopy
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 117
Video Capsule (PillCam)
 Standard colonoscopy:
 expensive
 does not scale
 intrusive
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 118
Video Capsule (PillCam)
 Standard colonoscopy:
 expensive
 does not scale
 intrusive
 Wireless Video Capsule endoscopy:
 better scale
 less intrusive
 possible to combine examinations!?
 watch hours of video
 less expensive?
(detection might lead to an endoscopy)
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 119
• Angiectasia is a vascular lesions that can cause of
GI bleedings
• Medical specialists reach a detection accuracy of about 69%
• Medical systems should reach an 85% threshold to be
acceptable in clinical use
Angiectasia Detection
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 120
Angiectasia Detection: Varying Difficulty
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 121
• By far, GANs give the best
detection:
• sensitivity: 98%
• specificity: 100%
• BUT, sloooooow…
• Several approaches are better
than the average doctor (69%)
• Most of the approaches have a too
low detection rate, but still better
than the baseline
• Compromise between
accuracy and speed
Detection Compared
• VCE dataset from GIANA 2017
(300 with angiectasia and 300 without)
• 10-fold cross validation
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 122
Detection Feedback
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 123
Detection Subsystem Outputs
• Visualize the output of the system to the medical doctors
• Simple and easy to understand (most important)
• Easy to integrate in hospitals
• Live support
• Useable for automatic reports, etc.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 124
Real-time Detection Feedback
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 125
Real-time Detection Feedback
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 126
Increasing Understanding
&
Assisting Administrative Work
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 127
• Understanding:
A black box will not work – neither for patients nor clinicians
• Reporting:
Critical for communication and evidence, but a huge overhead
• Inconsistent descriptions of abnormalities
• Poor adoption of existing standards
• Time consuming (up to 15 minutes or more)
• Boring and reduced job satisfaction
Understanding and Reporting
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 128
Mimir: Reporting of Endoscopies
• A way of interpreting the output of a neural network
• deeper analysis of why the model produces a given result
• class discriminatory visualizations based on selected class
and layer.
• tools for uploading and managing various models.
• Automatic generation of modifiable medical reports
• Produced Visualizations
• grad-CAM technique
• saliency and class activation maps
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 129
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 130
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 131
Human Reproduction Case
Study
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 132
Why semen analysis?
• Every year, over 45 million couples
experience involuntary childlessness,
with 40% of cases due in some part to
male fertility problems
• Semen analysis is one of the first
procures done when determining
infertility.
• Current methods are either time-
consuming and prone to human error,
or require expensive laboratory
equipment.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 133
What is semen quality
• When analyzing semen quality, we often
look at multiple visual features of the
spermatozoa (sperm) together with
information about the patient.
• The problem is that we know that patient
parameters impact semen quality, but we
don’t know how.
• This is a true multimodal problem where
our expertise could have great impact.
• But right now, let’s look at some visual
features which are commonly used to
determine quality.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 134
Sperm count
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 135
Morphology is used to assess the shape and size of a sperm,
focusing on the tail, midpiece and head.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 136
Morphology Examples
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 137
Motility is used to assess the movements of each sperm, the can be
grouped into progressive, non-progressive and immotile.
Non-progressive ImmotileProgressive
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 138
Non-Progressive Example
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 139
Progressive Example
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 140
Immotile Example
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 141
VISEM – A Multimodal Video Dataset of Human Spermatozoa
• We have a dataset consisting of 85 microscopic
videos of human semen, all from different
participants.
• Each video comes with a preliminary semen
analysis done according to WHO standards.
• The dataset also contains information about each
participant (such as age and BMI), sex hormone
levels of the participant, and some parameters
extracted from existing sperm analysis machines.
• Data is open-source and available at
datasets.simula.no/visem.
Low Mid-
low
Mid-
high
High
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 142
How should we tackle this?
• Determining the different visual
features requires different
approaches.
• Morphology is more focuses on the
spatial features of a frame, while
motility requires the temporal
dimension.
• Not all videos are create equal, some
videos are not properly focused or
include fluid drift.
• We must find clever solutions which
incorporate the temporal information
of the frames together with the
participant-related data.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 143
Baseline Approach
• No other methods to directly
compare.
• To create a baseline, we calculate
the ZeroR across our collected
dataset.
• Metrics used to measure
performance is the mean
absolute error (MAE) and the
root mean squared error (RMSE).
Morphology Baseline
Motility Baseline
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 144
Baseline Approach
• No other methods to directly
compare.
• To create a baseline, we calculate
the ZeroR across our collected
dataset.
• Metrics used to measure
performance is the mean
absolute error (MAE) and the
root mean squared error (RMSE).
Morphology Baseline
Motility Baseline
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 145
3D Convolutional Neural Networks
• Using multiple frame to predict
quality using 3D convolutional
neural networks.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 146
Using an autoencoder to extract temporal features into images.
• We use an autoencoder which takes
multiple frames to extract temporal
features into an RGB image.
• Used to predict both morphology
and motility.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 147
Generate optical flow from the video frames.
• Compress the temporal
information of sequential frames
by using sparse or dense optical
flow.
• Clearly see the movement of the
different sperm across time.
• Using synthetic sperm videos to
accurately estimate optical flow
using GANs.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 148
MediaEval 2019 – Medico Multimedia Task
• Three different tasks related to analyzing human
semen.
• Main task is predicting motility and morphology
using the VISEM dataset.
• 5 submissions using a variety of approaches.
• Tune in next week and join the fun in 2020!
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 149
Open challenges and future directions.
• How should we combine the patient
data with the visual features to
better predict semen quality?
• What data to use and how to include it.
• Tracking individual sperm cells to
find the “best” spermatozoon.
• Combining semen analysis with
embryo data to better understand
the relationship between sperm and
successful egg fertilization.
• Next step…
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 150
Embryo analysis and prediction
• Analyzing time-lapse videos of
embryo development.
• Get a better understanding of
early embryo development and
the health of offspring.
• Increase success rate of in vitro
fertilization.
• Ethical and legal challenges.
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 151
Predicting Performance of Soccer Players
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 152
Initial challenge: Logging and Monitoring
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 153
pmSys: Reporting using a mobile app
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 154
Coach Web Portal
 See team overviews
− all, averages
− planned load
 Send reminders
 See individual views
 Simple automatic “predictions”
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 155
Would like to perform proper predictions!
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 156
• 2008-09, Spanish 1st division: 24.360 player-days absent
• 2017-19:
• Premier league clubs paid £217m in wages to injured players
• Manchester United has an average cost of £870.00 per injury (high salaries)
• Champions Manchester City suffered the second fewest number of injuries
• 2018-19
• Manchester City won PL with a minimum margin:
98 vs 97 points (2nd and 3rd highest ever)
Important to find an optimal training regime, avoid injuries and
pick the right players for the game
Would like to perform proper predictions!
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 157
• Recurrent neural networks – Long Short-Term Memory (LSTM)
• handles the complexity of sequences – well-suited to classifying, processing and
making predictions based on time series data
• Motivating example: Airline passengers
LSTM
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 158
• Dataset
• two professional Norwegian teams
• data from 2017 and 2018
• 6000 days of reports
• many parameters, but our initial experiments used “readiness to train”
• LSTM
• sequence numbers of 36
• 30 ephocs
• batch size of 4
• 4 layers – input, 2 hidden, output
• rmsprop optimizer
• Model training
• training and predicting on the same player
• training on all players but one
• Aim: detecting positive and negative peaks
Initial Experiments
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 159
Analyzing player data using LSTM (training on one player)
Needs more data then training on
just ONE player…
Team 1
Team 2
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 160
Analyzing player data using LSTM (training on all but one players)
Predicting the positive and negative peaks with
a precision and recall above 90%
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 161
• If we manage to give good predictions…
• better training
• less injuries
• better results
• Challenges
• enough data
• detect all corner cases
• making the users believe in the predictions
Consequences and challenges
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 162
So, MEDICAL MULTIMEDIA -
all problems solved!!??
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 163
• Still improve accuracy and system performance
1. Full system integration
2. Exploiting domain expert knowledge – build datasets
3. Integration of various data, multi-modality – new sensors
4. Explainable AI
5. Patient context information
6. Visualization (AR/VR)
7. Decision support and administrative aids
8. …
• The potential for real impact is HUGE!!
• screening / diagnosis
• personalized medicine
• automatic treatment
• improving exercise, rehabilitation and sport performance
• autonomous and remote surgeries
• …
Open Challenges & Potential
"Multimedia and Medicine: Teammates for Better Disease Detection and Survival"
Michael Riegler, et. al.
Proceedings ACM MM, Amsterdam, The Netherlands, October 2016
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 164
• We have given several case-specific examples, but in general, they are common
• Doctors want to use all the data for general support:
analysis, diagnostics, reporting, teaching, statistics, similarity search / comparisons, …
• Currently, …
• more and more high quality data is recorded / produced
• data analysis methods are promising
• multi modal data analysis is not very common
• good visualization tools exist, but not used (e.g., AR, VR, …)
• some tools are missing
• many (other) areas produce separate (isolated) methods
• …
• and, we need a complete integrated system!
 Our multimedia community is needed
Summary
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 165
ImageCLEF 2020
CLEF, 22-25 S eptem ber, Thes s aloniki, Greece
http://www.im ag eclef.org/2020
#ImageCLEFlifelog2020# (4th edition) An increasingly wide
range of personal devices that allow capturing pictures,
videos, and audio clips for every m om ent of our lives, are
becom ing available. In this context, the task addresses the
problem s oflifelogging data retrievaland sum m arization.
Organizers: Duc-Tien Dang -Nguyen (University of Bergen), Luca Piras
(Pluribus One & University of Cagliari), MichaelR ieg ler & PålHalvorsen
(S imula Research Laboratory), Minh-Triet Tran (University of S cience),
Cathal G urrin (Dublin City University), Mathias Lux (Klagenfurt
University).
#ImageCLEFcoral2020# (2nd edition) The increasing use of
structure-from -m otion photogram m etry for m odelling large-
scale environm ents from action cam eras has driven the next
generation of visualization techniques. The task addresses
the problem of autom atically segm enting and labeling a
collection of im ages that can be used in com bination to
create 3D m odels forthe m onitoring ofcoralreefs.
Organizers: Jon Cham berlain, Adrian Clark, & Alba G arcía Seco de
Herrera (University of Essex), Antonio Cam pello (Wellcome Trust).
#ImageCLEFmedical2020# (2nd edition) Medical im ages
can be used in a variety of scenarios and this task will
com bine the m ost popular m edicaltasks ofIm ageCLEF and
continue the last year idea of m ixing various applications,
nam ely: autom atic im age captioning and scene
understanding, m edical visual question answering and
decision support on tuberculosis. This allows to explore
synergies between tasks.
Organizers: Asm a Ben Abacha & Dina Dem ner-Fushm an (National
Library of Medicine), Sadid A. Hasan, V ivek Datla & Joey Liu (Philips
Research Cambridge), Obiom a Pelka & Christoph M. Friedrich (University
of Applied S ciences and Arts Dortmund), Alba G arcía Seco de Herrera
(University of Essex), Yashin Dicente Cid (University of Warwick), Serg e
Kozlovski, V itali Liauchuk, & V assili Kovalev (United Institute of
Informatics Problems), Henning Müller(HES -S O).
#ImageCLEFdrawnUI2020# (new) Enabling people to create
websites by drawing them on a piece of paper would m ake
the webpage building process m ore accessible. The task
addresses the problem of autom atically recognizing hand
drawn objects representing website UIs, which willbe further
translated into autom atic website code.
Organizers: PaulBrie & Fichou Dim itri (teleportHQ), Mihai Dogariu, Liviu
Daniel Ștefan, Mihai G abrielConstantin, & Bogdan Ionescu (University
Politehnica of Bucharest).
Contact on s ocial media
Facebook
https://www.facebook.com /Im ageClef
Twitter
https://twitter.com /im ageclef
Im ageCLEF 2020 is an evaluation
cam paign that is being organized as
part ofthe CLEF (Conference and Labs
ofthe Evaluation Forum ) labs.
The cam paign offers several research
tasks that welcom e participation from
team s around the world.
The results of the cam paign appear in
the working notes, published by CEUR
(CEUR -W S.org) and are presented in
the CLEFconference.
Selected contributions am ong the
participants will be invited for
publication in the following year in the
Springer Lecture Notes in Com puter
Science (LNCS), together with the
annuallab overviews.
Target com m unities involve (but are not
lim ited to): information retrieval (e.g.,
text, vision, audio, m ultim edia, social
m edia, sensor data), machine learning,
deep learning, data mining, natural
language processing, image and video
processing; with special em phasis on
the challenges of multi-modality, multi-
linguality, and interactive search.
Overall coordination
Bogdan Ionescu,
University Politehnica of Bucharest, R om ania
Henning Müller,
HES -S O, S ierre, Switzerland
R enaud Pé teri,
University of La Rochelle, France
Important Dates (depending on tasks)
end of April, 2020: registration closes;
beginning of May, 2020: runs due;
end of May, 2020: working notes due.
#imageclef20
#clef2020ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 166
The End…
ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 167

More Related Content

What's hot

Machine Learning for Medical Image Analysis: What, where and how?
Machine Learning for Medical Image Analysis:What, where and how?Machine Learning for Medical Image Analysis:What, where and how?
Machine Learning for Medical Image Analysis: What, where and how?
Debdoot Sheet
 
2014 Medical Imaging
2014 Medical Imaging2014 Medical Imaging
2014 Medical Imaging
Engku Fahmi
 
Medical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructuresMedical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructures
Institute of Information Systems (HES-SO)
 
BreastScreening: On the Use of Multi-Modality in Medical Imaging Diagnosis
BreastScreening: On the Use of Multi-Modality in Medical Imaging DiagnosisBreastScreening: On the Use of Multi-Modality in Medical Imaging Diagnosis
BreastScreening: On the Use of Multi-Modality in Medical Imaging Diagnosis
Instituto Superior Técnico
 
Medical Image Analysis and Its Application
Medical Image Analysis and Its ApplicationMedical Image Analysis and Its Application
Medical Image Analysis and Its Application
Subarno Pal
 
FoCAS Newsletter Issue 1: Septemeber 2013
FoCAS Newsletter Issue 1: Septemeber 2013FoCAS Newsletter Issue 1: Septemeber 2013
FoCAS Newsletter Issue 1: Septemeber 2013
FoCAS Initiative
 
Nessos
NessosNessos
Nessos
fcleary
 
140123 Workshop bioprinting Sirris
140123 Workshop bioprinting Sirris140123 Workshop bioprinting Sirris
140123 Workshop bioprinting Sirris
batgreg
 
Cloud Services for Education - HNSciCloud applied to the UP2U project
Cloud Services for Education - HNSciCloud applied to the UP2U projectCloud Services for Education - HNSciCloud applied to the UP2U project
Cloud Services for Education - HNSciCloud applied to the UP2U project
Helix Nebula The Science Cloud
 
Presentation
PresentationPresentation
Presentation
Videoguy
 
Dcb1419 f1 ultrasound
Dcb1419   f1 ultrasoundDcb1419   f1 ultrasound
Dcb1419 f1 ultrasound
Domhnall Macinnes
 
GUGC Info Session - Informatics and Bioinformatics
GUGC Info Session - Informatics and BioinformaticsGUGC Info Session - Informatics and Bioinformatics
GUGC Info Session - Informatics and Bioinformatics
Wesley De Neve
 

What's hot (12)

Machine Learning for Medical Image Analysis: What, where and how?
Machine Learning for Medical Image Analysis:What, where and how?Machine Learning for Medical Image Analysis:What, where and how?
Machine Learning for Medical Image Analysis: What, where and how?
 
2014 Medical Imaging
2014 Medical Imaging2014 Medical Imaging
2014 Medical Imaging
 
Medical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructuresMedical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructures
 
BreastScreening: On the Use of Multi-Modality in Medical Imaging Diagnosis
BreastScreening: On the Use of Multi-Modality in Medical Imaging DiagnosisBreastScreening: On the Use of Multi-Modality in Medical Imaging Diagnosis
BreastScreening: On the Use of Multi-Modality in Medical Imaging Diagnosis
 
Medical Image Analysis and Its Application
Medical Image Analysis and Its ApplicationMedical Image Analysis and Its Application
Medical Image Analysis and Its Application
 
FoCAS Newsletter Issue 1: Septemeber 2013
FoCAS Newsletter Issue 1: Septemeber 2013FoCAS Newsletter Issue 1: Septemeber 2013
FoCAS Newsletter Issue 1: Septemeber 2013
 
Nessos
NessosNessos
Nessos
 
140123 Workshop bioprinting Sirris
140123 Workshop bioprinting Sirris140123 Workshop bioprinting Sirris
140123 Workshop bioprinting Sirris
 
Cloud Services for Education - HNSciCloud applied to the UP2U project
Cloud Services for Education - HNSciCloud applied to the UP2U projectCloud Services for Education - HNSciCloud applied to the UP2U project
Cloud Services for Education - HNSciCloud applied to the UP2U project
 
Presentation
PresentationPresentation
Presentation
 
Dcb1419 f1 ultrasound
Dcb1419   f1 ultrasoundDcb1419   f1 ultrasound
Dcb1419 f1 ultrasound
 
GUGC Info Session - Informatics and Bioinformatics
GUGC Info Session - Informatics and BioinformaticsGUGC Info Session - Informatics and Bioinformatics
GUGC Info Session - Informatics and Bioinformatics
 

Similar to Medical Multimedia Systems and Applications

Medical Video Processing (Tutorial)
Medical Video Processing (Tutorial)Medical Video Processing (Tutorial)
Medical Video Processing (Tutorial)
klschoef
 
Medical Multimedia Information Systems (ACMMM17 Tutorial)
Medical Multimedia Information Systems (ACMMM17 Tutorial) Medical Multimedia Information Systems (ACMMM17 Tutorial)
Medical Multimedia Information Systems (ACMMM17 Tutorial)
klschoef
 
Lec1: Medical Image Computing - Introduction
Lec1: Medical Image Computing - Introduction Lec1: Medical Image Computing - Introduction
Lec1: Medical Image Computing - Introduction
Ulaş Bağcı
 
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
The Statistical and Applied Mathematical Sciences Institute
 
Enterprise Imaging: case study
Enterprise Imaging: case studyEnterprise Imaging: case study
Enterprise Imaging: case study
Marc De Fré
 
ROLE OF DIGITAL IMAGING IN PATHOLOGY.pptx
ROLE OF DIGITAL IMAGING IN PATHOLOGY.pptxROLE OF DIGITAL IMAGING IN PATHOLOGY.pptx
ROLE OF DIGITAL IMAGING IN PATHOLOGY.pptx
aditisikarwar2
 
Telepathology-An overview
Telepathology-An overviewTelepathology-An overview
Telepathology-An overview
Gurleen Oberoi
 
Global Med Solutione
Global Med SolutioneGlobal Med Solutione
Global Med Solutione
Iris Interativa
 
IGS_final_presentation_bronchoscopy
IGS_final_presentation_bronchoscopyIGS_final_presentation_bronchoscopy
IGS_final_presentation_bronchoscopy
Eduard Cortes
 
Telemedicine1
Telemedicine1Telemedicine1
Telemedicine1
Sandhya M
 
Picture Archiving and Communication Systems
Picture Archiving and Communication SystemsPicture Archiving and Communication Systems
Picture Archiving and Communication Systems
Rogier Van de Wetering, PhD
 
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImagingPresentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
maigva
 
control room design.pdf
control room design.pdfcontrol room design.pdf
control room design.pdf
PawachMetharattanara
 
Stefano Furgoni - VIDEOMED SRL
Stefano Furgoni - VIDEOMED SRLStefano Furgoni - VIDEOMED SRL
Stefano Furgoni - VIDEOMED SRL
Informa Australia
 
Image data beyond radiology: new developments
Image data beyond radiology: new developmentsImage data beyond radiology: new developments
Image data beyond radiology: new developments
Erik R. Ranschaert, MD, PhD
 
Multimedia tools propel Web-based telepathology
Multimedia tools propel Web-based telepathologyMultimedia tools propel Web-based telepathology
Multimedia tools propel Web-based telepathology
Mahmood Aijazi, MD
 
Modern Cariology: Evidence into Practice | The Paradigm is Shifting (it has t...
Modern Cariology: Evidence into Practice | The Paradigm is Shifting (it has t...Modern Cariology: Evidence into Practice | The Paradigm is Shifting (it has t...
Modern Cariology: Evidence into Practice | The Paradigm is Shifting (it has t...
CALCIVIS Ltd.
 
Lec2: Digital Images and Medical Imaging Modalities
Lec2: Digital Images and Medical Imaging ModalitiesLec2: Digital Images and Medical Imaging Modalities
Lec2: Digital Images and Medical Imaging Modalities
Ulaş Bağcı
 
Social Networks and Collaborative Platforms for Data Sharing in Radiology
Social Networks and Collaborative Platforms for Data Sharing in RadiologySocial Networks and Collaborative Platforms for Data Sharing in Radiology
Social Networks and Collaborative Platforms for Data Sharing in Radiology
Erik R. Ranschaert, MD, PhD
 
CARS Presentation by Carlos Amato
CARS Presentation by Carlos AmatoCARS Presentation by Carlos Amato
CARS Presentation by Carlos Amato
CannonDesign
 

Similar to Medical Multimedia Systems and Applications (20)

Medical Video Processing (Tutorial)
Medical Video Processing (Tutorial)Medical Video Processing (Tutorial)
Medical Video Processing (Tutorial)
 
Medical Multimedia Information Systems (ACMMM17 Tutorial)
Medical Multimedia Information Systems (ACMMM17 Tutorial) Medical Multimedia Information Systems (ACMMM17 Tutorial)
Medical Multimedia Information Systems (ACMMM17 Tutorial)
 
Lec1: Medical Image Computing - Introduction
Lec1: Medical Image Computing - Introduction Lec1: Medical Image Computing - Introduction
Lec1: Medical Image Computing - Introduction
 
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
 
Enterprise Imaging: case study
Enterprise Imaging: case studyEnterprise Imaging: case study
Enterprise Imaging: case study
 
ROLE OF DIGITAL IMAGING IN PATHOLOGY.pptx
ROLE OF DIGITAL IMAGING IN PATHOLOGY.pptxROLE OF DIGITAL IMAGING IN PATHOLOGY.pptx
ROLE OF DIGITAL IMAGING IN PATHOLOGY.pptx
 
Telepathology-An overview
Telepathology-An overviewTelepathology-An overview
Telepathology-An overview
 
Global Med Solutione
Global Med SolutioneGlobal Med Solutione
Global Med Solutione
 
IGS_final_presentation_bronchoscopy
IGS_final_presentation_bronchoscopyIGS_final_presentation_bronchoscopy
IGS_final_presentation_bronchoscopy
 
Telemedicine1
Telemedicine1Telemedicine1
Telemedicine1
 
Picture Archiving and Communication Systems
Picture Archiving and Communication SystemsPicture Archiving and Communication Systems
Picture Archiving and Communication Systems
 
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImagingPresentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
Presentación del nodo Valenciano en Bonn en el comité de Euro-BioImaging
 
control room design.pdf
control room design.pdfcontrol room design.pdf
control room design.pdf
 
Stefano Furgoni - VIDEOMED SRL
Stefano Furgoni - VIDEOMED SRLStefano Furgoni - VIDEOMED SRL
Stefano Furgoni - VIDEOMED SRL
 
Image data beyond radiology: new developments
Image data beyond radiology: new developmentsImage data beyond radiology: new developments
Image data beyond radiology: new developments
 
Multimedia tools propel Web-based telepathology
Multimedia tools propel Web-based telepathologyMultimedia tools propel Web-based telepathology
Multimedia tools propel Web-based telepathology
 
Modern Cariology: Evidence into Practice | The Paradigm is Shifting (it has t...
Modern Cariology: Evidence into Practice | The Paradigm is Shifting (it has t...Modern Cariology: Evidence into Practice | The Paradigm is Shifting (it has t...
Modern Cariology: Evidence into Practice | The Paradigm is Shifting (it has t...
 
Lec2: Digital Images and Medical Imaging Modalities
Lec2: Digital Images and Medical Imaging ModalitiesLec2: Digital Images and Medical Imaging Modalities
Lec2: Digital Images and Medical Imaging Modalities
 
Social Networks and Collaborative Platforms for Data Sharing in Radiology
Social Networks and Collaborative Platforms for Data Sharing in RadiologySocial Networks and Collaborative Platforms for Data Sharing in Radiology
Social Networks and Collaborative Platforms for Data Sharing in Radiology
 
CARS Presentation by Carlos Amato
CARS Presentation by Carlos AmatoCARS Presentation by Carlos Amato
CARS Presentation by Carlos Amato
 

Recently uploaded

The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Sérgio Sacani
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Sérgio Sacani
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
PirithiRaju
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
sammy700571
 
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
frank0071
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
Sérgio Sacani
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
sandertein
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 

Recently uploaded (20)

The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
 
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 

Medical Multimedia Systems and Applications

  • 1. Medical Multimedia Systems and Applications Steven Hicks1 / Michael Riegler1, Pål Halvorsen1, Klaus Schoeffmann2 2 Institute of Information Technology Klagenfurt University, Austria 1 Simula Research Laboratory Norway
  • 2. • Introduction & Overview • Multimedia Data in Medicine • Characteristics of Endoscopic Video • Different Fields and Communities • Application 1: Post-Procedural Usage of Surgery Videos • Domain-Specific Storage for long-term Archiving • Medical Video Content Analysis and Datasets • Medical Video Interaction • Application 2: Diagnostic Decision Support and Case Studies • Knowledge Transfer • Analysis • Feedback • Explainability and Trust • Conclusions & Outlook Agenda ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 2
  • 3. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 3 Notice This presentation contains images and videos from medical surgeries, which you may find disturbing!
  • 4. Introduction ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 4
  • 5. Medical inspections/interventions produce many kinds of data • Medical text • OR reports, Patient records… • Sensor signals • ECG, EEG, vital signs • Medical images (radiology) • Ultrasound, x-ray • CT, MRI, PET, … • Medical video • Screenings • Surgery Multimedia Data in Medicine ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 5  Signal Processing  Medical Imaging  Robotics  Multimedia  Data Mining
  • 6. • Traditional open surgery? • Minimally-invasive surgery • Interventions with endoscopes • Reduced trauma for patient • Less invasive and faster • Less rehabilitation time • Microscopic surgery Video Data Sources in Medicine ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 6
  • 7. Therapeutic Endoscopy ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 7 • Rigid endoscope • Small incisions • Therapy / Surgery • Laparoscopy • Cholecystectomy • Gynecological Surgery • Urological Surgery • … • Arthroscopy • …
  • 8. Diagnostic Endoscopy ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 8 • Flexible endoscope • Natural orifices • Diagnosis / Inspections • Gastroenterology (colonoscopy, gastroscopy) • Bronchoscopy • Hysteroscopy • … • WCE (Wireless capsule endoscopy)
  • 9. Endoscopic Video Examples ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 9
  • 10. Domain-specific Characteristics & Challenges ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 10 • Full HD or 4K (even stereo 3D) • One shot recordings • Up to multiple hours • Homogenous color distribution • Visually very similar content • Circular content area • Fast motion • Geometric distortion • Specular reflections • Occlusions • Smoke, motion blur, blood, flying particles • Size!
  • 11. Literature Overview ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 11 Münzer, Bernd, Klaus Schoeffmann, and Laszlo Böszörmenyi. "Content-based processing and analysis of endoscopic images and videos: A survey." Multimedia Tools and Applications (2017): 1-40.
  • 12. Pre-Processing • Image Enhancement • Contrast enhancement, color misalignment correction… • Camera calibration and distortion correction • Specular reflection removal • Comb structure removal & super resolution • … • Information Filtering • Frame filtering • Image segmentation ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 12
  • 13. Real-time Support at Intervention Time Applications  Diagnosis support  Robot-assisted surgery  Context awareness  Augmented reality ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 13
  • 14. Post-Procedural Applications Management and Retrieval • Compression and storage • Content-based retrieval • Temporal video segmentation • Video summarization • Visualization & Interaction Quality Assessment  Skills assessment  Education & Training  Error Rating  Assessment of intervention quality ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 14
  • 15. Post-Procedural Use of Surgery Videos ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 15
  • 16. • Video documentation of endoscopic procedures is on the rise • “a picture paints a thousand words“, a moving picture paints millions! • In some countries even mandatory already • Current documentation practice poses many problems • Hard task to retrieve relevant information • Huge amounts of storage space • High ratio of irrelevant data (“rubbish”) • Very inefficient encoding (especially for HD content) Motivation for Video Documentation ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 16
  • 17. • Later inspection of specific moments • Discussion of critical moments (e.g., with OP team) • Information to patients • Preparation of future interventions • Forensics & investigations (e.g., comparisons) • Training & teaching • Surgical quality assessment (technical errors) Post-Procedural Use of Surgical Videos ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 17
  • 18. Full Storage of Endoscopic Videos • Exemplary hospital • 5 departments (Lap, Gyn, Arthro, GI, ENT) • 2 operation rooms, each 4 ops/day, each op ca. 1-2h •  i.e. 40 interventions per day, each ~ 90 mins. • 60 hours video per day! • Assumption: HD 1920x1080, H.264/AVC • 270 GB / day (1h=4.5 GB) • 1.9 TB / week • 100 TB / year (200 TB MPEG-2) 4K: even more ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 18 Great challenge for a hospital’s IT department!
  • 19. How to Reduce Storage Requirements? Exploit domain-specific characteristics: 1. Spatial compression optimization 2. Temporal compression optimization 3. Perceptual quality based optimization 4. Long-term archiving strategy Transcoding ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 19 up to 30% up to 40% up to 93%
  • 20. Study on Video Quality • Subjective quality assessment • Catharina Hospital Eindhoven, NL • 37 participants • 19 experienced surgeons and 18 trainees • 7 women, 30 men, average age: 40 years • Subjective tests regarding maximum compression 1) Perceivable quality loss • Double-Stimulus (ITU-R BT.500-11) • Switch between reference and test video 2) Perceivable semantic information loss • Single Stimulus (ITU-R P.910) • Assessing random videos (incl. reference) Münzer, B., Schoeffmann, K., Böszörmenyi, L., Smulders, J. F., & Jakimowicz, J. J. (2014, May). Investigation of the impact of compression on the perceptional quality of laparoscopic videos. In 2014 IEEE 27th International Symposium on Computer-Based Medical Systems (pp. 153-158). IEEE. Session 1 Session 2 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 20
  • 21. Assessment of Video Quality (Session 1) -5 0 5 10 15 20 25 30 35 0 3000 6000 9000 12000 15000 18000 21000 24000 20 22 24 26 28 18 20 22 24 26 18 18 DifferenceMeanOpinionScore(DMOS) Bitrate(Kb/s) Test Conditions Average bitrate Rating difference 1920x1080 1280x720 960x540 640x360 subjectively better than reference Reference video (MPEG-2, HD, 20 (35) Mbit/s) “lossless” ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 21 crf (constant rate factor)
  • 22. Assessment of Video Quality (Session 2) 1. Visually lossless with 8 Mbit/s Q1 (in comparison to 20 Mbit/s) Reduction: 60% data vs. 0% MOS 2. Good quality with 2,5 Mbit/s and Q2 reduced resolution (1280x720) Reduction: 88% data vs. 7% MOS 3. Acceptable quality with 1,4 Mbit/s Q3 and lower resolution (640x360) Reduction: 93% data vs. 31% MOS 1 2 3 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 22
  • 23. Example Videos 1280x720 Weak compression 16 MB (crf 18) 640x360 Strong compression 0,8 MB (crf 26) 20x ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 23
  • 24. Medical Video Analysis ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 24
  • 25. • With several hours of videos each day • Manual search in archive becomes impractical! • Automatic content analysis • Filter for relevant scenes in the videos • Anatomical structures • Surgical actions • Instruments • Operation phases • Irregular/Adverse events • … • Content classification (e.g., with neural networks) • Video Retrieval/Interaction systems Medical Videos ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 25 Suture Cutting Injection Coagulation? ? ? ? ?
  • 26. 1000 frames (sampled from 17min with 1fps) 2 6 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications
  • 27. Content Relevance Filtering / Instrument Recognition ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 27 Münzer, B., Schoeffmann, K., & Böszörmenyi, L. (2013, December). Relevance segmentation of laparoscopic videos. In Multimedia (ISM), 2013 IEEE International Symposium on (pp. 84-91). IEEE. Primus, M. J., Schoeffmann, K., & Böszörmenyi, L. (2015, June). Instrument classification in laparoscopic videos. In Content-Based Multimedia Indexing (CBMI), 2015 13th International Workshop on (pp. 1-6). IEEE. Instrument detection/segmentation for better content understanding (e.g., op phase segmentation, following instruments in robot-assisted surgery) Out-of-patient Scenes Blurry Scenes Border Area
  • 28. Smoke Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 28
  • 29. Smoke Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 29 Cauterization in 90% surgeries Instruments: Laser or HF (100° - 1200° C) filtration system (manual)  Automatic Smoke Detection & Removal? (Real-Time)
  • 30. Automatic Smoke Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 30 Achievable Performance with Saturation Peak Analysis (SPA)
  • 31. Automatic Smoke Detection - Performance ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 31 20K images (DS A) 10K images (DS A) 4.5K images (DS B) SPA: Saturation Peak Analysis GLN RGB: GoogLeNet using RGB images GLN SAT: GoogLeNet using saturation channel only Deep Learning
  • 32. Real-Time Smoke Detection Prototype ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 32 Andreas Leibetseder, Manfred J. Primus, Stefan Petscharnig, and Klaus Schoeffmann. “Image-based Smoke Detection in Laparoscopic Videos“. Proceedings of Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures: 4th International Workshop, CARE 2017, and 6th International Workshop, CLIP 2017, held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, September 14, 2017, pp. 70-87
  • 33. Surgical Action Classification ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 33
  • 34. Gynecologic Laparoscopy: Relevant Surgical Actions ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 34 Dissection– 58 Segs / 35.517 Pics Coagulation– 212 Segs / 84.786 Pics Cutting cold – 271 Segs / 26.388 Pics Cutting– 106 Segs / 92.653 Pics Hysterectomy– 25 Segs / 68.466 Pics Injection– 52 Segs / 52.355 Pics Suturing– 92 Segs / 321.851 PicsSuction & Irrigation – 173 Segs / 73.977 Pics 1.105 segments (823.000 frames) 9h annotated video of 111 interventions 10-fold cross-validation Stefan Petscharnig and Klaus Schoeffmann. 2018. Learning Laparoscopic Video Shot Classification for Gynecological Surgery. Multimedia Tools and Applications (MTAP), 77, 7, Springer US, 8061- 8079.
  • 35. Gynecologic Laparoscopy: Surgical Actions Classification ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 35 R...Recall P...Precision
  • 36. • Early fusion • Integrate motion information from consecutive frames • Feed into CNN as additional input channel(s) • Compare two approaches • Block-Based Motion Estimation (BBME): using block matching • Residual Motion (ResM): local motion • Late fusion • Assume we already know scene boundaries and classify all frames of segments • Temporal aggregation of single-frame classifications • Majority vote (maximum occurrence of class in frames of scene) • Average confidence Fusing Temporal Information with CNNs ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 36 S. Petscharnig, K. Schöffmann, J. Benois-Pineau, S. Chaabouni and J. Keckstein, "Early and Late Fusion of Temporal Information for Classification of Surgical Actions in Laparoscopic Gynecology," 2018 IEEE 31st International Symposium on Computer-Based Medical Systems (CBMS), Karlstad, 2018, pp. 369-374.
  • 37. Gynecologic Laparoscopy: Surgical Actions Classification ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 37 Petscharnig, S., & Schöffmann, K. (2017). Learning laparoscopic video shot classification for gynecological surgery. Multimedia Tools and Applications, 1-19.
  • 38. Instrument Segmentation/Recognition ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 38
  • 39. Instrument Segmentation/Recognition ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 39 INPUT Video recordings of laparoscopic procedures in gynecology OUTPUT Position and category of each instrument in the video
  • 40. • Use a region-based CNN for 1. Binary instrument segmentation • distinguish between instrument instances and background (without recognizing the actual instrument) 2. Multi-class instrument recognition • Labeling different instrument segments • We approach this task by using • Mask R-CNN • Very small dataset (only about 50 examples/instrument; 12 classes) • Several data augmentation techniques Surgical Instrument Segmentation/Recognition ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 40 Sabrina Kletz, Klaus Schoeffmann, Jenny Benois-Pineau, and Heinrich Husslein. 2019. Identifying Surgical Instruments in Laparoscopy Using Deep Learning Instance Segmentation. Proceedings of the International Conference on Content-Based Multimedia Indexing (CBMI 2019). IEEE, Los Alamitos, CA, USA, 6 pages
  • 41. Instrument Segmentation/Recognition: Dataset ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 41 11 different instrument types and one class covering unspecified instruments.
  • 42. • Settings • Training from scratch and transfer learning from COCO dataset • 60/20/20 split for training, validation, and test • SGD as optimizer, different LR={0.01, 0.001, 0.0001} • Evaluation • Average precision with IoU (Jaccard index) for every instance • with ground truth G and the detected region D • COCO metrics • Average precision with different thresholds • AP50 and AP50:95 Instrument Segmentation/Recognition: Experimental Setup ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 42 𝐼𝑜𝑈 = 𝑇 ∩ 𝐷 𝑇 ∪ 𝐷
  • 43. Instrument Segmentation/Recognition: Quantitative Results ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 43
  • 44. Quantitative Results of Multi-Class Segmentation ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 44 Classification performance after 50th epoch
  • 45. Instrument Segmentation/Recognition: Qualitative Results ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 45 Sabrina Kletz, Klaus Schoeffmann, Jenny Benois-Pineau, and Heinrich Husslein. 2019. Identifying Surgical Instruments in Laparoscopy Using Deep Learning Instance Segmentation. Proceedings of the International Conference on Content-Based Multimedia Indexing (CBMI 2019). IEEE, Los Alamitos, CA, USA, 6 pages
  • 46. Medical Video Datasets ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 46
  • 47. LapGyn4: Laparoscopic Gynecology Dataset ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 47 Surgical Actions (~31K images) Anatomical Structures (~3K images) Andreas Leibetseder, Stefan Petscharnig, Manfred Jürgen Primus, Sabrina Kletz, Bernd Münzer, Klaus Schoeffmann, and Jörg Keckstein. 2018. Lapgyn4: a dataset for 4 automatic content analysis problems in the domain of laparoscopic gynecology. In Proceedings of the 9th ACM Multimedia Systems Conference (MMSys '18). ACM, New York, NY, USA, 357-362. Instrument Count (~22K images) Suturing on Anatomy (~1K images) • Over 57,000 images • 500+ surgeries • Baseline Evaluations: GoogleNet • 5-fold cross validation over 100 epochs
  • 48. • Dataset with annotations of endometriosis • benign but potentially painful anomaly affecting females in child-bearing age • Dislocation of uterine-like tissue; cicatrization and enclosed bleedings • Serious and painful disease • Often hard to diagnose GLENDA: Gynecologic Laparoscopy Endometriosis Dataset ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 48
  • 49. • Many of which show endometriosis cases of varying severity • Pathology: peritoneum, ovary, uterus, deep infiltrated endometriosis (DIE) • No pathology • Region-based and temporal expert annotations • hand-drawn sketches GLENDA Dataset – 25682 Frames from 400+ Surgeries ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 49 Andreas Leibetseder, Sabrina Kletz, Klaus Schoeffmann, Simon Keckstein, and Jörg Keckstein. 2020. GLENDA: Gynecologic Laparoscopy Endometriosis Dataset. Proceedings of the 26th International Conference on Multimedia Modeling 2020 (MMM2020). Lecture Notes in Computer Science, Springer International Publishing, Cham, 12 pages. to appear http://www.itec.aau.at/ftp/datasets/GLENDA/
  • 50. GLENDA Dataset – Endometriosis Examples ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 50
  • 51. • Cataract-101 • Videos recorded from 101 cataract surgeries in 2017 and 2018 • Only surgeries without any serious complications • Comes with phase segmentation ground truth (11 phases) Cataract-101 Video Dataset ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 51 Klaus Schoeffmann, Mario Taschwer, Stephanie Sarny, Bernd Münzer, Manfred Jürgen Primus, and Doris Putzgruber. 2018. Cataract-101: video dataset of 101 cataract surgeries. In Proceedings of the 9th ACM Multimedia Systems Conference (MMSys '18). ACM, New York, NY, USA, 421-425. http://www-itec.aau.at/ftp/datasets/ovid/cat-101/
  • 52. Classification of Cataract OP Phases ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 52 Manfred J. Primus, Doris Putzgruber-Adamitsch, Mario Taschwer, Bernd Münzer, Yosuf El-Shabrawi, Laszlo Böszörmenyi, and Klaus Schoeffmann. 2018. Frame-Based Classification of Operation Phases in Cataract Surgery Videos. In Proceedings of the 24th International Conference on Multimedia Modeling 2018 (MMM2018). Lecture Notes in Computer Science, vol 10704, Springer, Cham, 241-253.
  • 53. Typical instruments used in Cataract surgery: • Primary incision knife (pik) • Secondary incision knife (sik) • Katena forceps (kf) • Capsulorhexis forceps (cf) • Cannula (c) • 27 gauge cannula (27gc) • Phacoemulsifier handpiece (ph) • Spatula (s) • Irrigation/aspiration handpiece (iah) • Implant injector (ii) Cataract Instrument Recognition (Cat-101 Dataset) ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 53 pik kf + sik cf c 27gc ph s iah ii
  • 54. • Classification Study • 26 randomly selected videos • Manually annotated 8000 frames for instrument usage (see next slide) • 800 frames for each of the 10 instruments (balanced) • Instrument classification (full frame) and generalization performance • ResNet-50, Inception v3, NASNet Mobile • Multi-label classification, loss=binary cross-entropy, bs=32, 50 epochs training from scratch • Tested with different settings (Adam optimizer, SGD, lrinit=0.1/0.01/0.001) Cataract Instrument Recognition (Cat-101 Dataset) ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 54 Natalia Sokolova, Klaus Schoeffmann, Mario Taschwer, Doris Putzgruber-Adamitsch, and Yosuf El-Shabrawi. 2020. Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos. Proceedings of the 26th International Conference on Multimedia Modeling 2020 (MMM2020). Lecture Notes in Computer Science, Springer International Publishing, Cham, 11 pages. to appear
  • 55. Medical Video Interaction Tools ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 55
  • 56. Past/Current Status ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 56 Patient names File Explorers & Segments to Download 2014 2009
  • 57. Desired Status ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 57 Bernd Münzer, Klaus Schoeffmann and Laszlo Boeszoermenyi. “EndoXplore: A Web-based Video Explorer for Endoscopic Videos“. Proceedings of the IEEE International Symposium on Multimedia 2017 (ISM 2017), Taipei, Taiwan, 2017, pp. 1-2
  • 58. Special Content Visualization ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 58
  • 59. • Clinicians check full video recordings for occurrence of technical errors: • Errors are rated according to standardized schemes (e.g., OSATS, GERT) and surgeons are made aware of them • Studies have shown that this significantly improves surgical quality Surgical Quality Assessment (SQA) ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 59
  • 60. Surgical Quality Assessment (SQA) ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 60
  • 61. Surgical Quality Assessment (SQA) Software ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 61 • Integrating rating features • More efficient video navigation/browsing Marco A. Hudelist, Heinrich Husslein, Bernd Muenzer, Sabrina Kletz and Klaus Schoeffmann. “A Tool to Support Surgical Quality Assessment“, in Proceedings of the Third IEEE International Conference on Multimedia Big Data (BigMM), Laguna Hills, CA, USA, 2017, pp. 238-239.
  • 62. (Diagnostic) Decision Support ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 62
  • 63. Challenges and Requirements ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 63
  • 64. There is a Need for Complete Systems! anomalies are missed detection depends on experience there is a lack of medical personnel for large scale screening programs ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 64
  • 65. There is a Need for Complete Systems! Medical knowledge transfer Automated analysis / detection / classification Feedback / visualization & administrative ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 65
  • 66. • Medical knowledge transfers – need DATA w/Ground Truth • High detection accuracy • Fast and efficient: real-time feedback and large scale • Fit the normal examination procedures • Assist administrative and report writing work • Adhere to ethical, legal, privacy challenges & regulations Key Challenges & Requirements ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 66
  • 67. Gastrointestinal (GI) Case Study (challenges, system support, datasets, diagnostic decision support, ...) ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 67
  • 68. • Many types of diseases can potentially affect the human gastrointestinal (GI) tract • about 2.8 millions of new luminal GI cancers (esophagus, stomach, colorectal) are detected yearly • the mortality is about 65% • Screening of the GI tract using different types of endoscopy… • is costly (colonoscopy according to NY Times: $1100/patient, $10 billion dollars) • consumes valuable medical personnel time (1-2 hours) • does not scale to large populations • is intrusive to the patient • … • Current technology may potentially enable automatic algorithmic screening and assisted examinations  a true interdisciplinary activity with high chances of societal impact GI Tract Challenges and Potential ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 68
  • 69. Colorectal Cancer Women Men Colorectal cancer is the third most common cause of cancer mortality for both women and men, and it is a condition where early detection is important for survival, i.e., a 5-year survival probability of going from a low 10-30% if detected in later stages to a high 90% survival probability in early stages. Colonoscopy is not the ideal screening test. Related to the cancer example, on average 20% of polyps (possible predecessors of cancer) are missed or incompletely removed. The risk of getting cancer largely depend on the endoscopists ability to detect and remove polyps. Large inter- and intra-clinician variations. A 1% increase in detection can decrease the risk of cancer with 3%. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 69
  • 70. Automatic Detection of Anomalies Colonoscopy & Gastroscopy ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 70
  • 71. • A polyp is an abnormal growth of tissue attached to the underlying mucosa • Detection accuracy depends on experience and skills • average miss rates of approx. 20% • large inter- and intra-variations (e.g., a norwegian study shows variations between 36-65% for polyps) • should reach a high (>85%) accuracy threshold to be acceptable • Current technology may potentially enable automated algorithmic assisted examinations • Introduce a digital “third eye” (with high accuracy and real-time processing) Standard endoscopy: Live Polyp Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 71
  • 72. A complete System ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 72
  • 73. System Overview ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 73
  • 74. Medical Knowledge Transfer (Data Collection) ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 74
  • 75. Available GI Datasets Name Contain Annotation Size Type Usage CVC-ClinicDB Polyps GT masks 12000 images (several versions) Trad. ©, by permission ETIS-Larib Polyp DB Polyps, Normal GT masks 1500 images Trad. ©, by permission ASU-Mayo Clinic DB Polyps, Normal GT masks 20 videos Trad. ©, by permission Colonoscopy Videos DB Various Lesions Sorted 76 videos Trad. Academic Capsule Endoscopy DB Various Lesions and Findings Sorted 3170 images, 47 videos VCE Academic, by request GastroAtlas Various Lesions and Findings Sorted, Text annotations 4449 videos Trad. Academic WEO Atlas Various Lesions and Findings Sorted, Text annotations ? Trad. Academic GASTROLAB Various Lesions and Findings Sorted, Text annotations ? Trad. Academic Atlas of GE Various Lesions Sorted, Text annotations 669 images Trad. ©, by permission KID Various Lesions Sorted 2500 + 47 videos Trad. ©, by permission ASU-Mayo dataset: POLYPS • 20 videos • 10 with polyps, 10 without • 8-64 seconds long • varying resolution • ~18.000 frames/images • image mask of polyp (ground truth) • (currently) restricted use CVC: POLYPS • CVC-356 – 356 polyp images, 1350 normal frames • CVC-612 – 612 polyp images, 1350 normal frames • CVC-968 – 968 polyp images, 1350 normal frames • CVC-12K – 10025 polyp images, 1929 normal frames • image mask of polyp (ground truth) • (currently) restricted use Need more data to transfer the medical knowledge, and thus tools … ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 75
  • 76. • Which image is not from the same class? … and it gets worse … • Making a mistake between cats and dogs may not matter, but a misclassification here may have lethal consequences Why Can’t CS People Do the Annotation!? PylorusZ-line Z-line Z-line Z-line Z-line ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 76
  • 77. Available time of the clinicians? ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 77
  • 78. • Simple and efficient • Web-based • Assisted object tracking Video Annotation Subsystem "Expert Driven Semi-Supervised Elucidation Tool for Medical Endoscopic Videos" Zeno Albisser, et. al. Proceedings of MMSys, Portland, OR, USA, March 2015 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 78
  • 79. • For large collection of images • VV / Kvasir dataset • Fully cleaned • Feature extraction mechanisms • Different unsupervised clustering algorithms • Hierarchical image collection visualization • Open source: ClusterTag https://bitbucket.org/mpg_projects/clustertag ClusterTag: Image Clustering and Tagging Tool "ClusterTag: Interactive Visualization, Clustering and Tagging Tool for Big Image Collections" Konstantin Pogorelov, et. al. Proceedings of ICMR, Bucharest, Romania, June 2017 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 79
  • 80. • Still need even more efficient tools and data of entire procedures 1. “Annotation” during examination 2. Video with bookmarks 3. Annotate bookmarks 4. Automatically annotate neighboring frames using object tracking – and verify Next version of the annotation tool ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 80
  • 81. • Multi-Class Image Dataset for Computer Aided GI Disease Detection • GI endoscopy images • Some images contain the position and configuration of the endoscope (scope guide) • 8 different anomalies and anatomical landmarks • v1: 500 images per class, 6 pre-extracted global features • v2: 1000 images per class • v3: 16 classes, multi-label – to be released soon • Open source: http://datasets.simula.no/kvasir/ The Kvasir Dataset "Kvasir: A Multi-Class Image-Dataset for Computer Aided Gastrointestinal Disease Detection" Konstantin Pogorelov, et al. Proceedings of MMSYS, Taiwan, June 2017 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 81
  • 82. • Bowel Preparation Quality Video • 21 GI endoscopy videos of colon • Some frames contain the position and configuration of the endoscope (scope guide) • 4 classes showing the four-score Boston Bowel Preparation Scale (BBPS)-defined bowel-preparation quality • 0 - very dirty • … • 3 - very clean • Open source: http://datasets.simula.no/nerthus/ The Nerthus Dataset "Nerthus: A Bowel Preparation Quality Video Dataset" Konstantin Pogorelov, et al. Proceedings of MMSYS, Taiwan, June 2017 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 82
  • 83. • Kvasir does not contain segmentation masks • 1000 accurate pixel-accurate masks of the polyps in Kvasir • Some similar datasets exist (e.g., CVC-356, CVC-612, ETIS-Larib Polyp DB), but small, restricted, etc. • http://datasets.simula.no/kvasir-seg/ The Kvasir-SEG Dataset ”Kvasir-SEG: A Segmented Polyp Dataset" Debesh Jha, et al. Proceedings of MMM, Korea, January 2020 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 83
  • 84. GI Anomaly Detection System ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 84
  • 85. • Common approaches • Handcrafted features • Convolutional neural network • Generative Adversarial Networks • Easy to extend with new diseases • Easy to extend with new algorithms • Easy to train • Results are explainable? • Disease Localization? • Real-time? Requirements Detection and Automatic Analysis subsystem ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 85
  • 86. State-of-The-Art: Example Detection Systems – 5 years ago Polyp-Alert • detects polyps using edges and texture • near real-time feedback during colonoscopy (10fps) • detected 97.7% (42 of 43) of polyp shots on 53 randomly selected (not per frame detection) • one of the few end-to-end systems • Wallapak Tavanapong – from MM community 100s of new approaches the last years, many with good detection results… ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 86
  • 87. Performance (accuracy and speed) ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 87
  • 88.  Mayo dataset (18781 images/frames)  masks for all polyps • GF: • JCD and Tamura • recall 98.50%, precision 93.88%, fps ~300 • CNN: • Modified Inception v3: recall 95.86%, precision 80.78%, fps: ~30 • Inception v3 + WEKA: recall: 88.87%, precision: 89.16%, fps: ~30 ASU Mayo Dataset: Polyp Detection ”EIR - Efficient Computer Aided Diagnosis Framework for Gastrointestinal Endoscopies" Michael Riegler, et. al. Proceedings of CBMI, Bucharest, Romania, June 2016 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 88
  • 89. • Resource consumption and processing performance of GF: • CNNs (also including GPU support)? • tests so far: ~30 fps (same GPU as above) • but adding layers, more networks, … !?? (newer GPU) • Inception v3: 66 fps, plain CNN: ~40-45 fps • GAN: ~12 fps (for 160x160) ASU Mayo Dataset: Polyp Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 89
  • 90. • Vestre Viken (VV) multi-disease dataset (250 images per class) • GF: • recall 90.60 % • precision 91.40% • fps ~30 • CNN: • recall: 87.20% • precision: 87.90% • fps: ~30 VV Dataset: Multi-Disease Detection ""Efficient disease detection in gastrointestinal videos - global features versus neural networks" Konstantin Pogorelov, et. al. Multimedia Tools and Applications, 2017 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 90
  • 91. • GF • CNN VV Dataset: Multi-Disease Detection ""Efficient disease detection in gastrointestinal videos - global features versus neural networks" Konstantin Pogorelov, et. al. Multimedia Tools and Applications, 2017 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 91
  • 92. • 7 different algorithms • Convolutional neural networks (CNN) (2) – trained from scratch • 3-layers • 6-layers • Transfer learning (1) – retrained Inception v3 • Global features (4) • 2 global features (JCD, Tamura) • 6 global features (JCD, Tamura, Color Layout, Edge Histogram, Auto Color Correlogram and PHOG) • 2 different algorithms (Random forest and logistic model tree) • 2 baselines • Random Forrest with one global feature • Majority class • 2-folded cross validation Kvasir Dataset v1: Multi-Disease Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 92
  • 93. Kvasir Dataset v1: Multi-Disease Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 93
  • 94. Kvasir Dataset v1: Multi-Disease Detection DyedandLiftedPolypDyedResectionMargin ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 94
  • 95. Kvasir Dataset v1: Multi-Disease Detection CecumPylorus ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 95
  • 96. • Using same GF and some new deep features, i.e., • Pre-trained ImageNet dataset Inception v3 • ResNet50 models • Used different ML classifications; • random tree (RT) • random forest (RF) • logistic model tree (LMR) – performed best • Uses weights of 1000 pre-defined concepts as features • Top layer input as features vector (16384 for Inception v3 and 2048 for ResNet50) Kvasir Dataset v1  v2: Multi-Disease Detection Pretrained model Output or top- layer input weights WEKA for classification ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 96
  • 97. • Multiclass: 16 classes of anomalies and landmarks • Very varying dataset sizes for the different classes • Combination of retrained networks Kvasir Dataset v2  v3: Multi-Disease Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 97
  • 98. • MICCAI • … • Medico @ MediaEval • BioMedia @ ACM MM Competitions Team Approaches F1 FPS SCL-UMD Global-features and deep-features extraction, Inception-V3 and VGGNet CNN models, followed by machine-learning-based classification using RT, RF, SVM and LMR classifiers 0.848 1.3 FAST-NU-DS Global and local features combined followed by data size reduction by applying K-means clustering and than using logistic regression model for the classification 0.767 2.3 ITEC-AAU Two different custom Inception-like CNN models 0.755 1.4 HKBU A manifold learning method (bidirectional marginal Fisher analysis) learning a compact representation of the data, then machine-learning-based multi-class support vector machine is used for the classification 0.703 2.2 SIMULA GF-features extraction, ResNet50 and Inception-V3 CNN models and followed by machine-learning-based classification using RT, RF and LMR classifiers 0.826 46.0 Team and Run Name F1 MCC Average Processing Speed HCMUS 0,934236452 0,931232439 Fastenough S@M (Simula) 0,929733339 0,928383755 LesCats (Simula) 0,923640116 0,922827982 RUNE (Simula) 0,855590739 0,855590694 UMM-SIM_detection_InResV2-Van_3712 0,836795839 0,836636058 ParaNoMundo_detection_kt12dense201_3808 0,811417906 0,814635359 AAUITEC_detection_LSVM-comb2_5293 0,866259873 0,864100277 SIMULA_detection_run1_5293 0,814535427 0,811510687 FASTNUCES_detection_ver1_300 0,586802677 0,602579617 NOAT_detection_1_5293 0,391347034 0,390125827 HKBU_detection_1_5293 0,482962822 0,460894862 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 98
  • 99. Compared: • Handcrafted global features (GF-D) using LIRE • Retrained and fine tuned existing DL architectures (RT-D) • Generative adversarial network (GAN) • Combined various datasets captured by different equipment in different hospitals. • With our best working GAN-based detection approach, • we reached detection specificity of ~94% and accuracy of ~90% with only 356 training and 6,000 test samples, slightly better if increasing training size • though a bit too many false positives (a bit low sensitivity) The Next Level: Comparing Handcrafted and Deep Learning Features – Cross Datasets ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 99
  • 100. Detecting Bowel Cleanness Levels ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 100
  • 101. • 7 different algorithms • Convolutional neural networks (CNN) (2) – trained from scratch • 3-layers • 6-layers • Transfer learning (1) – retrained Inception v3 • Global features (4) • 2 global features (JCD, Tamura) • 6 global features (JCD, Tamura, Color Layout, Edge Histogram, Auto Color Correlogram and PHOG) • 2 different algorithms (Random forest and logistic model tree) • 2 baselines • Random Forrest with one global feature • Majority class • 2-folded cross validation Nerthus Dataset: Bowel Cleanness Level ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 101
  • 102. Nerthus Dataset: Bowel Cleanness Level ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 102
  • 103. Nerthus Dataset: Bowel Cleanness Level ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 103
  • 104. Localization / Segmentation ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 104
  • 105. • Detection first, then process only frames containing polyps • Image enhancements • Detects curve-shaped objects and local maximums • Builds energy map and selects 4 possible locations • Localization performance: • recall 31.83 %, • precision 32.07% • ~30 fps • later better GPU: ~75 fps (detection: 300 fps ; localization 100 fps) ASU Mayo Dataset: Polyp Localization ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 105
  • 106. • Can we generate the full mask, not only pointing to one of the affected pixels? • Extended and improved the ResUNet architecture and compared to several other segmentation systems Kvasir-SEG: Generate the anomaly mask Input Conv2D (3х3) BN ReLU Conv2D (3х3) Addition Squeeze & Excite Atrous Spatial Pyramidal Pooling (ASPP) BN ReLU Upsampling Attention Conv2D (3х3) BN ReLU Conv2D (3х3) Conv2D (3х3) ReLU Addition Squeeze & Excite BN Conv2D (3х3) ReLU BN Conv2D (3х3) ReLU Addition Squeeze & Excite BN Conv2D (3х3) ReLU BN ASPP Outputs Conv2D (1х1) Sigmoid Concatenate Addition BN ReLU Upsampling Attention Conv2D (3х3) BN ReLU Conv2D (3х3) Concatenate Addition BN ReLU Upsampling Attention Conv2D (3х3) BN ReLU Conv2D (3х3) Concatenate Addition Encoding Decoding ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 106
  • 107. • Can we generate the full mask, not only pointing to one of the affected pixels? • Extended and improved the ResUNet architecture and compared to several other segmentation systems: • U-Net • ResUNet • ResUNet-mod • ResUNet++ Kvasir-SEG: Generate the anomaly mask ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 107
  • 108. • Can we generate the full mask, not only pointing to one of the affected pixels? Kvasir-SEG: Generate the anomaly mask Trained and tested on Kvasir-SEG: Original Ground truth UNet ResUNet ResUNet-mod ResUNet++ ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 108
  • 109. • Can we generate the full mask, not only pointing to one of the affected pixels? Kvasir-SEG: Generate the anomaly mask Trained on CVC-612 and tested on Kvasir-SEG: Original Ground truth UNet ResUNet ResUNet-mod ResUNet++ ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 109
  • 110. Preprocessing & Augmentation ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 110
  • 111. • Too little data!! • Blurry images due to camera motion • Objects too close to camera • Under or over scene lighting • Flares • Artificial objects and natural “contaminations” • Low resolution of capsular endoscopes • … Data Challenges: Preprocessing ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 111
  • 112. Data Enhancements for CNN Training ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 112
  • 113. Data Enhancements for CNN Training ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 113
  • 114. • Artifacts in the images can influence the algorithm • Understanding of what the algorithm reacts to is crucial Borders and Overlays ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 114
  • 115. • Results on Kvasir + CVC-986 • Accuracy improved for almost all models with some preprocessing (F1 from 0.7% to 4.4%) Borders and Overlays ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 115
  • 116. • Replacing artifacts in the video/image • Different methods • Clipping • Autoencoders • Contextencoder • Context Conditional (CC)-GAN • Some difference but marginal GAN inpainting of Navigation Box ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 116
  • 117. Automatic Detection of Angiectasia Video Capsule Endoscopy ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 117
  • 118. Video Capsule (PillCam)  Standard colonoscopy:  expensive  does not scale  intrusive ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 118
  • 119. Video Capsule (PillCam)  Standard colonoscopy:  expensive  does not scale  intrusive  Wireless Video Capsule endoscopy:  better scale  less intrusive  possible to combine examinations!?  watch hours of video  less expensive? (detection might lead to an endoscopy) ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 119
  • 120. • Angiectasia is a vascular lesions that can cause of GI bleedings • Medical specialists reach a detection accuracy of about 69% • Medical systems should reach an 85% threshold to be acceptable in clinical use Angiectasia Detection ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 120
  • 121. Angiectasia Detection: Varying Difficulty ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 121
  • 122. • By far, GANs give the best detection: • sensitivity: 98% • specificity: 100% • BUT, sloooooow… • Several approaches are better than the average doctor (69%) • Most of the approaches have a too low detection rate, but still better than the baseline • Compromise between accuracy and speed Detection Compared • VCE dataset from GIANA 2017 (300 with angiectasia and 300 without) • 10-fold cross validation ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 122
  • 123. Detection Feedback ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 123
  • 124. Detection Subsystem Outputs • Visualize the output of the system to the medical doctors • Simple and easy to understand (most important) • Easy to integrate in hospitals • Live support • Useable for automatic reports, etc. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 124
  • 125. Real-time Detection Feedback ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 125
  • 126. Real-time Detection Feedback ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 126
  • 127. Increasing Understanding & Assisting Administrative Work ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 127
  • 128. • Understanding: A black box will not work – neither for patients nor clinicians • Reporting: Critical for communication and evidence, but a huge overhead • Inconsistent descriptions of abnormalities • Poor adoption of existing standards • Time consuming (up to 15 minutes or more) • Boring and reduced job satisfaction Understanding and Reporting ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 128
  • 129. Mimir: Reporting of Endoscopies • A way of interpreting the output of a neural network • deeper analysis of why the model produces a given result • class discriminatory visualizations based on selected class and layer. • tools for uploading and managing various models. • Automatic generation of modifiable medical reports • Produced Visualizations • grad-CAM technique • saliency and class activation maps ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 129
  • 130. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 130
  • 131. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 131
  • 132. Human Reproduction Case Study ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 132
  • 133. Why semen analysis? • Every year, over 45 million couples experience involuntary childlessness, with 40% of cases due in some part to male fertility problems • Semen analysis is one of the first procures done when determining infertility. • Current methods are either time- consuming and prone to human error, or require expensive laboratory equipment. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 133
  • 134. What is semen quality • When analyzing semen quality, we often look at multiple visual features of the spermatozoa (sperm) together with information about the patient. • The problem is that we know that patient parameters impact semen quality, but we don’t know how. • This is a true multimodal problem where our expertise could have great impact. • But right now, let’s look at some visual features which are commonly used to determine quality. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 134
  • 135. Sperm count ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 135
  • 136. Morphology is used to assess the shape and size of a sperm, focusing on the tail, midpiece and head. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 136
  • 137. Morphology Examples ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 137
  • 138. Motility is used to assess the movements of each sperm, the can be grouped into progressive, non-progressive and immotile. Non-progressive ImmotileProgressive ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 138
  • 139. Non-Progressive Example ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 139
  • 140. Progressive Example ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 140
  • 141. Immotile Example ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 141
  • 142. VISEM – A Multimodal Video Dataset of Human Spermatozoa • We have a dataset consisting of 85 microscopic videos of human semen, all from different participants. • Each video comes with a preliminary semen analysis done according to WHO standards. • The dataset also contains information about each participant (such as age and BMI), sex hormone levels of the participant, and some parameters extracted from existing sperm analysis machines. • Data is open-source and available at datasets.simula.no/visem. Low Mid- low Mid- high High ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 142
  • 143. How should we tackle this? • Determining the different visual features requires different approaches. • Morphology is more focuses on the spatial features of a frame, while motility requires the temporal dimension. • Not all videos are create equal, some videos are not properly focused or include fluid drift. • We must find clever solutions which incorporate the temporal information of the frames together with the participant-related data. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 143
  • 144. Baseline Approach • No other methods to directly compare. • To create a baseline, we calculate the ZeroR across our collected dataset. • Metrics used to measure performance is the mean absolute error (MAE) and the root mean squared error (RMSE). Morphology Baseline Motility Baseline ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 144
  • 145. Baseline Approach • No other methods to directly compare. • To create a baseline, we calculate the ZeroR across our collected dataset. • Metrics used to measure performance is the mean absolute error (MAE) and the root mean squared error (RMSE). Morphology Baseline Motility Baseline ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 145
  • 146. 3D Convolutional Neural Networks • Using multiple frame to predict quality using 3D convolutional neural networks. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 146
  • 147. Using an autoencoder to extract temporal features into images. • We use an autoencoder which takes multiple frames to extract temporal features into an RGB image. • Used to predict both morphology and motility. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 147
  • 148. Generate optical flow from the video frames. • Compress the temporal information of sequential frames by using sparse or dense optical flow. • Clearly see the movement of the different sperm across time. • Using synthetic sperm videos to accurately estimate optical flow using GANs. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 148
  • 149. MediaEval 2019 – Medico Multimedia Task • Three different tasks related to analyzing human semen. • Main task is predicting motility and morphology using the VISEM dataset. • 5 submissions using a variety of approaches. • Tune in next week and join the fun in 2020! ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 149
  • 150. Open challenges and future directions. • How should we combine the patient data with the visual features to better predict semen quality? • What data to use and how to include it. • Tracking individual sperm cells to find the “best” spermatozoon. • Combining semen analysis with embryo data to better understand the relationship between sperm and successful egg fertilization. • Next step… ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 150
  • 151. Embryo analysis and prediction • Analyzing time-lapse videos of embryo development. • Get a better understanding of early embryo development and the health of offspring. • Increase success rate of in vitro fertilization. • Ethical and legal challenges. ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 151
  • 152. Predicting Performance of Soccer Players ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 152
  • 153. Initial challenge: Logging and Monitoring ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 153
  • 154. pmSys: Reporting using a mobile app ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 154
  • 155. Coach Web Portal  See team overviews − all, averages − planned load  Send reminders  See individual views  Simple automatic “predictions” ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 155
  • 156. Would like to perform proper predictions! ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 156
  • 157. • 2008-09, Spanish 1st division: 24.360 player-days absent • 2017-19: • Premier league clubs paid £217m in wages to injured players • Manchester United has an average cost of £870.00 per injury (high salaries) • Champions Manchester City suffered the second fewest number of injuries • 2018-19 • Manchester City won PL with a minimum margin: 98 vs 97 points (2nd and 3rd highest ever) Important to find an optimal training regime, avoid injuries and pick the right players for the game Would like to perform proper predictions! ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 157
  • 158. • Recurrent neural networks – Long Short-Term Memory (LSTM) • handles the complexity of sequences – well-suited to classifying, processing and making predictions based on time series data • Motivating example: Airline passengers LSTM ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 158
  • 159. • Dataset • two professional Norwegian teams • data from 2017 and 2018 • 6000 days of reports • many parameters, but our initial experiments used “readiness to train” • LSTM • sequence numbers of 36 • 30 ephocs • batch size of 4 • 4 layers – input, 2 hidden, output • rmsprop optimizer • Model training • training and predicting on the same player • training on all players but one • Aim: detecting positive and negative peaks Initial Experiments ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 159
  • 160. Analyzing player data using LSTM (training on one player) Needs more data then training on just ONE player… Team 1 Team 2 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 160
  • 161. Analyzing player data using LSTM (training on all but one players) Predicting the positive and negative peaks with a precision and recall above 90% ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 161
  • 162. • If we manage to give good predictions… • better training • less injuries • better results • Challenges • enough data • detect all corner cases • making the users believe in the predictions Consequences and challenges ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 162
  • 163. So, MEDICAL MULTIMEDIA - all problems solved!!?? ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 163
  • 164. • Still improve accuracy and system performance 1. Full system integration 2. Exploiting domain expert knowledge – build datasets 3. Integration of various data, multi-modality – new sensors 4. Explainable AI 5. Patient context information 6. Visualization (AR/VR) 7. Decision support and administrative aids 8. … • The potential for real impact is HUGE!! • screening / diagnosis • personalized medicine • automatic treatment • improving exercise, rehabilitation and sport performance • autonomous and remote surgeries • … Open Challenges & Potential "Multimedia and Medicine: Teammates for Better Disease Detection and Survival" Michael Riegler, et. al. Proceedings ACM MM, Amsterdam, The Netherlands, October 2016 ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 164
  • 165. • We have given several case-specific examples, but in general, they are common • Doctors want to use all the data for general support: analysis, diagnostics, reporting, teaching, statistics, similarity search / comparisons, … • Currently, … • more and more high quality data is recorded / produced • data analysis methods are promising • multi modal data analysis is not very common • good visualization tools exist, but not used (e.g., AR, VR, …) • some tools are missing • many (other) areas produce separate (isolated) methods • … • and, we need a complete integrated system!  Our multimedia community is needed Summary ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 165
  • 166. ImageCLEF 2020 CLEF, 22-25 S eptem ber, Thes s aloniki, Greece http://www.im ag eclef.org/2020 #ImageCLEFlifelog2020# (4th edition) An increasingly wide range of personal devices that allow capturing pictures, videos, and audio clips for every m om ent of our lives, are becom ing available. In this context, the task addresses the problem s oflifelogging data retrievaland sum m arization. Organizers: Duc-Tien Dang -Nguyen (University of Bergen), Luca Piras (Pluribus One & University of Cagliari), MichaelR ieg ler & PålHalvorsen (S imula Research Laboratory), Minh-Triet Tran (University of S cience), Cathal G urrin (Dublin City University), Mathias Lux (Klagenfurt University). #ImageCLEFcoral2020# (2nd edition) The increasing use of structure-from -m otion photogram m etry for m odelling large- scale environm ents from action cam eras has driven the next generation of visualization techniques. The task addresses the problem of autom atically segm enting and labeling a collection of im ages that can be used in com bination to create 3D m odels forthe m onitoring ofcoralreefs. Organizers: Jon Cham berlain, Adrian Clark, & Alba G arcía Seco de Herrera (University of Essex), Antonio Cam pello (Wellcome Trust). #ImageCLEFmedical2020# (2nd edition) Medical im ages can be used in a variety of scenarios and this task will com bine the m ost popular m edicaltasks ofIm ageCLEF and continue the last year idea of m ixing various applications, nam ely: autom atic im age captioning and scene understanding, m edical visual question answering and decision support on tuberculosis. This allows to explore synergies between tasks. Organizers: Asm a Ben Abacha & Dina Dem ner-Fushm an (National Library of Medicine), Sadid A. Hasan, V ivek Datla & Joey Liu (Philips Research Cambridge), Obiom a Pelka & Christoph M. Friedrich (University of Applied S ciences and Arts Dortmund), Alba G arcía Seco de Herrera (University of Essex), Yashin Dicente Cid (University of Warwick), Serg e Kozlovski, V itali Liauchuk, & V assili Kovalev (United Institute of Informatics Problems), Henning Müller(HES -S O). #ImageCLEFdrawnUI2020# (new) Enabling people to create websites by drawing them on a piece of paper would m ake the webpage building process m ore accessible. The task addresses the problem of autom atically recognizing hand drawn objects representing website UIs, which willbe further translated into autom atic website code. Organizers: PaulBrie & Fichou Dim itri (teleportHQ), Mihai Dogariu, Liviu Daniel Ștefan, Mihai G abrielConstantin, & Bogdan Ionescu (University Politehnica of Bucharest). Contact on s ocial media Facebook https://www.facebook.com /Im ageClef Twitter https://twitter.com /im ageclef Im ageCLEF 2020 is an evaluation cam paign that is being organized as part ofthe CLEF (Conference and Labs ofthe Evaluation Forum ) labs. The cam paign offers several research tasks that welcom e participation from team s around the world. The results of the cam paign appear in the working notes, published by CEUR (CEUR -W S.org) and are presented in the CLEFconference. Selected contributions am ong the participants will be invited for publication in the following year in the Springer Lecture Notes in Com puter Science (LNCS), together with the annuallab overviews. Target com m unities involve (but are not lim ited to): information retrieval (e.g., text, vision, audio, m ultim edia, social m edia, sensor data), machine learning, deep learning, data mining, natural language processing, image and video processing; with special em phasis on the challenges of multi-modality, multi- linguality, and interactive search. Overall coordination Bogdan Ionescu, University Politehnica of Bucharest, R om ania Henning Müller, HES -S O, S ierre, Switzerland R enaud Pé teri, University of La Rochelle, France Important Dates (depending on tasks) end of April, 2020: registration closes; beginning of May, 2020: runs due; end of May, 2020: working notes due. #imageclef20 #clef2020ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 166
  • 167. The End… ACM Multimedia 2019 Tutorial Medical Multimedia Systems and Applications 167

Editor's Notes

  1. 280 references
  2. Objective structured assessment of technical skill (OSATS) The Generic Error Rating Tool: A Novel Approach to ... - NCBI
  3. Tja, … a question is whether the automatic system may actually achieve this….
  4. Ethical, legal, privacy challenges Patient privacy Miss-use of data Randomization and anonymization Decuple analysis from patient names Different laws in different countries Open-source and open-data is the way to go! According to a study from UC Irvin 2014, people are actually willing to share their data
  5. Costs US - The New York Times. 2013. The $2.7 Trillion Medical Bill.
  6. WHO 2012 CRC, most common cancer for men in Norway, second for women. Estimated age-standardized rates (World) per 100,000 – x out of 100,000 gets the disease http://globocan.iarc.fr/Pages/fact_sheets_cancer.aspx?cancer=colorectal
  7. Miss rates depend on study you cite – have also seen 24%
  8. We got 60.000 cleaned images from our medical partners, but we need ground truth. z-line - between esophagus and the stomach pylorus - stomach and small bowel / intestine
  9. Important message here: Keep it simple and give as much support as possible
  10. Implemented in Java/C++ Image operations: OpenCV Drawing: LWJGL (Lightweight Java Game Library) Database: custom, stored in binary additive file Features extraction using open-source LIRE USED FEATURES: JCD (CEDD+FCTH) & Tamura Global Features
  11. In Norse mythology, Kvasir was extremely wise man and was teaching and spreading knowledge Took a long time to collect
  12. In Norse mythology, Kvasir was extremely wise man and was teaching and spreading knowledge
  13. In Norse mythology, Kvasir was extremely wise man and was teaching and spreading knowledge
  14. Polyp-Alert: the software correctly detected 97.7% (42 of 43) of polyp shots on 53 randomly selected video files of entire colonoscopy procedures. However, Polyp-Alert incorrectly marked only 4.3% of a full-length colonoscopy procedure as showing a polyp when they do not. 
  15. composition of CEDD and FCTH features CEDD - Color and Edge Directivity Descriptor: Resistant to changes in lighting / Easy to compute FCTH - Fuzzy Color and Texture Histogram: Robust against deformation, noise and smoothing Tamura: Six visual features: Coarseness / Contrast / Directionality / Orientation of edges/ Linelikeness / Regularity / Roughness
  16. Konstantin har Titan X
  17. Inception v3 and ResNet50 are deep CNNs designed for computer vision and image recognition tasks. We using the models that are pre-trained on the subset of ImageNet image database used in Large Scale Visual Recognition Challenge and contains 1000 object categories (concepts). "Concept" = we just feed all the images to the models and using the model output (which is 1000 floating point numbers) as a feature vector with the following machine learning classification approach (the same as for global features). "Features” = remove the top layer of the models and feed all the images to the models and using the output of the layer-before-the-top-one (which is 16384 values for Inception v3 and 2048 for ResNet50 floating point numbers) as a feature vector with the following machine learning classification approach (the same as for global features). "TFL" = (Transfer Learning) = replace the top layer of pre-trained Inception v3 model by fully connected layer with random weights, then retraining only the this new top layer with high training rate and than retraining all the network with the low training rate to adapt it even better to the input data. Here the new top layer doing all the classification job instead of traditional machine learning. "CNN" = a custom CNN, train it from scratch. No traditional machine learning is used here. Mediaeval – if more submissions, the bold shows the reported (best) results
  18. The localisation pipeline processes the rectified frames, and multiple pipelines for different abnormalities can run in parallel. For this paper, we have implemented one specific pipeline for our polyp detection system, i.e., a polyp localisation pipeline. The main idea of the localisation algorithm is to use the polyps’ physical shape to find the exact position in the frame. In most cases, the polyps have the shape of a hill located on relatively flat underlying surface or the shape of a more or less round rock connected to underlying surface with a stalk varying in their thickness. These polyps can be approximated with an elliptical shape region that differs from the surrounding tissue. The polyp localisation pipeline implements an image processing algorithm that performs in sequence the following steps: non-local means de-noising [Buades et al. 2011]; 2D Gaussian blur and 2D image gradient vectors extraction; borders extraction by gradient vectors simple threshold binarization; border’s isolated binary noise removal; possible location of ellipses focuses estimation; ellipses size estimation by analyzing border pixels distribution; ellipses matching to extracted border pixels; selection of predefined number of non-overlapping local maximums and outputting their coordinates as possible polyp locations. For the possible locations of ellipses, we use the coordinates of local maximums in the insensitivity image, created by additive drawing of straight lines starting at each border pixel in the direction of it’s gradient vector. Ellipse matching is then performed using an ellipse fitting function [Fitzgibbon et al. 1996].
  19. More info about the pills… Standard pill – 14% 360 pill – 60%
  20. More info about the pills… Standard pill – 14% 360 pill – 60%
  21. Dataset from capsular endoscopy data (GIANA 2017 challenge) Training: 300 positive samples + 300 negative samples Testing: 300 positive samples + 300 negative samples
  22. CNNs: extract either the features directly (FEA) or to classify the images and using the whole range of concepts and their probabilities as input for the classifiers (CON) precision (PREC), recall/sensitivity (SENS, true positive rate), specificity (SPEC, true negative rate), accuracy (ACC), F1 score (F1), Matthew correlation coefficient (MCC) and (FPS).
  23. First is an extension of the semi-supervised annotation tool
  24. Polyp – ground truth (yellow = polyp) Extracted features for all frames (grey) Detection (purple) = polyp detection Hit / MISS: - green = true detection - yellow = false positive - red = false negative (missed polyp)
  25. Polyp – ground truth (yellow = polyp) Extracted features for all frames (grey) Detection (purple) = polyp detection Hit / MISS: - green = true detection - yellow = false positive - red = false negative (missed polyp)
  26. First is an extension of the semi-supervised annotation tool
  27. First is an extension of the semi-supervised annotation tool
  28. First is an extension of the semi-supervised annotation tool
  29. First is an extension of the semi-supervised annotation tool
  30. First is an extension of the semi-supervised annotation tool
  31. First is an extension of the semi-supervised annotation tool
  32. First is an extension of the semi-supervised annotation tool
  33. First is an extension of the semi-supervised annotation tool
  34. First is an extension of the semi-supervised annotation tool
  35. First is an extension of the semi-supervised annotation tool
  36. First is an extension of the semi-supervised annotation tool
  37. First is an extension of the semi-supervised annotation tool
  38. First is an extension of the semi-supervised annotation tool
  39. First is an extension of the semi-supervised annotation tool
  40. https://www.researchgate.net/publication/258726802_Economic_costs_estimation_of_soccer_injuries_in_first_and_second_spanish_division_professional_teams https://www.bbc.com/sport/football/45045561 https://www.reuters.com/article/soccer-england-injury/soccer-rising-cost-of-premier-league-injuries-raises-fixture-concerns-study-idUSL8N1Q3493
  41. https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/