Relevance-Based Compression of Cataract Surgery Videos Using
Convolutional Neural Networks
Negin Ghamsarian, Hadi Amirpour, Christian Timmerer, Mario Taschwer, Klaus Schöffmann
1. Introduction
4. Compression Results
2. Proposed Method
3. Detection Results
1. Introduction:
1. Introduction:
1. Introduction
4. Compression Results
2. Proposed Method
3. Detection Results
3. Proposed Method:
2. Proposed Method:
1
• 𝐴𝑐𝑡𝑖𝑜𝑛
2
• 𝐴𝑐𝑡𝑖𝑜𝑛 ∩ (𝑐𝑜𝑟𝑛𝑒𝑎 ∪ 𝑖𝑛𝑠𝑡𝑟𝑢𝑚𝑒𝑛𝑡)
3
• 𝐴𝑐𝑡𝑖𝑜𝑛 ∩ 𝑐𝑜𝑟𝑛𝑒𝑎 (simple)
4
• 𝐴𝑐𝑡𝑖𝑜𝑛 ∩ 𝑐𝑜𝑟𝑛𝑒𝑎 (Luma preference)
5
• 𝐴𝑐𝑡𝑖𝑜𝑛 ∩ 𝑐𝑜𝑟𝑛𝑒𝑎 (removed background)
Scenarios for relevant content
… … …
Action Idle Action Idle
Irrelevant Irrelevant
Relevant
2. Proposed Method:
2. Proposed Method:
2. Proposed Method:
Mask R-CNN (Mask Regional Convolutional Neural Network)
Two sub-problems:
Object Detection Semantic Segmentation
Using region proposal
networks
Using fully convolutional
networks
2. Proposed Method:
9 proposals relative to 9 anchors,
and 4 parameters per anchor
Features from the backbone
network
9 × 4 = 36
Relative box
parameters
Class
probabilities
9 × 2 = 18
RPN (Region Proposal Network):
2. Proposed Method:
RPN (Region Proposal Network):
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7 8 9
Class Probabilities
Cornea Background
1
2 3 4
75
6
8
9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7 8 9
Class Probabilities
Cornea Background
3. Proposed Method:
RPN (Region Proposal Network):
1
2 3 4
75
6
8
9
2. Proposed Method:
Forward/Inference
Backward/Learning
FCN (Fully Convolutional Network):
1. Introduction
4. Compression Results
2. Proposed Method
3. Detection Results
3. Detection Results:
3. Detection Results:
V2
V5
V7
V8
Idle Action
3. Detection Results:
3. Detection Results:
At least 80% IoU for mask segmentation
3. Detection Results:
At least 85% IoU for bounding-box segmentation
1. Introduction
4. Compression Results
2. Proposed Method
3. Detection Results
5. Compression Results:
∆𝑄 = 5 ∆𝑄 = 10 ∆𝑄 = 13 ∆𝑄 = 15
PSNR values for an exemplary segment using different QP differences:
5. Compression Results:
The percentage of bitrate reduction resulting from different scenarios and different QP differences:
5. Compression Results:
The PSNR of ROI and output size corresponding to an exemplary video compressed in different
scenarios:
5. Compression Results:
Original Scenario II
5. Compression Results:
Original Scenario IV
5. Compression Results:
Original Scenario V
Conclusion:
4
Being generalizable to all sorts of surgical videos
where the domain knowledge is required
3
A set of novel scenarios to compress cataract surgery videos so
as to be proportionate to the target audience.2
1
Main Contributions
Relevance-based compression approach using
surgical domain knowledge
Storage space gain of up to 68%
Question?
negin@itec.aau.at
hadi@itec.aau.at
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks

Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks