Seed net automatic seed generation with deep reinforcement learning for robust interactive segmentation

SeedNet : Automatic Seed Generation with Deep
Reinforcement Learning for Robust Interactive
Segmentation
Gwangmo Song
Dept. of ECE, Seoul National University

Goal : Interactive Segmentation
 Segmentation
 Cutting out a desired object from the image

Goal : Interactive Segmentation
 Interactive Segmentation
 Cutting out a desired object from the image using user input
Camel!

Why Interactive?
 Ambiguity of human intention

Why Interactive?
 User intention

Why Interactive?
 Error correction

Previous Approach
 Early approaches : Graph Cut, GrabCut, Random Walk, Random
Walk with Restart, Geodesic [1,2,3,4,5]
 Using MRF optimization to extract region of interest
 Determine the region by comparing the correspondence between labeled and
unlabeled pixels
[1] Y. Y. Boykov et al. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In ICCV, 2001.
[2] C. Rother et al. Grabcut: Interactive foreground extraction using iterated graph cuts. In ToG. 2004.
[3] L. Grady. Random walks for image segmentation. In PAMI. 2006.
[4] T. H. Kim et al. Generative image segmentation using random walks with restart. In ECCV. 2008.
[5] V. Gulshan et al. Geodesic star convexity for interactive image segmentation. In CVPR. IEEE, 2010.

Previous Approach
 Recent approaches : Deep Learning [1,2] …
 Fine-tuning semantic segmentation network (FCN [3]) for interactive
segmentation task
 End-to-end training by concatenating seed point and image
 Significant performance improvements compared to classical
techniques
[1] N. Xu et al. Deep interactive object selection. In CVPR. 2016.
[2] J. Hao Liew et al. Regional interactive image segmentation networks. In ICCV. 2017.
[3] J. Long et al. Fully convolutional networks for semantic segmentation. In CVPR. 2015

Challenges
 Minimizing user interaction
 Main metric of segmentation : Accuracy %, Click #
 Reducing the input burden of the user
 Deep learning frameworks still require a large number of clicks
Segmentation Pascal Grabcut Berkeley
MSCOCO
(seen)
MSCOCO
(unseen)
Graph Cut [1] 15.06 11.10 14.33 18.67 17.80
Random Walk [2] 11.37 12.30 14.02 13.91 11.53
Geodesic [3] 11.73 8.38 12.57 14.37 12.45
iFCN [4] 6.88 6.04 8.65 8.31 7.82
RIS-Net [5] 5.12 5.00 6.03 5.98 6.44
[1] Y. Y. Boykov et al. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In ICCV, 2001.
[5] J. Hao Liew et al. Regional interactive image segmentation networks. In ICCV. 2017.

Motivation
 Expansion of seed information
 Assuming a minimal user input (ex. Single seed point)
 Automatically expand seed information and perform segmentation
Initial Seed RWR result Expanded Seed RWR result

Motivation
 Problems of previous works
 Existing works are only interested in segmentation accuracy with static
user input
 Even SOTA gives unsatisfactory result with fewer user input
 Any model can give good result with sufficient user input
Segmentation
Module

Motivation
 Automatic seed control
 Generate enough seed information from sparse input
 Additional automatic seed control module
 Combine with segmentation to configure the entire system
 How to design seed control module?
Seed Control
Module
Segmentation
Module

Our Previous Work
 Interactive Segmentation with Seed Expansion

Our Previous Work
 Input
• Image (RGB)
• Scribble (1 Fore, 3 Back)
• Superpixel
• Saliency map

Our Previous Work
 Seed Expansion (Two stages)
• Superpixel-based
• Pixel-level to SP-level
• RWR & threshold
• Add pixels with distinct properties
to the seed
• Removes ambiguous pixels
through seed erase step

Our Previous Work
 Image Pyramid
• Speed up
• Matrix inversion
(Coarsest level)
• Power iteration
(Refinement)

Our Previous Work
 Refinement
• Saliency-based

Our Previous Work
 Results
 Limitation
 Heuristic approach
 Insufficient accuracy
Method F-score (%)
RWR [1] 74.48
Ours 80.32
Image &
Scribble
RWR
Results
Expanded
Seed
Our
Results

Our Approach
 Automatic seed generation system with deep reinforcement
learning
 Create additional seed using reinforcement learning
 Simulate the human process

Goal
 Automatically generates the sequence of artificial user input
Initial Seed Step 1 Step 2 Step 3 Step 4 Step 5
Step 6 Step 7 Step 9Step 8 Step 10GT Mask

Why RL?
 Learning framework
 Supervised : label
 Unsupervised : no label
 Reinforcement : reward
 RL
 Decision making
 In a state, agent does an action, get a reward
Supervised
Unsupervised Reinforcement
Learning

Why RL?
 Optimal seed point
 Differs by person
 Depends on the current and future state
Person 1 Person 2 Person 3 Person 4

Why RL?
 RL properties
 Use reward instead of label that is hard to define
 Instant reward + future reward
 Optimized for sequential decision making problems (Atari games)
 Interpret seed generation problem as seed locating game

RL Framework
 Markov Decision Process
 State
 Action
 Reward
Agent
Environment
ActionState Reward

RL Framework
 Markov Decision Process
 State : Input image, segmentation mask
 Action : New seed point location
 Reward : IoU(Intersection over Union)-based function
Seed
Generator
Segmentation
Module
New seed
location
Image
/ Mask
IoU
Score

Proposed Model
 Propose seed generation network
 Combine with existing segmentation module to configure the entire system
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION

IMAGESEED
SEGMENTATION
OBSERVATION UPDATE
State
 Contain enough information to allow the agent to make the best choice
SEED
GENERATION
SEED
UPDATE
GTMASK
SEGMENTATION
REWARD
REWARD COMPUTATION
DQN
IMAGESEED
SEGMENTATION

SEED
GENERATION
SEED
UPDATE
SEED
GENERATION
Action
 Positioning new seed point on 2D grid
 20×20 grid with 2 label (fore, back) = 800 action space
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION
SEED

 Criteria for updating the network for the action and state
 Based on IoU score
REWARD COMPUTATION
GTMASK
SEGMENTATION
REWARD
REWARD COMPUTATION
Reward
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION
SEGMENTATION

IMAGESEED
SEGMENTATION
OBSERVATION UPDATE
State
 Update observation for the next step
 Repeat process
SEED
GENERATION
SEED
UPDATE
GTMASK
SEGMENTATION
REWARD
REWARD COMPUTATION
DQN
IMAGE
OBSERVATION UPDATE

Reward
 Basic IoU reward
 Mask 𝑀, Ground Truth 𝐺
𝑅𝐼𝑜𝑈 = 𝑰𝒐𝑼(𝑀, 𝐺)
 Change trend of IoU
 Compare with previous step result
𝑅 𝑑𝑖𝑓𝑓 = 𝑰𝒐𝑼 𝑀, 𝐺 − 𝑰𝒐𝑼(𝑀 𝑝𝑟𝑒𝑣, 𝐺)
 Weighted IoU
 Weights on high values
𝑅 𝑒𝑥𝑝 =
𝑒𝑥𝑝 𝑘∗𝑰𝒐𝑼(𝑀,𝐺)−1
𝑒𝑥𝑝 𝑘−1
IoU
𝑅 𝑒𝑥𝑝
1
1

Reward
 Additional seed information
 Determine if seed is located correctly
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION

Reward
 Proposed reward
 Divide GT mask into 4 regions (SF, WF, WB, SB)
𝑅 𝑠𝑖 = ቐ
𝑅 𝑒𝑥𝑝
𝑅 𝑒𝑥𝑝 − 1
−1
𝑖𝑓 𝐹𝑠𝑒𝑒𝑑 ∈ 𝑆𝐹 𝑜𝑟 𝐵𝑠𝑒𝑒𝑑 ∈ 𝑆𝐵
𝑖𝑓 𝐹𝑠𝑒𝑒𝑑 ∈ 𝑊𝐹 𝑜𝑟 𝐵𝑠𝑒𝑒𝑑 ∈ 𝑊𝐵
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
GT Strong
Foreground
Weak
Foreground
Weak
Background
Strong
Background
IoU
𝑅 𝑒𝑥𝑝
1
1
IoU
𝑅 𝑒𝑥𝑝
1
-1
IoU
𝑅 𝑒𝑥𝑝
1
-1

DQN
 Train seed generation agent with RL
 Deep Q-Network
 Policy Gradient
 Actor-Critic
 Deep Q-Network
 Train action-value function 𝑄(𝑎, 𝑠), the expected reward
 Approximates 𝑄(𝑎, 𝑠) with a deep neural network
 Loss function
𝐿𝑜𝑠𝑠 𝜃 = 𝔼 𝑟 + 𝛾 max
𝑎′
𝑄 𝑠′, 𝑎′; 𝜃 − 𝑄 𝑠, 𝑎; 𝜃
2

DQN
 DQN architecture
 Basic architecture from [1]
 Double DQN [2]
 Dueling DQN [3]
[1] V. Mnih et al. Human-level control through deep reinforcement learning. In Nature. 2015.
[2] H. Van Hasselt et al. Deep reinforcement learning with double q-learning. In AAAI, 2016.
[3] Z. Wang et al. Dueling network architectures for deep reinforcement learning. In ICML, 2016.

Experiments
 MSRA10K dataset
 Saliency task
 10,000 images (9,000 train, 1,000 test)
 Initial seed generated from GT mask
 1 foreground, 1 background seed point
 Our model
 Segmentation module : Random Walk [1]
 Stop after 10 seed generation

MSRA10K
 Quantitative Results
 5 random seed sets
 Intersection-over-Union score comparison
Method Set 1 Set 2 Set 3 Set 4 Set 5 Mean
RW [1] 39.59 39.65 39.71 39.77 39.89 39.72
Ours 60.70 60.12 61.28 61.87 60.90 60.97

MSRA10K
Image GT RW results [1]
(Initial Seed)
Step 1 Step 2 Step 3 Step 10
Our results

MSRA10K
 Comparison with supervised methods
 Same condition (Network configuration, Scratch)
 Directly output segmentation mask
 Input type : image only (FCN), with seed (iFCN)
Method FCN [1] iFCN [2] Ours
IoU 37.20 44.60 60.97
[1] J. Long et al. Fully convolutional networks for semantic segmentation. In CVPR. 2015

Ablation Study
 Reward function
Method RW [1] 𝑹 𝑰𝒐𝑼 𝑹 𝒅𝒊𝒇𝒇 𝑹 𝒔𝒊
IoU 39.72 42.55 44.45 60.97
𝑅𝐼𝑜𝑈 𝑅 𝑑𝑖𝑓𝑓 𝑅 𝑠𝑖

Ablation Study
 Other segmentation module
Method GC [1] Ours (GC) GSC [2] Ours (GSC)
IoU 38.44 52.10 58.34 63.48
RWR [3] Ours (RWR)
35.71 53.04
Image GT Initial Ours
GCGSCRWR
[1] C. Rother et al. Grabcut: Interactive foreground extraction using iterated graph cuts. In ToG. 2004.

Unseen Dataset
 Train on MSRA10K
 Test on
 GSCSEQ, Weizmann Single Object, Weizmann Horse, iCoseg, IG02
Dataset GSCSEQ
Weizmann
Single
Weizmann
Horse
iCoseg IG02
Method RW Ours RW Ours RW Ours RW Ours RW Ours
IoU 27.40 38.93 38.29 55.08 25.63 48.33 36.02 43.87 23.87 28.48

Unseen Dataset
Image GT RW results
(Initial Seed)
Step 1 Step 2 Step 3 Step 10
Our results

Summary
 Suggest novel seed generation module with deep reinforcement
learning
 Improved segmentation accuracy using automatic seed
generation system
 Validation on various segmentation modules and datasets

Seed net automatic seed generation with deep reinforcement learning for robust interactive segmentation

Seed net automatic seed generation with deep reinforcement learning for robust interactive segmentation

More Related Content

Similar to Seed net automatic seed generation with deep reinforcement learning for robust interactive segmentation

More from NAVER Engineering

Recently uploaded

Seed net automatic seed generation with deep reinforcement learning for robust interactive segmentation