SeedNet : Automatic Seed Generation with Deep
Reinforcement Learning for Robust Interactive
Segmentation
Gwangmo Song
Dept. of ECE, Seoul National University
Goal : Interactive Segmentation
 Segmentation
 Cutting out a desired object from the image
Goal : Interactive Segmentation
 Interactive Segmentation
 Cutting out a desired object from the image using user input
Camel!
Why Interactive?
 Ambiguity of human intention
Why Interactive?
 User intention
Why Interactive?
 Error correction
Previous Approach
 Early approaches : Graph Cut, GrabCut, Random Walk, Random
Walk with Restart, Geodesic [1,2,3,4,5]
 Using MRF optimization to extract region of interest
 Determine the region by comparing the correspondence between labeled and
unlabeled pixels
[1] Y. Y. Boykov et al. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In ICCV, 2001.
[2] C. Rother et al. Grabcut: Interactive foreground extraction using iterated graph cuts. In ToG. 2004.
[3] L. Grady. Random walks for image segmentation. In PAMI. 2006.
[4] T. H. Kim et al. Generative image segmentation using random walks with restart. In ECCV. 2008.
[5] V. Gulshan et al. Geodesic star convexity for interactive image segmentation. In CVPR. IEEE, 2010.
Previous Approach
 Recent approaches : Deep Learning [1,2] …
 Fine-tuning semantic segmentation network (FCN [3]) for interactive
segmentation task
 End-to-end training by concatenating seed point and image
 Significant performance improvements compared to classical
techniques
[1] N. Xu et al. Deep interactive object selection. In CVPR. 2016.
[2] J. Hao Liew et al. Regional interactive image segmentation networks. In ICCV. 2017.
[3] J. Long et al. Fully convolutional networks for semantic segmentation. In CVPR. 2015
Challenges
 Minimizing user interaction
 Main metric of segmentation : Accuracy %, Click #
 Reducing the input burden of the user
 Deep learning frameworks still require a large number of clicks
Segmentation Pascal Grabcut Berkeley
MSCOCO
(seen)
MSCOCO
(unseen)
Graph Cut [1] 15.06 11.10 14.33 18.67 17.80
Random Walk [2] 11.37 12.30 14.02 13.91 11.53
Geodesic [3] 11.73 8.38 12.57 14.37 12.45
iFCN [4] 6.88 6.04 8.65 8.31 7.82
RIS-Net [5] 5.12 5.00 6.03 5.98 6.44
[1] Y. Y. Boykov et al. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In ICCV, 2001.
[2] L. Grady. Random walks for image segmentation. In PAMI. 2006.
[3] V. Gulshan et al. Geodesic star convexity for interactive image segmentation. In CVPR. IEEE, 2010.
[4] N. Xu et al. Deep interactive object selection. In CVPR. 2016.
[5] J. Hao Liew et al. Regional interactive image segmentation networks. In ICCV. 2017.
Motivation
 Expansion of seed information
 Assuming a minimal user input (ex. Single seed point)
 Automatically expand seed information and perform segmentation
Initial Seed RWR result Expanded Seed RWR result
Motivation
 Problems of previous works
 Existing works are only interested in segmentation accuracy with static
user input
 Even SOTA gives unsatisfactory result with fewer user input
 Any model can give good result with sufficient user input
Segmentation
Module
Motivation
 Automatic seed control
 Generate enough seed information from sparse input
 Additional automatic seed control module
 Combine with segmentation to configure the entire system
 How to design seed control module?
Seed Control
Module
Segmentation
Module
Our Previous Work
 Interactive Segmentation with Seed Expansion
Our Previous Work
 Interactive Segmentation with Seed Expansion
 Input
• Image (RGB)
• Scribble (1 Fore, 3 Back)
• Superpixel
• Saliency map
Our Previous Work
 Interactive Segmentation with Seed Expansion
 Seed Expansion (Two stages)
• Superpixel-based
• Pixel-level to SP-level
• RWR & threshold
• Add pixels with distinct properties
to the seed
• Removes ambiguous pixels
through seed erase step
Our Previous Work
 Interactive Segmentation with Seed Expansion
 Image Pyramid
• Speed up
• Matrix inversion
(Coarsest level)
• Power iteration
(Refinement)
Our Previous Work
 Interactive Segmentation with Seed Expansion
 Refinement
• Saliency-based
Our Previous Work
 Results
 Limitation
 Heuristic approach
 Insufficient accuracy
Method F-score (%)
RWR [1] 74.48
Ours 80.32
[1] T. H. Kim et al. Generative image segmentation using random walks with restart. In ECCV. 2008.
Image &
Scribble
RWR
Results
Expanded
Seed
Our
Results
Our Approach
 Automatic seed generation system with deep reinforcement
learning
 Create additional seed using reinforcement learning
 Simulate the human process
Goal
 Automatically generates the sequence of artificial user input
Initial Seed Step 1 Step 2 Step 3 Step 4 Step 5
Step 6 Step 7 Step 9Step 8 Step 10GT Mask
Why RL?
 Learning framework
 Supervised : label
 Unsupervised : no label
 Reinforcement : reward
 RL
 Decision making
 In a state, agent does an action, get a reward
Supervised
Unsupervised Reinforcement
Learning
Why RL?
 Optimal seed point
 Differs by person
 Depends on the current and future state
Person 1 Person 2 Person 3 Person 4
Why RL?
 RL properties
 Use reward instead of label that is hard to define
 Instant reward + future reward
 Optimized for sequential decision making problems (Atari games)
 Interpret seed generation problem as seed locating game
RL Framework
 Markov Decision Process
 State
 Action
 Reward
Agent
Environment
ActionState Reward
RL Framework
 Markov Decision Process
 State : Input image, segmentation mask
 Action : New seed point location
 Reward : IoU(Intersection over Union)-based function
Seed
Generator
Segmentation
Module
New seed
location
Image
/ Mask
IoU
Score
Proposed Model
 Propose seed generation network
 Combine with existing segmentation module to configure the entire system
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION
IMAGESEED
SEGMENTATION
OBSERVATION UPDATE
State
 Contain enough information to allow the agent to make the best choice
SEED
GENERATION
SEED
UPDATE
GTMASK
SEGMENTATION
REWARD
REWARD COMPUTATION
DQN
IMAGESEED
SEGMENTATION
SEED
GENERATION
SEED
UPDATE
SEED
GENERATION
Action
 Positioning new seed point on 2D grid
 20×20 grid with 2 label (fore, back) = 800 action space
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION
SEED
 Criteria for updating the network for the action and state
 Based on IoU score
REWARD COMPUTATION
GTMASK
SEGMENTATION
REWARD
REWARD COMPUTATION
Reward
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION
SEGMENTATION
IMAGESEED
SEGMENTATION
OBSERVATION UPDATE
State
 Update observation for the next step
 Repeat process
SEED
GENERATION
SEED
UPDATE
GTMASK
SEGMENTATION
REWARD
REWARD COMPUTATION
DQN
IMAGE
OBSERVATION UPDATE
RL Framework
 Markov Decision Process
 State : Input image, segmentation mask
 Action : New seed point location
 Reward : IoU(Intersection over Union)-based function
Seed
Generator
Segmentation
Module
New seed
location
Image
/ Mask
IoU
Score
Reward
 Basic IoU reward
 Mask 𝑀, Ground Truth 𝐺
𝑅𝐼𝑜𝑈 = 𝑰𝒐𝑼(𝑀, 𝐺)
 Change trend of IoU
 Compare with previous step result
𝑅 𝑑𝑖𝑓𝑓 = 𝑰𝒐𝑼 𝑀, 𝐺 − 𝑰𝒐𝑼(𝑀 𝑝𝑟𝑒𝑣, 𝐺)
 Weighted IoU
 Weights on high values
𝑅 𝑒𝑥𝑝 =
𝑒𝑥𝑝 𝑘∗𝑰𝒐𝑼(𝑀,𝐺)−1
𝑒𝑥𝑝 𝑘−1
IoU
𝑅 𝑒𝑥𝑝
1
1
Reward
 Additional seed information
 Determine if seed is located correctly
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION
Reward
 Additional seed information
 Determine if seed is located correctly
DQN
SEED
GENERATION
SEED
UPDATE
IMAGESEEDGTMASK
SEGMENTATION
SEGMENTATION
REWARD
OBSERVATION UPDATE
REWARD COMPUTATION
Reward
 Proposed reward
 Divide GT mask into 4 regions (SF, WF, WB, SB)
𝑅 𝑠𝑖 = ቐ
𝑅 𝑒𝑥𝑝
𝑅 𝑒𝑥𝑝 − 1
−1
𝑖𝑓 𝐹𝑠𝑒𝑒𝑑 ∈ 𝑆𝐹 𝑜𝑟 𝐵𝑠𝑒𝑒𝑑 ∈ 𝑆𝐵
𝑖𝑓 𝐹𝑠𝑒𝑒𝑑 ∈ 𝑊𝐹 𝑜𝑟 𝐵𝑠𝑒𝑒𝑑 ∈ 𝑊𝐵
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
GT Strong
Foreground
Weak
Foreground
Weak
Background
Strong
Background
IoU
𝑅 𝑒𝑥𝑝
1
1
IoU
𝑅 𝑒𝑥𝑝
1
-1
IoU
𝑅 𝑒𝑥𝑝
1
-1
DQN
 Train seed generation agent with RL
 Deep Q-Network
 Policy Gradient
 Actor-Critic
 Deep Q-Network
 Train action-value function 𝑄(𝑎, 𝑠), the expected reward
 Approximates 𝑄(𝑎, 𝑠) with a deep neural network
 Loss function
𝐿𝑜𝑠𝑠 𝜃 = 𝔼 𝑟 + 𝛾 max
𝑎′
𝑄 𝑠′, 𝑎′; 𝜃 − 𝑄 𝑠, 𝑎; 𝜃
2
DQN
 DQN architecture
 Basic architecture from [1]
 Double DQN [2]
 Dueling DQN [3]
[1] V. Mnih et al. Human-level control through deep reinforcement learning. In Nature. 2015.
[2] H. Van Hasselt et al. Deep reinforcement learning with double q-learning. In AAAI, 2016.
[3] Z. Wang et al. Dueling network architectures for deep reinforcement learning. In ICML, 2016.
Experiments
 MSRA10K dataset
 Saliency task
 10,000 images (9,000 train, 1,000 test)
 Initial seed generated from GT mask
 1 foreground, 1 background seed point
 Our model
 Segmentation module : Random Walk [1]
 Stop after 10 seed generation
[1] L. Grady. Random walks for image segmentation. In PAMI. 2006.
MSRA10K
 Quantitative Results
 5 random seed sets
 Intersection-over-Union score comparison
Method Set 1 Set 2 Set 3 Set 4 Set 5 Mean
RW [1] 39.59 39.65 39.71 39.77 39.89 39.72
Ours 60.70 60.12 61.28 61.87 60.90 60.97
[1] L. Grady. Random walks for image segmentation. In PAMI. 2006.
MSRA10K
Image GT RW results [1]
(Initial Seed)
Step 1 Step 2 Step 3 Step 10
Our results
[1] L. Grady. Random walks for image segmentation. In PAMI. 2006.
MSRA10K
 Comparison with supervised methods
 Same condition (Network configuration, Scratch)
 Directly output segmentation mask
 Input type : image only (FCN), with seed (iFCN)
Method FCN [1] iFCN [2] Ours
IoU 37.20 44.60 60.97
[1] J. Long et al. Fully convolutional networks for semantic segmentation. In CVPR. 2015
[2] N. Xu et al. Deep interactive object selection. In CVPR. 2016.
Ablation Study
 Reward function
Method RW [1] 𝑹 𝑰𝒐𝑼 𝑹 𝒅𝒊𝒇𝒇 𝑹 𝒔𝒊
IoU 39.72 42.55 44.45 60.97
𝑅𝐼𝑜𝑈 𝑅 𝑑𝑖𝑓𝑓 𝑅 𝑠𝑖
[1] L. Grady. Random walks for image segmentation. In PAMI. 2006.
Ablation Study
 Other segmentation module
Method GC [1] Ours (GC) GSC [2] Ours (GSC)
IoU 38.44 52.10 58.34 63.48
RWR [3] Ours (RWR)
35.71 53.04
Image GT Initial Ours
GCGSCRWR
[1] C. Rother et al. Grabcut: Interactive foreground extraction using iterated graph cuts. In ToG. 2004.
[2] V. Gulshan et al. Geodesic star convexity for interactive image segmentation. In CVPR. IEEE, 2010.
[3] T. H. Kim et al. Generative image segmentation using random walks with restart. In ECCV. 2008.
Unseen Dataset
 Train on MSRA10K
 Test on
 GSCSEQ, Weizmann Single Object, Weizmann Horse, iCoseg, IG02
Dataset GSCSEQ
Weizmann
Single
Weizmann
Horse
iCoseg IG02
Method RW Ours RW Ours RW Ours RW Ours RW Ours
IoU 27.40 38.93 38.29 55.08 25.63 48.33 36.02 43.87 23.87 28.48
Unseen Dataset
Image GT RW results
(Initial Seed)
Step 1 Step 2 Step 3 Step 10
Our results
Unseen Dataset
Image GT RW results
(Initial Seed)
Step 1 Step 2 Step 3 Step 10
Our results
Summary
 Suggest novel seed generation module with deep reinforcement
learning
 Improved segmentation accuracy using automatic seed
generation system
 Validation on various segmentation modules and datasets
Seed net automatic seed generation with deep reinforcement learning for robust interactive segmentation

Seed net automatic seed generation with deep reinforcement learning for robust interactive segmentation

  • 1.
    SeedNet : AutomaticSeed Generation with Deep Reinforcement Learning for Robust Interactive Segmentation Gwangmo Song Dept. of ECE, Seoul National University
  • 2.
    Goal : InteractiveSegmentation  Segmentation  Cutting out a desired object from the image
  • 3.
    Goal : InteractiveSegmentation  Interactive Segmentation  Cutting out a desired object from the image using user input Camel!
  • 4.
  • 5.
  • 6.
  • 7.
    Previous Approach  Earlyapproaches : Graph Cut, GrabCut, Random Walk, Random Walk with Restart, Geodesic [1,2,3,4,5]  Using MRF optimization to extract region of interest  Determine the region by comparing the correspondence between labeled and unlabeled pixels [1] Y. Y. Boykov et al. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In ICCV, 2001. [2] C. Rother et al. Grabcut: Interactive foreground extraction using iterated graph cuts. In ToG. 2004. [3] L. Grady. Random walks for image segmentation. In PAMI. 2006. [4] T. H. Kim et al. Generative image segmentation using random walks with restart. In ECCV. 2008. [5] V. Gulshan et al. Geodesic star convexity for interactive image segmentation. In CVPR. IEEE, 2010.
  • 8.
    Previous Approach  Recentapproaches : Deep Learning [1,2] …  Fine-tuning semantic segmentation network (FCN [3]) for interactive segmentation task  End-to-end training by concatenating seed point and image  Significant performance improvements compared to classical techniques [1] N. Xu et al. Deep interactive object selection. In CVPR. 2016. [2] J. Hao Liew et al. Regional interactive image segmentation networks. In ICCV. 2017. [3] J. Long et al. Fully convolutional networks for semantic segmentation. In CVPR. 2015
  • 9.
    Challenges  Minimizing userinteraction  Main metric of segmentation : Accuracy %, Click #  Reducing the input burden of the user  Deep learning frameworks still require a large number of clicks Segmentation Pascal Grabcut Berkeley MSCOCO (seen) MSCOCO (unseen) Graph Cut [1] 15.06 11.10 14.33 18.67 17.80 Random Walk [2] 11.37 12.30 14.02 13.91 11.53 Geodesic [3] 11.73 8.38 12.57 14.37 12.45 iFCN [4] 6.88 6.04 8.65 8.31 7.82 RIS-Net [5] 5.12 5.00 6.03 5.98 6.44 [1] Y. Y. Boykov et al. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In ICCV, 2001. [2] L. Grady. Random walks for image segmentation. In PAMI. 2006. [3] V. Gulshan et al. Geodesic star convexity for interactive image segmentation. In CVPR. IEEE, 2010. [4] N. Xu et al. Deep interactive object selection. In CVPR. 2016. [5] J. Hao Liew et al. Regional interactive image segmentation networks. In ICCV. 2017.
  • 10.
    Motivation  Expansion ofseed information  Assuming a minimal user input (ex. Single seed point)  Automatically expand seed information and perform segmentation Initial Seed RWR result Expanded Seed RWR result
  • 11.
    Motivation  Problems ofprevious works  Existing works are only interested in segmentation accuracy with static user input  Even SOTA gives unsatisfactory result with fewer user input  Any model can give good result with sufficient user input Segmentation Module
  • 12.
    Motivation  Automatic seedcontrol  Generate enough seed information from sparse input  Additional automatic seed control module  Combine with segmentation to configure the entire system  How to design seed control module? Seed Control Module Segmentation Module
  • 13.
    Our Previous Work Interactive Segmentation with Seed Expansion
  • 14.
    Our Previous Work Interactive Segmentation with Seed Expansion  Input • Image (RGB) • Scribble (1 Fore, 3 Back) • Superpixel • Saliency map
  • 15.
    Our Previous Work Interactive Segmentation with Seed Expansion  Seed Expansion (Two stages) • Superpixel-based • Pixel-level to SP-level • RWR & threshold • Add pixels with distinct properties to the seed • Removes ambiguous pixels through seed erase step
  • 16.
    Our Previous Work Interactive Segmentation with Seed Expansion  Image Pyramid • Speed up • Matrix inversion (Coarsest level) • Power iteration (Refinement)
  • 17.
    Our Previous Work Interactive Segmentation with Seed Expansion  Refinement • Saliency-based
  • 18.
    Our Previous Work Results  Limitation  Heuristic approach  Insufficient accuracy Method F-score (%) RWR [1] 74.48 Ours 80.32 [1] T. H. Kim et al. Generative image segmentation using random walks with restart. In ECCV. 2008. Image & Scribble RWR Results Expanded Seed Our Results
  • 19.
    Our Approach  Automaticseed generation system with deep reinforcement learning  Create additional seed using reinforcement learning  Simulate the human process
  • 20.
    Goal  Automatically generatesthe sequence of artificial user input Initial Seed Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 9Step 8 Step 10GT Mask
  • 21.
    Why RL?  Learningframework  Supervised : label  Unsupervised : no label  Reinforcement : reward  RL  Decision making  In a state, agent does an action, get a reward Supervised Unsupervised Reinforcement Learning
  • 22.
    Why RL?  Optimalseed point  Differs by person  Depends on the current and future state Person 1 Person 2 Person 3 Person 4
  • 23.
    Why RL?  RLproperties  Use reward instead of label that is hard to define  Instant reward + future reward  Optimized for sequential decision making problems (Atari games)  Interpret seed generation problem as seed locating game
  • 24.
    RL Framework  MarkovDecision Process  State  Action  Reward Agent Environment ActionState Reward
  • 25.
    RL Framework  MarkovDecision Process  State : Input image, segmentation mask  Action : New seed point location  Reward : IoU(Intersection over Union)-based function Seed Generator Segmentation Module New seed location Image / Mask IoU Score
  • 26.
    Proposed Model  Proposeseed generation network  Combine with existing segmentation module to configure the entire system DQN SEED GENERATION SEED UPDATE IMAGESEEDGTMASK SEGMENTATION SEGMENTATION REWARD OBSERVATION UPDATE REWARD COMPUTATION
  • 27.
    IMAGESEED SEGMENTATION OBSERVATION UPDATE State  Containenough information to allow the agent to make the best choice SEED GENERATION SEED UPDATE GTMASK SEGMENTATION REWARD REWARD COMPUTATION DQN IMAGESEED SEGMENTATION
  • 28.
    SEED GENERATION SEED UPDATE SEED GENERATION Action  Positioning newseed point on 2D grid  20×20 grid with 2 label (fore, back) = 800 action space DQN SEED GENERATION SEED UPDATE IMAGESEEDGTMASK SEGMENTATION SEGMENTATION REWARD OBSERVATION UPDATE REWARD COMPUTATION SEED
  • 29.
     Criteria forupdating the network for the action and state  Based on IoU score REWARD COMPUTATION GTMASK SEGMENTATION REWARD REWARD COMPUTATION Reward DQN SEED GENERATION SEED UPDATE IMAGESEEDGTMASK SEGMENTATION SEGMENTATION REWARD OBSERVATION UPDATE REWARD COMPUTATION SEGMENTATION
  • 30.
    IMAGESEED SEGMENTATION OBSERVATION UPDATE State  Updateobservation for the next step  Repeat process SEED GENERATION SEED UPDATE GTMASK SEGMENTATION REWARD REWARD COMPUTATION DQN IMAGE OBSERVATION UPDATE
  • 31.
    RL Framework  MarkovDecision Process  State : Input image, segmentation mask  Action : New seed point location  Reward : IoU(Intersection over Union)-based function Seed Generator Segmentation Module New seed location Image / Mask IoU Score
  • 32.
    Reward  Basic IoUreward  Mask 𝑀, Ground Truth 𝐺 𝑅𝐼𝑜𝑈 = 𝑰𝒐𝑼(𝑀, 𝐺)  Change trend of IoU  Compare with previous step result 𝑅 𝑑𝑖𝑓𝑓 = 𝑰𝒐𝑼 𝑀, 𝐺 − 𝑰𝒐𝑼(𝑀 𝑝𝑟𝑒𝑣, 𝐺)  Weighted IoU  Weights on high values 𝑅 𝑒𝑥𝑝 = 𝑒𝑥𝑝 𝑘∗𝑰𝒐𝑼(𝑀,𝐺)−1 𝑒𝑥𝑝 𝑘−1 IoU 𝑅 𝑒𝑥𝑝 1 1
  • 33.
    Reward  Additional seedinformation  Determine if seed is located correctly DQN SEED GENERATION SEED UPDATE IMAGESEEDGTMASK SEGMENTATION SEGMENTATION REWARD OBSERVATION UPDATE REWARD COMPUTATION
  • 34.
    Reward  Additional seedinformation  Determine if seed is located correctly DQN SEED GENERATION SEED UPDATE IMAGESEEDGTMASK SEGMENTATION SEGMENTATION REWARD OBSERVATION UPDATE REWARD COMPUTATION
  • 35.
    Reward  Proposed reward Divide GT mask into 4 regions (SF, WF, WB, SB) 𝑅 𝑠𝑖 = ቐ 𝑅 𝑒𝑥𝑝 𝑅 𝑒𝑥𝑝 − 1 −1 𝑖𝑓 𝐹𝑠𝑒𝑒𝑑 ∈ 𝑆𝐹 𝑜𝑟 𝐵𝑠𝑒𝑒𝑑 ∈ 𝑆𝐵 𝑖𝑓 𝐹𝑠𝑒𝑒𝑑 ∈ 𝑊𝐹 𝑜𝑟 𝐵𝑠𝑒𝑒𝑑 ∈ 𝑊𝐵 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 GT Strong Foreground Weak Foreground Weak Background Strong Background IoU 𝑅 𝑒𝑥𝑝 1 1 IoU 𝑅 𝑒𝑥𝑝 1 -1 IoU 𝑅 𝑒𝑥𝑝 1 -1
  • 36.
    DQN  Train seedgeneration agent with RL  Deep Q-Network  Policy Gradient  Actor-Critic  Deep Q-Network  Train action-value function 𝑄(𝑎, 𝑠), the expected reward  Approximates 𝑄(𝑎, 𝑠) with a deep neural network  Loss function 𝐿𝑜𝑠𝑠 𝜃 = 𝔼 𝑟 + 𝛾 max 𝑎′ 𝑄 𝑠′, 𝑎′; 𝜃 − 𝑄 𝑠, 𝑎; 𝜃 2
  • 37.
    DQN  DQN architecture Basic architecture from [1]  Double DQN [2]  Dueling DQN [3] [1] V. Mnih et al. Human-level control through deep reinforcement learning. In Nature. 2015. [2] H. Van Hasselt et al. Deep reinforcement learning with double q-learning. In AAAI, 2016. [3] Z. Wang et al. Dueling network architectures for deep reinforcement learning. In ICML, 2016.
  • 38.
    Experiments  MSRA10K dataset Saliency task  10,000 images (9,000 train, 1,000 test)  Initial seed generated from GT mask  1 foreground, 1 background seed point  Our model  Segmentation module : Random Walk [1]  Stop after 10 seed generation [1] L. Grady. Random walks for image segmentation. In PAMI. 2006.
  • 39.
    MSRA10K  Quantitative Results 5 random seed sets  Intersection-over-Union score comparison Method Set 1 Set 2 Set 3 Set 4 Set 5 Mean RW [1] 39.59 39.65 39.71 39.77 39.89 39.72 Ours 60.70 60.12 61.28 61.87 60.90 60.97 [1] L. Grady. Random walks for image segmentation. In PAMI. 2006.
  • 40.
    MSRA10K Image GT RWresults [1] (Initial Seed) Step 1 Step 2 Step 3 Step 10 Our results [1] L. Grady. Random walks for image segmentation. In PAMI. 2006.
  • 41.
    MSRA10K  Comparison withsupervised methods  Same condition (Network configuration, Scratch)  Directly output segmentation mask  Input type : image only (FCN), with seed (iFCN) Method FCN [1] iFCN [2] Ours IoU 37.20 44.60 60.97 [1] J. Long et al. Fully convolutional networks for semantic segmentation. In CVPR. 2015 [2] N. Xu et al. Deep interactive object selection. In CVPR. 2016.
  • 42.
    Ablation Study  Rewardfunction Method RW [1] 𝑹 𝑰𝒐𝑼 𝑹 𝒅𝒊𝒇𝒇 𝑹 𝒔𝒊 IoU 39.72 42.55 44.45 60.97 𝑅𝐼𝑜𝑈 𝑅 𝑑𝑖𝑓𝑓 𝑅 𝑠𝑖 [1] L. Grady. Random walks for image segmentation. In PAMI. 2006.
  • 43.
    Ablation Study  Othersegmentation module Method GC [1] Ours (GC) GSC [2] Ours (GSC) IoU 38.44 52.10 58.34 63.48 RWR [3] Ours (RWR) 35.71 53.04 Image GT Initial Ours GCGSCRWR [1] C. Rother et al. Grabcut: Interactive foreground extraction using iterated graph cuts. In ToG. 2004. [2] V. Gulshan et al. Geodesic star convexity for interactive image segmentation. In CVPR. IEEE, 2010. [3] T. H. Kim et al. Generative image segmentation using random walks with restart. In ECCV. 2008.
  • 44.
    Unseen Dataset  Trainon MSRA10K  Test on  GSCSEQ, Weizmann Single Object, Weizmann Horse, iCoseg, IG02 Dataset GSCSEQ Weizmann Single Weizmann Horse iCoseg IG02 Method RW Ours RW Ours RW Ours RW Ours RW Ours IoU 27.40 38.93 38.29 55.08 25.63 48.33 36.02 43.87 23.87 28.48
  • 45.
    Unseen Dataset Image GTRW results (Initial Seed) Step 1 Step 2 Step 3 Step 10 Our results
  • 46.
    Unseen Dataset Image GTRW results (Initial Seed) Step 1 Step 2 Step 3 Step 10 Our results
  • 47.
    Summary  Suggest novelseed generation module with deep reinforcement learning  Improved segmentation accuracy using automatic seed generation system  Validation on various segmentation modules and datasets