Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 19

Learning What to Defer for Maximum Independent Sets (ICML 2020)

1

Share

Download to read offline

Sungsoo Ahn, Younggyo Seo, Jinwoo Shin

arXiv: https://arxiv.org/abs/2006.09607

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Learning What to Defer for Maximum Independent Sets (ICML 2020)

  1. 1. Learning What to Defer for Maximum Independent Sets Sungsoo Ahn, Younggyo Seo, Jinwoo Shin Korea Advanced Institute of Science and Technology (KAIST)
  2. 2. • We train a Deep Neural Network (DNN), to solve the Maximum Independent Set (MIS) problem. Solving the MIS with a DNN finding a maximum set of non-adjacent vertices in a graph • We use Reinforcement Learning (RL) to train the DNN-policy. • Our contribution: designing an “efficient” Markov Decision Process (MDP), such that • the MIS solution is determined by actions. • the MIS objective is decomposed into rewards. However, human-designed algorithms exist... why “learn” a DNN-based solver? problem (graph) solver (DNN) solution ( ) Learning What to Defer for Maximum Independent Sets
  3. 3. Why “learn” a solver? • Our solver learns to solve combinatorial optimization on the training data. • traditional solvers: fully hand-crafted from humans, without training data. • We primarily focus on MIS; it is fundamental and practical. • e.g., shown to be NP-complete [Karp, 1972] and applied to computer vision [Sander et al., 2008] Next, about existing DNN-based solver for the MIS problem. • capable of solving new type of problems. • Based on this work, we can automatically design solvers that are: • specialized to a distribution of problems. graph DNN MAXCUT ?MIS DNN MIS on grid graphs Learning What to Defer for Maximum Independent Sets
  4. 4. • Predicting the solution one-by-one at each step [Dai et al., 2018]. • Vertices are excluded ( ) at the beginning, and then included ( ) one-by-one. • Invalid choice of vertices are masked out at each step. Deciding MIS one-by-one input (graph) output (vertices) • One-shot generation of MIS is hard! • How to constrain the vertices to be non-adjacent? • Suffers from the credit assignment problem. However, selecting 1,000,000 vertices requires 1,000,000 DNN inference? input (graph) output (vertices) Learning What to Defer for Maximum Independent Sets
  5. 5. • We train a DNN to decide on a subset of vertices, at each step. • Decisions on the vertices are “deferred” ( ) at the beginning. • At each step, vertices are decided to be either excluded / included / deferred ( / / ). Learning what to defer for MIS • At each iteration, DNN makes one of the 3 (# of deferred vertices) available actions. • The policy repeatedly choose to “defer” the decision on vertices. • Our algorithm have an “adaptive” number of iterations. • Hence, applicable to large-scale graphs. • Intuitively, the policy learns to defer the hard decisions; they become easier in later iterations. How to train with RL, and how to parameterize the DNN? Learning What to Defer for Maximum Independent Sets
  6. 6. • Our algorithm is a combination of Learning what to defer for MIS Markov decision process (MDP). DNN-based policy and value estimator. • We train the policy using Proximal Policy Optimization (PPO) [Schulman et al. 2017]. state action (on 3 vertices) next state transition reward: 2 (# of newly added s ) • We use the GraphSAGE [Hamiltonian et al. 2017] architecture for parameterization of the DNN. • We use two separate DNNs for the policy and the value estimator. Sometimes, the policy may generate invalid actions. policy Learning What to Defer for Maximum Independent Sets
  7. 7. • We use the MDP’s transition function to “clean-up” the sub-optimal actions. Transition function for clean-up • When the policy makes invalid assignments, they are rejected. • When the policy leaves out trivial assignments, they are processed. trivial assignments: should be surrounded only by . invalid & rejected assignments: s should be non-adjacent. policy policy Learning What to Defer for Maximum Independent Sets transition transition
  8. 8. Ablation: is deferring helpful? • Constraining the max. number of steps in MDP degrades performance. • As the MDP gets “shorter”, the policy needs to decide on many vertices at each step. • We see a tradeoff between performance vs. speed of algorithm. performance with varying the length (T) of MDP. Next, additional “diversification” bonus for improving performance. Learning What to Defer for Maximum Independent Sets
  9. 9. • At test time, we generate multiple solutions per graph and take the best one. • Unlike most reinforcement learning, we care about the best outcome. Solution diversification reward problem (graph) solver (DNN) solutions • Then it is better to diversify solutions than improving in average. • To this end, we reward policies for generating diverse solutions. Learning What to Defer for Maximum Independent Sets
  10. 10. • To reward diverse solutions, we couple and compare a pair of MDPs defined on the same graph. • We reward the policy whenever solutions get diverse for vertices that are NOT deferred. Solution diversification reward MDP 1: MDP 2: diverse by 7 reward: 2 reward: 3 reward: 2 Learning What to Defer for Maximum Independent Sets
  11. 11. • We compare with the following algorithms: • Exact integer programming solver (CPLEX) • State-of-the-art non-deep learning MIS heuristic (KaMIS) [Lamm et al. 2017] • DNN-based RL solver that decides MIS one-by-one (S2V-DQN) [Dai et al. 2018] • DNN-based supervised learning (SL) solver (TGS) [Li et al. 2019] • TGS + Graph Reduction and Local Search heuristics (TGS+GR+LS). Experiments • We mainly focus on the maximum independent set (MIS) problem. • We also show our framework can be applied to other problems. • We have two versions of our algorithm: • The demonstrated Learning what to Defer (LwD). • LwD + Local Search heuristic (LwD+LS). NOT using DNN using DNN Learning What to Defer for Maximum Independent Sets
  12. 12. Experiments: moderate-scale graphs Learning What to Defer for Maximum Independent Sets Results on synthetic graphs with ~500 vertices graph type graph size algorithm type our algorithm objective time spent
  13. 13. Experiments: moderate-scale graphs Learning What to Defer for Maximum Independent Sets Results on synthetic graphs with ~500 vertices LwD+LS ≈ KaMIS > TGS+GR+LS > CPLEX > S2V-DQN (ours ≈ SOTA > SL > exact > one-by-one)
  14. 14. Experiments: moderate-scale graphs Learning What to Defer for Maximum Independent Sets LwD> S2V-DQN > TGS (ours > one-by-one > SL) Results on synthetic graphs with ~500 vertices (only DNN-based)
  15. 15. Experiments: moderate-scale graphs • Remarks: • One-by-one policy performs poorly for large graphs. Results on real-world graphs with ~20,100 vertices Learning What to Defer for Maximum Independent Sets
  16. 16. • Remarks: • One-by-one policy is not applicable to large graphs. • LwD outperforms the SOTA with up to 15x speedup. Experiments: large-scale graphs Results on large scale graphs with ~2,000,000 vertices LwD+LS > LwD > KaMIS > TGS+GR+LS > TGS > CPLEX (ours > SOTA > SL > exact) Learning What to Defer for Maximum Independent Sets
  17. 17. • Our algorithm can be applied outside the maximum independent set problem: • Maximum weighted independent set (MWIS) problem • Prize collecting maximum independent set (PCMIS) problem • MAXCUT problem • Maximum-a-posteriori inference for the Ising model Experiments: other combinatorial optimizations Results on problems outside the maximum independent set problem. Learning What to Defer for Maximum Independent Sets
  18. 18. • We propose Learning What to Defer (LwD) framework for solving combinatorial problems. • LwD outperforms existing deep learning solvers for solving the MIS problem. • LwD outperforms the existing solvers in large graphs, under limited time budget. Conclusion Learning What to Defer for Maximum Independent Sets
  19. 19. Thank you very much!

×