SlideShare a Scribd company logo
1 of 24
Download to read offline
A Learning-based Iterative
Method for Solving Vehicle
Routing Problem
Hao Lu, Xingwen Zhang and Shuang Yang
Princeton University and Ant Financial Services Group
ICLR 2020
Abstract
• Present “Learn to Improve (L2I)” to solve capacitated vehicle routing
problem (CVRP).
• Start with initial solution, refine the solution iteratively.
• Outperform the classical operations research (OR) approach (e.g.
LKH3).
Introduction
• In recent years, after the Pointer Network, researchers start to
develop new deep learning and reinforcement learning framework to
solve combinatorial optimization problems.
• In terms of vehicle routing problem, the prior results can not beat the
OR algorithm LKH3.
• Propose a learning-based algorithm for solving CVRP and outperform
classical solvers.
Introduction (cont’d)
• Propose hierarchical framework.
• Separate heuristic operators into two classes, improvement operators and
perturbation operators.
• Choose the class first and then choose operators within the class.
• Propose an ensemble method training several RL policies at the same
time.
Background
Capacitated Vehicle Routing Problem (CVRP)
• There is a depot and a set of 𝑁 customers in the CVRP. Each customer
𝑖 has a demand 𝑑𝑖 to be satisfied.
• A vehicle which starts at and ends at the depot, can serve a set of
customers and the total customer demand does not exceed the
capacity of the vehicle 𝐶.
• Find a set of routes with minimal cost to fulfill the demands of a set of
customers without violating vehicle capacity constraints.
Local search and 2-opt
• Start with feasible solution and look for an improved solution .
• Two TSP tours are called 2-adjacent if one can be obtained from the
other by deleting two edges and adding two edges.
• A TSP tour T is called 2-optimal if there is no 2-adjacent tour to T with
lower cost than T.
• 2-opt heuristic: Continuously replace the 2-adjacent tour whose cost
is lower than current tour until there is a 2-optimal tour.
Source: MIT 15.053/8 The Traveling Salesman Problem and Heuristics
Source: MIT 15.053/8 The Traveling Salesman Problem and Heuristics
Source: MIT 15.053/8 The Traveling Salesman Problem and Heuristics
Learn to Improve
• Improvement operator try to improve the solution.
• Call maximum consecutive sequence of improvement operators applied
before perturbation an improvement iteration.
• Perturbation operator destroy and reconstruct to generate a new
starting solution.
• If no cost reduction has been made for 𝐿 improvement steps, perturb
the solution.
• After 𝑇 steps, the algorithm stops and choose the minimum cost
solution.
L2I hierarchy framework
States for each node
+1 if action led to reduction, -1 otherwise
problem
solution
Reward and Policy network
• Reward
• Intermediate impact
• 1 if the operator improve the current solution, -1 otherwise
• Advantage-based
• Take the distance for the problem during the first improvement iteration as a baseline.
• For the subsequent iteration, receive reward equal to difference between current distance
and the baseline.
• Policy network
• REINFORCE algorithm
𝛻𝜃 𝐽 𝜃 𝑠 = 𝔼 𝜋~𝑝 𝜃 . 𝑠 [ 𝐿 𝜋 𝑠 − 𝑏 𝑠 𝛻𝜃log p 𝜃(𝜋|𝑠)]
• Attention network:
• Transformer
• 8 heads
• 64 output unit
• Ensemble method: train 6 different policies.
Experiments and Analyses
• Three sub-problems with number of customers 𝑁 = 20,50, 100
• Location of each customer and the depot in 0,1 2.
• Demand of each customer in 1, 2, … , 9 .
• The capacity of a vehicle is 20, 30, 40 for 𝑁 = 20,50,100, respectively.
• ADAM optimizer
• 𝑇 = 40000, perturb solution after 𝐿 = 6 consecutive non-
improvements
• 2000 random samples
heuristic
solver
SOTA
With improvement operator but lack of perturbation operation
Apply on TSP
Use the first node as depot, zero demand in each customer
Conclusion
• Propose “Learn to Improve” for solving CVRP and ensemble method
training several RL policies and choose the best solution produced by
the policies.
• Combine the strength of OR with learning capabilities of RL.
• Achieve new state-of-the-art result for CVRP instances.

More Related Content

What's hot

CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...Antonio Tejero de Pablos
 
AIと最適化の違いをうっかり聞いてしまう前に
AIと最適化の違いをうっかり聞いてしまう前にAIと最適化の違いをうっかり聞いてしまう前に
AIと最適化の違いをうっかり聞いてしまう前にMonta Yashi
 
Robot frontier lesson1
Robot frontier lesson1Robot frontier lesson1
Robot frontier lesson1Ryuichi Ueda
 
Techtalk:多様体
Techtalk:多様体Techtalk:多様体
Techtalk:多様体Kenta Oono
 
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with TransformersDeep Learning JP
 
量子アニーリングのこれまでとこれから -- ハード・ソフト・アプリ三方向からの協調的展開 --
量子アニーリングのこれまでとこれから -- ハード・ソフト・アプリ三方向からの協調的展開 --量子アニーリングのこれまでとこれから -- ハード・ソフト・アプリ三方向からの協調的展開 --
量子アニーリングのこれまでとこれから -- ハード・ソフト・アプリ三方向からの協調的展開 --Shu Tanaka
 
[DL輪読会]1次近似系MAMLとその理論的背景
[DL輪読会]1次近似系MAMLとその理論的背景[DL輪読会]1次近似系MAMLとその理論的背景
[DL輪読会]1次近似系MAMLとその理論的背景Deep Learning JP
 
人と人の相性を考慮したシフトスケジューラ
人と人の相性を考慮したシフトスケジューラ人と人の相性を考慮したシフトスケジューラ
人と人の相性を考慮したシフトスケジューラ鈴木 庸氏
 
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
 [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient DescentDeep Learning JP
 
合成変量とアンサンブル:回帰森と加法モデルの要点
合成変量とアンサンブル:回帰森と加法モデルの要点合成変量とアンサンブル:回帰森と加法モデルの要点
合成変量とアンサンブル:回帰森と加法モデルの要点Ichigaku Takigawa
 
スパース性に基づく機械学習 2章 データからの学習
スパース性に基づく機械学習 2章 データからの学習スパース性に基づく機械学習 2章 データからの学習
スパース性に基づく機械学習 2章 データからの学習hagino 3000
 
KDD2018 DiDi 「large-scale order dispatch in on-demand ride-hailing platforms:...
KDD2018 DiDi 「large-scale order dispatch in on-demand ride-hailing platforms:...KDD2018 DiDi 「large-scale order dispatch in on-demand ride-hailing platforms:...
KDD2018 DiDi 「large-scale order dispatch in on-demand ride-hailing platforms:...SaeruYamamuro
 
全部Excelだけで実現しようとして後悔するデータ分析 2nd Edition
全部Excelだけで実現しようとして後悔するデータ分析 2nd Edition 全部Excelだけで実現しようとして後悔するデータ分析 2nd Edition
全部Excelだけで実現しようとして後悔するデータ分析 2nd Edition __john_smith__
 
Elasticsearchインデクシングのパフォーマンスを測ってみた
Elasticsearchインデクシングのパフォーマンスを測ってみたElasticsearchインデクシングのパフォーマンスを測ってみた
Elasticsearchインデクシングのパフォーマンスを測ってみたRyoji Kurosawa
 
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object DetectionDeep Learning JP
 
Juliaで学ぶ Hamiltonian Monte Carlo (NUTS 入り)
Juliaで学ぶ Hamiltonian Monte Carlo (NUTS 入り)Juliaで学ぶ Hamiltonian Monte Carlo (NUTS 入り)
Juliaで学ぶ Hamiltonian Monte Carlo (NUTS 入り)Kenta Sato
 
高効率音声符号化―MP3詳解―
高効率音声符号化―MP3詳解―高効率音声符号化―MP3詳解―
高効率音声符号化―MP3詳解―Akinori Ito
 

What's hot (20)

CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
AIと最適化の違いをうっかり聞いてしまう前に
AIと最適化の違いをうっかり聞いてしまう前にAIと最適化の違いをうっかり聞いてしまう前に
AIと最適化の違いをうっかり聞いてしまう前に
 
Robot frontier lesson1
Robot frontier lesson1Robot frontier lesson1
Robot frontier lesson1
 
ILRMA 20170227 danwakai
ILRMA 20170227 danwakaiILRMA 20170227 danwakai
ILRMA 20170227 danwakai
 
Techtalk:多様体
Techtalk:多様体Techtalk:多様体
Techtalk:多様体
 
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
 
量子アニーリングのこれまでとこれから -- ハード・ソフト・アプリ三方向からの協調的展開 --
量子アニーリングのこれまでとこれから -- ハード・ソフト・アプリ三方向からの協調的展開 --量子アニーリングのこれまでとこれから -- ハード・ソフト・アプリ三方向からの協調的展開 --
量子アニーリングのこれまでとこれから -- ハード・ソフト・アプリ三方向からの協調的展開 --
 
[DL輪読会]1次近似系MAMLとその理論的背景
[DL輪読会]1次近似系MAMLとその理論的背景[DL輪読会]1次近似系MAMLとその理論的背景
[DL輪読会]1次近似系MAMLとその理論的背景
 
人と人の相性を考慮したシフトスケジューラ
人と人の相性を考慮したシフトスケジューラ人と人の相性を考慮したシフトスケジューラ
人と人の相性を考慮したシフトスケジューラ
 
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
 [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
 
Matlab講習2021
Matlab講習2021Matlab講習2021
Matlab講習2021
 
合成変量とアンサンブル:回帰森と加法モデルの要点
合成変量とアンサンブル:回帰森と加法モデルの要点合成変量とアンサンブル:回帰森と加法モデルの要点
合成変量とアンサンブル:回帰森と加法モデルの要点
 
スパース性に基づく機械学習 2章 データからの学習
スパース性に基づく機械学習 2章 データからの学習スパース性に基づく機械学習 2章 データからの学習
スパース性に基づく機械学習 2章 データからの学習
 
KDD2018 DiDi 「large-scale order dispatch in on-demand ride-hailing platforms:...
KDD2018 DiDi 「large-scale order dispatch in on-demand ride-hailing platforms:...KDD2018 DiDi 「large-scale order dispatch in on-demand ride-hailing platforms:...
KDD2018 DiDi 「large-scale order dispatch in on-demand ride-hailing platforms:...
 
全部Excelだけで実現しようとして後悔するデータ分析 2nd Edition
全部Excelだけで実現しようとして後悔するデータ分析 2nd Edition 全部Excelだけで実現しようとして後悔するデータ分析 2nd Edition
全部Excelだけで実現しようとして後悔するデータ分析 2nd Edition
 
Elasticsearchインデクシングのパフォーマンスを測ってみた
Elasticsearchインデクシングのパフォーマンスを測ってみたElasticsearchインデクシングのパフォーマンスを測ってみた
Elasticsearchインデクシングのパフォーマンスを測ってみた
 
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
[DL輪読会]Libra R-CNN: Towards Balanced Learning for Object Detection
 
Juliaで学ぶ Hamiltonian Monte Carlo (NUTS 入り)
Juliaで学ぶ Hamiltonian Monte Carlo (NUTS 入り)Juliaで学ぶ Hamiltonian Monte Carlo (NUTS 入り)
Juliaで学ぶ Hamiltonian Monte Carlo (NUTS 入り)
 
高効率音声符号化―MP3詳解―
高効率音声符号化―MP3詳解―高効率音声符号化―MP3詳解―
高効率音声符号化―MP3詳解―
 

Similar to Paper Study: A learning based iterative method for solving vehicle routing

Welch Verolog 2013
Welch Verolog 2013Welch Verolog 2013
Welch Verolog 2013Philip Welch
 
1907555 ant colony optimization for simulated dynamic multi-objective railway...
1907555 ant colony optimization for simulated dynamic multi-objective railway...1907555 ant colony optimization for simulated dynamic multi-objective railway...
1907555 ant colony optimization for simulated dynamic multi-objective railway...Mamun Hasan
 
SPLT Transformer.pptx
SPLT Transformer.pptxSPLT Transformer.pptx
SPLT Transformer.pptxSeungeon Baek
 
Parallel Artificial Bee Colony Algorithm
Parallel Artificial Bee Colony AlgorithmParallel Artificial Bee Colony Algorithm
Parallel Artificial Bee Colony AlgorithmSameer Raghuram
 
A feasible solution algorithm for a primitive vehicle routing problem
A feasible solution algorithm for a primitive vehicle routing problemA feasible solution algorithm for a primitive vehicle routing problem
A feasible solution algorithm for a primitive vehicle routing problemCem Recai Çırak
 
A Dynamic Logistic Dispatching System With Set-Based Particle Swarm Optimization
A Dynamic Logistic Dispatching System With Set-Based Particle Swarm OptimizationA Dynamic Logistic Dispatching System With Set-Based Particle Swarm Optimization
A Dynamic Logistic Dispatching System With Set-Based Particle Swarm OptimizationRajib Roy
 
CH-1.1 Introduction (1).pptx
CH-1.1 Introduction (1).pptxCH-1.1 Introduction (1).pptx
CH-1.1 Introduction (1).pptxsatvikkushwaha1
 
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptxUNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptxMinilikDerseh1
 
linearprogramingproblemlpp-180729145239.pptx
linearprogramingproblemlpp-180729145239.pptxlinearprogramingproblemlpp-180729145239.pptx
linearprogramingproblemlpp-180729145239.pptxKOUSHIkPIPPLE
 
Fdp session rtu session 1
Fdp session rtu session 1Fdp session rtu session 1
Fdp session rtu session 1sprsingh1
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models ananth
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning민재 정
 
Deep Reinforcement learning
Deep Reinforcement learningDeep Reinforcement learning
Deep Reinforcement learningCairo University
 
AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012OptiModel
 
Towards better bus networks: A visual analytics approach
Towards better bus networks: A visual analytics approachTowards better bus networks: A visual analytics approach
Towards better bus networks: A visual analytics approachivaderivader
 
Analysis and Design of Algorithms
Analysis and Design of AlgorithmsAnalysis and Design of Algorithms
Analysis and Design of AlgorithmsBulbul Agrawal
 
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Mauro Vallati
 

Similar to Paper Study: A learning based iterative method for solving vehicle routing (20)

Welch Verolog 2013
Welch Verolog 2013Welch Verolog 2013
Welch Verolog 2013
 
1907555 ant colony optimization for simulated dynamic multi-objective railway...
1907555 ant colony optimization for simulated dynamic multi-objective railway...1907555 ant colony optimization for simulated dynamic multi-objective railway...
1907555 ant colony optimization for simulated dynamic multi-objective railway...
 
SPLT Transformer.pptx
SPLT Transformer.pptxSPLT Transformer.pptx
SPLT Transformer.pptx
 
Parallel Artificial Bee Colony Algorithm
Parallel Artificial Bee Colony AlgorithmParallel Artificial Bee Colony Algorithm
Parallel Artificial Bee Colony Algorithm
 
Data envelopment analysis
Data envelopment analysisData envelopment analysis
Data envelopment analysis
 
Operations research
Operations researchOperations research
Operations research
 
A feasible solution algorithm for a primitive vehicle routing problem
A feasible solution algorithm for a primitive vehicle routing problemA feasible solution algorithm for a primitive vehicle routing problem
A feasible solution algorithm for a primitive vehicle routing problem
 
A Dynamic Logistic Dispatching System With Set-Based Particle Swarm Optimization
A Dynamic Logistic Dispatching System With Set-Based Particle Swarm OptimizationA Dynamic Logistic Dispatching System With Set-Based Particle Swarm Optimization
A Dynamic Logistic Dispatching System With Set-Based Particle Swarm Optimization
 
CH-1.1 Introduction (1).pptx
CH-1.1 Introduction (1).pptxCH-1.1 Introduction (1).pptx
CH-1.1 Introduction (1).pptx
 
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptxUNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
UNIT-2 Quantitaitive Anlaysis for Mgt Decisions.pptx
 
linearprogramingproblemlpp-180729145239.pptx
linearprogramingproblemlpp-180729145239.pptxlinearprogramingproblemlpp-180729145239.pptx
linearprogramingproblemlpp-180729145239.pptx
 
Fdp session rtu session 1
Fdp session rtu session 1Fdp session rtu session 1
Fdp session rtu session 1
 
Unit ii-1-lp
Unit ii-1-lpUnit ii-1-lp
Unit ii-1-lp
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning
 
Deep Reinforcement learning
Deep Reinforcement learningDeep Reinforcement learning
Deep Reinforcement learning
 
AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012AIAA-SDM-SequentialSampling-2012
AIAA-SDM-SequentialSampling-2012
 
Towards better bus networks: A visual analytics approach
Towards better bus networks: A visual analytics approachTowards better bus networks: A visual analytics approach
Towards better bus networks: A visual analytics approach
 
Analysis and Design of Algorithms
Analysis and Design of AlgorithmsAnalysis and Design of Algorithms
Analysis and Design of Algorithms
 
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
Argumentation in Artificial Intelligence: From Theory to Practice (Practice)
 

More from ChenYiHuang5

The Journey to the Kubernetes networking.pdf
The Journey to the Kubernetes networking.pdfThe Journey to the Kubernetes networking.pdf
The Journey to the Kubernetes networking.pdfChenYiHuang5
 
The journey to the kubernetes metrics
The journey to the kubernetes metricsThe journey to the kubernetes metrics
The journey to the kubernetes metricsChenYiHuang5
 
Paper Study: Transformer dissection
Paper Study: Transformer dissectionPaper Study: Transformer dissection
Paper Study: Transformer dissectionChenYiHuang5
 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!ChenYiHuang5
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksChenYiHuang5
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satChenYiHuang5
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 

More from ChenYiHuang5 (8)

The Journey to the Kubernetes networking.pdf
The Journey to the Kubernetes networking.pdfThe Journey to the Kubernetes networking.pdf
The Journey to the Kubernetes networking.pdf
 
The journey to the kubernetes metrics
The journey to the kubernetes metricsThe journey to the kubernetes metrics
The journey to the kubernetes metrics
 
Paper Study: Transformer dissection
Paper Study: Transformer dissectionPaper Study: Transformer dissection
Paper Study: Transformer dissection
 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
 
Buddy system
Buddy systemBuddy system
Buddy system
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 

Recently uploaded

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

Paper Study: A learning based iterative method for solving vehicle routing

  • 1. A Learning-based Iterative Method for Solving Vehicle Routing Problem Hao Lu, Xingwen Zhang and Shuang Yang Princeton University and Ant Financial Services Group ICLR 2020
  • 2. Abstract • Present “Learn to Improve (L2I)” to solve capacitated vehicle routing problem (CVRP). • Start with initial solution, refine the solution iteratively. • Outperform the classical operations research (OR) approach (e.g. LKH3).
  • 3. Introduction • In recent years, after the Pointer Network, researchers start to develop new deep learning and reinforcement learning framework to solve combinatorial optimization problems. • In terms of vehicle routing problem, the prior results can not beat the OR algorithm LKH3. • Propose a learning-based algorithm for solving CVRP and outperform classical solvers.
  • 4. Introduction (cont’d) • Propose hierarchical framework. • Separate heuristic operators into two classes, improvement operators and perturbation operators. • Choose the class first and then choose operators within the class. • Propose an ensemble method training several RL policies at the same time.
  • 6. Capacitated Vehicle Routing Problem (CVRP) • There is a depot and a set of 𝑁 customers in the CVRP. Each customer 𝑖 has a demand 𝑑𝑖 to be satisfied. • A vehicle which starts at and ends at the depot, can serve a set of customers and the total customer demand does not exceed the capacity of the vehicle 𝐶. • Find a set of routes with minimal cost to fulfill the demands of a set of customers without violating vehicle capacity constraints.
  • 7. Local search and 2-opt • Start with feasible solution and look for an improved solution . • Two TSP tours are called 2-adjacent if one can be obtained from the other by deleting two edges and adding two edges. • A TSP tour T is called 2-optimal if there is no 2-adjacent tour to T with lower cost than T. • 2-opt heuristic: Continuously replace the 2-adjacent tour whose cost is lower than current tour until there is a 2-optimal tour. Source: MIT 15.053/8 The Traveling Salesman Problem and Heuristics
  • 8. Source: MIT 15.053/8 The Traveling Salesman Problem and Heuristics
  • 9. Source: MIT 15.053/8 The Traveling Salesman Problem and Heuristics
  • 10.
  • 12. • Improvement operator try to improve the solution. • Call maximum consecutive sequence of improvement operators applied before perturbation an improvement iteration. • Perturbation operator destroy and reconstruct to generate a new starting solution. • If no cost reduction has been made for 𝐿 improvement steps, perturb the solution. • After 𝑇 steps, the algorithm stops and choose the minimum cost solution.
  • 14.
  • 15.
  • 16. States for each node +1 if action led to reduction, -1 otherwise problem solution
  • 17. Reward and Policy network • Reward • Intermediate impact • 1 if the operator improve the current solution, -1 otherwise • Advantage-based • Take the distance for the problem during the first improvement iteration as a baseline. • For the subsequent iteration, receive reward equal to difference between current distance and the baseline. • Policy network • REINFORCE algorithm 𝛻𝜃 𝐽 𝜃 𝑠 = 𝔼 𝜋~𝑝 𝜃 . 𝑠 [ 𝐿 𝜋 𝑠 − 𝑏 𝑠 𝛻𝜃log p 𝜃(𝜋|𝑠)]
  • 18. • Attention network: • Transformer • 8 heads • 64 output unit • Ensemble method: train 6 different policies.
  • 19. Experiments and Analyses • Three sub-problems with number of customers 𝑁 = 20,50, 100 • Location of each customer and the depot in 0,1 2. • Demand of each customer in 1, 2, … , 9 . • The capacity of a vehicle is 20, 30, 40 for 𝑁 = 20,50,100, respectively. • ADAM optimizer • 𝑇 = 40000, perturb solution after 𝐿 = 6 consecutive non- improvements • 2000 random samples
  • 20. heuristic solver SOTA With improvement operator but lack of perturbation operation
  • 21.
  • 22.
  • 23. Apply on TSP Use the first node as depot, zero demand in each customer
  • 24. Conclusion • Propose “Learn to Improve” for solving CVRP and ensemble method training several RL policies and choose the best solution produced by the policies. • Combine the strength of OR with learning capabilities of RL. • Achieve new state-of-the-art result for CVRP instances.