SlideShare a Scribd company logo
Computational Aspects Of
Vehicle Routing
Victor Pillac
May 20
VRP 2013, Angers, France
1
Agenda
2
• Introduction
• What is complexity? Why is it important?
• Data structures
• How to represent a solution efficiently?
• Algorithmic tricks
• What are the main bottlenecks and how to avoid them?
• Parallelization
• How do parallel computing work? Why, when, and how to
parallelize?
• Software engineering
• How to design flexible and reusable code?
• Resources
• How to avoid reinventing the wheel?
INTRODUCTION
3
About Me
• Finished my Ph. D. in 2012 at the Ecole des Mines de Nantes (France)
and Universidad de Los Andes (Colombia)
• Dynamic vehicle routing: solution methods and computational tools
• Since Oct. 2012, researcher at NICTA (Melbourne,Australia)
• Disaster management team
• NICTA in a few numbers:
• 700 staff, 260 PhDs
• 7 research groups, 4 business teams
• 550+ publications in 2012
4
Assumptions
• General knowledge on vehicle routing
• General knowledge of common heuristics
• Local search
• Variable Neighborhood Search (VNS)
• General knowledge of object-oriented programming
• Examples are in Java
5
Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
For	
  i	
  =	
  1	
  to	
  n
a	
  =	
  1	
  +	
  i
b	
  =	
  2	
  *	
  a
c	
  =	
  a	
  *	
  b	
  +	
  2
Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
For	
  i	
  =	
  1	
  to	
  n
a	
  =	
  1	
  +	
  i
b	
  =	
  2	
  *	
  a
c	
  =	
  a	
  *	
  b	
  +	
  2
Performs n*(1+1+2) = 4n operations
Complexity is O(n)
Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
For	
  i	
  =	
  1	
  to	
  n
a	
  =	
  1	
  +	
  i
b	
  =	
  2	
  *	
  a
c	
  =	
  a	
  *	
  b	
  +	
  2
Performs n*(1+1+2) = 4n operations
Complexity is O(n)
For	
  S	
  ⊆	
  {1..n}
a	
  =	
  1	
  +	
  |S|
Time Complexity
6
• Measure the worst case number of operations
• Expressed as a function of the size of the problem n
For	
  i	
  =	
  1	
  to	
  n
a	
  =	
  1	
  +	
  i
b	
  =	
  2	
  *	
  a
c	
  =	
  a	
  *	
  b	
  +	
  2
Performs n*(1+1+2) = 4n operations
Complexity is O(n)
For	
  S	
  ⊆	
  {1..n}
a	
  =	
  1	
  +	
  |S|
Performs 2n operations
Complexity is O(2n)
Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
For	
  i	
  =	
  1	
  to	
  n
a	
  =	
  1	
  +	
  i
b	
  =	
  2	
  *	
  a
c	
  =	
  a	
  *	
  b	
  +	
  2
Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
For	
  i	
  =	
  1	
  to	
  n
a	
  =	
  1	
  +	
  i
b	
  =	
  2	
  *	
  a
c	
  =	
  a	
  *	
  b	
  +	
  2
Stores at most 4 integers simultaneously
Complexity is O(1)
Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
For	
  i	
  =	
  1	
  to	
  n
a	
  =	
  1	
  +	
  i
b	
  =	
  2	
  *	
  a
c	
  =	
  a	
  *	
  b	
  +	
  2
Stores at most 4 integers simultaneously
Complexity is O(1)
For	
  S	
  ⊆	
  {1..n}
a	
  =	
  1	
  +	
  |S|
Space Complexity
7
• Measure the worst case memory usage
• Expressed in as a function of the size of the problem n
For	
  i	
  =	
  1	
  to	
  n
a	
  =	
  1	
  +	
  i
b	
  =	
  2	
  *	
  a
c	
  =	
  a	
  *	
  b	
  +	
  2
Stores at most 4 integers simultaneously
Complexity is O(1)
For	
  S	
  ⊆	
  {1..n}
a	
  =	
  1	
  +	
  |S|
Stores at most n+1 integers simultaneously
Complexity is O(n)
Complexity In Practice
8
10 100 1000 10,000 100,000 1000,000
n 0.1 ns 1 ns 10 ns 100 ns 1 µs 10 µs
n.log(n) 0.1 ns 2 ns 30 ns 400 ns 5 µs 60 µs
n2 1 ns 100 ns 10 µs 1 ms 100 ms 10 s
n3 1 ns 10 µs 10 ms 1 s 2.7 h 115 d
en 22 µs 8.5 1024 years - - - -
Computational time for a single floating point operation
on a recent desktop processor
Complexity In Practice
9
10 100 1000 10,000 100,000 1000,000
n 320 b 3.2 kb 32 kb 320 kb 3.2 Mb 32 Mb
n.log(n) 320 b 64 kb 96 kb 1.28 Mb 16 Mb 190 Mb
n2 3.2 kb 320 kb 32 Mb 3.2 Gb 320 Gb 32Tb
n3 32 kb 32 Mb 32 Gb 32Tb 32 Pb 32 Eb*
en 700 kb 8 1026 Eb
Memory requirement to store a single
floating point precision number
(*The world’s storage capacity is estimated to be 300 Eb - or 300 billion Gb)
Local Search & Terminology
10
Initial solution
S0
Local Search & Terminology
10
Initial solution
NeighborhoodS0
Local Search & Terminology
10
Initial solution
Move
Neighbor
NeighborhoodS0
Local Search & Terminology
10
Initial solution
Move
Neighbor
Neighborhood
Current solution
S0
S1
Executed Move
Local Search & Terminology
10
Initial solution
Move
Neighbor
Neighborhood
Current solution
S0
S1
S3
Executed Move
DATA STRUCTURES
11
Representing Routes
12
• Routes are the base of solving vehicle routing problems
• It is critical to have efficient data structures to store them
• There is no best data structure
• Performance depends on how it is used
• Tradeoff between simplicity and performance
• Choice should be motivated by
• Purpose: prototype v.s. state of the art algorithm
• Usage: what are the most common operations?
Dynamic Array List
• Common operation complexity
• Access to the customer by position: O(1)
• Access to the position of customer by id: O(n)
• Iteration: O(1)
• Insertion/deletion: O(n)
• See	
  [ArrayListRoute.java]
13
1 2 3 4 5
0 2 3 4 0
1 2 3 4 5 6
0 1 6 5 7 0
0
7
5
6
4
3
2
1
Doubly Linked List
• Common operation complexity
• Access to the customer by position: O(n)
• Access to the position of customer by id: O(n)
• Iteration: O(1)
• Insertion/deletion: O(1)
• See[LinkedListRoute.java]
14
0
0
2
1
3
6
4
6
0
7 0
0
7
5
6
4
3
2
1
• Common operation complexity
• Access to the customer by position: O(n)
• Access to the position of customer by id: O(1)
• Iteration: O(1)
• Insertion/deletion: O(1)
Doubly Linked List V2
15
0
7
5
6
4
3
2
1
1 2 3 4 5 6 7
0 0 2 3 6 1 5
1 2 3 4 5 6 7
6 3 4 0 7 5 0
Predecessor
Successor
First
Last
• Common operation complexity
• Access to the customer by position: O(n)
• Access to the position of customer by id: O(1)
• Iteration: O(1)
• Insertion/deletion: O(1)
Implementation can be tricky, especially for
repeated nodes (e.g., depot)
Warning: The implementation in
VroomModeling is full of bugs “incomplete”
Doubly Linked List V2
15
0
7
5
6
4
3
2
1
1 2 3 4 5 6 7
0 0 2 3 6 1 5
1 2 3 4 5 6 7
6 3 4 0 7 5 0
Predecessor
Successor
First
Last
16
/*	
  HANDS	
  ON	
  */
17
Resources / Solutions:
http://victorpillac.com/vrp2013
18
Naming Conventions:
m: prefix for instance fields (e.g., mMyField)
s: prefix for static fields (e.g., sMyStaticField)
I: prefix for interface names (e.g., IMyInterface)
Base: suffix for abstract types (e.g., MyTypeBase)
Logging:
Uses Log4J, see VRPLogging.java
19
Opened files
Documentation, Console
Packages,
source files
Class
structure
20
21
22
22
Compilation error
Click for quick fix
22
Error explanation
Compilation error
Click for quick fix
22
Possible fixes
Error explanation
Compilation error
Click for quick fix
algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
examples package util package
VNS
algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
Clarke and
Wright heuristic
to generate
routes
examples package util package
VNS
algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
Clarke and
Wright heuristic
to generate
routes
Explore a
number of
neighborhoods
examples package util package
VNS
algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
Clarke and
Wright heuristic
to generate
routes
Explore a
number of
neighborhoods
Start with a
solution from CW
and apply VND
examples package util package
VNS
algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
Clarke and
Wright heuristic
to generate
routes
Explore a
number of
neighborhoods
Start with a
solution from CW
and apply VND
Parallel
implementation
of GRASP
examples package util package
VNS
algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
Clarke and
Wright heuristic
to generate
routes
Explore a
number of
neighborhoods
Start with a
solution from CW
and apply VND
Parallel
implementation
of GRASP
Takes a set of
routes and build
a solution
examples package util package
VNS
algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
Clarke and
Wright heuristic
to generate
routes
Explore a
number of
neighborhoods
Start with a
solution from CW
and apply VND
Parallel
implementation
of GRASP
Takes a set of
routes and build
a solution
examples package
Each class contains a main method that we will
use to run the examples
util package
VNS
algorithms package
23
IVRPOptimizationAlgorithm
GRASPVND ParallelGRASP
Heuristic
Concentration
CW
Clarke and
Wright heuristic
to generate
routes
Explore a
number of
neighborhoods
Start with a
solution from CW
and apply VND
Parallel
implementation
of GRASP
Takes a set of
routes and build
a solution
examples package
Each class contains a main method that we will
use to run the examples
util package
Classes to make our life easier
VNS
24
[ExampleRoutesAtomic.java]
• Compares ArrayListRoute and LinkedListRoute
• Append a node
• Get a node at a random position
• Remove the first node
25
[ExampleRoutesAtomic.java]
• Compares ArrayListRoute and LinkedListRoute
• Append a node
• Get a node at a random position
• Remove the first node
25
ArrayList
Append:123.1ms	
  GetNodeAt:18.7ms	
  RemoveFirst:134.2ms
LinkedList
Append:129.4ms	
  GetNodeAt:66.6ms	
  RemoveFirst:110.6ms
[CW.java]
• Clarke and Wright constructive heuristic
26
0
4
3
2
1
2
13
4
0
4
3
2
1
2
13
4
0
4
3
2
1
2
13
4
Initialization:
create one route
per node
Each step: Merge
the two routes to
generate the
greatest saving
Repeat until there
are no more
feasible merging
[CW.java]
• Clarke and Wright constructive heuristic
26
0
4
3
2
1
2
13
4
0
4
3
2
1
2
13
4
0
4
3
2
1
2
13
4
Initialization:
create one route
per node
Each step: Merge
the two routes to
generate the
greatest saving
Repeat until there
are no more
feasible merging
Implemented in VroomHeuristics in package
vroom.common.heuristics.cw
[ExampleCW.java]
27
[ExampleCW.java]
27
true
[ExampleCW.java]
27
true
LEVEL_WARN
Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
N1start
Improvement
found?
Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
N1start
Improvement
found?
yes
Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
N1 N2start
Improvement
found?
yes
no
Variable Neighborhood Descent
• Explore different neighborhoods sequentially
• The final solution is a local optima for all neighborhoods
28
0
4
3
2
11
0
4
3
2
11
N1 N2
0
4
3
2
11
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
[VND.java]
29
[VND.java]
29
Ignore for now
[VND.java]
29
The constraints are defined separately
from the neighborhoods.
Each constraint is responsible for
checking if a move is feasible
Ignore for now
[VND.java]
29
The constraints are defined separately
from the neighborhoods.
Each constraint is responsible for
checking if a move is feasible
Ignore for now
Instantiate the neighborhood that will
be used later
[VND.java]
30
[VND.java]
30
Performs a local search in
the neighborhood of the
current solution
[VND.java]
30
Performs a local search in
the neighborhood of the
current solution
The parameters control how the
search is performed, in this case
deterministic & best improvement
[ExampleVND.java]
• Run the main method
• Is the ordering of neighborhoods in VND.java logical?
• How to improve it?
• Is the localSearch implementation in VND.java coherent
with the definition of VND?
31
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
[ExampleVND.java]
• Run the main method
• Is the ordering of neighborhoods in VND.java logical?
• How to improve it?
• Is the localSearch implementation in VND.java coherent
with the definition of VND?
31
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
[ExampleVND.java]
• Run the main method
• Is the ordering of neighborhoods in VND.java logical?
• How to improve it?
• Is the localSearch implementation in VND.java coherent
with the definition of VND?
31
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
[ExampleRoutesOptim.java]
• Compares ArrayListRoute and LinkedListRoute
• Constructive heuristic (CW)
• Variable Neighborhood Descent optimization (VND)
32
[ExampleRoutesOptim.java]
• Compares ArrayListRoute and LinkedListRoute
• Constructive heuristic (CW)
• Variable Neighborhood Descent optimization (VND)
32
ArrayList
CW	
  617.1ms	
  VND:67,576.7ms
LinkedList
CW	
  414.4ms	
  VND:86,443.3ms
Store Routes
33
• Store routes for future use
• Requirements
• Memory-efficient
• Avoid repeated routes
• Store a minimalistic route representation
• Low computation overhead
• Two approaches
• Exhaustive list
• Issue: repeated routes
• Hash based set
Hash Functions
• Compress the information stored in a route
• Desired characteristics
• Determinism
• Uniformity
• Issues
• Two different routes can have the same hash (hash collision)
• Computational cost of hash evaluation
34
35
/*	
  HANDS	
  ON	
  */
• See Groer et al. 2010 - [GroerSolutionHasher.java]
• Produces a 32-bit integer that depend on the set and
sequence of nodes in the route
010	
  XOR	
  111	
  =	
  101
(2)	
  	
  	
  	
  	
  (7)	
  	
  	
  (5)
Sequence Dependent Hash
36
Input:	
  
-­‐	
  rnd:	
  An	
  array	
  of	
  n	
  random	
  integers
-­‐	
  route:	
  A	
  route
Output:
-­‐	
  A	
  hash	
  value	
  for	
  route
1.if	
  route.first	
  >	
  route.last
1.route	
  ←	
  reverse	
  ordering	
  of	
  route
2.hash	
  ←	
  0
3.For	
  each	
  edge	
  (i,j)	
  in	
  route
1.hash	
  ←	
  hash	
  XOR	
  rnd[i+j	
  %	
  n]
4.return	
  hash
Sequence Independent Hash
• See Pillac et al. 2012 - [NodeSetSolutionHasher.java]
• Produce a 32-bit integer that depends on the set of nodes
visited by the route
• Advantage:
• Implicit filtering of
duplicated routes
37
Input:	
  
-­‐	
  rnd:	
  An	
  array	
  of	
  n	
  random	
  integers
-­‐	
  route:	
  A	
  route
Output:
-­‐	
  A	
  hash	
  value	
  for	
  route
1.hash	
  ←	
  0
2.For	
  each	
  node	
  i	
  in	
  route
1.hash	
  ←	
  hash	
  XOR	
  rnd[i	
  %	
  n]
3.return	
  hash
Example
• Greedy Randomized Adaptive Search Procedure
38
Randomized Constructive
Heuristic
Start
Local Search
End
[GRASP.java]
• Clarke and Wright construction heuristic
• Variable Neighborhood Descent optimization
39
[GRASP.java]
• Clarke and Wright construction heuristic
• Variable Neighborhood Descent optimization
39
[GRASP.java]
• Clarke and Wright construction heuristic
• Variable Neighborhood Descent optimization
39
[ExampleGRASP.java]
• Runs the GRASP procedure on a single instance
40
Heuristic Concentration
41
Randomized Constructive
Heuristic
Start
Local Search
Heuristic Concentration
41
Randomized Constructive
Heuristic
Start
Local Search Route
pool
Heuristic Concentration
41
Randomized Constructive
Heuristic
Start
Local Search
End
Route
pool
Set Covering
Heuristic Concentration
42
min
X
p2⌦
cpxp
s.t.
X
p2⌦
ai
pxp 1 8i 2 N
xp 2 {0, 1} 8p 2 ⌦
• Set covering model:
Heuristic Concentration
42
min
X
p2⌦
cpxp
s.t.
X
p2⌦
ai
pxp 1 8i 2 N
xp 2 {0, 1} 8p 2 ⌦
Set of routes
• Set covering model:
Heuristic Concentration
42
min
X
p2⌦
cpxp
s.t.
X
p2⌦
ai
pxp 1 8i 2 N
xp 2 {0, 1} 8p 2 ⌦
Set of routes
Cost of route p
• Set covering model:
Heuristic Concentration
42
min
X
p2⌦
cpxp
s.t.
X
p2⌦
ai
pxp 1 8i 2 N
xp 2 {0, 1} 8p 2 ⌦
Set of routes
Cost of route p
1 if route p is selected
• Set covering model:
Heuristic Concentration
42
min
X
p2⌦
cpxp
s.t.
X
p2⌦
ai
pxp 1 8i 2 N
xp 2 {0, 1} 8p 2 ⌦
Set of routes
Cost of route p
1 if route p is selected
Set of nodes
• Set covering model:
Heuristic Concentration
42
min
X
p2⌦
cpxp
s.t.
X
p2⌦
ai
pxp 1 8i 2 N
xp 2 {0, 1} 8p 2 ⌦
Set of routes
Cost of route p
1 if route p is selected
1 if route p visits node i
Set of nodes
• Set covering model:
• Adapt the GRASP procedure to collect routes
• Add the following fragment where needed
• Hint: we want to collect as many routes as possible
• Experiment with different route pools
• What is the impact on the number of routes and HC time?
[ExampleGRASPHC.java]
43
Heuristic Concentration
44
Randomized Constructive
Heuristic
Start
Local Search
End
Route
pool
Set Covering
ALGORITHMIC TRICKS
45
Bottlenecks In Heuristics For VRP
46
• Size of the neighborhood
• Areas of the neighborhood are not interesting
• Only minor changes are made to the solution at each move
• How different is the new neighborhood?
• How to avoid restarting from scratch?
• Move evaluation
• Cost & Feasibility
• Performed millions of times
• Which is most costly? Which should be done first?
Granular Neighborhoods
47
• Reduce the size of the neighborhoods
• SeeToth andVigo (2003)
• Costly (long) arcs are less likely to be in good solutions
• Filter out moves that involves only costly arcs
• Costly arc threshold
5
4
3
2
1
0
Heuristic solution
Number of nodes
+ Number of vehicles
# = · z0
n+K0
Sparsification parameter
(e.g., =2.5)
Granular Neighborhoods
47
• Reduce the size of the neighborhoods
• SeeToth andVigo (2003)
• Costly (long) arcs are less likely to be in good solutions
• Filter out moves that involves only costly arcs
• Costly arc threshold
5
4
3
2
1
0
Inserting 5 between 3 and
4 involves 2 costly arcsHeuristic solution
Number of nodes
+ Number of vehicles
# = · z0
n+K0
Sparsification parameter
(e.g., =2.5)
Granular Neighborhoods
47
• Reduce the size of the neighborhoods
• SeeToth andVigo (2003)
• Costly (long) arcs are less likely to be in good solutions
• Filter out moves that involves only costly arcs
• Costly arc threshold
5
4
3
2
1
0
Inserting 5 between 3 and
4 involves 2 costly arcs
Inserting 5 between 1 and
2 involves 1 costly arcs
Heuristic solution
Number of nodes
+ Number of vehicles
# = · z0
n+K0
Sparsification parameter
(e.g., =2.5)
Static Move Descriptor (SMD)
• Store information between moves
• See Zachariadis and Kiranoudis (2010)
• Precompute and maintain all moves
• Example with relocate (relocation of a single node)
48
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... 0 0 ...
0 ... ... ... ...
0 ... ... ... ...
... ... 0 ... ...
... ... ... ... 0
... 0 ... ... ...
x
x
Static Move Descriptor (SMD)
• Store information between moves
• See Zachariadis and Kiranoudis (2010)
• Precompute and maintain all moves
• Example with relocate (relocation of a single node)
48
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... 0 0 ...
0 ... ... ... ...
0 ... ... ... ...
... ... 0 ... ...
... ... ... ... 0
... 0 ... ... ...
x
Cost of relocating 4 after 3:
c3,4+c0,5-c5,4-c3,0
x
Static Move Descriptor (SMD)
• Store information between moves
• See Zachariadis and Kiranoudis (2010)
• Precompute and maintain all moves
• Example with relocate (relocation of a single node)
48
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... 0 0 ...
0 ... ... ... ...
0 ... ... ... ...
... ... 0 ... ...
... ... ... ... 0
... 0 ... ... ...
x
Cost of relocating 4 after 3:
c3,4+c0,5-c5,4-c3,0
Cost of relocating 1 after 5:
c0,5+c1,4-c0,1-c5,4
x
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
SMD Update
• One static SMD table is created per neighborhood
• Static update rules are predefined to know which SMDs need
to be updated after a move was executed
49
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
SMD Update
• One static SMD table is created per neighborhood
• Static update rules are predefined to know which SMDs need
to be updated after a move was executed
49
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
SMD Update
• One static SMD table is created per neighborhood
• Static update rules are predefined to know which SMDs need
to be updated after a move was executed
49
5
4
3
2
1
0
n2=0 n2=1 n2=2 n2=3 n2=4 n2=5
n1=0
n1=1
n1=2
n1=3
n1=4
n1=5
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
... ... ... ... ...
SMD Update
• One static SMD table is created per neighborhood
• Static update rules are predefined to know which SMDs need
to be updated after a move was executed
49
5
4
3
2
1
0
Selecting The Best Neighbor
• All SMDs are store in a Fibonacci Heap
• O(1) access to the lowest cost SMD
• O(1) insertion
• O(n.log(n)) deletion
• How to find the
best feasible neighbor?
• Pop the lowest cost SMD
until a feasible move is found
50
Source: http://en.wikipedia.org/wiki/Fibonacci_heap
SMD In Practice
51
t tag of
m, so that
e heaps.
t update
opriately
s caused
tances.
D repre-
executed
ew SMD
ic move
was the
was the
in local
0
2
0
n: Problem Size
150 450 750 1050 150 450 750 1050
acted until the first admissible is obtained against problem size.
0
500
1000
1500
2000
2500
n: Problem Size
CPUTimefor50000iterations(sec)
Classic
Representation
SMD
reprentation
200 400 600 800 1000 1200
Fig. 8. The acceleration role of the SMD representation.
)
Source: Zachariadis and Kiranoudis (2010)
Comparison of computational times
Sequential Search
• Explore neighborhoods in a smart way
• See Irnich et al. (2006)
• Decompose moves in partial moves
• Example with swap
52
54
321
5
Sequential Search
• Explore neighborhoods in a smart way
• See Irnich et al. (2006)
• Decompose moves in partial moves
• Example with swap
52
54
321
5
Sequential Search
• Explore neighborhoods in a smart way
• See Irnich et al. (2006)
• Decompose moves in partial moves
• Example with swap
52
54
321
5
5
321
Sequential Search
• Explore neighborhoods in a smart way
• See Irnich et al. (2006)
• Decompose moves in partial moves
• Example with swap
52
54
321
5
5
321
54
2
5
Sequential Search In Practice
• Neighborhoods are explored by considering partial moves
• Exploration is pruned using bounds on the partial move cost
53
S. Irnich et al. / Computers & Operations Research 33 (2006) 2405–2429 2423
Or-Opt
40
60
80
100
120
140
elerationFactor
String-Exchange
0
100
200
300
400
500
600
700
800
900
0 500 1000 1500 2000 2500
AccelerationFactor
Special 2-Opt*
0
20
40
60
80
100
120
0 500 1000 1500 2000 2500
AccelerationFactor
Swap
0
20
40
60
80
100
120
0 500 1000 1500 2000 2500
AccelerationFactor
Relocation
20
30
40
50
60
70
elerationFactor
2-Opt
0
5
10
15
20
25
30
35
0 500 1000 1500 2000 2500
AccelerationFactor
f =100
f =75
f =50
f =25
f =100
f =75
f =50
f =25
f =100
f =75
f =50
f =100
f =75
f =50
f =100
f =75
f =50
f =25
f =100
f =75
f =50
f =25
Size n
Size n
Size n
Size n
2424 S. Irnich et al. / Computers & Operations Re
3-Opt*
0
2000
4000
6000
8000
10000
12000
14000
16000
200 300 400 500
AccelerationFactor
0
500
1000
1500
2000
AvgTimeperSearch[ms]
Size n
f =100
f =75
f =50
f =25
Fig. 8. Acceleration factor comparing lexicographic search and sequent
sequential search iteration for 3-opt* moves.f: average number of customers in a route
Speedup for swap and 3-Opt* neighborhoods
Source: Irnich et al. (2006)
Store Cumulative Information
• Reduce the complexity of move evaluation
• Store and maintain useful information
• For example: waiting time / forward slack time
• See Savelsbergh (1992)
• Constant time time window feasibility check
• More details in Module 2
54
PARALLELIZATION
55
Moore’s Law
56
Source: http://en.wikipedia.org/wiki/Moore's_law
Moore’s Law
56
Source: http://en.wikipedia.org/wiki/Moore's_law
Doubles every 2
years
Moore’s Law
56
Source: http://en.wikipedia.org/wiki/Moore's_law
Doubles every 2
years
Doubles every 3
years
Clock Frequency
57
Source: http://cpudb.stanford.edu/visualize/clock_frequency
Promises Of Parallelization
58
• Overcome the stalling of CPU performance increase
• Increased availability of parallel computing
• Personal computers with multiples CPUs/cores
• Most universities have access to large grids
• On demand cloud services (e.g.,Amazon)
Promises Of Parallelization
58
• Overcome the stalling of CPU performance increase
• Increased availability of parallel computing
• Personal computers with multiples CPUs/cores
• Most universities have access to large grids
• On demand cloud services (e.g.,Amazon)
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusion
Architecture Overview
59
CPU
Core
Threads
Architecture Overview
59
CPU
Core
L1cacheL1cache
L1cacheL1cache
Threads
Architecture Overview
59
CPU
L2cache(~Mb)
Core
L1cacheL1cache
L1cacheL1cache
Very Fast
Threads
Architecture Overview
59
CPU
L2cache(~Mb)
RAM
(~ Gb)
Core
L1cacheL1cache
L1cacheL1cache
Fast
Very Fast
Threads
Architecture Overview
59
CPU
L2cache(~Mb)
RAM
(~ Gb)
HDD
(~Tb)
Core
L1cacheL1cache
L1cacheL1cache
Fast
Slow
Very Fast
Threads
Extremely Slow
Architecture Overview
59
CPU
L2cache(~Mb)
RAM
(~ Gb)
HDD
(~Tb)
Core
L1cacheL1cache
L1cacheL1cache
Fast
Slow
Very Fast
Threads
CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1	
  =	
  new	
  Thread(do	
  A,	
  do	
  B)
thread1.run()
thread2	
  =	
  new	
  Thread(do	
  C)
thread2.run()
Operating System
thread2
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #4
Execute	
  the	
  instructions	
  A,B
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #1
Execute	
  the	
  instructions	
  C
thread1
CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1	
  =	
  new	
  Thread(do	
  A,	
  do	
  B)
thread1.run()
thread2	
  =	
  new	
  Thread(do	
  C)
thread2.run()
Operating System
thread2
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #4
Execute	
  the	
  instructions	
  A,B
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #1
Execute	
  the	
  instructions	
  C
thread1
Takes time
CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1	
  =	
  new	
  Thread(do	
  A,	
  do	
  B)
thread1.run()
thread2	
  =	
  new	
  Thread(do	
  C)
thread2.run()
Operating System
thread2
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #4
Execute	
  the	
  instructions	
  A,B
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #1
Execute	
  the	
  instructions	
  C
Limited control on the actual
execution sequence
thread1
Takes time
CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1	
  =	
  new	
  Thread(do	
  A,	
  do	
  B)
thread1.run()
thread2	
  =	
  new	
  Thread(do	
  C)
thread2.run()
Operating System
thread2
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #4
Execute	
  the	
  instructions	
  A,B
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #1
Execute	
  the	
  instructions	
  C
Limited control on the actual
execution sequence
thread1
Increased memory
usage
Takes time
CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1	
  =	
  new	
  Thread(do	
  A,	
  do	
  B)
thread1.run()
thread2	
  =	
  new	
  Thread(do	
  C)
thread2.run()
Operating System
thread2
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #4
Execute	
  the	
  instructions	
  A,B
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #1
Execute	
  the	
  instructions	
  C
Takes time
Limited control on the actual
execution sequence
thread1
Increased memory
usage
Takes time
CPU
Concepts And Limitations
60
My Program
executeMySequentialCode()
thread1	
  =	
  new	
  Thread(do	
  A,	
  do	
  B)
thread1.run()
thread2	
  =	
  new	
  Thread(do	
  C)
thread2.run()
Operating System
thread2
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #4
Execute	
  the	
  instructions	
  A,B
Create	
  a	
  new	
  thread
Assign	
  it	
  to	
  cpu	
  core	
  #1
Execute	
  the	
  instructions	
  C
Takes time
Concurrent access
to shared resources
Limited control on the actual
execution sequence
thread1
Increased memory
usage
Takes time
Sharing Is Caring
61
Thread 1 Thread 2object
x	
  =	
  11
2
3
4
1
2
3
4
Sharing Is Caring
61
Thread 1 Thread 2object
x	
  =	
  1 ✓ z	
  =	
  object.getX()y	
  =	
  object.getX()1
2
3
4
1
2
3
4
Sharing Is Caring
61
Thread 1 Thread 2object
x	
  =	
  1 ✓ z	
  =	
  object.getX()
z	
  =	
  z	
  +	
  2
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
1
2
3
4
1
2
3
4
Sharing Is Caring
61
Thread 1 Thread 2object
x	
  =	
  1 ✓
x	
  =	
  ?
z	
  =	
  object.getX()
z	
  =	
  z	
  +	
  2
object.setX(z)
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
1
2
3
4
1
2
3
4
Sharing Is Caring
61
Thread 1 Thread 2object
x	
  =	
  1 ✓
x	
  =	
  ?x	
  =	
  2
z	
  =	
  object.getX()
z	
  =	
  z	
  +	
  2
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
1
2
3
4
1
2
3
4
Sharing Is Caring
61
Thread 1 Thread 2object
x	
  =	
  1 ✓
x	
  =	
  ?
x	
  =	
  3
x	
  =	
  2
z	
  =	
  object.getX()
z	
  =	
  z	
  +	
  2
object.setX(z)
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
1
2
3
4
1
2
3
4
Sharing Is Caring
61
Thread 1 Thread 2object
x	
  =	
  1 ✓
x	
  =	
  ?
x	
  =	
  3
x	
  =	
  2
z	
  =	
  object.getX()
z	
  =	
  z	
  +	
  2
object.setX(z)
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
y	
  =	
  object.getX() ? ✗
1
2
3
4
1
2
3
4
Sharing Is Caring
61
Thread 1 Thread 2object
x	
  =	
  1 ✓
x	
  =	
  ?
x	
  =	
  3
x	
  =	
  2
x	
  =	
  1	
  +	
  1	
  +	
  2	
  ≠	
  3✗
z	
  =	
  object.getX()
z	
  =	
  z	
  +	
  2
object.setX(z)
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
y	
  =	
  object.getX() ? ✗
1
2
3
4
1
2
3
4
Sharing With Care
6224
Thread 1 Thread 2object
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Sharing With Care
6224
Thread 1 Thread 2object
lock(object)
(waiting)
lock(object)1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Sharing With Care
6224
Thread 1 Thread 2object
x	
  =	
  1
x	
  =	
  2
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
lock(object)
(waiting)
lock(object)1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Sharing With Care
6224
Thread 1 Thread 2object
x	
  =	
  1
x	
  =	
  2
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
release(object)
lock(object)
(waiting)
lock(object)1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Sharing With Care
6224
Thread 1 Thread 2object
x	
  =	
  1
x	
  =	
  2
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
release(object)
lock(object)
(waiting)
lock(object)
getlock(object)
(waiting)
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Sharing With Care
6224
Thread 1 Thread 2object
x	
  =	
  1
x	
  =	
  2
x	
  =	
  2 z	
  =	
  object.getX()
z	
  =	
  z	
  +	
  2
object.setX(z)
release(object)
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
release(object)
lock(object)
(waiting)
lock(object)
getlock(object)
(waiting)
x	
  =	
  4
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
Sharing With Care
6224
Thread 1 Thread 2object
x	
  =	
  1
x	
  =	
  2
x	
  =	
  2
x	
  =	
  1	
  +	
  1	
  +	
  2	
  =	
  4✓
z	
  =	
  object.getX()
z	
  =	
  z	
  +	
  2
object.setX(z)
release(object)
y	
  =	
  object.getX()
y	
  =	
  y	
  +	
  1
object.setX(y)
release(object)
lock(object)
(waiting)
lock(object)
getlock(object)
(waiting)
y	
  =	
  object.getX()
x	
  =	
  4
x	
  =	
  4
1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
9
10
The Limits Of Sharing
63
Thread 1 Thread 2object
lock(object)
(waiting)
release(object)
getlock(object)
(waiting)
release(object)
lock(object)
• Lock/release mechanisms force
threads to wait
• In the worst case the execution
is sequential
• In general
• Lock an object while it may
be modified
• Do no lock for read-only
operations
• Check for inconsistencies at
runtime
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusion
Parallelization In Practice
64
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
x (1- e- number of headaches )
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
x (1- e- number of headaches )
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
x (1- e- number of headaches )
Not so random
Parallel CPU time =
Sequential CPU time
Number of CPUs
The parallelization illusionParallel CPU time =
Sequential CPU time
Number of CPUs
Converges to
1
Parallelization In Practice
64
x Random(1, +∞)
x (1- e- time spent programming)
x (1- e- time spent debugging )
x (1- e- number of headaches )
Not so random
Amdahl’s Law
65
S =
1
1 ↵
P + ↵
Given:
- A fraction α of the
code can be
parallelized
- P processors
The speedup is
bounded by:
Amdahl’s Law
65
S =
1
1 ↵
P + ↵
Given:
- A fraction α of the
code can be
parallelized
- P processors
The speedup is
bounded by:
Source: http://en.wikipedia.org/wiki/Parallel_computing
Illusion
(P)
(↵)
(S)
Amdahl’s Law
65
S =
1
1 ↵
P + ↵
Given:
- A fraction α of the
code can be
parallelized
- P processors
The speedup is
bounded by:
“When a task cannot be partitioned
because of sequential constraints, the
application of more effort has no effect
on the schedule.The bearing of a child
takes nine months, no matter how many
women are assigned.”
Fred Brooks
Two Approaches
• Run a sequential algorithm in different threads
• E.g., different experiments, or runs of a same algorithm
• No synchronization issues
• Limited shared resources issues
• Design a parallel algorithm
• Potentially a real speedup of the algorithm
• Increase complexity and harder to debug
66
Learnt From Experience
• Limit number of shared resources
• Avoid risk of concurrent modifications
• Use bullet proof synchronization / locks / error checks
• Limit complex debugging
• Limit communication between threads
• Reduce waiting for other threads to exchange information
• Execute a significant number of operations in each thread
• Execution time ≫ thread creation overhead
67
68
/*	
  HANDS	
  ON	
  */
Thread 1 Thread 2 Thread 3
A Simple Example
• Parallel Greedy Randomized Adaptive Search Procedure
69
Randomized Constructive
Heuristic
Start
Local Search
End
Randomized Constructive
Heuristic
Local Search
Randomized Constructive
Heuristic
Local Search
[ParallelGRASP.java]
70
[ParallelGRASP.java]
70
Ask the system how many
processors are available
[ParallelGRASP.java]
70
Ask the system how many
processors are available
Create one GRASP instance per
iteration
[ParallelGRASP.java]
70
Ask the system how many
processors are available
Create one GRASP instance per
iteration
The executor will be responsible for
the creation of threads
[ParallelGRASP.java]
71
[ParallelGRASP.java]
71
The executor will creates threads as needed,
execute the GRASP subprocesses, and return
the results
[ParallelGRASP.java]
71
The executor will creates threads as needed,
execute the GRASP subprocesses, and return
the results
Loop through pairs
<GRASP subprocess, Best solution>
[ExampleParallelGRASP.java]
• Run the for different instances and compare with the
sequential version
• What is the speedup?
• Are the solutions identical?
• Going further ....
• Why do we create GRASP instances with a single iteration?
• What are the synchronization issues?
72
LS LS
Variable Neighborhood Search
• Similar to theVND
• Random exploration of each neighborhood
• Local search
73
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
LS
LS LS
Variable Neighborhood Search
• Similar to theVND
• Random exploration of each neighborhood
• Local search
73
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
LS
See [VNS.java]
LS LS
Variable Neighborhood Search
• Similar to theVND
• Random exploration of each neighborhood
• Local search
73
N1 N2start Nn end
Improvement
found?
Improvement
found?
Improvement
found?
yesyesyes
no no
LS
See [VNS.java]
NeighborhoodExplorer
[VNS.java]
74
Define 4 string-exchange
neighborhoods of increasing size
Define an explorer for each
neighborhood: include one
neighborhoo and a local search
LSNi
LSN2
LS
LS
Parallel Variable Neighborhood Search
• Explore all neighborhoods in parallel
• Select best neighbor
75
N1
N2start
Nn
end
Improvement
found?
yes
no
LS
LS
LS
Parallel Variable Neighborhood Search
• Explore all neighborhoods in parallel
• Select best neighbor
75
N1
N2start
Nn
end
Improvement
found?
yes
no
LS
See [ParallelVNS.java]
LS
LS
Parallel Variable Neighborhood Search
• Explore all neighborhoods in parallel
• Select best neighbor
75
N1
N2start
Nn
end
Improvement
found?
yes
no
LS
See [ParallelVNS.java]
NeighborhoodExplorer
[ParallelVNS.java]
• In the provided version, neighborhoods are explored in the
same thread
• Exercise: explore each neighborhood in a separate thread
• Hints:
• Use mExecutor
• See ParallelGRASP.java for reference
• Compare the speed-up for small and large instances
76
Parallel Algorithms Classification
• Classification according to three dimensions (Crainic 2008)
• Search control cardinality
• 1-control / p-control
• Search control and communications
• Rigid / Knowledge synchronization
• Collegial / Knowledge Collegial
• Search differentiations
• Same initial point / Multiple initial point
• Same search strategy / Different search strategy
• In which category fall the ParallelGRASP and ParallelVNS?
77
Synchronous 1-Control
78
Thread1
Thread2
Thread3
Thread4
Control
Control
Thread1
Thread2
Thread3
Thread4
Control
Synchronous 1-Control
78
Thread1
Thread2
Thread3
Thread4
Control
Control
The control starts new threads to
run part of the optimization in parallel
Thread1
Thread2
Thread3
Thread4
Control
Do your
assignments
Synchronous 1-Control
78
Thread1
Thread2
Thread3
Thread4
Control
Control
The control starts new threads to
run part of the optimization in parallel
Once all threads are finished, the
control gathers the information and
proceed with the optimization
Thread1
Thread2
Thread3
Thread4
Control
Do your
assignments
Show me your
results
Synchronous P-Control
79
Thread1
Thread2
Thread3
Thread4
Control Control
Control Control
Control Control Control Control
Main Control
Synchronous P-Control
79
Thread1
Thread2
Thread3
Thread4
Control Control
Control Control
Control Control Control Control
At fixed points of the optimization,
some threads synchronize and
exchange information
Main Control
Synchronous P-Control
79
Thread1
Thread2
Thread3
Thread4
Control Control
Control Control
Control Control Control Control
At fixed points of the optimization,
some threads synchronize and
exchange information
Main Control
I found a new
local optima!
I found a new
best solution!
Thread1
Thread2
Thread3
Thread4
Main Control
Asynchronous P-Control
80
Control
Control
Control
Control
Control
Control
Control
Control
Shared
information
Thread1
Thread2
Thread3
Thread4
Main Control
Asynchronous P-Control
80
Control
Control
Control
Control
Control
Control
Control
Control
Shared
information
At arbitrary points of the
optimization, each thread exchange
information with a centralized
component
Thread1
Thread2
Thread3
Thread4
Main Control
Asynchronous P-Control
80
Control
Control
Control
Control
Control
Control
Control
Control
Shared
information
At arbitrary points of the
optimization, each thread exchange
information with a centralized
component
I found a new
best solution!
I’m stuck, give me
the best solution
found so far
I found a new
local optima!
NOTES ON SOFTWARE
DEVELOPMENT
81
Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
What do I need now?
What will I need in the future?
What may I need in the future
Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
What do I need now?
What will I need in the future?
What may I need in the future
How to implement what I need,
will need and may need?
How to ensure I will be able to
reuse/extend my code?
Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
What do I need now?
What will I need in the future?
What may I need in the future
How to implement what I need,
will need and may need?
How to ensure I will be able to
reuse/extend my code?
How to do what I need now?
Software Development For Research
82
Specifications
Design
Implementation
Test
Prototype
Final product
What do I need now?
What will I need in the future?
What may I need in the future
How to implement what I need,
will need and may need?
How to ensure I will be able to
reuse/extend my code?
How to do what I need now?
Is everything working as expected?
Is something that worked before
now broken?
A Typical Design Problem
83
• Current need: two-opt local search for theVRP
• Data model
• How to represent an instance, customer, solution, route?
• Optimization algorithm
• How to represent
• A local search?
• A neighborhood?
• A move?
• How to check the feasibility of a move?
A First Design
84
Instance
int[]	
  customers
double[]	
  demands
double[][]	
  distances
int	
  fleetSize
double	
  vehicleCapacity
Solution
int[]<>	
  routes
double[]	
  loads
TwoOpt
twoOpt(Instance,Solution){
	
  for	
  each	
  move:
	
  	
  check	
  if	
  feasible
	
  	
  evaluate
	
  return	
  best	
  move
}
A First Design
84
Instance
int[]	
  customers
double[]	
  demands
double[][]	
  distances
int	
  fleetSize
double	
  vehicleCapacity
Solution
int[]<>	
  routes
double[]	
  loads
TwoOpt
twoOpt(Instance,Solution){
	
  for	
  each	
  move:
	
  	
  check	
  if	
  feasible
	
  	
  evaluate
	
  return	
  best	
  move
}
What if I now want to
solve theVRPTW?
A First Design
84
Instance
int[]	
  customers
double[]	
  demands
double[][]	
  distances
int	
  fleetSize
double	
  vehicleCapacity
Solution
int[]<>	
  routes
double[]	
  loads
TwoOpt
twoOpt(Instance,Solution){
	
  for	
  each	
  move:
	
  	
  check	
  if	
  feasible
	
  	
  evaluate
	
  return	
  best	
  move
}
What if I now want to
solve theVRPTW?
What if I want an
Or-Opt local search?
Some Design Tips
• Identify what is reusable
• For instance, logic common to all neighborhoods
• Separate clearly responsibilities
• An instance stores the data
• A solution stores a solution
• An objective function evaluates a solution and moves
• A constraint evaluates the feasibility of a solution or move
• Keep in mind possible extensions
• What other problems may I have to solve?
• Warning: avoid over-designing
85
Flexible And Extensible Designs
86
Neighborhood
Constraint<>	
  constraints
localSearch(instance,solution,objective){
	
  for	
  each	
  move	
  in	
  listAllMoves(instance,solution):
	
  	
  for	
  each	
  constraint	
  in	
  constraints:
	
  	
  	
  constraint.check(move)
	
  	
  objective.evaluate(instance,solution,move)
	
  return	
  best	
  feasible	
  move
}
abstract	
  listAllMoves(instance,solution)
TwoOpt
listAllMoves(instance,solution)
{
	
  ...
}
Flexible And Extensible Designs
86
Neighborhood
Constraint<>	
  constraints
localSearch(instance,solution,objective){
	
  for	
  each	
  move	
  in	
  listAllMoves(instance,solution):
	
  	
  for	
  each	
  constraint	
  in	
  constraints:
	
  	
  	
  constraint.check(move)
	
  	
  objective.evaluate(instance,solution,move)
	
  return	
  best	
  feasible	
  move
}
abstract	
  listAllMoves(instance,solution)
TwoOpt
listAllMoves(instance,solution)
{
	
  ...
}
OrOpt
listAllMoves(instance,solution)
{
	
  ...
}
Flexible And Extensible Designs
87
Constraint
abstract	
  check(move)
Capacity
check(move){
	
  ...
}
Flexible And Extensible Designs
87
Constraint
abstract	
  check(move)
Capacity
check(move){
	
  ...
}
TimeWindow
check(move){
	
  ...
}
MaxDuration
check(move){
	
  ...
}
Designing Tools
88
• Create UML diagrams to model the organization of the code
• Generate code from a model
• Once the design is stable
• Generate all the code skeleton in one click
• Generate a model from code (hazardous)
• Examples
• Visual Paradigm (free community edition)
• Enterprise Architect ($$$)
• Check with your university / computer science department
Implementation
• Document your code and use coherent conventions
• Explain what are the inputs, outputs, main steps
• Saves a lot of time when you have to come back to it
• Make your code reusable and extensible
• Use the benefits of object-oriented programming
• Spend time now, save time tomorrow
• Build on top of existing libraries
• Avoid reinventing the wheel
89
Testing
• Create simple test cases that check key functionalities
• Unit test cases
• E.g., check that the methods to manipulate a solutions are
working
• More elaborate test cases
• E.g., solution found by a 2-Opt neighborhood
• Profile your code to detect bottlenecks and memory leaks
90
Final product
Development Process
91
Define problem
Relaxation
(simplify the problem)
Select approach
Design & Implement
Test & Debug
Benchmark & Profile
Restore relaxation
Adjust parameters
Publish paper!
Prototype
LIBRARIES & FRAMEWORKS
92
Vehicle Routing
93
• VROOM (Java) - http://victorpillac.com/vroom/
• VROOM-­‐Modelling
• Library to manipulateVRP instances
• VROOM-­‐Heuristics
• Library of common (meta)heuristics
• CW, [Adaptive]VNS, [Parallel] [Adaptive] LNS, GRASPx[ILS,ELS]
• VROOM-­‐Technicians
• Improved implementations for theTRSP
• VROOM-­‐jMSA
• Event driven multiple scenario approach for dynamic vehicle routing
Vehicle Routing
• VRPH (C++) - https://sites.google.com/site/vrphlibrary/
• Library of heuristics for theVRP
• CW, VNS
• Symphony-VRP - https://projects.coin-or.org/SYMPHONY
• Exact solver based on the Symphony and Concorde solver
• CVRPSEP - Lysgaard (2004)
• Valid inequality generation
• Concorde - http://www.tsp.gatech.edu/concorde.html
• Exact solver for theTSP
94
Parallelization
• Java
• Since Java 5: java.util.concurrent framework
• Since Java 7: Fork/Join framework
• C++
• POSIX	
  Threads (http://computing.llnl.gov/tutorials/pthreads)
• OpenMP (http://computing.llnl.gov/tutorials/openMP)
• Boost.Thread (http://www.boost.org/)
• Python
• Parallel	
  Python (http://www.parallelpython.com/)
95
Logging
• Java
• Log4J
• java.util.logging package
• C++
• Boost Log / Logging
• Log4cpp
• Python
• logging module
96
VRPRep Instance Repository
97
• VRPREP Website: http://rhodes.ima.uco.fr/vrprep/web/home
• XML schema to describe most vehicle routing problems
• Easy to read for your program
• XML data binding: creates the objects for you
• Repository of exiting instances
• Possibility to define your own problem
• Tool to generate a sample XML file
• Upload your instances
WRAP UP
98
Today We Have Seen ...
99
• Introduction
• What is complexity? Why is it important?
• Data structures
• How to represent a solution efficiently?
• Algorithmic tricks
• What are the main bottlenecks and how to avoid them?
• Parallelization
• How do parallel computing work? Why, when, and how to
parallelize?
• Software engineering
• How to design flexible and reusable code?
• Resources
• How to avoid reinventing the wheel?
Take Away
1. Developing efficient optimization algorithms requires careful
software engineering
✓ Complexity of the problems at hand
✓ Efficient data structures,Algorithmic tricks, Parallelization
2. Invest in developing flexible and extensible code
✓ Detailed design, Documentation
✓ Will save you time later
3. Use existing libraries and share your code
✓ Do not reinvent the wheel
✓ Help others (good for your resumé too)
100
Discrete Optimization
101
• PascalVan Hentenryck - http://www.coursera.org/course/optimization
• Online community of thousands of students
• Topics
• Dynamic programming
• Constraint programming
• Local search
• Linear programming
• Join the challenge to solveTSPs andVRPs!
2nd International Optimisation Summer School
• 12th to 17th January 2014, Kioloa, NSW,Australia
• http://www.cse.unsw.edu.au/~tw/school/
• Lectures
• Constraint programming, Integer programming, Column generation
• Modelling
• Uncertainty
• Vehicle routing, Scheduling, Supply networks
• Research skills.
102
NICTA IS DEDICATED TO RESEARCH
EXCELLENCE IN ICT AND WEALTH
CREATION FOR AUSTRALIA
NICTA IS AUSTRALIA'S PRE-EMINENT NATIONAL
ICT RESEARCH CENTRE OF EXCELLENCE
AUSTRALIA’S ICT PHD FACTORY
CONNECTING SMALL BUSINESS AND COMMERCIALISING
TECHNOLOGY
* NICTA IS
CREATING NEW BUSINESSES
* NICTA IS
TRANSFORMING INDUSTRY
* NICTA IS
BUILDING SKILLS AND CAPACITY
FOR THE DIGITAL ECONOMY
NICTA TECHNOLOGY IS IN 1.5 BILLION MOBILE
PHONES AROUND THE WORLD
17PARTNER
UNIVERSITIES
700OF THE BEST ICT
SCIENTISTS AND STUDENTS.
BRISBANE
SYDNEY
CANBERRA
MELBOURNE
Crash-proof code: “One of the
world’s top 10 technologies”: MIT.
Making Amazon's business
more secure in the cloud.
A global leader in digital audio networking, used at the
London 2012 Olympics and the Queen's Jubilee Concert.
Revolutionising pain management
through the use of implants in the
spinal cord.
25% OF
ICT PHD STUDENTS
IN AUSTRALIA
340 graduates,
260 enrolled students, that’s:
NICTA is working with schools to
promote opportunities in ICT.
Fleet logistics helping to
save 15% of transport costs
for Tip Top.
Big Data analytics for the
Australian finance sector.
AUDINATE
OPEN KERNEL LABS
SALUDA MEDICAL
Optimising freight pick-ups and
deliveries across Australia.
OPTURION
Reduce roadside maintenance
costs by $60m using computer
vision for Sensis.
Providing cutting edge services and applications
for the NBN.
INCREASING
PRODUCTIVITY
AND PROFIT.
Helping diagnose and treat
prostate cancer for Peter Mac
Cancer Centre, and helping to
build the bionic eye.
Saved water costs by 30%
for the Australian dairy
industry.
YURUWARE
who have gone onto work for:
11 spin-outs, 5 more in the pipeline... creating
highly skilled jobs for Australians.
FOR MORE INFORMATION,
CONTACT US ON
INFO@NICTA.COM.AU
including UNSW, UoM,
USYD, MOnash, and ANU.
* INFRASTRUCTURE
* FINANCE
* AGRICULTURE
* TRANSPORT
* MEDICINE
References
104
• Crainic,T., Parallel Solution Methods forVehicle Routing Problems,TheVehicle Routing Problem: Latest
Advances and New Challenges, Operations Research/Computer Science InterfacesVolume 43, 2008, pp
171-198
• Groer, C.; Golden, B. & Wasil, E.,A library of local search heuristics for the vehicle routing problem,
Mathematical Programming Computation, Springer Berlin / Heidelberg, 2010, 2, 79-101
• Irnich, S., Funke, B., & Grünert,T. (2006). Sequential search and its application to vehicle-routing problems.
Computers & Operations Research, 33(8), 2405-2429.
• Lysgaard, J.,(2004).CVRPSP:A package of separation routines for the capacitated vehicle routing problem,
Working Paper 03–04.
• Pillac,V.; Guéret, C. & Medaglia,A. L. (2012). A parallel matheuristic for theTechnician Routing and Scheduling
Problem, Optimization Letters, doi:10.1007/s11590-012-0567-4
• Savelsbergh, M. (1992).The vehicle routing problem with time windows: minimizing route duration. INFORMS,
4(2):146–154, doi:10.1287/ijoc.4.2.146.
• Toth, P. andVigo, D. (2003). The GranularTabu Search and Its Application to theVehicle-Routing Problem,
INFORMS Journal on Computing15, 333-346;
• Zachariadis, E. E., & Kiranoudis, C.T. (2010).A strategy for reducing the computational complexity of local
search-based methods for the vehicle routing problem. Computers & Operations Research, 37(12),
2089-2105.

More Related Content

What's hot

Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
Polytechnique Montréal
 
Computing Information Flow Using Symbolic-Model-Checking_.pdf
Computing Information Flow Using Symbolic-Model-Checking_.pdfComputing Information Flow Using Symbolic-Model-Checking_.pdf
Computing Information Flow Using Symbolic-Model-Checking_.pdf
Polytechnique Montréal
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
Ding Li
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
ChenYiHuang5
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
ChenYiHuang5
 
Europy17_dibernardo
Europy17_dibernardoEuropy17_dibernardo
Europy17_dibernardo
GIUSEPPE DI BERNARDO
 
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
MLconf
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent space
Hansol Kang
 
Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++
Satalia
 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!
ChenYiHuang5
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
Kenta Oono
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
남주 김
 
(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...
(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...
(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...
Frank Nielsen
 
Paper Study: Transformer dissection
Paper Study: Transformer dissectionPaper Study: Transformer dissection
Paper Study: Transformer dissection
ChenYiHuang5
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
Gilles Louppe
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
ashishtinku
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural Networks
Sang Jun Lee
 
Fast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeFast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in Practice
Rakuten Group, Inc.
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
AIST
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 

What's hot (20)

Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
 
Computing Information Flow Using Symbolic-Model-Checking_.pdf
Computing Information Flow Using Symbolic-Model-Checking_.pdfComputing Information Flow Using Symbolic-Model-Checking_.pdf
Computing Information Flow Using Symbolic-Model-Checking_.pdf
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
 
Europy17_dibernardo
Europy17_dibernardoEuropy17_dibernardo
Europy17_dibernardo
 
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
Sergei Vassilvitskii, Research Scientist, Google at MLconf NYC - 4/15/16
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent space
 
Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++
 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...
(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...
(chapter 8) A Concise and Practical Introduction to Programming Algorithms in...
 
Paper Study: Transformer dissection
Paper Study: Transformer dissectionPaper Study: Transformer dissection
Paper Study: Transformer dissection
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural Networks
 
Fast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in PracticeFast Wavelet Tree Construction in Practice
Fast Wavelet Tree Construction in Practice
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
 

Similar to VRP2013 - Comp Aspects VRP

Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
Afaq Mansoor Khan
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
Vincenzo Gulisano
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of Python
Takeshi Akutsu
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of python
Yung-Yu Chen
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning
민재 정
 
Analysis of algorithms
Analysis of algorithmsAnalysis of algorithms
Analysis of algorithms
Asen Bozhilov
 
DSJ_Unit I & II.pdf
DSJ_Unit I & II.pdfDSJ_Unit I & II.pdf
DSJ_Unit I & II.pdf
Arumugam90
 
Chapter One.pdf
Chapter One.pdfChapter One.pdf
Chapter One.pdf
abay golla
 
Functional Programming and Composing Actors
Functional Programming and Composing ActorsFunctional Programming and Composing Actors
Functional Programming and Composing Actors
legendofklang
 
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
Nesreen K. Ahmed
 
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Arvind Surve
 
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Arvind Surve
 
Optimization in Programming languages
Optimization in Programming languagesOptimization in Programming languages
Optimization in Programming languages
Ankit Pandey
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
Paul Groth
 
Functional Operations - Susan Potter
Functional Operations - Susan PotterFunctional Operations - Susan Potter
Functional Operations - Susan Potter
distributed matters
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
Héloïse Nonne
 
OR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptxOR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptx
ChandigaRichard1
 
OR Ndejje Univ.pptx
OR Ndejje Univ.pptxOR Ndejje Univ.pptx
OR Ndejje Univ.pptx
ChandigaRichard1
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic Notations
Rishabh Soni
 
Data structure and algorithm using java
Data structure and algorithm using javaData structure and algorithm using java
Data structure and algorithm using java
Narayan Sau
 

Similar to VRP2013 - Comp Aspects VRP (20)

Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of Python
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of python
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning
 
Analysis of algorithms
Analysis of algorithmsAnalysis of algorithms
Analysis of algorithms
 
DSJ_Unit I & II.pdf
DSJ_Unit I & II.pdfDSJ_Unit I & II.pdf
DSJ_Unit I & II.pdf
 
Chapter One.pdf
Chapter One.pdfChapter One.pdf
Chapter One.pdf
 
Functional Programming and Composing Actors
Functional Programming and Composing ActorsFunctional Programming and Composing Actors
Functional Programming and Composing Actors
 
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
 
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
 
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
 
Optimization in Programming languages
Optimization in Programming languagesOptimization in Programming languages
Optimization in Programming languages
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
Functional Operations - Susan Potter
Functional Operations - Susan PotterFunctional Operations - Susan Potter
Functional Operations - Susan Potter
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
 
OR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptxOR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptx
 
OR Ndejje Univ.pptx
OR Ndejje Univ.pptxOR Ndejje Univ.pptx
OR Ndejje Univ.pptx
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic Notations
 
Data structure and algorithm using java
Data structure and algorithm using javaData structure and algorithm using java
Data structure and algorithm using java
 

VRP2013 - Comp Aspects VRP

  • 1. Computational Aspects Of Vehicle Routing Victor Pillac May 20 VRP 2013, Angers, France 1
  • 2. Agenda 2 • Introduction • What is complexity? Why is it important? • Data structures • How to represent a solution efficiently? • Algorithmic tricks • What are the main bottlenecks and how to avoid them? • Parallelization • How do parallel computing work? Why, when, and how to parallelize? • Software engineering • How to design flexible and reusable code? • Resources • How to avoid reinventing the wheel?
  • 4. About Me • Finished my Ph. D. in 2012 at the Ecole des Mines de Nantes (France) and Universidad de Los Andes (Colombia) • Dynamic vehicle routing: solution methods and computational tools • Since Oct. 2012, researcher at NICTA (Melbourne,Australia) • Disaster management team • NICTA in a few numbers: • 700 staff, 260 PhDs • 7 research groups, 4 business teams • 550+ publications in 2012 4
  • 5. Assumptions • General knowledge on vehicle routing • General knowledge of common heuristics • Local search • Variable Neighborhood Search (VNS) • General knowledge of object-oriented programming • Examples are in Java 5
  • 6. Time Complexity 6 • Measure the worst case number of operations • Expressed as a function of the size of the problem n
  • 7. Time Complexity 6 • Measure the worst case number of operations • Expressed as a function of the size of the problem n For  i  =  1  to  n a  =  1  +  i b  =  2  *  a c  =  a  *  b  +  2
  • 8. Time Complexity 6 • Measure the worst case number of operations • Expressed as a function of the size of the problem n For  i  =  1  to  n a  =  1  +  i b  =  2  *  a c  =  a  *  b  +  2 Performs n*(1+1+2) = 4n operations Complexity is O(n)
  • 9. Time Complexity 6 • Measure the worst case number of operations • Expressed as a function of the size of the problem n For  i  =  1  to  n a  =  1  +  i b  =  2  *  a c  =  a  *  b  +  2 Performs n*(1+1+2) = 4n operations Complexity is O(n) For  S  ⊆  {1..n} a  =  1  +  |S|
  • 10. Time Complexity 6 • Measure the worst case number of operations • Expressed as a function of the size of the problem n For  i  =  1  to  n a  =  1  +  i b  =  2  *  a c  =  a  *  b  +  2 Performs n*(1+1+2) = 4n operations Complexity is O(n) For  S  ⊆  {1..n} a  =  1  +  |S| Performs 2n operations Complexity is O(2n)
  • 11. Space Complexity 7 • Measure the worst case memory usage • Expressed in as a function of the size of the problem n
  • 12. Space Complexity 7 • Measure the worst case memory usage • Expressed in as a function of the size of the problem n For  i  =  1  to  n a  =  1  +  i b  =  2  *  a c  =  a  *  b  +  2
  • 13. Space Complexity 7 • Measure the worst case memory usage • Expressed in as a function of the size of the problem n For  i  =  1  to  n a  =  1  +  i b  =  2  *  a c  =  a  *  b  +  2 Stores at most 4 integers simultaneously Complexity is O(1)
  • 14. Space Complexity 7 • Measure the worst case memory usage • Expressed in as a function of the size of the problem n For  i  =  1  to  n a  =  1  +  i b  =  2  *  a c  =  a  *  b  +  2 Stores at most 4 integers simultaneously Complexity is O(1) For  S  ⊆  {1..n} a  =  1  +  |S|
  • 15. Space Complexity 7 • Measure the worst case memory usage • Expressed in as a function of the size of the problem n For  i  =  1  to  n a  =  1  +  i b  =  2  *  a c  =  a  *  b  +  2 Stores at most 4 integers simultaneously Complexity is O(1) For  S  ⊆  {1..n} a  =  1  +  |S| Stores at most n+1 integers simultaneously Complexity is O(n)
  • 16. Complexity In Practice 8 10 100 1000 10,000 100,000 1000,000 n 0.1 ns 1 ns 10 ns 100 ns 1 µs 10 µs n.log(n) 0.1 ns 2 ns 30 ns 400 ns 5 µs 60 µs n2 1 ns 100 ns 10 µs 1 ms 100 ms 10 s n3 1 ns 10 µs 10 ms 1 s 2.7 h 115 d en 22 µs 8.5 1024 years - - - - Computational time for a single floating point operation on a recent desktop processor
  • 17. Complexity In Practice 9 10 100 1000 10,000 100,000 1000,000 n 320 b 3.2 kb 32 kb 320 kb 3.2 Mb 32 Mb n.log(n) 320 b 64 kb 96 kb 1.28 Mb 16 Mb 190 Mb n2 3.2 kb 320 kb 32 Mb 3.2 Gb 320 Gb 32Tb n3 32 kb 32 Mb 32 Gb 32Tb 32 Pb 32 Eb* en 700 kb 8 1026 Eb Memory requirement to store a single floating point precision number (*The world’s storage capacity is estimated to be 300 Eb - or 300 billion Gb)
  • 18. Local Search & Terminology 10 Initial solution S0
  • 19. Local Search & Terminology 10 Initial solution NeighborhoodS0
  • 20. Local Search & Terminology 10 Initial solution Move Neighbor NeighborhoodS0
  • 21. Local Search & Terminology 10 Initial solution Move Neighbor Neighborhood Current solution S0 S1 Executed Move
  • 22. Local Search & Terminology 10 Initial solution Move Neighbor Neighborhood Current solution S0 S1 S3 Executed Move
  • 24. Representing Routes 12 • Routes are the base of solving vehicle routing problems • It is critical to have efficient data structures to store them • There is no best data structure • Performance depends on how it is used • Tradeoff between simplicity and performance • Choice should be motivated by • Purpose: prototype v.s. state of the art algorithm • Usage: what are the most common operations?
  • 25. Dynamic Array List • Common operation complexity • Access to the customer by position: O(1) • Access to the position of customer by id: O(n) • Iteration: O(1) • Insertion/deletion: O(n) • See  [ArrayListRoute.java] 13 1 2 3 4 5 0 2 3 4 0 1 2 3 4 5 6 0 1 6 5 7 0 0 7 5 6 4 3 2 1
  • 26. Doubly Linked List • Common operation complexity • Access to the customer by position: O(n) • Access to the position of customer by id: O(n) • Iteration: O(1) • Insertion/deletion: O(1) • See[LinkedListRoute.java] 14 0 0 2 1 3 6 4 6 0 7 0 0 7 5 6 4 3 2 1
  • 27. • Common operation complexity • Access to the customer by position: O(n) • Access to the position of customer by id: O(1) • Iteration: O(1) • Insertion/deletion: O(1) Doubly Linked List V2 15 0 7 5 6 4 3 2 1 1 2 3 4 5 6 7 0 0 2 3 6 1 5 1 2 3 4 5 6 7 6 3 4 0 7 5 0 Predecessor Successor First Last
  • 28. • Common operation complexity • Access to the customer by position: O(n) • Access to the position of customer by id: O(1) • Iteration: O(1) • Insertion/deletion: O(1) Implementation can be tricky, especially for repeated nodes (e.g., depot) Warning: The implementation in VroomModeling is full of bugs “incomplete” Doubly Linked List V2 15 0 7 5 6 4 3 2 1 1 2 3 4 5 6 7 0 0 2 3 6 1 5 1 2 3 4 5 6 7 6 3 4 0 7 5 0 Predecessor Successor First Last
  • 31. 18 Naming Conventions: m: prefix for instance fields (e.g., mMyField) s: prefix for static fields (e.g., sMyStaticField) I: prefix for interface names (e.g., IMyInterface) Base: suffix for abstract types (e.g., MyTypeBase) Logging: Uses Log4J, see VRPLogging.java
  • 33. 20
  • 34. 21
  • 35. 22
  • 40. algorithms package 23 IVRPOptimizationAlgorithm GRASPVND ParallelGRASP Heuristic Concentration CW Clarke and Wright heuristic to generate routes examples package util package VNS
  • 41. algorithms package 23 IVRPOptimizationAlgorithm GRASPVND ParallelGRASP Heuristic Concentration CW Clarke and Wright heuristic to generate routes Explore a number of neighborhoods examples package util package VNS
  • 42. algorithms package 23 IVRPOptimizationAlgorithm GRASPVND ParallelGRASP Heuristic Concentration CW Clarke and Wright heuristic to generate routes Explore a number of neighborhoods Start with a solution from CW and apply VND examples package util package VNS
  • 43. algorithms package 23 IVRPOptimizationAlgorithm GRASPVND ParallelGRASP Heuristic Concentration CW Clarke and Wright heuristic to generate routes Explore a number of neighborhoods Start with a solution from CW and apply VND Parallel implementation of GRASP examples package util package VNS
  • 44. algorithms package 23 IVRPOptimizationAlgorithm GRASPVND ParallelGRASP Heuristic Concentration CW Clarke and Wright heuristic to generate routes Explore a number of neighborhoods Start with a solution from CW and apply VND Parallel implementation of GRASP Takes a set of routes and build a solution examples package util package VNS
  • 45. algorithms package 23 IVRPOptimizationAlgorithm GRASPVND ParallelGRASP Heuristic Concentration CW Clarke and Wright heuristic to generate routes Explore a number of neighborhoods Start with a solution from CW and apply VND Parallel implementation of GRASP Takes a set of routes and build a solution examples package Each class contains a main method that we will use to run the examples util package VNS
  • 46. algorithms package 23 IVRPOptimizationAlgorithm GRASPVND ParallelGRASP Heuristic Concentration CW Clarke and Wright heuristic to generate routes Explore a number of neighborhoods Start with a solution from CW and apply VND Parallel implementation of GRASP Takes a set of routes and build a solution examples package Each class contains a main method that we will use to run the examples util package Classes to make our life easier VNS
  • 47. 24
  • 48. [ExampleRoutesAtomic.java] • Compares ArrayListRoute and LinkedListRoute • Append a node • Get a node at a random position • Remove the first node 25
  • 49. [ExampleRoutesAtomic.java] • Compares ArrayListRoute and LinkedListRoute • Append a node • Get a node at a random position • Remove the first node 25 ArrayList Append:123.1ms  GetNodeAt:18.7ms  RemoveFirst:134.2ms LinkedList Append:129.4ms  GetNodeAt:66.6ms  RemoveFirst:110.6ms
  • 50. [CW.java] • Clarke and Wright constructive heuristic 26 0 4 3 2 1 2 13 4 0 4 3 2 1 2 13 4 0 4 3 2 1 2 13 4 Initialization: create one route per node Each step: Merge the two routes to generate the greatest saving Repeat until there are no more feasible merging
  • 51. [CW.java] • Clarke and Wright constructive heuristic 26 0 4 3 2 1 2 13 4 0 4 3 2 1 2 13 4 0 4 3 2 1 2 13 4 Initialization: create one route per node Each step: Merge the two routes to generate the greatest saving Repeat until there are no more feasible merging Implemented in VroomHeuristics in package vroom.common.heuristics.cw
  • 55. Variable Neighborhood Descent • Explore different neighborhoods sequentially • The final solution is a local optima for all neighborhoods 28 0 4 3 2 11 0 4 3 2 11 N1 N2 0 4 3 2 11
  • 56. Variable Neighborhood Descent • Explore different neighborhoods sequentially • The final solution is a local optima for all neighborhoods 28 0 4 3 2 11 0 4 3 2 11 N1 N2 0 4 3 2 11 N1start Improvement found?
  • 57. Variable Neighborhood Descent • Explore different neighborhoods sequentially • The final solution is a local optima for all neighborhoods 28 0 4 3 2 11 0 4 3 2 11 N1 N2 0 4 3 2 11 N1start Improvement found? yes
  • 58. Variable Neighborhood Descent • Explore different neighborhoods sequentially • The final solution is a local optima for all neighborhoods 28 0 4 3 2 11 0 4 3 2 11 N1 N2 0 4 3 2 11 N1 N2start Improvement found? yes no
  • 59. Variable Neighborhood Descent • Explore different neighborhoods sequentially • The final solution is a local optima for all neighborhoods 28 0 4 3 2 11 0 4 3 2 11 N1 N2 0 4 3 2 11 N1 N2start Nn end Improvement found? Improvement found? Improvement found? yesyesyes no no
  • 62. [VND.java] 29 The constraints are defined separately from the neighborhoods. Each constraint is responsible for checking if a move is feasible Ignore for now
  • 63. [VND.java] 29 The constraints are defined separately from the neighborhoods. Each constraint is responsible for checking if a move is feasible Ignore for now Instantiate the neighborhood that will be used later
  • 65. [VND.java] 30 Performs a local search in the neighborhood of the current solution
  • 66. [VND.java] 30 Performs a local search in the neighborhood of the current solution The parameters control how the search is performed, in this case deterministic & best improvement
  • 67. [ExampleVND.java] • Run the main method • Is the ordering of neighborhoods in VND.java logical? • How to improve it? • Is the localSearch implementation in VND.java coherent with the definition of VND? 31 N1 N2start Nn end Improvement found? Improvement found? Improvement found? yesyesyes no no
  • 68. [ExampleVND.java] • Run the main method • Is the ordering of neighborhoods in VND.java logical? • How to improve it? • Is the localSearch implementation in VND.java coherent with the definition of VND? 31 N1 N2start Nn end Improvement found? Improvement found? Improvement found? yesyesyes no no
  • 69. [ExampleVND.java] • Run the main method • Is the ordering of neighborhoods in VND.java logical? • How to improve it? • Is the localSearch implementation in VND.java coherent with the definition of VND? 31 N1 N2start Nn end Improvement found? Improvement found? Improvement found? yesyesyes no no
  • 70. [ExampleRoutesOptim.java] • Compares ArrayListRoute and LinkedListRoute • Constructive heuristic (CW) • Variable Neighborhood Descent optimization (VND) 32
  • 71. [ExampleRoutesOptim.java] • Compares ArrayListRoute and LinkedListRoute • Constructive heuristic (CW) • Variable Neighborhood Descent optimization (VND) 32 ArrayList CW  617.1ms  VND:67,576.7ms LinkedList CW  414.4ms  VND:86,443.3ms
  • 72. Store Routes 33 • Store routes for future use • Requirements • Memory-efficient • Avoid repeated routes • Store a minimalistic route representation • Low computation overhead • Two approaches • Exhaustive list • Issue: repeated routes • Hash based set
  • 73. Hash Functions • Compress the information stored in a route • Desired characteristics • Determinism • Uniformity • Issues • Two different routes can have the same hash (hash collision) • Computational cost of hash evaluation 34
  • 75. • See Groer et al. 2010 - [GroerSolutionHasher.java] • Produces a 32-bit integer that depend on the set and sequence of nodes in the route 010  XOR  111  =  101 (2)          (7)      (5) Sequence Dependent Hash 36 Input:   -­‐  rnd:  An  array  of  n  random  integers -­‐  route:  A  route Output: -­‐  A  hash  value  for  route 1.if  route.first  >  route.last 1.route  ←  reverse  ordering  of  route 2.hash  ←  0 3.For  each  edge  (i,j)  in  route 1.hash  ←  hash  XOR  rnd[i+j  %  n] 4.return  hash
  • 76. Sequence Independent Hash • See Pillac et al. 2012 - [NodeSetSolutionHasher.java] • Produce a 32-bit integer that depends on the set of nodes visited by the route • Advantage: • Implicit filtering of duplicated routes 37 Input:   -­‐  rnd:  An  array  of  n  random  integers -­‐  route:  A  route Output: -­‐  A  hash  value  for  route 1.hash  ←  0 2.For  each  node  i  in  route 1.hash  ←  hash  XOR  rnd[i  %  n] 3.return  hash
  • 77. Example • Greedy Randomized Adaptive Search Procedure 38 Randomized Constructive Heuristic Start Local Search End
  • 78. [GRASP.java] • Clarke and Wright construction heuristic • Variable Neighborhood Descent optimization 39
  • 79. [GRASP.java] • Clarke and Wright construction heuristic • Variable Neighborhood Descent optimization 39
  • 80. [GRASP.java] • Clarke and Wright construction heuristic • Variable Neighborhood Descent optimization 39
  • 81. [ExampleGRASP.java] • Runs the GRASP procedure on a single instance 40
  • 85. Heuristic Concentration 42 min X p2⌦ cpxp s.t. X p2⌦ ai pxp 1 8i 2 N xp 2 {0, 1} 8p 2 ⌦ • Set covering model:
  • 86. Heuristic Concentration 42 min X p2⌦ cpxp s.t. X p2⌦ ai pxp 1 8i 2 N xp 2 {0, 1} 8p 2 ⌦ Set of routes • Set covering model:
  • 87. Heuristic Concentration 42 min X p2⌦ cpxp s.t. X p2⌦ ai pxp 1 8i 2 N xp 2 {0, 1} 8p 2 ⌦ Set of routes Cost of route p • Set covering model:
  • 88. Heuristic Concentration 42 min X p2⌦ cpxp s.t. X p2⌦ ai pxp 1 8i 2 N xp 2 {0, 1} 8p 2 ⌦ Set of routes Cost of route p 1 if route p is selected • Set covering model:
  • 89. Heuristic Concentration 42 min X p2⌦ cpxp s.t. X p2⌦ ai pxp 1 8i 2 N xp 2 {0, 1} 8p 2 ⌦ Set of routes Cost of route p 1 if route p is selected Set of nodes • Set covering model:
  • 90. Heuristic Concentration 42 min X p2⌦ cpxp s.t. X p2⌦ ai pxp 1 8i 2 N xp 2 {0, 1} 8p 2 ⌦ Set of routes Cost of route p 1 if route p is selected 1 if route p visits node i Set of nodes • Set covering model:
  • 91. • Adapt the GRASP procedure to collect routes • Add the following fragment where needed • Hint: we want to collect as many routes as possible • Experiment with different route pools • What is the impact on the number of routes and HC time? [ExampleGRASPHC.java] 43
  • 94. Bottlenecks In Heuristics For VRP 46 • Size of the neighborhood • Areas of the neighborhood are not interesting • Only minor changes are made to the solution at each move • How different is the new neighborhood? • How to avoid restarting from scratch? • Move evaluation • Cost & Feasibility • Performed millions of times • Which is most costly? Which should be done first?
  • 95. Granular Neighborhoods 47 • Reduce the size of the neighborhoods • SeeToth andVigo (2003) • Costly (long) arcs are less likely to be in good solutions • Filter out moves that involves only costly arcs • Costly arc threshold 5 4 3 2 1 0 Heuristic solution Number of nodes + Number of vehicles # = · z0 n+K0 Sparsification parameter (e.g., =2.5)
  • 96. Granular Neighborhoods 47 • Reduce the size of the neighborhoods • SeeToth andVigo (2003) • Costly (long) arcs are less likely to be in good solutions • Filter out moves that involves only costly arcs • Costly arc threshold 5 4 3 2 1 0 Inserting 5 between 3 and 4 involves 2 costly arcsHeuristic solution Number of nodes + Number of vehicles # = · z0 n+K0 Sparsification parameter (e.g., =2.5)
  • 97. Granular Neighborhoods 47 • Reduce the size of the neighborhoods • SeeToth andVigo (2003) • Costly (long) arcs are less likely to be in good solutions • Filter out moves that involves only costly arcs • Costly arc threshold 5 4 3 2 1 0 Inserting 5 between 3 and 4 involves 2 costly arcs Inserting 5 between 1 and 2 involves 1 costly arcs Heuristic solution Number of nodes + Number of vehicles # = · z0 n+K0 Sparsification parameter (e.g., =2.5)
  • 98. Static Move Descriptor (SMD) • Store information between moves • See Zachariadis and Kiranoudis (2010) • Precompute and maintain all moves • Example with relocate (relocation of a single node) 48 5 4 3 2 1 0 n2=0 n2=1 n2=2 n2=3 n2=4 n2=5 n1=0 n1=1 n1=2 n1=3 n1=4 n1=5 ... ... 0 0 ... 0 ... ... ... ... 0 ... ... ... ... ... ... 0 ... ... ... ... ... ... 0 ... 0 ... ... ... x x
  • 99. Static Move Descriptor (SMD) • Store information between moves • See Zachariadis and Kiranoudis (2010) • Precompute and maintain all moves • Example with relocate (relocation of a single node) 48 5 4 3 2 1 0 n2=0 n2=1 n2=2 n2=3 n2=4 n2=5 n1=0 n1=1 n1=2 n1=3 n1=4 n1=5 ... ... 0 0 ... 0 ... ... ... ... 0 ... ... ... ... ... ... 0 ... ... ... ... ... ... 0 ... 0 ... ... ... x Cost of relocating 4 after 3: c3,4+c0,5-c5,4-c3,0 x
  • 100. Static Move Descriptor (SMD) • Store information between moves • See Zachariadis and Kiranoudis (2010) • Precompute and maintain all moves • Example with relocate (relocation of a single node) 48 5 4 3 2 1 0 n2=0 n2=1 n2=2 n2=3 n2=4 n2=5 n1=0 n1=1 n1=2 n1=3 n1=4 n1=5 ... ... 0 0 ... 0 ... ... ... ... 0 ... ... ... ... ... ... 0 ... ... ... ... ... ... 0 ... 0 ... ... ... x Cost of relocating 4 after 3: c3,4+c0,5-c5,4-c3,0 Cost of relocating 1 after 5: c0,5+c1,4-c0,1-c5,4 x
  • 101. n2=0 n2=1 n2=2 n2=3 n2=4 n2=5 n1=0 n1=1 n1=2 n1=3 n1=4 n1=5 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... SMD Update • One static SMD table is created per neighborhood • Static update rules are predefined to know which SMDs need to be updated after a move was executed 49 5 4 3 2 1 0
  • 102. n2=0 n2=1 n2=2 n2=3 n2=4 n2=5 n1=0 n1=1 n1=2 n1=3 n1=4 n1=5 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... SMD Update • One static SMD table is created per neighborhood • Static update rules are predefined to know which SMDs need to be updated after a move was executed 49 5 4 3 2 1 0
  • 103. n2=0 n2=1 n2=2 n2=3 n2=4 n2=5 n1=0 n1=1 n1=2 n1=3 n1=4 n1=5 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... SMD Update • One static SMD table is created per neighborhood • Static update rules are predefined to know which SMDs need to be updated after a move was executed 49 5 4 3 2 1 0
  • 104. n2=0 n2=1 n2=2 n2=3 n2=4 n2=5 n1=0 n1=1 n1=2 n1=3 n1=4 n1=5 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... SMD Update • One static SMD table is created per neighborhood • Static update rules are predefined to know which SMDs need to be updated after a move was executed 49 5 4 3 2 1 0
  • 105. Selecting The Best Neighbor • All SMDs are store in a Fibonacci Heap • O(1) access to the lowest cost SMD • O(1) insertion • O(n.log(n)) deletion • How to find the best feasible neighbor? • Pop the lowest cost SMD until a feasible move is found 50 Source: http://en.wikipedia.org/wiki/Fibonacci_heap
  • 106. SMD In Practice 51 t tag of m, so that e heaps. t update opriately s caused tances. D repre- executed ew SMD ic move was the was the in local 0 2 0 n: Problem Size 150 450 750 1050 150 450 750 1050 acted until the first admissible is obtained against problem size. 0 500 1000 1500 2000 2500 n: Problem Size CPUTimefor50000iterations(sec) Classic Representation SMD reprentation 200 400 600 800 1000 1200 Fig. 8. The acceleration role of the SMD representation. ) Source: Zachariadis and Kiranoudis (2010) Comparison of computational times
  • 107. Sequential Search • Explore neighborhoods in a smart way • See Irnich et al. (2006) • Decompose moves in partial moves • Example with swap 52 54 321 5
  • 108. Sequential Search • Explore neighborhoods in a smart way • See Irnich et al. (2006) • Decompose moves in partial moves • Example with swap 52 54 321 5
  • 109. Sequential Search • Explore neighborhoods in a smart way • See Irnich et al. (2006) • Decompose moves in partial moves • Example with swap 52 54 321 5 5 321
  • 110. Sequential Search • Explore neighborhoods in a smart way • See Irnich et al. (2006) • Decompose moves in partial moves • Example with swap 52 54 321 5 5 321 54 2 5
  • 111. Sequential Search In Practice • Neighborhoods are explored by considering partial moves • Exploration is pruned using bounds on the partial move cost 53 S. Irnich et al. / Computers & Operations Research 33 (2006) 2405–2429 2423 Or-Opt 40 60 80 100 120 140 elerationFactor String-Exchange 0 100 200 300 400 500 600 700 800 900 0 500 1000 1500 2000 2500 AccelerationFactor Special 2-Opt* 0 20 40 60 80 100 120 0 500 1000 1500 2000 2500 AccelerationFactor Swap 0 20 40 60 80 100 120 0 500 1000 1500 2000 2500 AccelerationFactor Relocation 20 30 40 50 60 70 elerationFactor 2-Opt 0 5 10 15 20 25 30 35 0 500 1000 1500 2000 2500 AccelerationFactor f =100 f =75 f =50 f =25 f =100 f =75 f =50 f =25 f =100 f =75 f =50 f =100 f =75 f =50 f =100 f =75 f =50 f =25 f =100 f =75 f =50 f =25 Size n Size n Size n Size n 2424 S. Irnich et al. / Computers & Operations Re 3-Opt* 0 2000 4000 6000 8000 10000 12000 14000 16000 200 300 400 500 AccelerationFactor 0 500 1000 1500 2000 AvgTimeperSearch[ms] Size n f =100 f =75 f =50 f =25 Fig. 8. Acceleration factor comparing lexicographic search and sequent sequential search iteration for 3-opt* moves.f: average number of customers in a route Speedup for swap and 3-Opt* neighborhoods Source: Irnich et al. (2006)
  • 112. Store Cumulative Information • Reduce the complexity of move evaluation • Store and maintain useful information • For example: waiting time / forward slack time • See Savelsbergh (1992) • Constant time time window feasibility check • More details in Module 2 54
  • 118. Promises Of Parallelization 58 • Overcome the stalling of CPU performance increase • Increased availability of parallel computing • Personal computers with multiples CPUs/cores • Most universities have access to large grids • On demand cloud services (e.g.,Amazon)
  • 119. Promises Of Parallelization 58 • Overcome the stalling of CPU performance increase • Increased availability of parallel computing • Personal computers with multiples CPUs/cores • Most universities have access to large grids • On demand cloud services (e.g.,Amazon) Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusion
  • 125. Extremely Slow Architecture Overview 59 CPU L2cache(~Mb) RAM (~ Gb) HDD (~Tb) Core L1cacheL1cache L1cacheL1cache Fast Slow Very Fast Threads
  • 126. CPU Concepts And Limitations 60 My Program executeMySequentialCode() thread1  =  new  Thread(do  A,  do  B) thread1.run() thread2  =  new  Thread(do  C) thread2.run() Operating System thread2 Create  a  new  thread Assign  it  to  cpu  core  #4 Execute  the  instructions  A,B Create  a  new  thread Assign  it  to  cpu  core  #1 Execute  the  instructions  C thread1
  • 127. CPU Concepts And Limitations 60 My Program executeMySequentialCode() thread1  =  new  Thread(do  A,  do  B) thread1.run() thread2  =  new  Thread(do  C) thread2.run() Operating System thread2 Create  a  new  thread Assign  it  to  cpu  core  #4 Execute  the  instructions  A,B Create  a  new  thread Assign  it  to  cpu  core  #1 Execute  the  instructions  C thread1 Takes time
  • 128. CPU Concepts And Limitations 60 My Program executeMySequentialCode() thread1  =  new  Thread(do  A,  do  B) thread1.run() thread2  =  new  Thread(do  C) thread2.run() Operating System thread2 Create  a  new  thread Assign  it  to  cpu  core  #4 Execute  the  instructions  A,B Create  a  new  thread Assign  it  to  cpu  core  #1 Execute  the  instructions  C Limited control on the actual execution sequence thread1 Takes time
  • 129. CPU Concepts And Limitations 60 My Program executeMySequentialCode() thread1  =  new  Thread(do  A,  do  B) thread1.run() thread2  =  new  Thread(do  C) thread2.run() Operating System thread2 Create  a  new  thread Assign  it  to  cpu  core  #4 Execute  the  instructions  A,B Create  a  new  thread Assign  it  to  cpu  core  #1 Execute  the  instructions  C Limited control on the actual execution sequence thread1 Increased memory usage Takes time
  • 130. CPU Concepts And Limitations 60 My Program executeMySequentialCode() thread1  =  new  Thread(do  A,  do  B) thread1.run() thread2  =  new  Thread(do  C) thread2.run() Operating System thread2 Create  a  new  thread Assign  it  to  cpu  core  #4 Execute  the  instructions  A,B Create  a  new  thread Assign  it  to  cpu  core  #1 Execute  the  instructions  C Takes time Limited control on the actual execution sequence thread1 Increased memory usage Takes time
  • 131. CPU Concepts And Limitations 60 My Program executeMySequentialCode() thread1  =  new  Thread(do  A,  do  B) thread1.run() thread2  =  new  Thread(do  C) thread2.run() Operating System thread2 Create  a  new  thread Assign  it  to  cpu  core  #4 Execute  the  instructions  A,B Create  a  new  thread Assign  it  to  cpu  core  #1 Execute  the  instructions  C Takes time Concurrent access to shared resources Limited control on the actual execution sequence thread1 Increased memory usage Takes time
  • 132. Sharing Is Caring 61 Thread 1 Thread 2object x  =  11 2 3 4 1 2 3 4
  • 133. Sharing Is Caring 61 Thread 1 Thread 2object x  =  1 ✓ z  =  object.getX()y  =  object.getX()1 2 3 4 1 2 3 4
  • 134. Sharing Is Caring 61 Thread 1 Thread 2object x  =  1 ✓ z  =  object.getX() z  =  z  +  2 y  =  object.getX() y  =  y  +  1 1 2 3 4 1 2 3 4
  • 135. Sharing Is Caring 61 Thread 1 Thread 2object x  =  1 ✓ x  =  ? z  =  object.getX() z  =  z  +  2 object.setX(z) y  =  object.getX() y  =  y  +  1 object.setX(y) 1 2 3 4 1 2 3 4
  • 136. Sharing Is Caring 61 Thread 1 Thread 2object x  =  1 ✓ x  =  ?x  =  2 z  =  object.getX() z  =  z  +  2 y  =  object.getX() y  =  y  +  1 object.setX(y) 1 2 3 4 1 2 3 4
  • 137. Sharing Is Caring 61 Thread 1 Thread 2object x  =  1 ✓ x  =  ? x  =  3 x  =  2 z  =  object.getX() z  =  z  +  2 object.setX(z) y  =  object.getX() y  =  y  +  1 object.setX(y) 1 2 3 4 1 2 3 4
  • 138. Sharing Is Caring 61 Thread 1 Thread 2object x  =  1 ✓ x  =  ? x  =  3 x  =  2 z  =  object.getX() z  =  z  +  2 object.setX(z) y  =  object.getX() y  =  y  +  1 object.setX(y) y  =  object.getX() ? ✗ 1 2 3 4 1 2 3 4
  • 139. Sharing Is Caring 61 Thread 1 Thread 2object x  =  1 ✓ x  =  ? x  =  3 x  =  2 x  =  1  +  1  +  2  ≠  3✗ z  =  object.getX() z  =  z  +  2 object.setX(z) y  =  object.getX() y  =  y  +  1 object.setX(y) y  =  object.getX() ? ✗ 1 2 3 4 1 2 3 4
  • 140. Sharing With Care 6224 Thread 1 Thread 2object 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
  • 141. Sharing With Care 6224 Thread 1 Thread 2object lock(object) (waiting) lock(object)1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
  • 142. Sharing With Care 6224 Thread 1 Thread 2object x  =  1 x  =  2 y  =  object.getX() y  =  y  +  1 object.setX(y) lock(object) (waiting) lock(object)1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
  • 143. Sharing With Care 6224 Thread 1 Thread 2object x  =  1 x  =  2 y  =  object.getX() y  =  y  +  1 object.setX(y) release(object) lock(object) (waiting) lock(object)1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
  • 144. Sharing With Care 6224 Thread 1 Thread 2object x  =  1 x  =  2 y  =  object.getX() y  =  y  +  1 object.setX(y) release(object) lock(object) (waiting) lock(object) getlock(object) (waiting) 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
  • 145. Sharing With Care 6224 Thread 1 Thread 2object x  =  1 x  =  2 x  =  2 z  =  object.getX() z  =  z  +  2 object.setX(z) release(object) y  =  object.getX() y  =  y  +  1 object.setX(y) release(object) lock(object) (waiting) lock(object) getlock(object) (waiting) x  =  4 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
  • 146. Sharing With Care 6224 Thread 1 Thread 2object x  =  1 x  =  2 x  =  2 x  =  1  +  1  +  2  =  4✓ z  =  object.getX() z  =  z  +  2 object.setX(z) release(object) y  =  object.getX() y  =  y  +  1 object.setX(y) release(object) lock(object) (waiting) lock(object) getlock(object) (waiting) y  =  object.getX() x  =  4 x  =  4 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
  • 147. The Limits Of Sharing 63 Thread 1 Thread 2object lock(object) (waiting) release(object) getlock(object) (waiting) release(object) lock(object) • Lock/release mechanisms force threads to wait • In the worst case the execution is sequential • In general • Lock an object while it may be modified • Do no lock for read-only operations • Check for inconsistencies at runtime
  • 148. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusion Parallelization In Practice 64
  • 149. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusionParallel CPU time = Sequential CPU time Number of CPUs Parallelization In Practice 64
  • 150. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusionParallel CPU time = Sequential CPU time Number of CPUs Parallelization In Practice 64 x Random(1, +∞)
  • 151. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusionParallel CPU time = Sequential CPU time Number of CPUs Parallelization In Practice 64 x Random(1, +∞) x (1- e- time spent programming)
  • 152. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusionParallel CPU time = Sequential CPU time Number of CPUs Parallelization In Practice 64 x Random(1, +∞) x (1- e- time spent programming) x (1- e- time spent debugging )
  • 153. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusionParallel CPU time = Sequential CPU time Number of CPUs Parallelization In Practice 64 x Random(1, +∞) x (1- e- time spent programming) x (1- e- time spent debugging ) x (1- e- number of headaches )
  • 154. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusionParallel CPU time = Sequential CPU time Number of CPUs Parallelization In Practice 64 x Random(1, +∞) x (1- e- time spent programming) x (1- e- time spent debugging ) x (1- e- number of headaches )
  • 155. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusionParallel CPU time = Sequential CPU time Number of CPUs Parallelization In Practice 64 x Random(1, +∞) x (1- e- time spent programming) x (1- e- time spent debugging ) x (1- e- number of headaches ) Not so random
  • 156. Parallel CPU time = Sequential CPU time Number of CPUs The parallelization illusionParallel CPU time = Sequential CPU time Number of CPUs Converges to 1 Parallelization In Practice 64 x Random(1, +∞) x (1- e- time spent programming) x (1- e- time spent debugging ) x (1- e- number of headaches ) Not so random
  • 157. Amdahl’s Law 65 S = 1 1 ↵ P + ↵ Given: - A fraction α of the code can be parallelized - P processors The speedup is bounded by:
  • 158. Amdahl’s Law 65 S = 1 1 ↵ P + ↵ Given: - A fraction α of the code can be parallelized - P processors The speedup is bounded by: Source: http://en.wikipedia.org/wiki/Parallel_computing Illusion (P) (↵) (S)
  • 159. Amdahl’s Law 65 S = 1 1 ↵ P + ↵ Given: - A fraction α of the code can be parallelized - P processors The speedup is bounded by: “When a task cannot be partitioned because of sequential constraints, the application of more effort has no effect on the schedule.The bearing of a child takes nine months, no matter how many women are assigned.” Fred Brooks
  • 160. Two Approaches • Run a sequential algorithm in different threads • E.g., different experiments, or runs of a same algorithm • No synchronization issues • Limited shared resources issues • Design a parallel algorithm • Potentially a real speedup of the algorithm • Increase complexity and harder to debug 66
  • 161. Learnt From Experience • Limit number of shared resources • Avoid risk of concurrent modifications • Use bullet proof synchronization / locks / error checks • Limit complex debugging • Limit communication between threads • Reduce waiting for other threads to exchange information • Execute a significant number of operations in each thread • Execution time ≫ thread creation overhead 67
  • 163. Thread 1 Thread 2 Thread 3 A Simple Example • Parallel Greedy Randomized Adaptive Search Procedure 69 Randomized Constructive Heuristic Start Local Search End Randomized Constructive Heuristic Local Search Randomized Constructive Heuristic Local Search
  • 165. [ParallelGRASP.java] 70 Ask the system how many processors are available
  • 166. [ParallelGRASP.java] 70 Ask the system how many processors are available Create one GRASP instance per iteration
  • 167. [ParallelGRASP.java] 70 Ask the system how many processors are available Create one GRASP instance per iteration The executor will be responsible for the creation of threads
  • 169. [ParallelGRASP.java] 71 The executor will creates threads as needed, execute the GRASP subprocesses, and return the results
  • 170. [ParallelGRASP.java] 71 The executor will creates threads as needed, execute the GRASP subprocesses, and return the results Loop through pairs <GRASP subprocess, Best solution>
  • 171. [ExampleParallelGRASP.java] • Run the for different instances and compare with the sequential version • What is the speedup? • Are the solutions identical? • Going further .... • Why do we create GRASP instances with a single iteration? • What are the synchronization issues? 72
  • 172. LS LS Variable Neighborhood Search • Similar to theVND • Random exploration of each neighborhood • Local search 73 N1 N2start Nn end Improvement found? Improvement found? Improvement found? yesyesyes no no LS
  • 173. LS LS Variable Neighborhood Search • Similar to theVND • Random exploration of each neighborhood • Local search 73 N1 N2start Nn end Improvement found? Improvement found? Improvement found? yesyesyes no no LS See [VNS.java]
  • 174. LS LS Variable Neighborhood Search • Similar to theVND • Random exploration of each neighborhood • Local search 73 N1 N2start Nn end Improvement found? Improvement found? Improvement found? yesyesyes no no LS See [VNS.java] NeighborhoodExplorer
  • 175. [VNS.java] 74 Define 4 string-exchange neighborhoods of increasing size Define an explorer for each neighborhood: include one neighborhoo and a local search LSNi LSN2
  • 176. LS LS Parallel Variable Neighborhood Search • Explore all neighborhoods in parallel • Select best neighbor 75 N1 N2start Nn end Improvement found? yes no LS
  • 177. LS LS Parallel Variable Neighborhood Search • Explore all neighborhoods in parallel • Select best neighbor 75 N1 N2start Nn end Improvement found? yes no LS See [ParallelVNS.java]
  • 178. LS LS Parallel Variable Neighborhood Search • Explore all neighborhoods in parallel • Select best neighbor 75 N1 N2start Nn end Improvement found? yes no LS See [ParallelVNS.java] NeighborhoodExplorer
  • 179. [ParallelVNS.java] • In the provided version, neighborhoods are explored in the same thread • Exercise: explore each neighborhood in a separate thread • Hints: • Use mExecutor • See ParallelGRASP.java for reference • Compare the speed-up for small and large instances 76
  • 180. Parallel Algorithms Classification • Classification according to three dimensions (Crainic 2008) • Search control cardinality • 1-control / p-control • Search control and communications • Rigid / Knowledge synchronization • Collegial / Knowledge Collegial • Search differentiations • Same initial point / Multiple initial point • Same search strategy / Different search strategy • In which category fall the ParallelGRASP and ParallelVNS? 77
  • 182. Synchronous 1-Control 78 Thread1 Thread2 Thread3 Thread4 Control Control The control starts new threads to run part of the optimization in parallel Thread1 Thread2 Thread3 Thread4 Control Do your assignments
  • 183. Synchronous 1-Control 78 Thread1 Thread2 Thread3 Thread4 Control Control The control starts new threads to run part of the optimization in parallel Once all threads are finished, the control gathers the information and proceed with the optimization Thread1 Thread2 Thread3 Thread4 Control Do your assignments Show me your results
  • 184. Synchronous P-Control 79 Thread1 Thread2 Thread3 Thread4 Control Control Control Control Control Control Control Control Main Control
  • 185. Synchronous P-Control 79 Thread1 Thread2 Thread3 Thread4 Control Control Control Control Control Control Control Control At fixed points of the optimization, some threads synchronize and exchange information Main Control
  • 186. Synchronous P-Control 79 Thread1 Thread2 Thread3 Thread4 Control Control Control Control Control Control Control Control At fixed points of the optimization, some threads synchronize and exchange information Main Control I found a new local optima! I found a new best solution!
  • 188. Thread1 Thread2 Thread3 Thread4 Main Control Asynchronous P-Control 80 Control Control Control Control Control Control Control Control Shared information At arbitrary points of the optimization, each thread exchange information with a centralized component
  • 189. Thread1 Thread2 Thread3 Thread4 Main Control Asynchronous P-Control 80 Control Control Control Control Control Control Control Control Shared information At arbitrary points of the optimization, each thread exchange information with a centralized component I found a new best solution! I’m stuck, give me the best solution found so far I found a new local optima!
  • 191. Software Development For Research 82 Specifications Design Implementation Test Prototype Final product
  • 192. Software Development For Research 82 Specifications Design Implementation Test Prototype Final product What do I need now? What will I need in the future? What may I need in the future
  • 193. Software Development For Research 82 Specifications Design Implementation Test Prototype Final product What do I need now? What will I need in the future? What may I need in the future How to implement what I need, will need and may need? How to ensure I will be able to reuse/extend my code?
  • 194. Software Development For Research 82 Specifications Design Implementation Test Prototype Final product What do I need now? What will I need in the future? What may I need in the future How to implement what I need, will need and may need? How to ensure I will be able to reuse/extend my code? How to do what I need now?
  • 195. Software Development For Research 82 Specifications Design Implementation Test Prototype Final product What do I need now? What will I need in the future? What may I need in the future How to implement what I need, will need and may need? How to ensure I will be able to reuse/extend my code? How to do what I need now? Is everything working as expected? Is something that worked before now broken?
  • 196. A Typical Design Problem 83 • Current need: two-opt local search for theVRP • Data model • How to represent an instance, customer, solution, route? • Optimization algorithm • How to represent • A local search? • A neighborhood? • A move? • How to check the feasibility of a move?
  • 197. A First Design 84 Instance int[]  customers double[]  demands double[][]  distances int  fleetSize double  vehicleCapacity Solution int[]<>  routes double[]  loads TwoOpt twoOpt(Instance,Solution){  for  each  move:    check  if  feasible    evaluate  return  best  move }
  • 198. A First Design 84 Instance int[]  customers double[]  demands double[][]  distances int  fleetSize double  vehicleCapacity Solution int[]<>  routes double[]  loads TwoOpt twoOpt(Instance,Solution){  for  each  move:    check  if  feasible    evaluate  return  best  move } What if I now want to solve theVRPTW?
  • 199. A First Design 84 Instance int[]  customers double[]  demands double[][]  distances int  fleetSize double  vehicleCapacity Solution int[]<>  routes double[]  loads TwoOpt twoOpt(Instance,Solution){  for  each  move:    check  if  feasible    evaluate  return  best  move } What if I now want to solve theVRPTW? What if I want an Or-Opt local search?
  • 200. Some Design Tips • Identify what is reusable • For instance, logic common to all neighborhoods • Separate clearly responsibilities • An instance stores the data • A solution stores a solution • An objective function evaluates a solution and moves • A constraint evaluates the feasibility of a solution or move • Keep in mind possible extensions • What other problems may I have to solve? • Warning: avoid over-designing 85
  • 201. Flexible And Extensible Designs 86 Neighborhood Constraint<>  constraints localSearch(instance,solution,objective){  for  each  move  in  listAllMoves(instance,solution):    for  each  constraint  in  constraints:      constraint.check(move)    objective.evaluate(instance,solution,move)  return  best  feasible  move } abstract  listAllMoves(instance,solution) TwoOpt listAllMoves(instance,solution) {  ... }
  • 202. Flexible And Extensible Designs 86 Neighborhood Constraint<>  constraints localSearch(instance,solution,objective){  for  each  move  in  listAllMoves(instance,solution):    for  each  constraint  in  constraints:      constraint.check(move)    objective.evaluate(instance,solution,move)  return  best  feasible  move } abstract  listAllMoves(instance,solution) TwoOpt listAllMoves(instance,solution) {  ... } OrOpt listAllMoves(instance,solution) {  ... }
  • 203. Flexible And Extensible Designs 87 Constraint abstract  check(move) Capacity check(move){  ... }
  • 204. Flexible And Extensible Designs 87 Constraint abstract  check(move) Capacity check(move){  ... } TimeWindow check(move){  ... } MaxDuration check(move){  ... }
  • 205. Designing Tools 88 • Create UML diagrams to model the organization of the code • Generate code from a model • Once the design is stable • Generate all the code skeleton in one click • Generate a model from code (hazardous) • Examples • Visual Paradigm (free community edition) • Enterprise Architect ($$$) • Check with your university / computer science department
  • 206. Implementation • Document your code and use coherent conventions • Explain what are the inputs, outputs, main steps • Saves a lot of time when you have to come back to it • Make your code reusable and extensible • Use the benefits of object-oriented programming • Spend time now, save time tomorrow • Build on top of existing libraries • Avoid reinventing the wheel 89
  • 207. Testing • Create simple test cases that check key functionalities • Unit test cases • E.g., check that the methods to manipulate a solutions are working • More elaborate test cases • E.g., solution found by a 2-Opt neighborhood • Profile your code to detect bottlenecks and memory leaks 90
  • 208. Final product Development Process 91 Define problem Relaxation (simplify the problem) Select approach Design & Implement Test & Debug Benchmark & Profile Restore relaxation Adjust parameters Publish paper! Prototype
  • 210. Vehicle Routing 93 • VROOM (Java) - http://victorpillac.com/vroom/ • VROOM-­‐Modelling • Library to manipulateVRP instances • VROOM-­‐Heuristics • Library of common (meta)heuristics • CW, [Adaptive]VNS, [Parallel] [Adaptive] LNS, GRASPx[ILS,ELS] • VROOM-­‐Technicians • Improved implementations for theTRSP • VROOM-­‐jMSA • Event driven multiple scenario approach for dynamic vehicle routing
  • 211. Vehicle Routing • VRPH (C++) - https://sites.google.com/site/vrphlibrary/ • Library of heuristics for theVRP • CW, VNS • Symphony-VRP - https://projects.coin-or.org/SYMPHONY • Exact solver based on the Symphony and Concorde solver • CVRPSEP - Lysgaard (2004) • Valid inequality generation • Concorde - http://www.tsp.gatech.edu/concorde.html • Exact solver for theTSP 94
  • 212. Parallelization • Java • Since Java 5: java.util.concurrent framework • Since Java 7: Fork/Join framework • C++ • POSIX  Threads (http://computing.llnl.gov/tutorials/pthreads) • OpenMP (http://computing.llnl.gov/tutorials/openMP) • Boost.Thread (http://www.boost.org/) • Python • Parallel  Python (http://www.parallelpython.com/) 95
  • 213. Logging • Java • Log4J • java.util.logging package • C++ • Boost Log / Logging • Log4cpp • Python • logging module 96
  • 214. VRPRep Instance Repository 97 • VRPREP Website: http://rhodes.ima.uco.fr/vrprep/web/home • XML schema to describe most vehicle routing problems • Easy to read for your program • XML data binding: creates the objects for you • Repository of exiting instances • Possibility to define your own problem • Tool to generate a sample XML file • Upload your instances
  • 216. Today We Have Seen ... 99 • Introduction • What is complexity? Why is it important? • Data structures • How to represent a solution efficiently? • Algorithmic tricks • What are the main bottlenecks and how to avoid them? • Parallelization • How do parallel computing work? Why, when, and how to parallelize? • Software engineering • How to design flexible and reusable code? • Resources • How to avoid reinventing the wheel?
  • 217. Take Away 1. Developing efficient optimization algorithms requires careful software engineering ✓ Complexity of the problems at hand ✓ Efficient data structures,Algorithmic tricks, Parallelization 2. Invest in developing flexible and extensible code ✓ Detailed design, Documentation ✓ Will save you time later 3. Use existing libraries and share your code ✓ Do not reinvent the wheel ✓ Help others (good for your resumé too) 100
  • 218. Discrete Optimization 101 • PascalVan Hentenryck - http://www.coursera.org/course/optimization • Online community of thousands of students • Topics • Dynamic programming • Constraint programming • Local search • Linear programming • Join the challenge to solveTSPs andVRPs!
  • 219. 2nd International Optimisation Summer School • 12th to 17th January 2014, Kioloa, NSW,Australia • http://www.cse.unsw.edu.au/~tw/school/ • Lectures • Constraint programming, Integer programming, Column generation • Modelling • Uncertainty • Vehicle routing, Scheduling, Supply networks • Research skills. 102
  • 220. NICTA IS DEDICATED TO RESEARCH EXCELLENCE IN ICT AND WEALTH CREATION FOR AUSTRALIA NICTA IS AUSTRALIA'S PRE-EMINENT NATIONAL ICT RESEARCH CENTRE OF EXCELLENCE AUSTRALIA’S ICT PHD FACTORY CONNECTING SMALL BUSINESS AND COMMERCIALISING TECHNOLOGY * NICTA IS CREATING NEW BUSINESSES * NICTA IS TRANSFORMING INDUSTRY * NICTA IS BUILDING SKILLS AND CAPACITY FOR THE DIGITAL ECONOMY NICTA TECHNOLOGY IS IN 1.5 BILLION MOBILE PHONES AROUND THE WORLD 17PARTNER UNIVERSITIES 700OF THE BEST ICT SCIENTISTS AND STUDENTS. BRISBANE SYDNEY CANBERRA MELBOURNE Crash-proof code: “One of the world’s top 10 technologies”: MIT. Making Amazon's business more secure in the cloud. A global leader in digital audio networking, used at the London 2012 Olympics and the Queen's Jubilee Concert. Revolutionising pain management through the use of implants in the spinal cord. 25% OF ICT PHD STUDENTS IN AUSTRALIA 340 graduates, 260 enrolled students, that’s: NICTA is working with schools to promote opportunities in ICT. Fleet logistics helping to save 15% of transport costs for Tip Top. Big Data analytics for the Australian finance sector. AUDINATE OPEN KERNEL LABS SALUDA MEDICAL Optimising freight pick-ups and deliveries across Australia. OPTURION Reduce roadside maintenance costs by $60m using computer vision for Sensis. Providing cutting edge services and applications for the NBN. INCREASING PRODUCTIVITY AND PROFIT. Helping diagnose and treat prostate cancer for Peter Mac Cancer Centre, and helping to build the bionic eye. Saved water costs by 30% for the Australian dairy industry. YURUWARE who have gone onto work for: 11 spin-outs, 5 more in the pipeline... creating highly skilled jobs for Australians. FOR MORE INFORMATION, CONTACT US ON INFO@NICTA.COM.AU including UNSW, UoM, USYD, MOnash, and ANU. * INFRASTRUCTURE * FINANCE * AGRICULTURE * TRANSPORT * MEDICINE
  • 221. References 104 • Crainic,T., Parallel Solution Methods forVehicle Routing Problems,TheVehicle Routing Problem: Latest Advances and New Challenges, Operations Research/Computer Science InterfacesVolume 43, 2008, pp 171-198 • Groer, C.; Golden, B. & Wasil, E.,A library of local search heuristics for the vehicle routing problem, Mathematical Programming Computation, Springer Berlin / Heidelberg, 2010, 2, 79-101 • Irnich, S., Funke, B., & Grünert,T. (2006). Sequential search and its application to vehicle-routing problems. Computers & Operations Research, 33(8), 2405-2429. • Lysgaard, J.,(2004).CVRPSP:A package of separation routines for the capacitated vehicle routing problem, Working Paper 03–04. • Pillac,V.; Guéret, C. & Medaglia,A. L. (2012). A parallel matheuristic for theTechnician Routing and Scheduling Problem, Optimization Letters, doi:10.1007/s11590-012-0567-4 • Savelsbergh, M. (1992).The vehicle routing problem with time windows: minimizing route duration. INFORMS, 4(2):146–154, doi:10.1287/ijoc.4.2.146. • Toth, P. andVigo, D. (2003). The GranularTabu Search and Its Application to theVehicle-Routing Problem, INFORMS Journal on Computing15, 333-346; • Zachariadis, E. E., & Kiranoudis, C.T. (2010).A strategy for reducing the computational complexity of local search-based methods for the vehicle routing problem. Computers & Operations Research, 37(12), 2089-2105.