Very largeoptimizationparallel

Alkis Vazacopoulos
Robert Ashford
Optimization Direct Inc.
CPLEX
Very Large Optimization Models
And parallel processing

Summary
• Challenges of Large Scale Optimization
• Heuristic Software Approach
• Scheduling, supply chain and telecomms
examples
• Benefits and problems of parallelism

Large Scale Optimization
• Many models now solved routinely which would have
been impossible (‘unsolvable’) a few years ago
• BUT: have super-linear growth of solving effort as model
size/complexity increases
• AND: customer models keep getting larger
• Globalized business has larger and more complex supply
chain
• Optimization expandinginto new areas, especially
scheduling
• Detailed models easier to sell to management and end-
users

The Curse of Dimensionality: Size Matters
• Super-linear solve time growth often supposed
• The reality is worse
• Few data sets available to support this
• Look at randomly selected sub-models of two scheduling
models
• Simple basic model
• More complex model with additional entity types
• Two hour time limit on each solve
• 8 threads on 4 core hyperthreaded Intel i7-4790K
• See how solve time varies with integers after presolve

Simple model
0
1000
2000
3000
4000
5000
6000
7000
8000
0 10000 20000 30000 40000 50000
Solution time in seconds
Numer of ineteger entities

Complex model
0
1000
2000
3000
4000
5000
6000
7000
8000
0 5000 10000 15000 20000 25000 30000
Solution time
Integer entities

Why size matters
• Solver has to
• (Presolve, probe and) solve LP relaxation
• Find and apply cuts
• Branch on remaining infeasibilities (and find and apply cuts too)
• Look for feasible solutions with heuristics all the while
• Simplex relaxation solves theoretically NP, but in practice effort
increases between linearly and quadratic
• Barrier solver effort grows more slowly, but:
• cross-over still grows quickly
• usually get more integer infeasibilities
• can’t use last solution/basis to accelerate
• Cutting grows faster than quadratic: each cut requires more
effort, more cuts/round, more rounds of cuts, each round harder
to apply.
• Branching is exponential: 2n in number of (say) binaries n

What can be done?
• Decomposition
• Solve smaller models
• Use ‘good’ formulations
• As tight as possible
• Avoid symmetry
• Settle for a good solution
• Use heuristics
• Use more powerful hardware
• Can usually make minor improvements by having a faster
processor and memory
• Big potential is from parallel processing
• Use 4, 8 12, 24,… or more threads on separate ‘processors’

Parallel Processing
• CPU clock speed is not improving
• Power consumed (and heat generated accordingto the
square of the speed
• Memory speed is not improving (much)
• Can fit more processors onto a single chip
• Can get wider registers
• Vectorization
• do several (up to 8) fp calculationsat once on 512 bit
register (SIMD)
• of limited use in sparse optimization
• Cannot use full processor capability if using many cores

Heuristic Solution Software
• Proof of optimality may be impractical, but want good solutions to (say)
20%
• Uses CPLEX callable library
• First-feasible-solution heuristics
• Improvement heuristic
• Solves sequence of smaller sub-model(s)
• approach used e.g. by RINS and local branching
• Use model structure to create sub-models and combine solutions
• Assess solution quality by a very aggressive root solve of whole model
• Multiple instances can be run concurrently with different seeds
• Can run on only one core
• Can interrupt at any point and take best solution so far
time limit / call-back /SIGINT

Parallel Heuristic Approach
• Run several heuristic threads with different seeds
simultaneously
• CPLEX callable library very flexible, so
• Exchange solution information between runs
• Kill sub-model solves when done better elsewhere
• Improves sub-model selection
• Opportunistic or deterministic
• 5 instances run on 24 core
Intel
2
x
Xeon
E5-‐2690v3
running

at
up
to
3.5
GHz
(DDR4-‐2133
memory)

Example: Large Scale Scheduling, Supply
Chain and Telecomms Models
† No usable (say within 30% gap) solution after 3 days run time
on fastest hardware (Intel i7 4790K ‘Devil’s Canyon’)
Model entities rows cols integers
Easy 314 299288 57804 57804
Medium† 314 389560 94200 94200
Difficult† 406 371964 149132 149132
Large 302965 2836736 4892396 18271400
Huge 27000 2577916 12944400 12944400

Heuristic Results on Scheduling Models
8
Cores
on
Intel
2
x
Xeon
E5-‐2690v3
3GHz
Solution Time Gap
Easy 96 231.06 0%
Medium 113 28800 ≤
13%
Difficult 773.8 28800 ≤
56%
Large 1.1493E+7 28800 ≤
1%
Huge 370 28800 ≤
61%
Lower
bounds

established
by
separate
CPLEX
runs
Times
are
in
seconds

Huge Model Heuristic Behavior
300
500
700
900
1100
1300
1500
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Threads
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads

Large Model Heuristic Behavior
0.00E+00
5.00E+08
1.00E+09
1.50E+09
2.00E+09
2.50E+09
3.00E+09
3.50E+09
4.00E+09
4.50E+09
5.00E+09
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Thread
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads

Large Model Heuristic Behavior
11000000
11500000
12000000
12500000
13000000
13500000
14000000
14500000
15000000
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Thread
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads

Difficult Model Heuristic Behavior
300
500
700
900
1100
1300
1500
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Thread
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads

Medium Model Heuristic Behavior
100
110
120
130
140
150
160
170
180
190
200
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Threads
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads

Easy Model Heuristic Behavior
0
500
1000
1500
2000
2500
0 100 200 300 400 500 600
Solution Value
Seconds
1 Thread
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads

What’s Gone Wrong?
• Size (or rather density) matters again
• More threads can give worse results
• Used 24 core Intel Xeon E5-2690v3 server
• Not used hyperthreading
• Memory transfer speed is ~20GB/sec
• MIP solves become memory bus bound

Work Rate
0
200
400
600
800
1000
1200
1400
1 6 11 16 21
Work Rate (ticks/thread/sec)
Threads
Easy (13.0)
Medium (10.5)
Difficult (8.5)
Large (2.7)
Huge (4.2)

Conclusions
• Hard size barriers to solve (to optimality) times
• May have to be satisfied with solutions of unproven
quality/optimality
• Heuristic methods demand serious consideration
• Parallel solution methods best way of exploiting modern
hardware
• Even these are limited by memory bus speeds

Thanks for listening
Robert Ashford
rwa@optimizationdirect.com
www.optimizationdirect.com

Very largeoptimizationparallel

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Viewers also liked

Viewers also liked (19)

Similar to Very largeoptimizationparallel

Similar to Very largeoptimizationparallel (20)

More from Alkis Vazacopoulos

More from Alkis Vazacopoulos (20)

Recently uploaded

Recently uploaded (20)

Very largeoptimizationparallel