1. Knowledge Mavens
Who are we and what we do
We’re a group of Scientists,
Artist, Engineers, Musicians
…aka Polymaths
We “Show and Tell” every
Saturday 1pm in Beaverton
Our mission is “Free Knowledge”
2. https://www.kaggle.com/c/traveling-santa-2018-prime-paths
A version of the Traveling Salesman
problem but with reindeer, Santa, and a
carrot issue to make it more interesting.
In short travel all the dots in the picture
once for the shortest path, but on the
tenth stop if the point number is not a
prime number then there is 10%
increase (the reindeer did not get the
expected carrot reward and thus are a
bit slower).
Reward is $7000 for best with other
prizes.
3. http://www.math.uwaterloo.ca/tsp/concorde.html
“Concorde is a computer code for
the symmetric traveling salesman
problem (TSP) and some related
network optimization problems.
The code is written in the ANSI C
programming language and it is
available for academic research
use; for other uses,
contact William Cook for licensing
options.”
Concorde scores about 1.5M path for contest. About 900
teams have about the same score.
4. https://github.com/alohawild/Raindeer
Our team, Wildteam, first coding in Python 3 is to just get the
data in and out (and to remember how Python 3 coding).
Our first run was to create the first basic dataset. We
managed to get our first rating, about 450 million units. We
shared our results with the reindeer folks and they were less
than excited. That would require the reindeer to travel, in the
four-hour period of our normal allowed delivery window, 1.8
million units (or so) a second.
We then created a sorted by X,Y list and inserted a prime
every tenth step. This was about 203 Million. Run time less
than a minute.
We then wrote a Monte Carlo program with simple greedy
selection: 73 M. This was with one hour run time on my Apple
with more than 100 epochs. Code is improvedeet.py. Run
time about an hour.
5. https://github.com/alohawild/Raindeer
Multiple runs showed no improvement of value over 100
epochs.
Path tracing was next.
Created a 10x10 matrix of the dataset (“CityID” list in the
contest terms). Then calculated the centroid for each focus
area defined by the matrix (0-99). Then created a path by
starting with North Pole (CityID of zero) and adding in focus
area by order of distance from centroid. Looped thru all the
areas starting with the one contained the North Pole
connecting all to the new path. This included trying to find a
prime and assigning to tenth step.
35 Million and then with “snake” loop 28 Million! Run time
about an hour.
6. https://github.com/alohawild/Raindeer
Again, no real improvements could be made including running
in improvedeer.py.
Expand the selection and allowed returning to greedy testing.
The logic selects the unused CityIDs that are closest to
centroid. This allows to find the next best one. The code is
arranged to use a parameter for this. New mode of “testing”
was added to routines to allow debugging—this gets a bit
obscure.
The results were a stunning 2.8 million with a 500 wide scan
(about 1/8 of the size of the list).
A comment from a data scientist and we dropped the prime
logic (likely making the reindeer unhappy) and scored on a
hour run on my apple of 1.9 million!
Code is deerpath.py and includes some commented out
prime code.
7. The Future…
The Concorde is still beating us….we could use it…Run times
of 7+ hours get 1.5M.
We know that a pure greedy process that runs a distance for
every point gets a 1.8 million run from an article in Kaggle. It
runs for a long time.
Starting on creating checking sub-paths and copying in the
best sub-path into the path. Calling this “quilting.”
We are looking at
https://en.wikipedia.org/wiki/Branch_and_bound as this may
be some help. Again we will code it. Resisting using
package…resisting….resisting…