2. This talk
Using R to grab, visualize, and analyse data from a mini mountain
marathon
getting data from web tables with rvest
manipulating the data
visualisation with ggplot2
mapping with ggmap
analysis of speeds and leg lengths
genetic algorithms for optimisation with GA
15. Mapping
library(ggmap)
bbox <- make_bbox(LON, LAT, G, f = .8)
map <- get_map(location = bbox,maptype='terrain')
mp <- ggmap(map) + geom_point(data=G,aes(x=LON,y=LAT),shape
geom_text(data=G,aes(x=LON,y=LAT,label=Value),size=4,hj
scale_shape_discrete(solid=FALSE) + ylim(c(min(G$LA
xlim(c(min(G$LON)-eps,max(G$LON)+eps))
NB geom_leg needed rather than arrow for some reason.
21. Mapping
Lessons:
easy to use & integrates well with ggplot2
osm can be used rather than google, but service only runs in
middle of night!
discrete zoom levels can be awkward if you’re in between
bbox’s are square; although you can trim the display with xlim
etc.
usual shenanigans with different coordinate systems
23. Analysis
Need a way of standardizing the leg durations, given splits from
different runners.
One might imagine for leg i and runner j, split Tij
Tij = di /sj
where di is an inherent leg duration, and sj is a relative speed. Then
log Tij = log di − log sj
Suggests linear regression:
lm(log(split) ~ leg + Pos + 0)
25. Optimization problem
Combinatorial optimisation problem, similar to the travelling
salesman problem
These problems are (NP) hard
Need to encode routes and their fitness (including penalties)
sample a route as a permutation of 1:no_checkpoints,
extracting the points between 1 and no_checkpoints
26! ≈ 4 × 1026
(cf Avogadro’s number)
=⇒ need heuristics; not true solutions
26. Genetic algorithms
A class of optimization algorithms inspired by evolution.
There is a population of solutions with:
a notion of fitness
generations
mutation
crossover (sex)
27. Package GA
Makes using these techniques pretty easy:
GA <- ga(type = "permutation", fitness = fitness,
min = 1, max = length(ptz),
popSize = 1e2, maxiter = 1e4,
run = 5e3, pmutation = 0.2)