5. The problem
I want to go hiking at a time/day that
works for me, but that also minimizes
the size of the crowds.
6. The problem
I want to go hiking at a time/day that
works for me, but that also minimizes
the size of the crowds.
I would like to predict the crowd size for
a specific location and a range of
future dates.
7. The problem
I want to go hiking at a time/day that
works for me, but that also minimizes
the size of the crowds.
I would like to predict the crowd size for
a specific location and a range of
future dates.
Then I can use that prediction to make
an intelligent choice about when to
take my trip.
8. How do we predict crowds right now?
Government data
Often aggregated
Not always immediately accessible
Check-ins
Sparse coverage
Prior knowledge/Intuition
Not always validated
12. CrowdSkippr: Inner workings
From flickr.com, extract
the total number of
photos taken at a given
time/place).
Extract data on
temperatures from
NOAA.gov for a
given time/place.
Using this information, create a prediction of how
heavy the crowds will be at a given future
time/place.
TM
13. Gradient Boosting Regression
Predictors
Day of week (Flickr)
Holiday flag (Flickr)
Day of year (Flickr)
Daily temperature (NOAA)
Response
Number of photos taken (Flickr)
(proxy for size of crowd)
18. For all 28-day windows in a given year,
the median difference between crowd size on predicted and
actual best days is 4.6%.
(On the days that are predicted to have the lowest crowds, the
crowd size is 29% of the worst possible crowds within that
window.)
Validation:
Rocky Mountain National Park
Predicted
crowd size
Actual
crowd size
(test data)
Editor's Notes
On the internet, everyone has a cell phone, people leave a digital footprint of themseves wherever they go, let’s see if we can use these footprints to create a prediction of where people will be in the future.