3. Block diagram of the system
User location
Recommended
landmark sequence
of given travel time
User mode of
transport
User's free time
and allowed
margin
Recommendation system
Recommended
landmark sequence
of given travel time
...
User requested
number of
sequences
Recommended
landmark sequence
of given travel time
Flickr photos
User's photos
4. Necessary components
●
●
Identifying landmarks in the area (no given list) and
naming them
Estimation of travel time between landmarks using
given transportation method
●
Estimation of time spent visiting each landmark
●
Recommending landmark sequences for the user
5. Previous Work and Innovation
●
Previous Work:
●
●
●
●
●
Crandall et al - extract landmarks at various granularity levels from
Flickr photos using mean-shift, and name them
Popescu et al - popular trips within a city from photo data-sets
Choudhury et al - constructing representative travel routes linking
popular landmarks within a city using popularity of landmarks, stay
times and transit times
Popescu et al - deducing the typical visit duration of a landmark
Main innovation here:
●
●
●
Personalized recommendations based on user's location history
and implicit interests
Estimation of traveling times between landmarks using different
modes of transport
Building a complete recommender system implementing the ideas
above
6. Flickr
●
●
Many digital cameras and phones add a geo-location tag to images
automatically.
Flickr houses at least 221,883,830 Geo-tagged time-stamped photos
from over 51 million users. http://www.flickr.com/map
●
●
Time-stamps will be used for travel time estimation
●
Textual tags are used to name the extracted landmark
●
●
Geo-tags will be used for landmark extraction and recommendation
NOT using the actual photos at all, just the meta-data (fast)
The Flickr API allows searching for public images taken in a given geobox or geo-circle for non-commercial use.
http://www.flickr.com/services/api/flickr.photos.search.html
7. Assumptions
●
Taking a picture of a place and uploading it to
Flickr constitutes a recommendation.
(Not many “This museum was boring” photos)
●
Geo-locations of camera and of photographed
object are equivalent
(The lookout point is recommended, not the view)
●
NOT assuming absolute time stamps of photos
are correct, since many camera clocks aren't
set.
Image time-deltas are used.
8. From photos to landmarks
●
Clustering points in a two-dimensional space
9. Landmark Extraction Assumptions
●
●
The probability of taking a photograph of a
landmark is distributed normally (
) as a
function of the distance (>0) from the landmark
Each photo is of one landmark.
(A photo of a close object against the
background of a distant one is a photo of the
closer object)
10. The Mean-Shift procedure:
●
Estimates the local maximum of the probability distribution of
each cluster of photos – the location of a landmark
●
Its only parameter is the bandwidth ω
●
Iteratively compute for each photo, until it converges:
11. From photos to landmarks
●
●
Substitute the geo-location of each photo with the
landmark it captures.
Group successive user photos of the same
landmark as one photo.
●
●
●
(Taking many pictures of a place isn't considered a
stronger recommendation)
The time-stamp of grouped photos is the average
between the time-stamps of the first and last
successive photos of the landmark.
The textual representation of each discovered
landmark is the most common tag of all the
photos of the landmark
12. Photographer behavior model
●
We want to estimate P( lt | <lt−1lt−2...l1>, hu ), the
probability that:
–
user u
–
with location history hu
–
at landmark lt−1 at time t−1,
lt−2 at time t−2, etc.
–
●
visits lt at time t
We assume the photographer's decision on the next
landmark to visit is a function of:
–
The photographer's current location (sequence)
–
The photographer's topics of interest
13. Location-based Model
●
Using a Markov model:
–
For simplicity, a first-order Markov model is used:
–
Maximum likelihood estimation:
14. Topic Model (PLSA)
●
Each user is a distribution over topics Z
●
Each topic is a distribution over the landmarks
User
●
Landmark
distributions
Using the law of total probability:
P(lt) =
●
Topic
distribution
P(z)
Assuming P(hu) and P(lt) are independently
conditioned on p(z) we get
P(lt|z,hu)P(z|hu) =
16. Markov-Topic Model
●
Assuming P(hu) and P(lt-1) are independently
conditioned on P(lt) we get, after derivation:
Topic
Markov
“Normalizing Factor” P(lt-1|hu)
17. Generating travel routes
●
Naive method:
–
–
●
Compute the probability of all possible routes of
given time based on user's location and history
Choosing the most probable ones
A best-first-search is used on the probability
tree:
P(l1|l0,hu)
l1
P(l2|<l1l0>,hu)
l0
P(l2|l0,hu)
l2
P(l3|l0,hu)
l3
18.
19. Travel time estimation
●
The time-delta between consecutive landmarks
in a sequence represents travel time between
them, using a specific mode of transport
–
(and sometimes includes some of the visit times of
both locations)
20. K-means
●
K-means is used on each two landmarks.
–
●
Identifies K typical travel times between them using
different transportation methods
K=3 was chosen
–
–
●
Three peaks visible here:
Google Maps gives estimates for walking, using
public transportation and using a car
Walking is assumed to be slowest, followed by
public transport, then private car
21. Experiments
●
696,394 photographs
●
71,718 users
●
Photos taken within 20 km from the center of:
–
Washington D.C., New York City, Philadelphia and
Boston on the East Coast
–
Los Angeles, San Francisco and Las Vegas on the
West Coast
22. Choosing the number of topics
●
Rating by precision of prediction of last landmark of
each sequence, over 5-fold cross-validation
28. Routes chosen by topic
Routes suggested by the
Markov model alone:
Routes suggested by the
Markov-Topic model:
29. Future Work
●
●
●
Using photographer's social network profile and
friends list
Consideration of opening hours, congestion
and fee
Evaluation in the field
30. Take-away points
●
●
●
●
●
Creatively looking for data
Building a complete system is a teaching
experience.
To build a system, it's frequently necessary to
use a variety of (AI) methods – it's good to have
a diverse mental toolbox, or a diverse team.
Testing is important - quantitative experiments
on a large-scale dataset.
Statistically significantly better than the
competitors.
Mainly suited for heavily photographed areas frequented by Flickr users
David Crandall, Lars Backstrom, Daniel Huttenlocher,
and Jon Kleinberg. Mapping the world’s photos. In
Proc. WWW’2009, pages 761–770, April 2009.
Popescu A, Grefenstette G, Moëllic (2009) Mining tourist information from user-supplied collections.
In: Proceedings of 18th ACM conference on information and knowledge management, pp 1713–1716
Choudhury MD, Feldman M, Amer-Yahia S et al (2010) Constructing travel itineraries from tagged
geo-temporal breadcrumbs. In: Proceedings of 19th international conference on world wide web,
Pp 1083–1084
Popescu A, Grefenstette G (2009) Deducing trip related information from Flickr. In: Proceedings of 18th
international conference on world wide web, pp 1183–1184
The number of Flickr users was taken from a 2011 advertising pitch: http://advertising.yahoo.com/article/flickr.html.
Visiting time of all landmarks is ZERO.
(“Japanese tourism”)
Not exactly correct
The larger the bandwidth, the more images are grouped by the procedure since they didn't fall off the ends of the probability distribution function
(u users)
Video from http://www.youtube.com/watch?v=nuLYrSZ3fRo
Note the models are NOT independent. The current location is possibly affected by interests.
Probabilistic latent semantic analysis
A,B are independently conditioned on z if they're independent after z is known.
<Z,B> gives no more information for A than just Z.
Knowing the topic of a landmark is sufficient, no need to know the topic and the user
No need to go back two steps in the above graph
Proving the algorithm takes about six hours. It's an impressive proof, I recommend it
assume conditional independence: p(hu,lt-1|lt) = p(hu|lt)p(lt-1|lt)
p(lt|lt-1,hu) = p(lt-1, hu|lt)p(lt)/p(lt-1,hu)
using assumption: = p(hu|lt)p(lt-1|lt)p(lt)/p(lt-1,hu)
using bayes on p(lt-1|lt): = p(hu|lt)p(lt|lt-1)p(lt-1)/p(lt)*p(lt)/p(lt-1,hu)
= p(hu|lt)p(lt|lt-1)/p(lt-1,hu)
using bayes on p(hu|lt): p(lt|hu)p(hu)/p(lt)*p(lt|lt-1)/p(lt-1,hu)
= p(lt|hu)p(hu)p(lt|lt-1)/p(lt-1,hu)p(lt)
= p(lt|hu)p(lt|lt-1)/p(lt) * p(hu)/p(lt-1,hu)
define C(lt-1,hu) = p(lt-1,hu)/p(hu) = p(lt-1|hu)
= p(lt|hu)p(lt|lt-1)/p(lt)C(lt-1,hu)
= topic*markov/(easy*C(lt-1, hu))
Sometimes = If consecutive photos were merged
If two location are very close, everybody would walk between them and not take a car or a train, so the estimates of the 3 transportation methods would be very similar
5-fold cross-validation to avoid over-fitting
Compared strategies:
Multinomial model – simply recommend the most popular landmark
Markov model – considers user's location only
Topic model – considers user's interests only
Markov-Topic model
Trained with all sequences, removing the last landmark
Tested with all sequences
Tested by slicing off the given time period from all the sequences.
A bit less data since some sequences were shorter than the time period and were dismissed
(Assuming Google's estimation is good)
Estimated travel times are a little longer than Google's since they include some of the visit time and since are affected by weather, traffic etc.
The system's estimation is actually based on real data from these specific routes more than Google's estimation