Generating Summaries and Visualization for Large Collections of Geo-referenced Photographs Alexander Jaffe*,  Mor Naaman *...
Attraction Map of Paris <ul><ul><li>Stanley Milgram, 1976.  </li></ul></ul><ul><ul><li>Psychological Maps of Paris </li></...
Attraction Map of London <ul><ul><li>Jaffe et al, 2006. </li></ul></ul>
Information Overload? <ul><ul><li>Flickr  “geotagged” </li></ul></ul>
Overview <ul><li>Problem definition </li></ul><ul><li>Intuition for solution </li></ul><ul><li>Algorithm for summarization...
Problem Definition <ul><li>Dataset: </li></ul><ul><li>( photo_id , user_id, latitude, longitude) </li></ul><ul><li>( photo...
Issues to Tackle <ul><li>Noisy data </li></ul>Whatever, color, city, spectrum, santa barbara, california, usa, Lookatme, H...
Intuition <ul><ul><li>More “activity” in a certain location indicates importance of that location </li></ul></ul><ul><ul><...
(Very) Simple Example
Algorithm Overview <ul><li>Hierarchical Clustering of the location data </li></ul><ul><li>For each cluster, generate clust...
The Clustered Return of the (Very) Simple Example! 4, 6, 5 8,7 4,8,6,5,7 20 10
Generating a Summary <ul><li>A complete ranking is produced for all photos in the dataset </li></ul><ul><li>An  n -photo s...
Generating Cluster Scores <ul><li>Main Factors: </li></ul><ul><ul><li>Number of photos </li></ul></ul><ul><ul><li>Relevanc...
Tag Distinguishability <ul><li>A measure of uniqueness of concepts represented in the cluster  (“document”) </li></ul><ul>...
Summary of San Francisco Golden Gate Bridge TransAmerica AT&T Baseball Park Golden Gate Twin Peaks Golden Gate Bay Bridge ...
Progress Bar (almost done) <ul><li>Problem definition </li></ul><ul><li>Intuition for solution </li></ul><ul><li>Algorithm...
Tag Maps <ul><li>Observation: </li></ul><ul><ul><li>The algorithm identifies “representative” locations </li></ul></ul><ul...
Tag Maps
Tag Maps
Ok, how do we evaluate this? <ul><li>Direct human-evaluation of algorithmic results </li></ul><ul><ul><li>Evaluated Tag Ma...
Maybe we have time for a demo
Maybe we have time for Q’s <ul><li>http://zonetag.research.yahoo.com </li></ul><ul><li>(applied in prototype cameraphone a...
Upcoming SlideShare
Loading in...5
×

Tag Maps

4,076

Published on

Generating Summaries and Visualization for Large Collections of Geo-referenced Photographs

Published in: Travel, Business
4 Comments
8 Likes
Statistics
Notes
  • I saw that. Cool.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Update: Check out the live version!
    http://tagmaps.research.yahoo.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • It's an internal Y! font, I removed them from the rest of this presentation but some remains were left... so, I don't think it's a problem :)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dude, this slide got hosed!
    Sorry, looks like it's a font problem. We're looking into it...
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
4,076
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
4
Likes
8
Embeds 0
No embeds

No notes for slide
  • Mor… joint work with Set up live demo? http://techdev3.search.corp.yahoo.com/semanticzoom/sf2.html
  • Transcript of "Tag Maps"

    1. 1. Generating Summaries and Visualization for Large Collections of Geo-referenced Photographs Alexander Jaffe*, Mor Naaman *, Tamir Tassa † , Marc Davis $ *Yahoo! Research Berkeley † Open University of Israel $ Yahoo! Research
    2. 2. Attraction Map of Paris <ul><ul><li>Stanley Milgram, 1976. </li></ul></ul><ul><ul><li>Psychological Maps of Paris </li></ul></ul>
    3. 3. Attraction Map of London <ul><ul><li>Jaffe et al, 2006. </li></ul></ul>
    4. 4. Information Overload? <ul><ul><li>Flickr “geotagged” </li></ul></ul>
    5. 5. Overview <ul><li>Problem definition </li></ul><ul><li>Intuition for solution </li></ul><ul><li>Algorithm for summarization </li></ul><ul><li>Visualizing the dataset </li></ul><ul><li>Evaluation </li></ul><ul><li>Demo? </li></ul>
    6. 6. Problem Definition <ul><li>Dataset: </li></ul><ul><li>( photo_id , user_id, latitude, longitude) </li></ul><ul><li>( photo_id , tag ) </li></ul><ul><li>Result: </li></ul><ul><li>(photo_id, rank) </li></ul><ul><ul><li>Given all photos from a geographic region, find a “representative” summary set </li></ul></ul>
    7. 7. Issues to Tackle <ul><li>Noisy data </li></ul>Whatever, color, city, spectrum, santa barbara, california, usa, Lookatme, Herbert Bayer Chromatic Gate <ul><li>Photographer biases </li></ul><ul><ul><li>In locations </li></ul></ul><ul><ul><li>In Tags </li></ul></ul><ul><li>Wrong data </li></ul>
    8. 8. Intuition <ul><ul><li>More “activity” in a certain location indicates importance of that location </li></ul></ul><ul><ul><li>Tag that are unique to a certain location can suggest importance of that location </li></ul></ul>
    9. 9. (Very) Simple Example
    10. 10. Algorithm Overview <ul><li>Hierarchical Clustering of the location data </li></ul><ul><li>For each cluster, generate cluster score </li></ul><ul><li>Recursively generate ordering of all photos in each cluster, based on subcluster score and ordering </li></ul>
    11. 11. The Clustered Return of the (Very) Simple Example! 4, 6, 5 8,7 4,8,6,5,7 20 10
    12. 12. Generating a Summary <ul><li>A complete ranking is produced for all photos in the dataset </li></ul><ul><li>An n -photo summary is simply the first n photos in this ranking. </li></ul>
    13. 13. Generating Cluster Scores <ul><li>Main Factors: </li></ul><ul><ul><li>Number of photos </li></ul></ul><ul><ul><li>Relevance (bias) factors </li></ul></ul><ul><ul><li>“ Tag Distinguishability” </li></ul></ul><ul><ul><li>“ Photographer Distinguishability” </li></ul></ul>
    14. 14. Tag Distinguishability <ul><li>A measure of uniqueness of concepts represented in the cluster (“document”) </li></ul><ul><li>TF/IDF based </li></ul><ul><ul><li>Compute frequency of each tag (TF) </li></ul></ul><ul><ul><li>Compute (inverse) frequency of tag in the rest of the dataset (IDF) </li></ul></ul><ul><ul><li>Aggregate TF/IDF over all tags in cluster using L2 norm </li></ul></ul><ul><li>Or, if you like formulas: </li></ul>Read the damn paper!
    15. 15. Summary of San Francisco Golden Gate Bridge TransAmerica AT&T Baseball Park Golden Gate Twin Peaks Golden Gate Bay Bridge Ocean Beach Chinatown
    16. 16. Progress Bar (almost done) <ul><li>Problem definition </li></ul><ul><li>Intuition for solution </li></ul><ul><li>Algorithm for summarization </li></ul><ul><li>Visualizing the dataset </li></ul><ul><li>Evaluation </li></ul><ul><li>Demo? </li></ul>
    17. 17. Tag Maps <ul><li>Observation: </li></ul><ul><ul><li>The algorithm identifies “representative” locations </li></ul></ul><ul><ul><li>The algorithm identifies unique, important tags </li></ul></ul>Can be used to visualize the dataset!
    18. 18. Tag Maps
    19. 19. Tag Maps
    20. 20. Ok, how do we evaluate this? <ul><li>Direct human-evaluation of algorithmic results </li></ul><ul><ul><li>Evaluated Tag Maps with various weighting options </li></ul></ul><ul><ul><li>Compared summaries to 3 base conditions </li></ul></ul><ul><li>Compared chosen locations to top 15 locations selected by humans (Milgram-style) </li></ul>
    21. 21. Maybe we have time for a demo
    22. 22. Maybe we have time for Q’s <ul><li>http://zonetag.research.yahoo.com </li></ul><ul><li>(applied in prototype cameraphone app) </li></ul><ul><li>http://blog.yahooresearchberkeley.com </li></ul><ul><li>(more on this and other topics) </li></ul><ul><li>Become an intern, get involved: </li></ul><ul><li>Email me. </li></ul><ul><li>Mor Naaman </li></ul><ul><li>[email_address] </li></ul>

    ×