• Like
  • Save
Extracting event and place semantics from Flickr tags
Upcoming SlideShare
Loading in...5
×

Extracting event and place semantics from Flickr tags

  • 2,464 views
Uploaded on

Presentation at SIGIR 2007.

Presentation at SIGIR 2007.

More in: Technology , Spiritual
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Dog photo by CC license from http://flickr.com/photos/pellegrini/5419830/
    Are you sure you want to
    Your message goes here
  • Seems like Slideshare still don't have some Gotham fonts. Sorry, reader!
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
2,464
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
2
Likes
7

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Mor… star interns One way to extract knowledge from Flickr tags. This is not about image search - it’s not even about photos

Transcript

  • 1. Tye Rattenbury, Nathan Good, Mor Naaman* Yahoo! Research Berkeley (Yahoo! Advanced Development Division) Towards Automatic Extraction of Event and Place Semantics from Flickr Tags
  • 2. This Talk is Not About Photos
    • Geo-referenced data
      • Photos
      • Blog posts (GeoRSS)
      • Geo-annotated Web pages
      • KML-based collections
    • Similar issues, similar potential
  • 3. Find the Differences “ Yahoo! Headquarters” “ Dog” “ SIGIR 2007”
  • 4. Data Description
  • 5. Simple Model (photo_id, user_id, time, latitude, longitude) (photo_id, tag)
  • 6. Information?
      • Flickr “geotagged” in San Francisco
  • 7. Tag Maps - San Francisco
  • 8. Tag-based Semantics
    • Derive meaningful data about individual tags
    • Based on the tag’s metadata patterns
    • E.g., ”Yahoo! Headquarters” , “SIGIR2007” , “Dog” .
  • 9. Tag Semantics: Why?
    • Improved image (and web!) search through query semantics
    • Automatic place- and event-gazetteers
    • Association of missing time/place data based on tags
    • Collection visualization
  • 10. Next Few Minutes
    • Extended model
    • Sample data
    • Problem definition
  • 11. Extended Model (photo_id, user_id, time, latitude, longitude) (photo_id, tag) (tag, location) (tag, time)
  • 12. Tag Patterns
  • 13. Tag Patterns
  • 14. Tag Patterns
  • 15. Definitions
    • Event tag: expected to exhibit significant time-based patterns
    • Location tag: expected to exhibit significant place-based patterns
  • 16. Problem Definition
    • Can we extract event/place semantics from unstructured tags given their usage distribution (time and location patterns)?
  • 17. Fundamental Issues
    • “ Regions” of semantic relevance
      • “ church” in a neighborhood is a place
      • “ carnival” in Rio de Janeiro is an event
      • “ California” in San Francsico is not a place
    • Semantic ambiguity
      • “ apple”
      • “ turkey”
      • “ pride”
  • 18. Proposed Solutions
    • Naïve Scan
      • Burst detection
    • Spatial Scan Statistics
      • Borrowed from disease outbreak detection
    • Scale-structure method
      • Examine the “structure” of the tag’s patterns in multiple scales
  • 19. Next Few Minutes
    • Intuition behind the three methods
    • Evaluation
    • Summary
  • 20. Naïve Scan
  • 21. Spatial Scan
  • 22. Issues
    • First two methods just look for a local burst in usage (in either time or space)
  • 23. Scale-structure Identification
    • Select a scale
      • Find the structure of the data for the scale
      • Calculate the entropy of the structure
    • Increase scale, repeat
    • Is the information “well organized”?
  • 24. Scale-structure Identification
  • 25. Scale-structure Identification Entropy = 1.421621 Entropy = 0.766263 Entropy = 0.555915 Entropy = 0.062760 “Barry Bonds” tag
  • 26. San Francisco Experiments
    • ~43k photos
    • ~800 tags
    San Francisco Dataset: 42,000 Photos 800+ popular tags
  • 27. Experiments byobw Ground Truth: BYOBW?
  • 28. Experiments
    • Given a ground truth:
      • Run the three methods, vary thresholds
      • Precision vs. recall
  • 29. The Obligatory Slide Where I Show You that Our Curve is Above the Other Curves
  • 30. Experiments
  • 31. Future Work
    • Multiple regions
    • Different spatially and temporally encoded data (e.g., blogs)
    • Other metadata domains besides time and space
      • Visual features: shape primitives, color
      • Semantic features
  • 32. Questions?
    • Tye Rattenbury, Nathan Good, Mor Naaman
    • Y!RB / Y!A.D.D.
    • Read more, follow: http://www.whyrb.com
    • Slides: http://slideshare.net/mor
    • Contact: mor@yahoo-inc.com
  • 33. Tag Maps - Paris - Les Blogs?
  • 34. Issues
    • Noisy data
    • Photographer biases
      • In locations
      • In Tags
    • Wrong data
    Whatever, color, city, spectrum, santa barbara, california, usa, Lookatme, Herbert Bayer Chromatic Gate