Kelm überblick 2013

1,974 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,974
On SlideShare
0
From Embeds
0
Number of Embeds
1,453
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Kelm überblick 2013

  1. 1. Pascal Kelm Kelm@nue.tu-berlin.de Communication Systems Groupwww.nue.tu-berlin.de Technische Universität Berlin Thursday, 24 January 2013
  2. 2. Overview 2 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  3. 3. Motivation – Where in the world is it? 3 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  4. 4. Example 4http://www.flickr.com/photos/zebandrews/7414117752/in/pool-18038320@N00/ Fact: only 3% of the content in online sharing plattforms is available with geographic coordinates (latitude, longitude) Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  5. 5. State of the Art 5 How would you estimate the location of an unknown content? Textual information Visual information Tags: Paris, France, twilight, grand blue, Europe, Hasselblad, film, … Local features Low-level features - interesting points on the Gazetteers Textual similarity - Propagate the location object can be extracted to - like geonames.org - Finding the similarity by finding a visual similar provide a "feature to a group of typonyms Image description“ of the object -Features: texture, color, - Features: SIFT, SURF shape… etc.• [Pascal Kelm: “Where in the World?: The State of Automatic Geotagging of Video”, invited lecture, DGA workshop 2012]• [Pascal Kelm et al.: “Georeferencing in Social Networks“ in Social Media Retrieval, Springer, 2012] Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  6. 6. Relevant Research 1 62008: James Hays, Alexei A. Efros. IM2GPS: estimating geographicinformation from a single image. Proceedings of the IEEE Conf. OnComputer Vision and Pattern Recognition (CVPR, „Where am I ?“) Purely data-driven scene matching approach (over 6 million GPS- tagged images, 5 low-level descriptors)  Visual ambiguity Low precision, high computational cost  (cluster of 400 processors  3 days) Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  7. 7. Relevant Research 2 72009: Pavel Serdyukov, Vanessa Murdock, Roelof van Zwol: PlacingFlickr Photos on a Map. In: 32nd International ACM SIGIR Images with “palma" tag falsely mapped near Palma de Mallorca, Spain Textual annotated language model (ranking)  Geographical / textual ambiguity  High precision  High computational cost Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  8. 8. Research Question 8 What is the limitation of an automatic algorithm? Which feature (text, video) performs best? Is a fusion possible to eliminate geographical ambiguity? Do I need a CPU-cluster to estimate the location? Low performance  low precision? Is it possible for a human to estimate the location of a video using textual, visual and audio information? Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  9. 9. Placing Task 9 The task requires participants to assign geographical coordinates to each provided test video. Participants can make use of metadata and audio and visual features as well as external resources. Organizers: Pascal Kelm, TU Berlin Adam Rae, Yahoo! Research[Adam Rae, Pascal Kelm “Working Notes for the Placing Task at MediaEval 2012” Working Notes Proceedings (ISSN 1613-0073) of the MediaEval 2012] Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  10. 10. Image DistributionFlickr Database: 3,6 million training images 10.000 trainings videos 5091 test videosDescriptors: 1. Color and Edge Directivity Descriptor 2. Gabor 3. Fuzzy Color and Texture Histogram 4. Color Histogram 5. Scalable Color 6. Auto Color Correlogram 7. Tamura 8. Edge Histogram 9. Color LayoutMetadata:All Inforamtion aboutuploader + video Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  11. 11. Overview Framework 11 National borders extracted from the metadata Textual and visual features are used in a hierarchical framework to predict the most likely location[Pascal Kelm, Sebastian Schmiedeke, Thomas Sikora “Multimodal Geo-tagging in Social Media Websites using HierarchicalSpatial Segmentation” Proceedings of the 20th ACM SIGSPATIAL 2012] Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  12. 12. Collaborative Systems: Example 12這是我上次去巴黎。在那裡,我得到了我的城堡在迪斯尼樂園看。… 這是我上次去巴黎。在那裡,我得到了我的城堡在迪斯尼樂園看。 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  13. 13. Geographical Ambiguity 13 這是我上次去巴黎。在那裡,我得到了我的城堡在迪斯尼樂園看。…Which language is it? Chinese This was my last trip to Paris. I visited the castle in Disneyland…Which words gives us information? Tags? Trip, Paris, Castle, DisneylandWhich of these nouns have got geographical information? Paris, Disneyland Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  14. 14. Geographical Ambiguity 14 Paris Disneyland N 1 R j (c0 ) j 0 France China c det ected arg max ... N 1 R j (cm ) Canada USA j 0 Puerto R(ci) = Rank sum France Rico ci = Countries N = Number of toponym … …• [Pascal Kelm, Sebastian Schmiedeke, Thomas Sikora “A Hierarchical, Multi-modal Approach for Placing Videos on the Mapusing Millions of Flickr Photographs” ACM Multimedia 2011] Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  15. 15. Overview Framework 15 National borders extracted from the metadata Textual and visual features are used in a hierarchical framework to predict the most likely location Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  16. 16. Example 16http://www.flickr.com/photos/62285085@N00/3484324495 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  17. 17. Textual Region Model 17 Segmenting the world map into regions according to the meridians and parallels Stemming: reducing inflected words to their root form Bounds Crossing, Florida, USA Text Porter StemmerBream Vortex Bream VortexSwimming SwimOcean OceanBeach BeachSprings Vortex Springs VortexScuba Diving Scuba DiveScuba Underwater Scuba Underwat… … Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  18. 18. Textual Region Model 18 N t ,l 1Term-location-distribution: P (t | l ) N t , l 1 t VTerm frequency-inverse document frequency: N tfidf t N t , l log nt N P (l | d ) max log Pi ( t | l ) i 0 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  19. 19. Textual Region Model 19 N t ,c 1Bernoulli model: P (t | c ) N t , c 1 t Vt = TagC= Class / Region Bream Vortex Swim Ocean Beach Springs Vortex Scuba Dive Scuba Underwat … Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  20. 20. Visual Region Model 20 Returns the visually most similar areas, which arerepresented by a mean feature vector of all training imagesand videos of the respective area Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  21. 21. What is meant by Spatial Segmentation? 21 World map is iteratively divided into segments of different sizes Each segment is considered as classes for our probabil- istic model• [Pascal Kelm, Sebastian Schmiedeke, Thomas Sikora “How Spatial Segmentation improves the Multimodal Geo-Tagging”Working Notes Proceedings of the MediaEval 2012] Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  22. 22. Fusion: Example 22 Confidence scores of the visual approach (right) restricted to be in the most likely spatial segment determined by the textual approach (left) Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  23. 23. Results 23[UNICAMP] O. A. B. Penatti, L. T. Li, J. Almeida, R. da S. Torres. A visual approach for video geocoding using bag-of-scenes. ICMR12[QMUL] X. Sevillano, T. Piatrik, K. Chandramouli, Q. Zhang, E. Izquierdoy. Geo-tagging online videos using semantic expansion andvisual analysis. Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  24. 24. Conclusion 24 hierarchical approach for automatic estimation of geo-tags in social media website detailed analysis of textual and visual features using different spatial granularities (national borders detection) fusion of textual and visual methods is important to eliminate geographical ambiguities reduces the computing time in the subsequent classification step correctly located within a radius of 10 km for half of the test set Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  25. 25. Web demonstrator 25 http://geotagging.de.im Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  26. 26. Geo-Location Human Baseline Project 26 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  27. 27. Geo-Location Human Baseline Project 27 http://geotagging.de.im/game.php• [Gottlieb, Choi, Kelm, Friedland, Sikora: “Pushing the Limits of Mechanical Turk: Qualifying the Crowd for Video Geo-Location”, in ACM Workshop on Crowdsourcing for Multimedia held in conjunction with ACM Multimedia 2012]•[Gottlieb, Choi, Kelm, Friedland, Sikora: “On Pushing the Limits of Mechanical Turk: Qualifying the Crowd for VideoGeolocation”, in MULTIMEDIA COMMUNICATIONS TECHNICAL COMMITTEE IEEE COMMUNICATIONS SOCIETY , Vol. 8, No.1, January 2013] Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  28. 28. Object Detection 28 Frame 370 Frame 35 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  29. 29. Augmented Object Detection 29OpenCV for Android FAST ORB BRISK Geo-referenced DatabaseSURFCPU: 192 ms business cardGPU: 87 msAndroid: 9990 ms Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  30. 30. Object Detection 30 Depth Map Matching Map Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  31. 31. Graph-based Object Detection 31Matching Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  32. 32. DFG Proposal 32 Housebreaking Cyber-Stealing Cyber-Mobbing Cyber-Stalking Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  33. 33. DFG Proposal: Geo-Privacy 33 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  34. 34. Question 34Thanks for your attention.Dipl.- Ing. Pascal Kelm Communication Systems Group Technische Universität Berlin Sekr. EN1, Einsteinufer 17 10587 Berlin, Germany E-mail: Kelm@nue.tu-berlin.de Telefon: (+49) 30 / 314 28504 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  35. 35. DFG: Geo-Tagging 35 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  36. 36. Spatial Segmentation 36 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  37. 37. Twitter-based Placing Sub-Task (New York) 37 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  38. 38. Spatial Segmentation 38 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  39. 39. Extracted geo. items 39 kauii hawaii usa00001: hawaii, kauai, usa Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  40. 40. Textual Features + Naive Bayes 40 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  41. 41. Visual Features 41What will you do if you do not have any textual information? Kelm: “Where in the World?: The State of Automatic Geotagging of Video”
  42. 42. Fusion 42Textual Region Model Region Region Region Region Region Region Region Region Region 1 2 3 4 5 6 8 … N 7Visual Region Model Region Region Region Region Region Region Region Region Region 1 2 3 4 5 6 8 … N 7 Geographical Boundaries ExtractionRanking Region Region Region Region Region Pic1 Pic2 Pic3 2 3 4 5 6 Kelm: “Where in the World?: The State of Automatic Geotagging of Video”

×