Event Clusters Detection on Flickr Images using a Suffix-Tree Structure

846 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
846
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Event Clusters Detection on Flickr Images using a Suffix-Tree Structure

  1. 1. 1 Event Cluster Detection on Flickr Images using a Suffix-Tree Structure Massimiliano Ruocco and Heri Ramampiaro Dept. Of Computer and Information Science Norwegian University of Science and Technology ruocco@idi.ntnu.no Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  2. 2. 2 Outline Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  3. 3. 3 Outline 1.  Introduction 1.  Problem Statement 2.  Related Works 3.  Contributions 2.  Proposed approach 1.  Problem definition 2.  Preliminary 3.  Algorithm Overview 3.  Evaluation 4.  Conclusions Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  4. 4. 4 Problem Statement Event Detection Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  5. 5. 5 Problem Statement Event Detection -  Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1): (1) http://projects.ldc.upenn.edu/TDT/! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  6. 6. 6 Problem Statement Event Detection -  Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1): -  Objective: aggregate stories over time into single event topic (1) http://projects.ldc.upenn.edu/TDT/! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  7. 7. 7 Problem Statement Event Detection -  Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1): -  Objective: aggregate stories over time into single event topic Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] (1) http://projects.ldc.upenn.edu/TDT/! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  8. 8. 8 Problem Statement Event Detection Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  9. 9. 9 Problem Statement Event Detection -  Most previous works focus on time-tagged document streams can be classified as: Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  10. 10. 10 Problem Statement Event Detection -  Most previous works focus on time-tagged document streams can be classified as: -  Retrospective Detection : discover unidentified events in a collection of news [Yang et al. 1998] Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  11. 11. 11 Problem Statement Event Detection -  Most previous works focus on time-tagged document streams can be classified as: -  Retrospective Detection : discover unidentified events in a collection of news [Yang et al. 1998] -  Online Detection : detect events in real-time from a stream of news [Brants et al. 2003] Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  12. 12. 12 Problem Statement Web Photo-Sharing Apps – New Needs Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  13. 13. 13 Problem Statement Web Photo-Sharing Apps – New Needs Huge Amount of Pictures Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  14. 14. 14 Problem Statement Web Photo-Sharing Apps – New Needs Time! 26 Oct 2010 User! RMax Location! 26:12, 23:14 Tags! Roma, Sky, Bridge …! Huge Amount of Pictures Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  15. 15. 15 Problem Statement Web Photo-Sharing Apps – New Needs Retrieve New Needs Browse Time! 26 Oct 2010 User! RMax Location! 26:12, 23:14 Knowledge Extraction Tags! Roma, Sky, Bridge …! Huge Amount of Pictures Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  16. 16. 16 Problem Statement Challenges Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  17. 17. 17 Problem Statement Challenges -  Event detection on Tagged Picture from Photo-Sharing Apps -  Web-scale environment -  Use of contextual information -  Noisy annotation Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  18. 18. 18 Problem Statement Challenges -  Event detection on Tagged Picture from Photo-Sharing Apps -  Web-scale environment -  Use of contextual information -  Noisy annotation Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  19. 19. 19 Related Works Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  20. 20. 20 Related Works -  Event Clustering (Visual/Temporal information) [Loui, Savakis 2002] -  Albuming user photo collections -  Not scalable to large dataset! -  Limited to user photo collection! -  No Locational Information! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  21. 21. 21 Related Works -  Event Clustering (Visual/Temporal information) [Loui, Savakis 2002] -  Albuming user photo collections -  Not scalable to large dataset! -  Limited to user photo collection! -  No Locational Information! -  Event/Place Semantic Identification (Temporal information) [Rattenbury et al. 2007] -  Extraction of event and place semantics for tags assigned to Flickr photos -  Scale-Structure Identification (SSI) method to analyze the tag usage distribution -  SSI is limited for large dataset! -  Location information is not considered! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  22. 22. 22 Related Works -  Event Clustering (Visual/Temporal information) [Loui, Savakis 2002] -  Albuming user photo collections -  Not scalable to large dataset! -  Limited to user photo collection! -  No Locational Information! -  Event/Place Semantic Identification (Temporal information) [Rattenbury et al. 2007] -  Extraction of event and place semantics for tags assigned to Flickr photos -  Scale-Structure Identification (SSI) method to analyze the tag usage distribution -  SSI is limited for large dataset! -  Location information is not considered! -  Event Tag Detection (Spatial/Temporal information) [Chen, Roy 2009] -  Detect event tags from Flickr photos -  As [Rattenbury et al. 2007] use SSI method to analyze the tag usage distribution -  SSI is used over locational and spatial distributions simultaneously Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  23. 23. 23 Problem Definition Hypothesis Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  24. 24. 24 Problem Definition Hypothesis Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  25. 25. 25 Problem Definition Hypothesis Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] Something happening in a certain place at a certain time with a certain tag Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  26. 26. 26 Problem Definition Hypothesis Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] Something happening in a certain place at a certain time with a certain tag Event Cluster ej  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  27. 27. 27 Problem Definition Hypothesis Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] Something happening in a certain place at a certain time with a certain tag Event Cluster ej  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Not the opposite  ! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  28. 28. 28 Problem Definition Hypothesis – Landmark clusters Location Event Cluster ek … {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g colosseo! time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  29. 29. 29 Problem Definition Hypothesis – Landmark clusters Location Event Cluster ek … {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g colosseo! dt time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  30. 30. 30 Problem Definition Hypothesis – Landmark clusters Landmark Location Clusters Event Cluster ek … {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g colosseo! dt time Event Cluster ek  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Not the opposite  ! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  31. 31. 31 Problem Definition Hypothesis – Event clusters Location g time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  32. 32. 32 Problem Definition Hypothesis – Event clusters Location g dt time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  33. 33. 33 Problem Definition Hypothesis – Event clusters Location g applepies! dt time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  34. 34. 34 Problem Definition Hypothesis – Event clusters Landmark Clusters Location Event Cluster ek Event Clusters {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g applepies! dt time Event Cluster ek  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  35. 35. 35 Problem Definition Hypothesis – Event clusters Landmark Clusters Location Event Cluster ek Event Clusters {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g applepies! dt time Event Cluster ek  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } The opposite is true  ! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  36. 36. 36 Problem Definition Hypothesis – Event clusters Landmark Clusters Location Event Cluster ek Event Clusters {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g applepies! time Event Cluster ek  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } The opposite is true  ! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  37. 37. 37 Problem Definition New Formulation Event Event Cluster ek ClustersLocation Sdgt Location Sgt = g applepies! g applepies! dt time time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  38. 38. 38 Problem Definition New Formulation Event Event Cluster ek Clusters { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) Location Sdgt € Location Sgt = g applepies! g applepies! dt time time Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  39. 39. 39 Preliminary Suffix-Tree Clustering [Zamir 1998] Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  40. 40. 40 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  41. 41. 41 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  42. 42. 42 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  43. 43. 43 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  44. 44. 44 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering -  Cluster label inferred by the tree structure Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  45. 45. 45 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering -  Cluster label inferred by the tree structure -  Phrase-Based model Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  46. 46. 46 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering -  Cluster label inferred by the tree structure -  Phrase-Based model -  Snippet-tolerant Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  47. 47. 47 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering -  Cluster label inferred by the tree structure -  Phrase-Based model -  Snippet-tolerant -  Overlapped clusters Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  48. 48. 48 Preliminary Suffix-Tree Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  49. 49. 49 Preliminary Suffix-Tree -  Given a string S suffix-tree is a Compact Trie containing all the suffixes of S -  Rooted directed tree -  Each internal node other than root has at least two children -  Each edge leaving a particular node is labelled with a non-empty substring of S Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  50. 50. 50 Preliminary Suffix-Tree -  Given a string S suffix-tree is a Compact Trie containing all the suffixes of S -  Rooted directed tree -  Each internal node other than root has at least two children -  Each edge leaving a particular node is labelled with a non-empty substring of S Papua ‘apua’ ‘pua’ ‘ua’ ‘a’ Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  51. 51. 51 Preliminary Suffix-Tree -  Given a string S suffix-tree is a Compact Trie containing all the suffixes of S -  Rooted directed tree -  Each internal node other than root has at least two children -  Each edge leaving a particular node is labelled with a non-empty substring of S Papua ‘apua’ ‘pua’ ‘ua’ ‘a’ -  Suffix-Tree construction performs in linear time (O(n)) ([Ukkonen 1995]) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  52. 52. 52 Algorithm Overview Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  53. 53. 53 Algorithm Overview Data cleaning Data extension Suffix Tree Construction Ii = (T, g, dt) Event clusters extraction Event Clusters merge … Primary! Party! Election! Campaign! … Concert! Music! John! … … Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  54. 54. 54 Algorithm Overview Data Cleaning and Extension Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  55. 55. 55 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  56. 56. 56 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  57. 57. 57 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] where s1 and s2 encoding function from date/location to string Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  58. 58. 58 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T T’’ = {t’’1, …, t’’l } s1 and s2 define the t’’i = [s1(dt) + s2(g) + ti ] granularity in space (geographical grid) and time where s1 and s2 encoding function from date/location to string Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  59. 59. 59 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T T’’ = {t’’1, …, t’’l } s1 and s2 define the t’’i = [s1(dt) + s2(g) + ti ] granularity in space (geographical grid) and time where s1 and s2 encoding function from date/location to string acmm2010 26Oct2010 43.77:11.24 acmm2010 florence 26Oct2010 43.77:11.24 florence multimedia 26Oct2010 43.77:11.24 multimedia Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  60. 60. 60 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T T’’ = {t’’1, …, t’’l } s1 and s2 define the t’’i = [s1(dt) + s2(g) + ti ] granularity in space (geographical grid) and time where s1 and s2 encoding function from date/location to string acmm2010 26Oct2010 43.77:11.24 acmm2010 T’ florence T’’ 26Oct2010 43.77:11.24 florence multimedia 26Oct2010 43.77:11.24 multimedia s1(26/10/2010) s2(43.777864,11.249029) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  61. 61. 61 Algorithm Overview ST Construction and Event Extraction Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  62. 62. 62 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l Ψl Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  63. 63. 63 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  64. 64. 64 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  65. 65. 65 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) -  Extract Ψ’l ([s2(g) + ti ]) € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  66. 66. 66 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) -  Extract Ψ’l ([s2(g) + ti ]) -  Compare Ψl and Ψ’l € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  67. 67. 67 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) -  Extract Ψ’l ([s2(g) + ti ]) -  Compare Ψl and Ψ’l -  IF (Ψl = Ψ’l)  Ψl ([s1(dt) + s2(g) + ti ]) is event cluster € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  68. 68. 68 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) -  Extract Ψ’l ([s2(g) + ti ]) -  Compare Ψl and Ψ’l -  IF (Ψl = Ψ’l)  Ψl ([s1(dt) + s2(g) + ti ]) is event cluster € -  Label inferred from the structure Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  69. 69. 69 Algorithm Overview Extraction and Merge Ψ’l Ψl Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  70. 70. 70 Algorithm Overview Extraction and Merge -  Extracted event clusters : {e1, …,en} Ψ’l Ψl Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  71. 71. 71 Algorithm Overview Extraction and Merge -  Extracted event clusters : {e1, …,en} -  Merge semantically similar cluster: Ψ’l Ψl Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  72. 72. 72 Algorithm Overview Extraction and Merge -  Extracted event clusters : {e1, …,en} -  Merge semantically similar cluster: ei ∩ e j Ψ’l θ (ei ,e j ) = min(ei ,e j ) Ψl € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  73. 73. 73 Evaluation - Dataset Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  74. 74. 74 Evaluation - Dataset -  Dataset collected from Flickr -  Only geo-tagged picture -  12 June 2008 – 11 June 2010 (729 days) -  San Francisco Area #Images ~ 350K #Tags ~ 3M Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  75. 75. 75 Evaluation - Measure Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  76. 76. 76 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  77. 77. 77 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} -  Ranking according to clusters size: |ei| Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  78. 78. 78 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} -  Ranking according to clusters size: |ei| -  Drawback: lack of ground truth (recall measure) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  79. 79. 79 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} -  Ranking according to clusters size: |ei| -  Drawback: lack of ground truth (recall measure) Rk Rk : relevant clusters in the Top-K Precision : first k returned K € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  80. 80. 80 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} -  Ranking according to clusters size: |ei| -  Drawback: lack of ground truth (recall measure) Rk Rk : relevant clusters in the Top-K Precision : first k returned K Top-20 (K=20) € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  81. 81. 81 Evaluation -  Experiment on different granularity in time and space -  Time: 1 day 1 week Example 2008Oct12 2008:43 -  Space: Latitude Precision Longitude Square Size Precision (Meters) 0.01 0.01 1000m X 1000m 0.005 0.005 500m X 500m 0.002 0.002 200m X 200m 0.001 0.001 100m X 100m Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  82. 82. 82 Evaluation - Results 100 m 200 m 500 m 1000 m 1 Day 1 Week 1 Day 1 Week 1 Day 1 Week 1 Day 1 Week #Clusters #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. 1 1 100% 1 100% 1 100% 1 100% 1 100% 1 100% 1 100% 1 100% 2 2 100% 2 100% 2 100% 2 100% 2 100% 2 100% 2 100% 1 50% 3 3 100% 3 100% 3 100% 3 100% 3 100% 3 100% 3 100% 2 67% … 20 15 75% 14 70% 15 75% 14 70% 14 70% 13 65% 13 65% 14 70% Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  83. 83. 83 Evaluation - Results Top-20 precision Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  84. 84. 84 Conclusion Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  85. 85. 85 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  86. 86. 86 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  87. 87. 87 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  88. 88. 88 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) -  Automatic annotation of clusters Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  89. 89. 89 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) -  Automatic annotation of clusters -  Noise reduction in the tag using extended vocabulary for stopword removal Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  90. 90. 90 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) -  Automatic annotation of clusters -  Noise reduction in the tag using extended vocabulary for stopword removal -  Spatial and Time information considered Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  91. 91. 91 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) -  Automatic annotation of clusters -  Noise reduction in the tag using extended vocabulary for stopword removal -  Spatial and Time information considered -  Analysis of different granularity of time and space Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  92. 92. 92 Thanks ( ) for the attention! http://www.idi.ntnu.no/~ruocco/ Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  93. 93. 93 Thanks ( ) for the attention! QUESTIONS? http://www.idi.ntnu.no/~ruocco/ Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010

×