1




    Event Cluster Detection on Flickr Images
          using a Suffix-Tree Structure
           Massimiliano Ruocco and Heri Ramampiaro

               Dept. Of Computer and Information Science
             Norwegian University of Science and Technology
                          ruocco@idi.ntnu.no

           Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
2




    Outline




              Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
3




    Outline
     1.    Introduction
             1.  Problem Statement
             2.  Related Works
             3.  Contributions
     2.    Proposed approach
             1.  Problem definition
             2.  Preliminary
             3.  Algorithm Overview
     3.    Evaluation
     4.    Conclusions




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
4



    Problem Statement
    Event Detection




                  Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
5



    Problem Statement
    Event Detection
      -    Event detection topic has its origin from the TDT (Topic Detection
           and Tracking) project(1):




           (1) http://projects.ldc.upenn.edu/TDT/!


                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
6



    Problem Statement
    Event Detection
      -    Event detection topic has its origin from the TDT (Topic Detection
           and Tracking) project(1):
               -    Objective: aggregate stories over time into single event topic




           (1) http://projects.ldc.upenn.edu/TDT/!


                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
7



    Problem Statement
    Event Detection
      -    Event detection topic has its origin from the TDT (Topic Detection
           and Tracking) project(1):
               -    Objective: aggregate stories over time into single event topic



            Something happening in a certain place at a certain time
                                        [Yang, Pierce, Carbonell 1999]
                                                                     	





           (1) http://projects.ldc.upenn.edu/TDT/!


                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
8



    Problem Statement
    Event Detection




                  Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
9



    Problem Statement
    Event Detection
      -    Most previous works focus on time-tagged document streams can
           be classified as:




                  Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
10



     Problem Statement
     Event Detection
       -    Most previous works focus on time-tagged document streams can
            be classified as:
              -    Retrospective Detection : discover unidentified events in a
                   collection of news [Yang et al. 1998]




                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
11



     Problem Statement
     Event Detection
       -    Most previous works focus on time-tagged document streams can
            be classified as:
              -    Retrospective Detection : discover unidentified events in a
                   collection of news [Yang et al. 1998]
              -    Online Detection : detect events in real-time from a stream of
                   news [Brants et al. 2003]




                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
12



     Problem Statement
     Web Photo-Sharing Apps – New Needs




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
13



     Problem Statement
     Web Photo-Sharing Apps – New Needs




      Huge Amount of Pictures




                                Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
14



     Problem Statement
     Web Photo-Sharing Apps – New Needs




                                                       Time!      26 Oct 2010
                                                       User!            RMax
                                                       Location! 26:12, 23:14
                                                       Tags! Roma, Sky, Bridge
                                                        …!




      Huge Amount of Pictures




                                Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
15



     Problem Statement
     Web Photo-Sharing Apps – New Needs



                                                                                                            Retrieve


                                                                                          New Needs
                                                                                                                  Browse
                                                       Time!      26 Oct 2010
                                                       User!            RMax
                                                       Location! 26:12, 23:14
                                                                                          Knowledge Extraction
                                                       Tags! Roma, Sky, Bridge
                                                        …!




      Huge Amount of Pictures




                                Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
16



     Problem Statement
     Challenges




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
17



     Problem Statement
     Challenges
       -    Event detection on Tagged Picture from Photo-Sharing Apps
       -    Web-scale environment
       -    Use of contextual information
       -    Noisy annotation




                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
18



     Problem Statement
     Challenges
       -    Event detection on Tagged Picture from Photo-Sharing Apps
       -    Web-scale environment
       -    Use of contextual information
       -    Noisy annotation




                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
19




     Related Works




             Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
20




     Related Works
      -    Event Clustering (Visual/Temporal information) [Loui, Savakis 2002]
              -     Albuming user photo collections
              -     Not scalable to large dataset!
              -     Limited to user photo collection!
              -     No Locational Information!




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
21




     Related Works
      -    Event Clustering (Visual/Temporal information) [Loui, Savakis 2002]
              -     Albuming user photo collections
              -     Not scalable to large dataset!
              -     Limited to user photo collection!
              -     No Locational Information!


      -    Event/Place Semantic Identification (Temporal information) [Rattenbury et al. 2007]
              -     Extraction of event and place semantics for tags assigned to Flickr photos
              -     Scale-Structure Identification (SSI) method to analyze the tag usage distribution
              -     SSI is limited for large dataset!
              -     Location information is not considered!




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
22




     Related Works
      -    Event Clustering (Visual/Temporal information) [Loui, Savakis 2002]
              -     Albuming user photo collections
              -     Not scalable to large dataset!
              -     Limited to user photo collection!
              -     No Locational Information!


      -    Event/Place Semantic Identification (Temporal information) [Rattenbury et al. 2007]
              -     Extraction of event and place semantics for tags assigned to Flickr photos
              -     Scale-Structure Identification (SSI) method to analyze the tag usage distribution
              -     SSI is limited for large dataset!
              -     Location information is not considered!


      -    Event Tag Detection (Spatial/Temporal information) [Chen, Roy 2009]
              -     Detect event tags from Flickr photos
              -     As [Rattenbury et al. 2007] use SSI method to analyze the tag usage distribution
              -     SSI is used over locational and spatial distributions simultaneously




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
23



     Problem Definition
     Hypothesis




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
24



     Problem Definition
     Hypothesis

             Something happening in a certain place at a certain time
                                      [Yang, Pierce, Carbonell 1999]
                                                                   	





                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
25



      Problem Definition
      Hypothesis

              Something happening in a certain place at a certain time
                                       [Yang, Pierce, Carbonell 1999]
                                                                    	




     Something happening in a certain place at a certain time with a certain tag	





                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
26



      Problem Definition
      Hypothesis

              Something happening in a certain place at a certain time
                                       [Yang, Pierce, Carbonell 1999]
                                                                    	




     Something happening in a certain place at a certain time with a certain tag	


                 Event Cluster ej  {tj=tj, dti=dtj, gi=gj,                              Ii,Ij ek }
                                                                                                  	





                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
27



      Problem Definition
      Hypothesis

              Something happening in a certain place at a certain time
                                       [Yang, Pierce, Carbonell 1999]
                                                                    	




     Something happening in a certain place at a certain time with a certain tag	


                 Event Cluster ej  {tj=tj, dti=dtj, gi=gj,                              Ii,Ij ek }
                                                                                                  	

                 Not the opposite  !




                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
28



     Problem Definition
     Hypothesis – Landmark clusters

      Location                Event Cluster ek	


                                                                     …
                                                       {tj=tj, dti=dtj, gi=gj,   Ii,Ij ek } 	





            g                                                                                       colosseo!




                                                                                                  time




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
29



     Problem Definition
     Hypothesis – Landmark clusters

      Location                Event Cluster ek	


                                                                     …
                                                       {tj=tj, dti=dtj, gi=gj,   Ii,Ij ek } 	





            g                                                                                       colosseo!




                                        dt                                                        time




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
30



     Problem Definition
     Hypothesis – Landmark clusters
                                                                  Landmark
      Location                                                     Clusters
                              Event Cluster ek	


                                                                       …
                                                         {tj=tj, dti=dtj, gi=gj,   Ii,Ij ek } 	





            g                                                                                              colosseo!




                                        dt                                                                time


                          Event Cluster ek	

    {tj=tj, dti=dtj, gi=gj,                  Ii,Ij ek }
                                                                                                    	

                          Not the opposite  !


                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
31



     Problem Definition
     Hypothesis – Event clusters

      Location




            g




                                                                                                  time




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
32



     Problem Definition
     Hypothesis – Event clusters

      Location




            g




                                                                     dt                           time




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
33



     Problem Definition
     Hypothesis – Event clusters

      Location




            g                                                                                       applepies!




                                                                     dt                           time




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
34



     Problem Definition
     Hypothesis – Event clusters
                                                                  Landmark
                                                                   Clusters
      Location                Event Cluster ek	

                                                                     Event
                                                                    Clusters

                                                         {tj=tj, dti=dtj, gi=gj,   Ii,Ij ek } 	





            g                                                                                              applepies!




                                                                             dt                           time


                          Event Cluster ek	

    {tj=tj, dti=dtj, gi=gj,                  Ii,Ij ek }
                                                                                                    	




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
35



     Problem Definition
     Hypothesis – Event clusters
                                                                  Landmark
                                                                   Clusters
      Location                Event Cluster ek	

                                                                     Event
                                                                    Clusters

                                                         {tj=tj, dti=dtj, gi=gj,   Ii,Ij ek } 	





            g                                                                                              applepies!




                                                                             dt                           time


                          Event Cluster ek	

    {tj=tj, dti=dtj, gi=gj,                  Ii,Ij ek }
                                                                                                    	

                          The opposite is true  !

                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
36



     Problem Definition
     Hypothesis – Event clusters
                                                                  Landmark
                                                                   Clusters
      Location                Event Cluster ek	

                                                                     Event
                                                                    Clusters

                                                         {tj=tj, dti=dtj, gi=gj,   Ii,Ij ek } 	





            g                                                                                              applepies!




                                                                                                          time


                          Event Cluster ek	

    {tj=tj, dti=dtj, gi=gj,                  Ii,Ij ek }
                                                                                                    	

                          The opposite is true  !

                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
37



         Problem Definition
         New Formulation

                                                                     Event
                                   Event Cluster ek	

                                                                    Clusters




Location
                          Sdgt                                  Location                            Sgt


                                                                =
     g                                             applepies!          g                                                 applepies!




                              dt                  time                                                                   time




                         Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
38



         Problem Definition
         New Formulation

                                                                               Event
                                   Event Cluster ek	

                                                                              Clusters

                                                             {   ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)	



Location
                           Sdgt               €                           Location                                Sgt


                                                                   =
     g                                                applepies!                  g                                      applepies!




                              dt                  time                                                                   time




                       Event Cluster ek	

      {   ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)	


                         Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
39



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
40



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]
       -    Suffix-Tree based




                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
41



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]
       -    Suffix-Tree based
       -    Mainly used in text (web) document clustering




                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
42



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]
       -    Suffix-Tree based
       -    Mainly used in text (web) document clustering
       -    Three step process:
               1    Document cleaning
               2    Base clusters identification
               3    Base clusters merging




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
43



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]
       -    Suffix-Tree based
       -    Mainly used in text (web) document clustering
       -    Three step process:
               1    Document cleaning
               2    Base clusters identification
               3    Base clusters merging
       -    Incremental clustering




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
44



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]
       -    Suffix-Tree based
       -    Mainly used in text (web) document clustering
       -    Three step process:
               1    Document cleaning
               2    Base clusters identification
               3    Base clusters merging
       -    Incremental clustering
       -    Cluster label inferred by the tree structure




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
45



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]
       -    Suffix-Tree based
       -    Mainly used in text (web) document clustering
       -    Three step process:
               1    Document cleaning
               2    Base clusters identification
               3    Base clusters merging
       -    Incremental clustering
       -    Cluster label inferred by the tree structure
       -    Phrase-Based model




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
46



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]
       -    Suffix-Tree based
       -    Mainly used in text (web) document clustering
       -    Three step process:
               1    Document cleaning
               2    Base clusters identification
               3    Base clusters merging
       -    Incremental clustering
       -    Cluster label inferred by the tree structure
       -    Phrase-Based model
       -    Snippet-tolerant




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
47



     Preliminary
     Suffix-Tree Clustering [Zamir 1998]
       -    Suffix-Tree based
       -    Mainly used in text (web) document clustering
       -    Three step process:
               1    Document cleaning
               2    Base clusters identification
               3    Base clusters merging
       -    Incremental clustering
       -    Cluster label inferred by the tree structure
       -    Phrase-Based model
       -    Snippet-tolerant
       -    Overlapped clusters




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
48



     Preliminary
     Suffix-Tree




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
49



     Preliminary
     Suffix-Tree
       -    Given a string S suffix-tree is a Compact Trie containing all the suffixes
            of S
               -    Rooted directed tree
               -    Each internal node other than root has at least two children
               -    Each edge leaving a particular node is labelled with a non-empty
                    substring of S	





                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
50



     Preliminary
     Suffix-Tree
       -    Given a string S suffix-tree is a Compact Trie containing all the suffixes
            of S
               -    Rooted directed tree
               -    Each internal node other than root has at least two children
               -    Each edge leaving a particular node is labelled with a non-empty
                    substring of S	


                         Papua	

                         ‘apua’	

                         ‘pua’	

                         ‘ua’	

                         ‘a’	





                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
51



     Preliminary
     Suffix-Tree
       -    Given a string S suffix-tree is a Compact Trie containing all the suffixes
            of S
               -    Rooted directed tree
               -    Each internal node other than root has at least two children
               -    Each edge leaving a particular node is labelled with a non-empty
                    substring of S	


                         Papua	

                         ‘apua’	

                         ‘pua’	

                         ‘ua’	

                         ‘a’	





       -    Suffix-Tree construction performs in linear time (O(n)) ([Ukkonen 1995])




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
52




     Algorithm Overview




             Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
53




     Algorithm Overview

                                   Data cleaning        Data extension




                                                Suffix Tree
                                               Construction

      Ii = (T, g, dt)
                                               Event clusters
                                                extraction



                                              Event Clusters
                                                 merge




                              …             Primary!
                                            Party!
                                            Election!
                                            Campaign!
                                                        …            Concert!
                                                                     Music!
                                                                     John!
                                                                                …

                         …
                        Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
54



     Algorithm Overview

     Data Cleaning and Extension




                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
55



     Algorithm Overview

     Data Cleaning and Extension
       -    Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt)
               -    Stopword removal (with extended vocabulary) + Stemming




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
56



     Algorithm Overview

     Data Cleaning and Extension
       -    Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt)
               -     Stopword removal (with extended vocabulary) + Stemming
       -    Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt)	

               -     Spatial and Temporal information are encoded in the annotation set T	





                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
57



     Algorithm Overview

     Data Cleaning and Extension
       -    Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt)
               -     Stopword removal (with extended vocabulary) + Stemming
       -    Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt)	

               -     Spatial and Temporal information are encoded in the annotation set T	

                      T’’ = {t’’1, …, t’’l }	

                      t’’i = [s1(dt) + s2(g) + ti ]

                      where s1 and s2 encoding function from date/location to string




                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
58



     Algorithm Overview

     Data Cleaning and Extension
       -    Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt)
               -     Stopword removal (with extended vocabulary) + Stemming
       -    Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt)	

               -     Spatial and Temporal information are encoded in the annotation set T	

                      T’’ = {t’’1, …, t’’l }	

                                                                                                   s1 and s2 define the
                      t’’i = [s1(dt) + s2(g) + ti ]                                                granularity in space
                                                                                                   (geographical grid) and time
                      where s1 and s2 encoding function from date/location to string




                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
59



     Algorithm Overview

     Data Cleaning and Extension
       -    Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt)
               -     Stopword removal (with extended vocabulary) + Stemming
       -    Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt)	

               -     Spatial and Temporal information are encoded in the annotation set T	

                      T’’ = {t’’1, …, t’’l }	

                                                                                                   s1 and s2 define the
                      t’’i = [s1(dt) + s2(g) + ti ]                                                granularity in space
                                                                                                   (geographical grid) and time
                      where s1 and s2 encoding function from date/location to string




                         acmm2010                            26Oct2010 43.77:11.24 acmm2010
                         florence                            26Oct2010 43.77:11.24 florence
                         multimedia                          26Oct2010 43.77:11.24 multimedia




                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
60



     Algorithm Overview

     Data Cleaning and Extension
       -    Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt)
               -     Stopword removal (with extended vocabulary) + Stemming
       -    Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt)	

               -     Spatial and Temporal information are encoded in the annotation set T	

                      T’’ = {t’’1, …, t’’l }	

                                                                                                         s1 and s2 define the
                      t’’i = [s1(dt) + s2(g) + ti ]                                                      granularity in space
                                                                                                         (geographical grid) and time
                      where s1 and s2 encoding function from date/location to string




                         acmm2010                             26Oct2010 43.77:11.24 acmm2010
                      T’ florence                       T’’   26Oct2010 43.77:11.24 florence
                         multimedia                           26Oct2010 43.77:11.24 multimedia



                                                              s1(26/10/2010)   s2(43.777864,11.249029)


                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
61



     Algorithm Overview

     ST Construction and Event Extraction




                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
62



     Algorithm Overview

     ST Construction and Event Extraction
       -    Image Ii’’ : document snippet
              Ii’’ = (T’’,g,dt)	

              T’’ = {t’’1, …, t’’l }	

              t’’i = [s1(dt) + s2(g) + ti ]
                                                                                                           Ψ’l	


                                                                                Ψl	





                            Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
63



     Algorithm Overview

     ST Construction and Event Extraction
       -    Image Ii’’ : document snippet
              Ii’’ = (T’’,g,dt)	

              T’’ = {t’’1, …, t’’l }	

              t’’i = [s1(dt) + s2(g) + ti ]
                                                                                                           Ψ’l	


       -    Extract Candidate event clusters Ψl :                               Ψl	


               -            Ψl ([s1(dt) + s2(g) + ti ])	





                            Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
64



     Algorithm Overview

     ST Construction and Event Extraction
       -     Image Ii’’ : document snippet
                Ii’’ = (T’’,g,dt)	

                T’’ = {t’’1, …, t’’l }	

                t’’i = [s1(dt) + s2(g) + ti ]
                                                                                                             Ψ’l	


       -     Extract Candidate event clusters Ψl :                                     Ψl	


                 -            Ψl ([s1(dt) + s2(g) + ti ])	




            Event Cluster ek	

         {     ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)	




                   €


                              Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
65



     Algorithm Overview

     ST Construction and Event Extraction
       -     Image Ii’’ : document snippet
                Ii’’ = (T’’,g,dt)	

                T’’ = {t’’1, …, t’’l }	

                t’’i = [s1(dt) + s2(g) + ti ]
                                                                                                             Ψ’l	


       -     Extract Candidate event clusters Ψl :                                     Ψl	


                 -            Ψl ([s1(dt) + s2(g) + ti ])	




            Event Cluster ek	

         {     ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)	

       -     Extract Ψ’l ([s2(g) + ti ])


                   €


                              Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
66



     Algorithm Overview

     ST Construction and Event Extraction
       -     Image Ii’’ : document snippet
                Ii’’ = (T’’,g,dt)	

                T’’ = {t’’1, …, t’’l }	

                t’’i = [s1(dt) + s2(g) + ti ]
                                                                                                             Ψ’l	


       -     Extract Candidate event clusters Ψl :                                     Ψl	


                 -            Ψl ([s1(dt) + s2(g) + ti ])	




            Event Cluster ek	

         {     ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)	

       -     Extract Ψ’l ([s2(g) + ti ])
       -     Compare Ψl and Ψ’l 	


                   €


                              Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
67



     Algorithm Overview

     ST Construction and Event Extraction
       -     Image Ii’’ : document snippet
                Ii’’ = (T’’,g,dt)	

                T’’ = {t’’1, …, t’’l }	

                t’’i = [s1(dt) + s2(g) + ti ]
                                                                                                             Ψ’l	


       -     Extract Candidate event clusters Ψl :                                     Ψl	


                 -            Ψl ([s1(dt) + s2(g) + ti ])	




            Event Cluster ek	

         {     ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)	

       -     Extract Ψ’l ([s2(g) + ti ])
       -     Compare Ψl and Ψ’l 	

       -     IF (Ψl = Ψ’l)  Ψl ([s1(dt) + s2(g) + ti ]) is event cluster
                   €


                              Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
68



     Algorithm Overview

     ST Construction and Event Extraction
       -     Image Ii’’ : document snippet
                Ii’’ = (T’’,g,dt)	

                T’’ = {t’’1, …, t’’l }	

                t’’i = [s1(dt) + s2(g) + ti ]
                                                                                                             Ψ’l	


       -     Extract Candidate event clusters Ψl :                                     Ψl	


                 -            Ψl ([s1(dt) + s2(g) + ti ])	




            Event Cluster ek	

         {     ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)	

       -     Extract Ψ’l ([s2(g) + ti ])
       -     Compare Ψl and Ψ’l 	

       -     IF (Ψl = Ψ’l)  Ψl ([s1(dt) + s2(g) + ti ]) is event cluster
                   €
       -     Label inferred from the structure


                              Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
69



     Algorithm Overview

     Extraction and Merge


                                                                                                   Ψ’l	


                                                                        Ψl	





                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
70



     Algorithm Overview

     Extraction and Merge
       -    Extracted event clusters : {e1, …,en}	



                                                                                                    Ψ’l	


                                                                         Ψl	





                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
71



     Algorithm Overview

     Extraction and Merge
       -    Extracted event clusters : {e1, …,en}	

       -    Merge semantically similar cluster:

                                                                                                    Ψ’l	


                                                                         Ψl	





                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
72



     Algorithm Overview

     Extraction and Merge
       -    Extracted event clusters : {e1, …,en}	

       -    Merge semantically similar cluster:

                                         ei ∩ e j                                                   Ψ’l	

                      θ (ei ,e j ) =
                                       min(ei ,e j )
                                                                         Ψl	




             €




                     Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
73




     Evaluation - Dataset




              Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
74




     Evaluation - Dataset
      -    Dataset collected from Flickr
      -    Only geo-tagged picture
      -    12 June 2008 – 11 June 2010 (729 days)
      -    San Francisco Area

                                                                #Images ~ 350K
                                                                #Tags ~ 3M




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
75




     Evaluation - Measure




              Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
76




     Evaluation - Measure

      -    List of ranked Clusters: {e1, e2, …}	





                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
77




     Evaluation - Measure

      -    List of ranked Clusters: {e1, e2, …}	

      -    Ranking according to cluster's size: |ei| 	





                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
78




     Evaluation - Measure

      -    List of ranked Clusters: {e1, e2, …}	

      -    Ranking according to cluster's size: |ei| 	

      -    Drawback: lack of ground truth (recall measure)	





                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
79




     Evaluation - Measure

      -    List of ranked Clusters: {e1, e2, …}	

      -    Ranking according to cluster's size: |ei| 	

      -    Drawback: lack of ground truth (recall measure)	





                                    Rk                       Rk : relevant clusters in the
               Top-K Precision :	

                          first k returned
                                    K



                         €

                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
80




     Evaluation - Measure

      -    List of ranked Clusters: {e1, e2, …}	

      -    Ranking according to cluster's size: |ei| 	

      -    Drawback: lack of ground truth (recall measure)	





                                    Rk                       Rk : relevant clusters in the
               Top-K Precision :	

                          first k returned
                                    K


                              Top-20 (K=20)
                         €

                    Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
81




     Evaluation
      -    Experiment on different granularity in time and space
      -    Time:
                                            1 day                   1 week

            Example                2008Oct12                 2008:43

      -    Space:
             Latitude Precision            Longitude              Square Size
                                           Precision               (Meters)

            0.01                    0.01                        1000m X 1000m
            0.005                   0.005                         500m X 500m
            0.002                   0.002                         200m X 200m
            0.001                   0.001                         100m X 100m




                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
82




       Evaluation - Results

                              100 m                            200 m                          500 m                          1000 m
                      1 Day       1 Week               1 Day       1 Week             1 Day       1 Week             1 Day        1 Week
     #Clusters   #Ev.    Prec.   #Ev.     Prec.   #Ev.    Prec.   #Ev.   Prec.   #Ev.    Prec.   #Ev.   Prec.   #Ev.    Prec.    #Ev.   Prec.

        1         1      100%     1       100%     1      100%     1     100%     1      100%     1     100%     1      100%      1     100%

        2         2      100%     2       100%     2      100%     2     100%     2      100%     2     100%     2      100%      1     50%
        3         3      100%     3       100%     3      100%     3     100%     3      100%     3     100%     3      100%      2     67%

        …


       20        15      75%     14       70%     15      75%     14     70%     14      70%     13     65%     13      65%      14     70%




                                      Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
83




     Evaluation - Results
       Top-20 precision




                          Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
84




     Conclusion




             Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
85




     Conclusion
      -    Novel algorithm for event cluster extraction:
             -    from large amount of Flickr images
             -    Multi-user photo collection
             -    Incremental clustering algorithm




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
86




     Conclusion
      -    Novel algorithm for event cluster extraction:
             -    from large amount of Flickr images
             -    Multi-user photo collection
             -    Incremental clustering algorithm
      -    Extension of STC previously used only to cluster text documents




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
87




     Conclusion
      -    Novel algorithm for event cluster extraction:
             -    from large amount of Flickr images
             -    Multi-user photo collection
             -    Incremental clustering algorithm
      -    Extension of STC previously used only to cluster text documents
      -    Based on a Suffix-Tree (construction O(n))




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
88




     Conclusion
      -    Novel algorithm for event cluster extraction:
             -    from large amount of Flickr images
             -    Multi-user photo collection
             -    Incremental clustering algorithm
      -    Extension of STC previously used only to cluster text documents
      -    Based on a Suffix-Tree (construction O(n))
      -    Automatic annotation of clusters




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
89




     Conclusion
      -    Novel algorithm for event cluster extraction:
             -    from large amount of Flickr images
             -    Multi-user photo collection
             -    Incremental clustering algorithm
      -    Extension of STC previously used only to cluster text documents
      -    Based on a Suffix-Tree (construction O(n))
      -    Automatic annotation of clusters
      -    Noise reduction in the tag using extended vocabulary for stopword
           removal




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
90




     Conclusion
      -    Novel algorithm for event cluster extraction:
             -    from large amount of Flickr images
             -    Multi-user photo collection
             -    Incremental clustering algorithm
      -    Extension of STC previously used only to cluster text documents
      -    Based on a Suffix-Tree (construction O(n))
      -    Automatic annotation of clusters
      -    Noise reduction in the tag using extended vocabulary for stopword
           removal
      -    Spatial and Time information considered




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
91




     Conclusion
      -    Novel algorithm for event cluster extraction:
             -    from large amount of Flickr images
             -    Multi-user photo collection
             -    Incremental clustering algorithm
      -    Extension of STC previously used only to cluster text documents
      -    Based on a Suffix-Tree (construction O(n))
      -    Automatic annotation of clusters
      -    Noise reduction in the tag using extended vocabulary for stopword
           removal
      -    Spatial and Time information considered
      -    Analysis of different granularity of time and space




                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
92




               Thanks (                        ) for the attention!




     http://www.idi.ntnu.no/~ruocco/



                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
93




               Thanks (                        ) for the attention!

                                   QUESTIONS?




     http://www.idi.ntnu.no/~ruocco/



                   Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010

Event Clusters Detection on Flickr Images using a Suffix-Tree Structure

  • 1.
    1 Event Cluster Detection on Flickr Images using a Suffix-Tree Structure Massimiliano Ruocco and Heri Ramampiaro Dept. Of Computer and Information Science Norwegian University of Science and Technology ruocco@idi.ntnu.no Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 2.
    2 Outline Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 3.
    3 Outline 1.  Introduction 1.  Problem Statement 2.  Related Works 3.  Contributions 2.  Proposed approach 1.  Problem definition 2.  Preliminary 3.  Algorithm Overview 3.  Evaluation 4.  Conclusions Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 4.
    4 Problem Statement Event Detection Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 5.
    5 Problem Statement Event Detection -  Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1): (1) http://projects.ldc.upenn.edu/TDT/! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 6.
    6 Problem Statement Event Detection -  Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1): -  Objective: aggregate stories over time into single event topic (1) http://projects.ldc.upenn.edu/TDT/! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 7.
    7 Problem Statement Event Detection -  Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1): -  Objective: aggregate stories over time into single event topic Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] (1) http://projects.ldc.upenn.edu/TDT/! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 8.
    8 Problem Statement Event Detection Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 9.
    9 Problem Statement Event Detection -  Most previous works focus on time-tagged document streams can be classified as: Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 10.
    10 Problem Statement Event Detection -  Most previous works focus on time-tagged document streams can be classified as: -  Retrospective Detection : discover unidentified events in a collection of news [Yang et al. 1998] Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 11.
    11 Problem Statement Event Detection -  Most previous works focus on time-tagged document streams can be classified as: -  Retrospective Detection : discover unidentified events in a collection of news [Yang et al. 1998] -  Online Detection : detect events in real-time from a stream of news [Brants et al. 2003] Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 12.
    12 Problem Statement Web Photo-Sharing Apps – New Needs Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 13.
    13 Problem Statement Web Photo-Sharing Apps – New Needs Huge Amount of Pictures Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 14.
    14 Problem Statement Web Photo-Sharing Apps – New Needs Time! 26 Oct 2010 User! RMax Location! 26:12, 23:14 Tags! Roma, Sky, Bridge …! Huge Amount of Pictures Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 15.
    15 Problem Statement Web Photo-Sharing Apps – New Needs Retrieve New Needs Browse Time! 26 Oct 2010 User! RMax Location! 26:12, 23:14 Knowledge Extraction Tags! Roma, Sky, Bridge …! Huge Amount of Pictures Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 16.
    16 Problem Statement Challenges Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 17.
    17 Problem Statement Challenges -  Event detection on Tagged Picture from Photo-Sharing Apps -  Web-scale environment -  Use of contextual information -  Noisy annotation Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 18.
    18 Problem Statement Challenges -  Event detection on Tagged Picture from Photo-Sharing Apps -  Web-scale environment -  Use of contextual information -  Noisy annotation Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 19.
    19 Related Works Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 20.
    20 Related Works -  Event Clustering (Visual/Temporal information) [Loui, Savakis 2002] -  Albuming user photo collections -  Not scalable to large dataset! -  Limited to user photo collection! -  No Locational Information! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 21.
    21 Related Works -  Event Clustering (Visual/Temporal information) [Loui, Savakis 2002] -  Albuming user photo collections -  Not scalable to large dataset! -  Limited to user photo collection! -  No Locational Information! -  Event/Place Semantic Identification (Temporal information) [Rattenbury et al. 2007] -  Extraction of event and place semantics for tags assigned to Flickr photos -  Scale-Structure Identification (SSI) method to analyze the tag usage distribution -  SSI is limited for large dataset! -  Location information is not considered! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 22.
    22 Related Works -  Event Clustering (Visual/Temporal information) [Loui, Savakis 2002] -  Albuming user photo collections -  Not scalable to large dataset! -  Limited to user photo collection! -  No Locational Information! -  Event/Place Semantic Identification (Temporal information) [Rattenbury et al. 2007] -  Extraction of event and place semantics for tags assigned to Flickr photos -  Scale-Structure Identification (SSI) method to analyze the tag usage distribution -  SSI is limited for large dataset! -  Location information is not considered! -  Event Tag Detection (Spatial/Temporal information) [Chen, Roy 2009] -  Detect event tags from Flickr photos -  As [Rattenbury et al. 2007] use SSI method to analyze the tag usage distribution -  SSI is used over locational and spatial distributions simultaneously Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 23.
    23 Problem Definition Hypothesis Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 24.
    24 Problem Definition Hypothesis Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 25.
    25 Problem Definition Hypothesis Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] Something happening in a certain place at a certain time with a certain tag Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 26.
    26 Problem Definition Hypothesis Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] Something happening in a certain place at a certain time with a certain tag Event Cluster ej  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 27.
    27 Problem Definition Hypothesis Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999] Something happening in a certain place at a certain time with a certain tag Event Cluster ej  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Not the opposite  ! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 28.
    28 Problem Definition Hypothesis – Landmark clusters Location Event Cluster ek … {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g colosseo! time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 29.
    29 Problem Definition Hypothesis – Landmark clusters Location Event Cluster ek … {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g colosseo! dt time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 30.
    30 Problem Definition Hypothesis – Landmark clusters Landmark Location Clusters Event Cluster ek … {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g colosseo! dt time Event Cluster ek  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Not the opposite  ! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 31.
    31 Problem Definition Hypothesis – Event clusters Location g time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 32.
    32 Problem Definition Hypothesis – Event clusters Location g dt time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 33.
    33 Problem Definition Hypothesis – Event clusters Location g applepies! dt time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 34.
    34 Problem Definition Hypothesis – Event clusters Landmark Clusters Location Event Cluster ek Event Clusters {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g applepies! dt time Event Cluster ek  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 35.
    35 Problem Definition Hypothesis – Event clusters Landmark Clusters Location Event Cluster ek Event Clusters {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g applepies! dt time Event Cluster ek  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } The opposite is true  ! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 36.
    36 Problem Definition Hypothesis – Event clusters Landmark Clusters Location Event Cluster ek Event Clusters {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } g applepies! time Event Cluster ek  {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } The opposite is true  ! Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 37.
    37 Problem Definition New Formulation Event Event Cluster ek Clusters Location Sdgt Location Sgt = g applepies! g applepies! dt time time Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 38.
    38 Problem Definition New Formulation Event Event Cluster ek Clusters { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) Location Sdgt € Location Sgt = g applepies! g applepies! dt time time Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 39.
    39 Preliminary Suffix-Tree Clustering [Zamir 1998] Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 40.
    40 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 41.
    41 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 42.
    42 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 43.
    43 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 44.
    44 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering -  Cluster label inferred by the tree structure Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 45.
    45 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering -  Cluster label inferred by the tree structure -  Phrase-Based model Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 46.
    46 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering -  Cluster label inferred by the tree structure -  Phrase-Based model -  Snippet-tolerant Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 47.
    47 Preliminary Suffix-Tree Clustering [Zamir 1998] -  Suffix-Tree based -  Mainly used in text (web) document clustering -  Three step process: 1  Document cleaning 2  Base clusters identification 3  Base clusters merging -  Incremental clustering -  Cluster label inferred by the tree structure -  Phrase-Based model -  Snippet-tolerant -  Overlapped clusters Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 48.
    48 Preliminary Suffix-Tree Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 49.
    49 Preliminary Suffix-Tree -  Given a string S suffix-tree is a Compact Trie containing all the suffixes of S -  Rooted directed tree -  Each internal node other than root has at least two children -  Each edge leaving a particular node is labelled with a non-empty substring of S Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 50.
    50 Preliminary Suffix-Tree -  Given a string S suffix-tree is a Compact Trie containing all the suffixes of S -  Rooted directed tree -  Each internal node other than root has at least two children -  Each edge leaving a particular node is labelled with a non-empty substring of S Papua ‘apua’ ‘pua’ ‘ua’ ‘a’ Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 51.
    51 Preliminary Suffix-Tree -  Given a string S suffix-tree is a Compact Trie containing all the suffixes of S -  Rooted directed tree -  Each internal node other than root has at least two children -  Each edge leaving a particular node is labelled with a non-empty substring of S Papua ‘apua’ ‘pua’ ‘ua’ ‘a’ -  Suffix-Tree construction performs in linear time (O(n)) ([Ukkonen 1995]) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 52.
    52 Algorithm Overview Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 53.
    53 Algorithm Overview Data cleaning Data extension Suffix Tree Construction Ii = (T, g, dt) Event clusters extraction Event Clusters merge … Primary! Party! Election! Campaign! … Concert! Music! John! … … Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 54.
    54 Algorithm Overview Data Cleaning and Extension Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 55.
    55 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 56.
    56 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 57.
    57 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] where s1 and s2 encoding function from date/location to string Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 58.
    58 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T T’’ = {t’’1, …, t’’l } s1 and s2 define the t’’i = [s1(dt) + s2(g) + ti ] granularity in space (geographical grid) and time where s1 and s2 encoding function from date/location to string Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 59.
    59 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T T’’ = {t’’1, …, t’’l } s1 and s2 define the t’’i = [s1(dt) + s2(g) + ti ] granularity in space (geographical grid) and time where s1 and s2 encoding function from date/location to string acmm2010 26Oct2010 43.77:11.24 acmm2010 florence 26Oct2010 43.77:11.24 florence multimedia 26Oct2010 43.77:11.24 multimedia Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 60.
    60 Algorithm Overview Data Cleaning and Extension -  Cleaning: Ii = (T,g,dt)  Ii’ = (T’,g,dt) -  Stopword removal (with extended vocabulary) + Stemming -  Extension: Ii’ = (T’,g,dt)  Ii’’ = (T’’,g,dt) -  Spatial and Temporal information are encoded in the annotation set T T’’ = {t’’1, …, t’’l } s1 and s2 define the t’’i = [s1(dt) + s2(g) + ti ] granularity in space (geographical grid) and time where s1 and s2 encoding function from date/location to string acmm2010 26Oct2010 43.77:11.24 acmm2010 T’ florence T’’ 26Oct2010 43.77:11.24 florence multimedia 26Oct2010 43.77:11.24 multimedia s1(26/10/2010) s2(43.777864,11.249029) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 61.
    61 Algorithm Overview ST Construction and Event Extraction Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 62.
    62 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l Ψl Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 63.
    63 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 64.
    64 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 65.
    65 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) -  Extract Ψ’l ([s2(g) + ti ]) € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 66.
    66 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) -  Extract Ψ’l ([s2(g) + ti ]) -  Compare Ψl and Ψ’l € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 67.
    67 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) -  Extract Ψ’l ([s2(g) + ti ]) -  Compare Ψl and Ψ’l -  IF (Ψl = Ψ’l)  Ψl ([s1(dt) + s2(g) + ti ]) is event cluster € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 68.
    68 Algorithm Overview ST Construction and Event Extraction -  Image Ii’’ : document snippet Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ] Ψ’l -  Extract Candidate event clusters Ψl : Ψl -  Ψl ([s1(dt) + s2(g) + ti ]) Event Cluster ek  { ∃ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek) -  Extract Ψ’l ([s2(g) + ti ]) -  Compare Ψl and Ψ’l -  IF (Ψl = Ψ’l)  Ψl ([s1(dt) + s2(g) + ti ]) is event cluster € -  Label inferred from the structure Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 69.
    69 Algorithm Overview Extraction and Merge Ψ’l Ψl Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 70.
    70 Algorithm Overview Extraction and Merge -  Extracted event clusters : {e1, …,en} Ψ’l Ψl Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 71.
    71 Algorithm Overview Extraction and Merge -  Extracted event clusters : {e1, …,en} -  Merge semantically similar cluster: Ψ’l Ψl Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 72.
    72 Algorithm Overview Extraction and Merge -  Extracted event clusters : {e1, …,en} -  Merge semantically similar cluster: ei ∩ e j Ψ’l θ (ei ,e j ) = min(ei ,e j ) Ψl € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 73.
    73 Evaluation - Dataset Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 74.
    74 Evaluation - Dataset -  Dataset collected from Flickr -  Only geo-tagged picture -  12 June 2008 – 11 June 2010 (729 days) -  San Francisco Area #Images ~ 350K #Tags ~ 3M Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 75.
    75 Evaluation - Measure Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 76.
    76 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 77.
    77 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} -  Ranking according to cluster's size: |ei| Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 78.
    78 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} -  Ranking according to cluster's size: |ei| -  Drawback: lack of ground truth (recall measure) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 79.
    79 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} -  Ranking according to cluster's size: |ei| -  Drawback: lack of ground truth (recall measure) Rk Rk : relevant clusters in the Top-K Precision : first k returned K € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 80.
    80 Evaluation - Measure -  List of ranked Clusters: {e1, e2, …} -  Ranking according to cluster's size: |ei| -  Drawback: lack of ground truth (recall measure) Rk Rk : relevant clusters in the Top-K Precision : first k returned K Top-20 (K=20) € Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 81.
    81 Evaluation -  Experiment on different granularity in time and space -  Time: 1 day 1 week Example 2008Oct12 2008:43 -  Space: Latitude Precision Longitude Square Size Precision (Meters) 0.01 0.01 1000m X 1000m 0.005 0.005 500m X 500m 0.002 0.002 200m X 200m 0.001 0.001 100m X 100m Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 82.
    82 Evaluation - Results 100 m 200 m 500 m 1000 m 1 Day 1 Week 1 Day 1 Week 1 Day 1 Week 1 Day 1 Week #Clusters #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. 1 1 100% 1 100% 1 100% 1 100% 1 100% 1 100% 1 100% 1 100% 2 2 100% 2 100% 2 100% 2 100% 2 100% 2 100% 2 100% 1 50% 3 3 100% 3 100% 3 100% 3 100% 3 100% 3 100% 3 100% 2 67% … 20 15 75% 14 70% 15 75% 14 70% 14 70% 13 65% 13 65% 14 70% Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 83.
    83 Evaluation - Results Top-20 precision Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 84.
    84 Conclusion Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 85.
    85 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 86.
    86 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 87.
    87 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 88.
    88 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) -  Automatic annotation of clusters Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 89.
    89 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) -  Automatic annotation of clusters -  Noise reduction in the tag using extended vocabulary for stopword removal Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 90.
    90 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) -  Automatic annotation of clusters -  Noise reduction in the tag using extended vocabulary for stopword removal -  Spatial and Time information considered Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 91.
    91 Conclusion -  Novel algorithm for event cluster extraction: -  from large amount of Flickr images -  Multi-user photo collection -  Incremental clustering algorithm -  Extension of STC previously used only to cluster text documents -  Based on a Suffix-Tree (construction O(n)) -  Automatic annotation of clusters -  Noise reduction in the tag using extended vocabulary for stopword removal -  Spatial and Time information considered -  Analysis of different granularity of time and space Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 92.
    92 Thanks ( ) for the attention! http://www.idi.ntnu.no/~ruocco/ Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
  • 93.
    93 Thanks ( ) for the attention! QUESTIONS? http://www.idi.ntnu.no/~ruocco/ Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010