SlideShare a Scribd company logo
Web Page Clustering Using a Fuzzy Logic Based
   Representation and Self-organizing Maps

    Alberto P. Garc´
                   ıa-Plaza, V´
                              ıctor Fresno, Raquel Mart´
                                                       ınez
                     NLP & IR Group, UNED

                       December 12, 2008
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents



             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
             3   Experiment Description
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                  slide 2
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents



             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
             3   Experiment Description
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                  slide 3
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                                  Objectives


              Group HTML documents by content similarity.
              Self-Organizing Maps (SOM) to organize, visualize and
              navigate through the collection.
              Term weighting function taking advantage of HTML tags
                      Combining, by means of fuzzy logic, heuristic criteria based on
                      the inherent semantics of some HTML tags and word positions
                      in the document.

       Hypothesis
       An improvement in document representation will involve an
       increase in map quality.



Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                 slide 4
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents


             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
                   1   Fuzzy Logic
                   2   EFCC
                   3   Linguistic Variables
                   4   Knowledge Base
             3   Experiment Description
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                  slide 5
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                                 Fuzzy logic



              Capturing human expert knowledge.
              Close to natural language.
              Knowledge base: defined by a set of IF-THEN rules.
              Linguistic variables
                      Defined using natural language words and fuzzy sets.
                      These sets allow the description of the membership degree of
                      an object to a particular class.




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                 slide 6
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents


             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
                   1   Fuzzy Logic
                   2   EFCC
                   3   Linguistic Variables
                   4   Knowledge Base
             3   Experiment Description
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                  slide 7
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                 slide 8
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                 slide 9
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 10
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 11
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 12
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 13
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 14
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 15
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 16
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 17
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                   Extended Fuzzy Combination of Criteria




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 18
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents


             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
                   1   Fuzzy Logic
                   2   EFCC
                   3   Linguistic Variables
                   4   Knowledge Base
             3   Experiment Description
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                slide 19
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                        Linguistic Variables




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 20
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                        Linguistic Variables




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 21
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                        Linguistic Variables




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 22
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                        Linguistic Variables




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 23
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                        Linguistic Variables




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 24
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                        Linguistic Variables




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 25
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents


             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
                   1   Fuzzy Logic
                   2   EFCC
                   3   Linguistic Variables
                   4   Knowledge Base
             3   Experiment Description
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                slide 26
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                           Knowledge Base




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 27
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                           Knowledge Base




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 28
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                           Knowledge Base




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 29
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                           Knowledge Base




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 30
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents


             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
             3   Experiment Description
                   1   Dimensionality Reduction
                   2   Document Map
                   3   Evaluation Methods
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                slide 31
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                  Dimensionality Reduction


              Input vectors dimension ranging from 100 to 5000
              Stopwords, puntuaction marks suffixes, and words occurring
              less than 50 times in the whole corpus were removed.
              Two well known methods:
                      Document frequency reduction.
                      Random projection method.
              Three proposed rank-based methods:
                      Most Valued Terms.
                      Fixed reduction method.
                      More Frequent Terms until n level.




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 32
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents


             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
             3   Experiment Description
                   1   Dimensionality Reduction
                   2   Document Map
                   3   Evaluation Methods
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                slide 33
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                 Experiment Description               Results                  Conclusion


                              Document Map Construction



              Benchmark dataset for clustering: Banksearch1
                      10000 documents
                      10 classes
              SOM size was set equal to the number of classes of input
              documents, i.e. 5x2, in order to compare clustering results.




            1
              M. P. Sinka and D. W. Corne. A large benchmark dataset for web document clustering. Soft Computing
       Systems: Design, Management, and Applications, 2002.
Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                        slide 34
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents


             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
             3   Experiment Description
                   1   Dimensionality Reduction
                   2   Document Map
                   3   Evaluation Methods
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                slide 35
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                        Evaluation Methods



              Weighted average of the F-measure for each class.
              After mapping the collection in the trained map, the class
              with greater number of documents mapped on a neuron will
              be selected to label the unit.
              All the document vectors in a neuron which class is different
              from the neuron label will be counted as errors.




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 36
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents



             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
             3   Experiment Description
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                slide 37
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


             Best reduction for each term weighting function




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 38
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                         MFTn reduction provides stability




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 39
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


             EFCC+MFTn obtains its best results with the
                   smallest number of features




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 40
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives               Our Approach                Experiment Description               Results          Conclusion


                                          Table of Contents



             1   Objectives
             2   Our Approach: Extended Fuzzy Combination of Criteria
                 (EFCC)
             3   Experiment Description
             4   Results
             5   Conclusion




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                slide 41
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                                 Conclusion


              Unsupervised document representation method, based on
              fuzzy logic, focused on clustering HTML documents by means
              of self-organizing maps.
              MFTn reduction is the most stable reduction in all cases.
              EFCC representation allows to obtain better results using a
              smaller vocabulary.
              Smaller number of features needed to represent the input
              documents and SOM unit vectors, which implies an
              improvement in computational cost.




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 42
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives              Our Approach                Experiment Description               Results          Conclusion


                                            Thank You!




Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                               slide 43
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives                Our Approach                  Experiment Description                   Results               Conclusion


                                                 Related Work

                                       VSM       Topic     Document                    Weighting             Modifies
                                               Information   Type                      Function               SOM
         Self organization of
         a Massive Document             Yes         Yes             Text         Shannon’s Entrophy              No
         Collection2
         Document Clustering            Yes          No             Text         Binary, TF, TF-IDF              No
         using Phrases3
         Document Clustering            Yes         Yes             Text        ESVM, HSVM, HyM                  No
         using WordNet4
         Conceptional SOM5              Yes          No             Text                    TF                   Yes




            2
              T. Kohonen, S. Kaski, K. Lagus, J. Salojarvi, J. Honkela, V. Paatero, and A. Saarela. Self organization of a
       massive document collection. IEEE Trans. on Neural Networks, 2000.
            3
              J. Bakus, M. Hussin, and M. Kamel. A som-based document clustering using phrases. In ICONIP, 2002.
            4
              C. Hung and S. Wermter. Neural network based document clustering using wordnet ontologies. Int. J.
       Hybrid Intell. Syst., 2004
            5
              Y. Liu, X. Wang, and C. Wu. Consom: A conceptional som model for text clustering. In Neurocomputing,
       2008
Alberto P. Garc´
               ıa-Plaza, V´
                          ıctor Fresno, Raquel Mart´
                                                   ınez, NLP & IR Group, UNED                                                slide 44

More Related Content

Viewers also liked

Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
vini89
 
Analysing Web GIS apps
Analysing Web GIS appsAnalysing Web GIS apps
Analysing Web GIS apps
M.Muneeb Ashraf
 
Developing Efficient Web-based GIS Applications
Developing Efficient Web-based GIS ApplicationsDeveloping Efficient Web-based GIS Applications
Developing Efficient Web-based GIS Applications
Swetha A
 
Introduction to sar-marjolaine_rouault
Introduction to sar-marjolaine_rouaultIntroduction to sar-marjolaine_rouault
Introduction to sar-marjolaine_rouault
Naivedya Mishra
 
Synthetic aperture radar
Synthetic aperture radarSynthetic aperture radar
Synthetic aperture radar
Mahesh pawar
 
MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...
MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...
MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...
Swetha A
 
Synthetic aperture radar (sar) 20150930
Synthetic aperture radar (sar) 20150930Synthetic aperture radar (sar) 20150930
Synthetic aperture radar (sar) 20150930
JiyaE
 
OSM and QGIS
OSM and QGISOSM and QGIS
OSM and QGIS
QGIS UK
 
Map to Image Georeferencing using ERDAS software
 Map  to Image Georeferencing using ERDAS software Map  to Image Georeferencing using ERDAS software
Map to Image Georeferencing using ERDAS software
Swetha A
 
2 cluster analysis
2  cluster analysis2  cluster analysis
2 cluster analysis
Dmitry Grapov
 
33412283 solving-fuzzy-logic-problems-with-matlab
33412283 solving-fuzzy-logic-problems-with-matlab33412283 solving-fuzzy-logic-problems-with-matlab
33412283 solving-fuzzy-logic-problems-with-matlab
sai kumar
 
Synthetic aperture radar_advanced
Synthetic aperture radar_advancedSynthetic aperture radar_advanced
Synthetic aperture radar_advanced
Naivedya Mishra
 
Feature Extraction and Principal Component Analysis
Feature Extraction and Principal Component AnalysisFeature Extraction and Principal Component Analysis
Feature Extraction and Principal Component Analysis
Sayed Abulhasan Quadri
 
Radar 2009 a 14 airborne pulse doppler radar
Radar 2009 a 14 airborne pulse doppler radarRadar 2009 a 14 airborne pulse doppler radar
Radar 2009 a 14 airborne pulse doppler radar
Forward2025
 
3 principal components analysis
3  principal components analysis3  principal components analysis
3 principal components analysis
Dmitry Grapov
 
Radar 2009 a 18 synthetic aperture radar
Radar 2009 a 18 synthetic aperture radarRadar 2009 a 18 synthetic aperture radar
Radar 2009 a 18 synthetic aperture radar
Forward2025
 
GEOPROCESSING IN QGIS
GEOPROCESSING IN QGISGEOPROCESSING IN QGIS
GEOPROCESSING IN QGIS
Swetha A
 
Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...
Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...
Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...
Swetha A
 
Steps for Principal Component Analysis (pca) using ERDAS software
Steps for Principal Component Analysis (pca) using ERDAS softwareSteps for Principal Component Analysis (pca) using ERDAS software
Steps for Principal Component Analysis (pca) using ERDAS software
Swetha A
 
Matlab Feature Extraction Using Segmentation And Edge Detection
Matlab Feature Extraction Using Segmentation And Edge DetectionMatlab Feature Extraction Using Segmentation And Edge Detection
Matlab Feature Extraction Using Segmentation And Edge Detection
DataminingTools Inc
 

Viewers also liked (20)

Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
 
Analysing Web GIS apps
Analysing Web GIS appsAnalysing Web GIS apps
Analysing Web GIS apps
 
Developing Efficient Web-based GIS Applications
Developing Efficient Web-based GIS ApplicationsDeveloping Efficient Web-based GIS Applications
Developing Efficient Web-based GIS Applications
 
Introduction to sar-marjolaine_rouault
Introduction to sar-marjolaine_rouaultIntroduction to sar-marjolaine_rouault
Introduction to sar-marjolaine_rouault
 
Synthetic aperture radar
Synthetic aperture radarSynthetic aperture radar
Synthetic aperture radar
 
MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...
MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...
MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...
 
Synthetic aperture radar (sar) 20150930
Synthetic aperture radar (sar) 20150930Synthetic aperture radar (sar) 20150930
Synthetic aperture radar (sar) 20150930
 
OSM and QGIS
OSM and QGISOSM and QGIS
OSM and QGIS
 
Map to Image Georeferencing using ERDAS software
 Map  to Image Georeferencing using ERDAS software Map  to Image Georeferencing using ERDAS software
Map to Image Georeferencing using ERDAS software
 
2 cluster analysis
2  cluster analysis2  cluster analysis
2 cluster analysis
 
33412283 solving-fuzzy-logic-problems-with-matlab
33412283 solving-fuzzy-logic-problems-with-matlab33412283 solving-fuzzy-logic-problems-with-matlab
33412283 solving-fuzzy-logic-problems-with-matlab
 
Synthetic aperture radar_advanced
Synthetic aperture radar_advancedSynthetic aperture radar_advanced
Synthetic aperture radar_advanced
 
Feature Extraction and Principal Component Analysis
Feature Extraction and Principal Component AnalysisFeature Extraction and Principal Component Analysis
Feature Extraction and Principal Component Analysis
 
Radar 2009 a 14 airborne pulse doppler radar
Radar 2009 a 14 airborne pulse doppler radarRadar 2009 a 14 airborne pulse doppler radar
Radar 2009 a 14 airborne pulse doppler radar
 
3 principal components analysis
3  principal components analysis3  principal components analysis
3 principal components analysis
 
Radar 2009 a 18 synthetic aperture radar
Radar 2009 a 18 synthetic aperture radarRadar 2009 a 18 synthetic aperture radar
Radar 2009 a 18 synthetic aperture radar
 
GEOPROCESSING IN QGIS
GEOPROCESSING IN QGISGEOPROCESSING IN QGIS
GEOPROCESSING IN QGIS
 
Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...
Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...
Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...
 
Steps for Principal Component Analysis (pca) using ERDAS software
Steps for Principal Component Analysis (pca) using ERDAS softwareSteps for Principal Component Analysis (pca) using ERDAS software
Steps for Principal Component Analysis (pca) using ERDAS software
 
Matlab Feature Extraction Using Segmentation And Edge Detection
Matlab Feature Extraction Using Segmentation And Edge DetectionMatlab Feature Extraction Using Segmentation And Edge Detection
Matlab Feature Extraction Using Segmentation And Edge Detection
 

Recently uploaded

Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 

Recently uploaded (20)

Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 

Web Page Clustering Using a Fuzzy Logic Based Representation and Self-Organizing Maps

  • 1. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez NLP & IR Group, UNED December 12, 2008
  • 2. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 3 Experiment Description 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 2
  • 3. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 3 Experiment Description 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 3
  • 4. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Objectives Group HTML documents by content similarity. Self-Organizing Maps (SOM) to organize, visualize and navigate through the collection. Term weighting function taking advantage of HTML tags Combining, by means of fuzzy logic, heuristic criteria based on the inherent semantics of some HTML tags and word positions in the document. Hypothesis An improvement in document representation will involve an increase in map quality. Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 4
  • 5. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 1 Fuzzy Logic 2 EFCC 3 Linguistic Variables 4 Knowledge Base 3 Experiment Description 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 5
  • 6. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Fuzzy logic Capturing human expert knowledge. Close to natural language. Knowledge base: defined by a set of IF-THEN rules. Linguistic variables Defined using natural language words and fuzzy sets. These sets allow the description of the membership degree of an object to a particular class. Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 6
  • 7. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 1 Fuzzy Logic 2 EFCC 3 Linguistic Variables 4 Knowledge Base 3 Experiment Description 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 7
  • 8. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 8
  • 9. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 9
  • 10. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 10
  • 11. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 11
  • 12. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 12
  • 13. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 13
  • 14. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 14
  • 15. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 15
  • 16. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 16
  • 17. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 17
  • 18. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Extended Fuzzy Combination of Criteria Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 18
  • 19. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 1 Fuzzy Logic 2 EFCC 3 Linguistic Variables 4 Knowledge Base 3 Experiment Description 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 19
  • 20. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Linguistic Variables Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 20
  • 21. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Linguistic Variables Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 21
  • 22. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Linguistic Variables Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 22
  • 23. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Linguistic Variables Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 23
  • 24. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Linguistic Variables Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 24
  • 25. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Linguistic Variables Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 25
  • 26. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 1 Fuzzy Logic 2 EFCC 3 Linguistic Variables 4 Knowledge Base 3 Experiment Description 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 26
  • 27. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Knowledge Base Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 27
  • 28. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Knowledge Base Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 28
  • 29. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Knowledge Base Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 29
  • 30. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Knowledge Base Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 30
  • 31. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 3 Experiment Description 1 Dimensionality Reduction 2 Document Map 3 Evaluation Methods 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 31
  • 32. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Dimensionality Reduction Input vectors dimension ranging from 100 to 5000 Stopwords, puntuaction marks suffixes, and words occurring less than 50 times in the whole corpus were removed. Two well known methods: Document frequency reduction. Random projection method. Three proposed rank-based methods: Most Valued Terms. Fixed reduction method. More Frequent Terms until n level. Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 32
  • 33. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 3 Experiment Description 1 Dimensionality Reduction 2 Document Map 3 Evaluation Methods 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 33
  • 34. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Document Map Construction Benchmark dataset for clustering: Banksearch1 10000 documents 10 classes SOM size was set equal to the number of classes of input documents, i.e. 5x2, in order to compare clustering results. 1 M. P. Sinka and D. W. Corne. A large benchmark dataset for web document clustering. Soft Computing Systems: Design, Management, and Applications, 2002. Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 34
  • 35. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 3 Experiment Description 1 Dimensionality Reduction 2 Document Map 3 Evaluation Methods 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 35
  • 36. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Evaluation Methods Weighted average of the F-measure for each class. After mapping the collection in the trained map, the class with greater number of documents mapped on a neuron will be selected to label the unit. All the document vectors in a neuron which class is different from the neuron label will be counted as errors. Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 36
  • 37. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 3 Experiment Description 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 37
  • 38. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Best reduction for each term weighting function Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 38
  • 39. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion MFTn reduction provides stability Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 39
  • 40. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion EFCC+MFTn obtains its best results with the smallest number of features Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 40
  • 41. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Table of Contents 1 Objectives 2 Our Approach: Extended Fuzzy Combination of Criteria (EFCC) 3 Experiment Description 4 Results 5 Conclusion Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 41
  • 42. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Conclusion Unsupervised document representation method, based on fuzzy logic, focused on clustering HTML documents by means of self-organizing maps. MFTn reduction is the most stable reduction in all cases. EFCC representation allows to obtain better results using a smaller vocabulary. Smaller number of features needed to represent the input documents and SOM unit vectors, which implies an improvement in computational cost. Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 42
  • 43. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Thank You! Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 43
  • 44. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps Objectives Our Approach Experiment Description Results Conclusion Related Work VSM Topic Document Weighting Modifies Information Type Function SOM Self organization of a Massive Document Yes Yes Text Shannon’s Entrophy No Collection2 Document Clustering Yes No Text Binary, TF, TF-IDF No using Phrases3 Document Clustering Yes Yes Text ESVM, HSVM, HyM No using WordNet4 Conceptional SOM5 Yes No Text TF Yes 2 T. Kohonen, S. Kaski, K. Lagus, J. Salojarvi, J. Honkela, V. Paatero, and A. Saarela. Self organization of a massive document collection. IEEE Trans. on Neural Networks, 2000. 3 J. Bakus, M. Hussin, and M. Kamel. A som-based document clustering using phrases. In ICONIP, 2002. 4 C. Hung and S. Wermter. Neural network based document clustering using wordnet ontologies. Int. J. Hybrid Intell. Syst., 2004 5 Y. Liu, X. Wang, and C. Wu. Consom: A conceptional som model for text clustering. In Neurocomputing, 2008 Alberto P. Garc´ ıa-Plaza, V´ ıctor Fresno, Raquel Mart´ ınez, NLP & IR Group, UNED slide 44