SlideShare a Scribd company logo
Weblog Extraction with Fuzzy  Classification Methods Edy Portmann -  University of Fribourg - Switzerland
Content Introduction Weblog extraction – Folksonomies - Fuzzy  logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers
Weblog extraction Website with regular (reverse-chronological) entries of comments, descriptions of events, or other material Provide instantnews on a particular subject and the readers can leave comments Data extraction is the act or process of retrieving data out of unstructured data sources
Folksonomies Practice and technique to create and manipulate tags collaboratively and annotate and categorize content collaboratively  Freely chosen keywords instead of controlled vocabulary User-generated taxonomy ,[object Object]
To generate an ontology,[object Object]
Hard vs. fuzzy clustering In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster In fuzzy clustering, data elements can belong to more than one cluster, and associated with each element is a set of membership levels
Content Introduction Weblog extraction – Folksonomies - Fuzzy  logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers
Building blocks 1 4 2 3
Interface Blogretrievr www.blogretrievr.com/ Blogretrievr™ I Yo-yo              I 1 3 FuzzynessFactor 2 Caption 1. Search box 2. Fuzzyness Factor 3. Go!
Query engine: Grassroots Tagging Tags Yo-yo According to these tags, yo-yo, triangle and the colours green, red and blue they must be related in some way! But in which way? Triangle Green Tags Yo-yo Triangle Red Tags Yo-yo Triangle Blue
Query engine: Jaccard coefficient B A Jaccard coefficient A B AB AB A A B A B B A B C Not at all similar Somewhatsimilar Quitesimilar
Query engine: fuzzy c-means (FCM) d FCM is a method of clustering which allows one piece of data to belong to two or more clusters d d d d
Query engine: fuzzy c-means (FCM) The algorithm defines for each term the belonging to a certain cluster It is possible that a term belongs to more than one cluster
Query engine: iterative FCM  The same terms which belongs to different clusters will be linked together The clusters and the membership degrees remain still  Membership Level Green Red Blue
Query engine: iterative FCM (ontology)  Each term is linked with other terms Every other term is again linked with terms Every new source tagged (in the Internet) causes new term-links A Membership Cluster Green Red Blue
Query engine: dendrogram  d 4 3 1 2 6 1 2 3 5 2 4 1 3 Membership Level Red Blue Green
Meta search engine Action Blogosphere Fuzzy set search query 1 2 3 2. The meta search engine sends the fuzzy set search query to other blog search engines Technorati 3. Each blog search engines send the query to the blogosphere… Meta search engine Blogdigger 4. …and gathers the results etc. 5. The meta search engine collects all results… 6. …and aggregates them 4 5 6
Aggregated documents Blogretrievr www.blogretrievr.com/ Blogretrievr™ Yo-yo Hand puppet                               I              I 5 FuzzynessFactor 1 2 Caption 1. Search Map 2. Search Results 3. Map Rotation  4. Zoom in/out 5. New search 3 4
Content Introduction Weblog extraction – Folksonomies - Fuzzy  logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers
Example: problem specifications What is coming around the edge? Samsung is screening the competitors for new killer applications In the blogosphere new technologies are discussed earlier than in other media  OLED LCD LED OEL
Example: Pre-search OEL [0.6,1] OLED LED [0.9,1] is related OLED [1] 0.9 LED 0.6 OEL
Example: The search Search for an weblog  	with new OLED 	technology The membership  	degree is [0.8,1] This includes  	OLED [1] and  	LED [0.9,1] But not OEL [0.6,1] OEL [0.6,1] [0.8..1] LED [0.9,1] OLED [1] FuzzynessFactor
Example: Results ,[object Object]
Not found with Fuzzy Search [0.8..1]Found with Boolean Search Found with Fuzzy Search [0.8..1] OLED LCD LED OEL OLED LCD LED OEL
Content Introduction Weblog extraction – Folksonomies - Fuzzy  logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers
Concluding remarks The boundaries in the fuzzy set theory are not well-defined ,[object Object]
This function takes values in the interval [0,1] Relationship in a fuzzy set is intrinsically steady instead of abrupt As a result it is possible to find more relevant documents
Aggregated docs with aim to organize the search results into several meaningful categories (clusters)  A cluster is a group of similar topics that are related to the original  The user benefits include: ,[object Object]

More Related Content

What's hot

Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Mauro Dragoni
 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET Journal
 
Enhancing the labelling technique of
Enhancing the labelling technique ofEnhancing the labelling technique of
Enhancing the labelling technique of
IJDKP
 
Concept Based Search
Concept Based SearchConcept Based Search
Concept Based Search
freewi11
 
Probabilistic Information Retrieval
Probabilistic Information RetrievalProbabilistic Information Retrieval
Probabilistic Information Retrieval
Harsh Thakkar
 
Detecting Ontological Conflicts in Protocols between Semantic Web Services
Detecting Ontological Conflicts in Protocols between Semantic Web ServicesDetecting Ontological Conflicts in Protocols between Semantic Web Services
Detecting Ontological Conflicts in Protocols between Semantic Web Services
dannyijwest
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
rchbeir
 
Sub1579
Sub1579Sub1579
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and Systems
Adrian Paschke
 
[PhDThesis2021] - Augmenting the knowledge pyramid with unconventional data a...
[PhDThesis2021] - Augmenting the knowledge pyramid with unconventional data a...[PhDThesis2021] - Augmenting the knowledge pyramid with unconventional data a...
[PhDThesis2021] - Augmenting the knowledge pyramid with unconventional data a...
University of Bologna
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
ssbd6985
 
Ethnograph 11 Jul07
Ethnograph 11 Jul07Ethnograph 11 Jul07
Ethnograph 11 Jul07
Clara Kwan
 
2006-05-25__coi-semdis
2006-05-25__coi-semdis2006-05-25__coi-semdis
2006-05-25__coi-semdis
webuploader
 
Ethnograph 10 Jul07
Ethnograph 10 Jul07Ethnograph 10 Jul07
Ethnograph 10 Jul07
Clara Kwan
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 
Data wrangling week 11
Data wrangling week 11Data wrangling week 11
Data wrangling week 11
Ferdin Joe John Joseph PhD
 
Relationship-Based Top-K Concept Retrieval for Ontology Search
Relationship-Based Top-K Concept Retrieval for Ontology SearchRelationship-Based Top-K Concept Retrieval for Ontology Search
Relationship-Based Top-K Concept Retrieval for Ontology Search
NUST School of Electrical Engineering and Computer Science
 
15. STL - Data Structures using C++ by Varsha Patil
15. STL - Data Structures using C++ by Varsha Patil15. STL - Data Structures using C++ by Varsha Patil
15. STL - Data Structures using C++ by Varsha Patil
widespreadpromotion
 
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
tmra
 
What to read next? Challenges and Preliminary Results in Selecting Represen...
What to read next? Challenges and  Preliminary Results in Selecting  Represen...What to read next? Challenges and  Preliminary Results in Selecting  Represen...
What to read next? Challenges and Preliminary Results in Selecting Represen...
MOVING Project
 

What's hot (20)

Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
 
Enhancing the labelling technique of
Enhancing the labelling technique ofEnhancing the labelling technique of
Enhancing the labelling technique of
 
Concept Based Search
Concept Based SearchConcept Based Search
Concept Based Search
 
Probabilistic Information Retrieval
Probabilistic Information RetrievalProbabilistic Information Retrieval
Probabilistic Information Retrieval
 
Detecting Ontological Conflicts in Protocols between Semantic Web Services
Detecting Ontological Conflicts in Protocols between Semantic Web ServicesDetecting Ontological Conflicts in Protocols between Semantic Web Services
Detecting Ontological Conflicts in Protocols between Semantic Web Services
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Sub1579
Sub1579Sub1579
Sub1579
 
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and Systems
 
[PhDThesis2021] - Augmenting the knowledge pyramid with unconventional data a...
[PhDThesis2021] - Augmenting the knowledge pyramid with unconventional data a...[PhDThesis2021] - Augmenting the knowledge pyramid with unconventional data a...
[PhDThesis2021] - Augmenting the knowledge pyramid with unconventional data a...
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Ethnograph 11 Jul07
Ethnograph 11 Jul07Ethnograph 11 Jul07
Ethnograph 11 Jul07
 
2006-05-25__coi-semdis
2006-05-25__coi-semdis2006-05-25__coi-semdis
2006-05-25__coi-semdis
 
Ethnograph 10 Jul07
Ethnograph 10 Jul07Ethnograph 10 Jul07
Ethnograph 10 Jul07
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
 
Data wrangling week 11
Data wrangling week 11Data wrangling week 11
Data wrangling week 11
 
Relationship-Based Top-K Concept Retrieval for Ontology Search
Relationship-Based Top-K Concept Retrieval for Ontology SearchRelationship-Based Top-K Concept Retrieval for Ontology Search
Relationship-Based Top-K Concept Retrieval for Ontology Search
 
15. STL - Data Structures using C++ by Varsha Patil
15. STL - Data Structures using C++ by Varsha Patil15. STL - Data Structures using C++ by Varsha Patil
15. STL - Data Structures using C++ by Varsha Patil
 
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
A Topic map-based ontology IR system versus Clustering-based IR System: A Com...
 
What to read next? Challenges and Preliminary Results in Selecting Represen...
What to read next? Challenges and  Preliminary Results in Selecting  Represen...What to read next? Challenges and  Preliminary Results in Selecting  Represen...
What to read next? Challenges and Preliminary Results in Selecting Represen...
 

Viewers also liked

Presentatie eindwerk: how to get free publicity
Presentatie eindwerk: how to get free publicityPresentatie eindwerk: how to get free publicity
Presentatie eindwerk: how to get free publicity
guest091dfa3a
 
Ai ppt (1)
Ai ppt (1)Ai ppt (1)
Ai ppt (1)
Ganesh Shete
 
Medical decision malaria
Medical decision malariaMedical decision malaria
Medical decision malaria
Rahmat Ascii
 
Fuzzy logic and its application in environmental engineering
Fuzzy logic and its application in environmental engineeringFuzzy logic and its application in environmental engineering
Fuzzy logic and its application in environmental engineering
Drashti Kapadia
 
Fuzzy logic based students’ learning assessment
Fuzzy logic based students’ learning assessmentFuzzy logic based students’ learning assessment
Fuzzy logic based students’ learning assessment
Aung Thu Rha Hein
 
Diagnosis of diabetes using fuzzy logic method
Diagnosis of diabetes using fuzzy logic methodDiagnosis of diabetes using fuzzy logic method
Diagnosis of diabetes using fuzzy logic method
Vineeth Kumar C G
 
Using Fuzzy Logic in Diagnosis of Tropical Malaria
Using Fuzzy Logic in Diagnosis of Tropical MalariaUsing Fuzzy Logic in Diagnosis of Tropical Malaria
Using Fuzzy Logic in Diagnosis of Tropical Malaria
Sekiziyivu Naggalama
 
Conclusión malaria
Conclusión malariaConclusión malaria
Conclusión malaria
Diego Martinez
 
Fuzzy logic and application in AI
Fuzzy logic and application in AIFuzzy logic and application in AI
Fuzzy logic and application in AI
Ildar Nurgaliev
 
fuzzy image processing
fuzzy image processingfuzzy image processing
fuzzy image processing
amalalhait
 
Fuzzy Logic in the Real World
Fuzzy Logic in the Real WorldFuzzy Logic in the Real World
Fuzzy Logic in the Real World
BCSLeicester
 
Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
Babu Appat
 
Application of fuzzy logic
Application of fuzzy logicApplication of fuzzy logic
Application of fuzzy logic
Viraj Patel
 
Fuzzy logic ppt
Fuzzy logic pptFuzzy logic ppt
Fuzzy logic ppt
Priya_Srivastava
 
State of the Word 2011
State of the Word 2011State of the Word 2011
State of the Word 2011
photomatt
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
Drift
 

Viewers also liked (16)

Presentatie eindwerk: how to get free publicity
Presentatie eindwerk: how to get free publicityPresentatie eindwerk: how to get free publicity
Presentatie eindwerk: how to get free publicity
 
Ai ppt (1)
Ai ppt (1)Ai ppt (1)
Ai ppt (1)
 
Medical decision malaria
Medical decision malariaMedical decision malaria
Medical decision malaria
 
Fuzzy logic and its application in environmental engineering
Fuzzy logic and its application in environmental engineeringFuzzy logic and its application in environmental engineering
Fuzzy logic and its application in environmental engineering
 
Fuzzy logic based students’ learning assessment
Fuzzy logic based students’ learning assessmentFuzzy logic based students’ learning assessment
Fuzzy logic based students’ learning assessment
 
Diagnosis of diabetes using fuzzy logic method
Diagnosis of diabetes using fuzzy logic methodDiagnosis of diabetes using fuzzy logic method
Diagnosis of diabetes using fuzzy logic method
 
Using Fuzzy Logic in Diagnosis of Tropical Malaria
Using Fuzzy Logic in Diagnosis of Tropical MalariaUsing Fuzzy Logic in Diagnosis of Tropical Malaria
Using Fuzzy Logic in Diagnosis of Tropical Malaria
 
Conclusión malaria
Conclusión malariaConclusión malaria
Conclusión malaria
 
Fuzzy logic and application in AI
Fuzzy logic and application in AIFuzzy logic and application in AI
Fuzzy logic and application in AI
 
fuzzy image processing
fuzzy image processingfuzzy image processing
fuzzy image processing
 
Fuzzy Logic in the Real World
Fuzzy Logic in the Real WorldFuzzy Logic in the Real World
Fuzzy Logic in the Real World
 
Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
 
Application of fuzzy logic
Application of fuzzy logicApplication of fuzzy logic
Application of fuzzy logic
 
Fuzzy logic ppt
Fuzzy logic pptFuzzy logic ppt
Fuzzy logic ppt
 
State of the Word 2011
State of the Word 2011State of the Word 2011
State of the Word 2011
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 

Similar to Weblog Extraction With Fuzzy Classification Methods

professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
Kumar Goud
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEEMEMTECHSTUDENTPROJECTS
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
Jan-Willem Bobbink - Freelance SEO Consultant
 
Breaking the Google Addiction
Breaking the Google AddictionBreaking the Google Addiction
Breaking the Google Addiction
Alan Manifold
 
Database novelty detection
Database novelty detectionDatabase novelty detection
Database novelty detection
MostafaAliAbbas
 
Data Mining with SQL Server 2008
Data Mining with SQL Server 2008Data Mining with SQL Server 2008
Data Mining with SQL Server 2008
Peter Gfader
 
A survey of web clustering engines
A survey of web clustering enginesA survey of web clustering engines
A survey of web clustering engines
unyil96
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assoc
ijerd
 
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for EntrepreneursLinked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
3 Round Stones
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
dallemang
 
IRJET- Determining Document Relevance using Keyword Extraction
IRJET-  	  Determining Document Relevance using Keyword ExtractionIRJET-  	  Determining Document Relevance using Keyword Extraction
IRJET- Determining Document Relevance using Keyword Extraction
IRJET Journal
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question Matching
IRJET Journal
 
Utilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword researchUtilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword research
Erudite
 
Pedersen acl2011-business-meeting
Pedersen acl2011-business-meetingPedersen acl2011-business-meeting
Pedersen acl2011-business-meeting
University of Minnesota, Duluth
 
Семантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поискаСемантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поиска
Vitebsk Miniq
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
AlgoAnalytics Financial Consultancy Pvt. Ltd.
 
Répondre à la question automatique avec le web
Répondre à la question automatique avec le webRépondre à la question automatique avec le web
Répondre à la question automatique avec le web
Ahmed Hammami
 
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET Journal
 
The Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsThe Case for Graphs in Supply Chains
The Case for Graphs in Supply Chains
Neo4j
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
Pouria Amirian
 

Similar to Weblog Extraction With Fuzzy Classification Methods (20)

professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
 
Breaking the Google Addiction
Breaking the Google AddictionBreaking the Google Addiction
Breaking the Google Addiction
 
Database novelty detection
Database novelty detectionDatabase novelty detection
Database novelty detection
 
Data Mining with SQL Server 2008
Data Mining with SQL Server 2008Data Mining with SQL Server 2008
Data Mining with SQL Server 2008
 
A survey of web clustering engines
A survey of web clustering enginesA survey of web clustering engines
A survey of web clustering engines
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assoc
 
Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for EntrepreneursLinked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
IRJET- Determining Document Relevance using Keyword Extraction
IRJET-  	  Determining Document Relevance using Keyword ExtractionIRJET-  	  Determining Document Relevance using Keyword Extraction
IRJET- Determining Document Relevance using Keyword Extraction
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question Matching
 
Utilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword researchUtilizing the natural langauage toolkit for keyword research
Utilizing the natural langauage toolkit for keyword research
 
Pedersen acl2011-business-meeting
Pedersen acl2011-business-meetingPedersen acl2011-business-meeting
Pedersen acl2011-business-meeting
 
Семантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поискаСемантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поиска
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
 
Répondre à la question automatique avec le web
Répondre à la question automatique avec le webRépondre à la question automatique avec le web
Répondre à la question automatique avec le web
 
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
IRJET- Towards Efficient Framework for Semantic Query Search Engine in Large-...
 
The Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsThe Case for Graphs in Supply Chains
The Case for Graphs in Supply Chains
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 

Recently uploaded

Community pharmacy- Social and preventive pharmacy UNIT 5
Community pharmacy- Social and preventive pharmacy UNIT 5Community pharmacy- Social and preventive pharmacy UNIT 5
Community pharmacy- Social and preventive pharmacy UNIT 5
sayalidalavi006
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 

Recently uploaded (20)

Community pharmacy- Social and preventive pharmacy UNIT 5
Community pharmacy- Social and preventive pharmacy UNIT 5Community pharmacy- Social and preventive pharmacy UNIT 5
Community pharmacy- Social and preventive pharmacy UNIT 5
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 

Weblog Extraction With Fuzzy Classification Methods

  • 1. Weblog Extraction with Fuzzy Classification Methods Edy Portmann - University of Fribourg - Switzerland
  • 2. Content Introduction Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers
  • 3. Weblog extraction Website with regular (reverse-chronological) entries of comments, descriptions of events, or other material Provide instantnews on a particular subject and the readers can leave comments Data extraction is the act or process of retrieving data out of unstructured data sources
  • 4.
  • 5.
  • 6. Hard vs. fuzzy clustering In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster In fuzzy clustering, data elements can belong to more than one cluster, and associated with each element is a set of membership levels
  • 7. Content Introduction Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers
  • 9. Interface Blogretrievr www.blogretrievr.com/ Blogretrievr™ I Yo-yo I 1 3 FuzzynessFactor 2 Caption 1. Search box 2. Fuzzyness Factor 3. Go!
  • 10. Query engine: Grassroots Tagging Tags Yo-yo According to these tags, yo-yo, triangle and the colours green, red and blue they must be related in some way! But in which way? Triangle Green Tags Yo-yo Triangle Red Tags Yo-yo Triangle Blue
  • 11. Query engine: Jaccard coefficient B A Jaccard coefficient A B AB AB A A B A B B A B C Not at all similar Somewhatsimilar Quitesimilar
  • 12. Query engine: fuzzy c-means (FCM) d FCM is a method of clustering which allows one piece of data to belong to two or more clusters d d d d
  • 13. Query engine: fuzzy c-means (FCM) The algorithm defines for each term the belonging to a certain cluster It is possible that a term belongs to more than one cluster
  • 14. Query engine: iterative FCM The same terms which belongs to different clusters will be linked together The clusters and the membership degrees remain still Membership Level Green Red Blue
  • 15. Query engine: iterative FCM (ontology) Each term is linked with other terms Every other term is again linked with terms Every new source tagged (in the Internet) causes new term-links A Membership Cluster Green Red Blue
  • 16. Query engine: dendrogram d 4 3 1 2 6 1 2 3 5 2 4 1 3 Membership Level Red Blue Green
  • 17. Meta search engine Action Blogosphere Fuzzy set search query 1 2 3 2. The meta search engine sends the fuzzy set search query to other blog search engines Technorati 3. Each blog search engines send the query to the blogosphere… Meta search engine Blogdigger 4. …and gathers the results etc. 5. The meta search engine collects all results… 6. …and aggregates them 4 5 6
  • 18. Aggregated documents Blogretrievr www.blogretrievr.com/ Blogretrievr™ Yo-yo Hand puppet I I 5 FuzzynessFactor 1 2 Caption 1. Search Map 2. Search Results 3. Map Rotation 4. Zoom in/out 5. New search 3 4
  • 19. Content Introduction Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers
  • 20. Example: problem specifications What is coming around the edge? Samsung is screening the competitors for new killer applications In the blogosphere new technologies are discussed earlier than in other media OLED LCD LED OEL
  • 21. Example: Pre-search OEL [0.6,1] OLED LED [0.9,1] is related OLED [1] 0.9 LED 0.6 OEL
  • 22. Example: The search Search for an weblog with new OLED technology The membership degree is [0.8,1] This includes OLED [1] and LED [0.9,1] But not OEL [0.6,1] OEL [0.6,1] [0.8..1] LED [0.9,1] OLED [1] FuzzynessFactor
  • 23.
  • 24. Not found with Fuzzy Search [0.8..1]Found with Boolean Search Found with Fuzzy Search [0.8..1] OLED LCD LED OEL OLED LCD LED OEL
  • 25. Content Introduction Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers
  • 26.
  • 27. This function takes values in the interval [0,1] Relationship in a fuzzy set is intrinsically steady instead of abrupt As a result it is possible to find more relevant documents
  • 28.
  • 29. View similar results together in folders rather than scattered throughout a listConcluding remarks
  • 30. Content Introduction Weblog extraction – Folksonomies - Fuzzy logic – Fuzzy data clustering Fuzzy weblog extraction Building blocks – Interface - Query engine - Meta search engine - Aggregated documents Example Concluding Remarks Questions and Answers