SlideShare a Scribd company logo
1 of 40
Download to read offline
+ 
Extending Faceted Search to the 
General Web 
2014/11/25 (Tue.) 
Weize Kong, James Allan 
CIKM‘14 
Chang Wei-Yuan @ MakeLab Group Meeting
+Outline 
n Introduction 
n Method 
n Facet Generation 
n Facet Feedback 
n Evaluation 
n Experiment 
n Conclusion 
n Thought 
2
+Outline 
n Introduction 
n Method 
n Facet Generation 
n Facet Feedback 
n Evaluation 
n Experiment 
n Conclusion 
n Thought 
3
+Introduction 
n Faceted search helps users by offering 
drill-down options as a complement to 
the keyword input box. 
4
+Introduction 
n However, this idea is not well explored 
for general web search. 
n heterogeneous nature 
5
+Introduction 
n However, this idea is not well explored 
for general web search. 
n heterogeneous nature 
6 
baggage allowance 
所有航線 
所有航線 
國際航線 
國內航線 
貨運公司 
行李類型
+Introduction 
n However, this idea is not well explored 
for general web search. 
n heterogeneous nature 
7 
baggage allowance 
所有航線 
所有航線 
國際航線 
國內航線 
貨運公司 
行李類型 
← query 
← facet 
← facet term 
↓ search result ( ducument)
+Introduction 
n Goal : 
n query-dependent automatic facet generation 
n user feedback on these query facets into 
document ranking 
8
+Outline 
n Introduction 
n Method 
n Facet Generation 
n Facet Feedback 
n Evaluation 
n Experiment 
n Conclusion 
n Thought 
9
+Flow Chart 
10 
Search 
Result 
Candidate 
Facets 
Facets 
Selected 
Terms 
Top-ranked 
Documents 
Search 
Result 
Query 
Extracting 
Candidates 
Refining 
Candidates 
Facet 
Feedback
+Flow Chart 
11 
Search 
Result 
Candidate 
Facets 
Facets 
Selected 
Terms 
Top-ranked 
Documents 
Search 
Result 
Query 
Extracting 
Candidates 
Refining 
Candidates 
Facet 
Feedback
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
12 
n Input : Query and Search Result 
n Step 1 : Extracting Candidates 
n Step 2 : Refining Candidates 
n Output : Query Facet
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
13 
n Step 1 : Extracting Candidates 
n applied both textual and HTML patterns on 
the top search results
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
14 
n Step 1 : Extracting Candidates 
n query : “mars landing” 
n search results 
n “ Mars rovers such as Curiosity, Opportunity 
and Spirit ” 
n candidate facets 
n C : { Curiosity, Opportunity, Spirit }
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
15 
n Step 1 : Extracting Candidates 
n the candidate query facets extracted. 
n noisy 
n non-relevant to the issued query 
n terms be not members of the same class
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
16 
n Step 1 : Extracting Candidates 
n query : “mars landing” 
n candidate facets :
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
17 
n Step 1 : Extracting Candidates 
n query : “mars landing” 
n candidate facets :
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
18 
n Step 1 : Extracting Candidates 
n query : “mars landing” 
n candidate facets : 
n Refine
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
19 
n Step 2 : Refining Candidates 
n re-cluster the query facets or their facet 
terms into higher quality query facets
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
20 
n Step 2 : Refining Candidates 
n Topic modeling 
n pLSA, LDA 
n Unsupervised clustering method 
n QDMiner, QDM 
n Super-vised methods based on a 
graphical model 
n QF-I, QF-J
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
21 
n Input : Query and Search Result 
n Step 1 : Extracting Candidates 
n Step 2 : Refining Candidates 
n Output : Facet : { a set of terms } 
n Year : { 2007, 2011, 2012 } 
n Lab : { NASA, Mars Science Lab, Curiosity Lab }
+Flow Chart 
22 
Search 
Result 
Candidate 
Facets 
Facets 
Selected 
Terms 
Top-ranked 
Documents 
Search 
Result 
Query 
Extracting 
Candidates 
Refining 
Candidates 
Facet 
Feedback
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
23 
n Input : Document, Query, User Selection 
n Document = one of search result 
n Boolean Filtering Model 
n Soft Ranking Model 
n Output : the score of each document
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
24 
n Boolean Filtering Model 
n Fu denotes the set of feedback facets which 
user selected 
n condition B can be either AND, OR, or A+O 
n S(D, Q) is the score returned by the original 
retrieval model
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
25 
n Soft Ranking Model 
n λ is a parameter for adjusting the weight 
n SE(D, Fu) is the expansion part which captures 
the relevance between the document and 
feedback facet
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
26 
n Input : Documents, Query, User Selection 
n Boolean Filtering Model 
n Soft Ranking Model 
n Output : the score of each document
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
27 
n Intrinsic Evaluation 
n Ground Truth: query facets are constructed 
by human annotators 
n annotators are asked to group or re-group 
terms in the pool into preferred query facets. 
n pooling facets generated by the different systems 
n compared with facets generated by different 
systems
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
28 
n Extrinsic Evaluation 
n User Model 
n The user model describes how a user selects 
feedback terms from facets, based on which we can 
estimate the time cost for the user. 
↑ 
time for selecting terms 
time for scanning facet 
↓
+ 
Facet 
Generation 
Facet 
Feedback 
Evaluation 
29 
n Extrinsic Evaluation 
n Oracle Feedback and Annotator Feedback 
n Oracle feedback model only selected effective terms 
as feedback. 
n The annotator is asked to select all the terms from 
the facets that would help address the information 
need.
+Outline 
n Introduction 
n Method 
n Facet Generation 
n Facet Feedback 
n Evaluation 
n Experiment 
n Conclusion 
n Thought 
30
+Experiment Settings 
n Dataset 
n For the document corpus, we use the ClueWeb09 
Category-B collection. 
n 196 queries and 678 query subtopics 
n Facet Generation Models 
n pLSA, LDA, QDM, QF-I and QF-J 
n Facet Feedback Models 
n Boolean filtering models, soft ranking models 
n Baseline Retrieval Model 
n SDM, and its MAP(Mean average precision) = 0.185 
31
+Facet Generation Models 
32
+Facet Generation Models 
33 
based on annotator feedback 
and SF feedback model 
based on oracle feedback 
and SF feedback model.
+Facet Generation Models 
34 
Our experiments testify to the 
potential of Faceted Web Search. 
based on annotator feedback 
and SF feedback model 
based on oracle feedback 
and SF feedback model.
+Facet Feedback Models 
35
+Facet Feedback Models 
36 
Our experiments show feedback 
models effective.
+Outline 
n Introduction 
n Method 
n Facet Generation 
n Facet Feedback 
n Evaluation 
n Experiment 
n Conclusion 
n Thought 
37
+Conclusion 
n This paper proposed Faceted Web 
Search. 
n an extension of faceted search to the general 
Web 
n query-dependent automatic facet 
generation 
n feedback on these query facets into 
document ranking 
38
+Outline 
n Introduction 
n Method 
n Facet Generation 
n Facet Feedback 
n Evaluation 
n Experiment 
n Conclusion 
n Thought 
39
+ 
Thanks for listening. 
2014 / 11 / 25 (Tue.) @ MakeLab Group Meeting 
v123582@gmail.com

More Related Content

Viewers also liked

Centinela del mar ross
Centinela del mar rossCentinela del mar ross
Centinela del mar rossjorross
 
Search Patterns: An Early Talk
Search Patterns: An Early TalkSearch Patterns: An Early Talk
Search Patterns: An Early TalkPeter Morville
 
The Future is All Mine
The Future is All MineThe Future is All Mine
The Future is All Mineopenminted_eu
 
Are users really ready for faceted search?
Are users really ready for faceted search?Are users really ready for faceted search?
Are users really ready for faceted search?epek
 
Faceted Search And Result Reordering
Faceted Search And Result ReorderingFaceted Search And Result Reordering
Faceted Search And Result ReorderingVarun Thacker
 
Faceted Search and Solr
Faceted Search and SolrFaceted Search and Solr
Faceted Search and Solrotisg
 
Faceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StoryFaceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StorySourcesense
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 

Viewers also liked (10)

Centinela del mar ross
Centinela del mar rossCentinela del mar ross
Centinela del mar ross
 
Search Patterns: An Early Talk
Search Patterns: An Early TalkSearch Patterns: An Early Talk
Search Patterns: An Early Talk
 
The Future is All Mine
The Future is All MineThe Future is All Mine
The Future is All Mine
 
Are users really ready for faceted search?
Are users really ready for faceted search?Are users really ready for faceted search?
Are users really ready for faceted search?
 
Apache Solr vs Oracle Endeca
Apache Solr vs Oracle EndecaApache Solr vs Oracle Endeca
Apache Solr vs Oracle Endeca
 
Data mining
Data miningData mining
Data mining
 
Faceted Search And Result Reordering
Faceted Search And Result ReorderingFaceted Search And Result Reordering
Faceted Search And Result Reordering
 
Faceted Search and Solr
Faceted Search and SolrFaceted Search and Solr
Faceted Search and Solr
 
Faceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StoryFaceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents Story
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 

Similar to Extending faceted search to the general web

Fast, Lenient, and Accurate – Building Personalized Instant Search Experience...
Fast, Lenient, and Accurate – Building Personalized Instant Search Experience...Fast, Lenient, and Accurate – Building Personalized Instant Search Experience...
Fast, Lenient, and Accurate – Building Personalized Instant Search Experience...Abhimanyu Lad
 
How many folders do you really need ? Classifying email into a handful of cat...
How many folders do you really need ? Classifying email into a handful of cat...How many folders do you really need ? Classifying email into a handful of cat...
How many folders do you really need ? Classifying email into a handful of cat...Wei-Yuan Chang
 
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple ViewsA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Viewscollwe
 
Real-world News Recommender Systems
Real-world News Recommender SystemsReal-world News Recommender Systems
Real-world News Recommender Systemskib_83
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxJadna Almeida
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxJadna Almeida
 
Web Page Ranking using Machine Learning
Web Page Ranking using Machine LearningWeb Page Ranking using Machine Learning
Web Page Ranking using Machine LearningPradip Rahul
 
Hybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
Hybridisation Techniques for Cold-Starting Context-Aware Recommender SystemsHybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
Hybridisation Techniques for Cold-Starting Context-Aware Recommender SystemsMatthias Braunhofer
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsMatthias Braunhofer
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentAmrapali Zaveri, PhD
 
Entity Summarization with User Feedback (ESWC 2020)
Entity Summarization with User Feedback (ESWC 2020)Entity Summarization with User Feedback (ESWC 2020)
Entity Summarization with User Feedback (ESWC 2020)Qingxia Liu
 
Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model
Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation ModelIncorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model
Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation ModelRand Fishkin
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsYONG ZHENG
 
Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationFaegheh Hasibi
 

Similar to Extending faceted search to the general web (20)

Fast, Lenient, and Accurate – Building Personalized Instant Search Experience...
Fast, Lenient, and Accurate – Building Personalized Instant Search Experience...Fast, Lenient, and Accurate – Building Personalized Instant Search Experience...
Fast, Lenient, and Accurate – Building Personalized Instant Search Experience...
 
How many folders do you really need ? Classifying email into a handful of cat...
How many folders do you really need ? Classifying email into a handful of cat...How many folders do you really need ? Classifying email into a handful of cat...
How many folders do you really need ? Classifying email into a handful of cat...
 
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple ViewsA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views
 
Entity2rec recsys
Entity2rec recsysEntity2rec recsys
Entity2rec recsys
 
Real-world News Recommender Systems
Real-world News Recommender SystemsReal-world News Recommender Systems
Real-world News Recommender Systems
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Web Page Ranking using Machine Learning
Web Page Ranking using Machine LearningWeb Page Ranking using Machine Learning
Web Page Ranking using Machine Learning
 
Hybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
Hybridisation Techniques for Cold-Starting Context-Aware Recommender SystemsHybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
Hybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender Systems
 
Contextual information elicitation in travel recommender systems
Contextual information elicitation in travel recommender systemsContextual information elicitation in travel recommender systems
Contextual information elicitation in travel recommender systems
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Entity Summarization with User Feedback (ESWC 2020)
Entity Summarization with User Feedback (ESWC 2020)Entity Summarization with User Feedback (ESWC 2020)
Entity Summarization with User Feedback (ESWC 2020)
 
Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model
Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation ModelIncorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model
Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model
 
master_thesis.pdf
master_thesis.pdfmaster_thesis.pdf
master_thesis.pdf
 
Analysis of the Datasets
Analysis of the DatasetsAnalysis of the Datasets
Analysis of the Datasets
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
Apsec 2014 Presentation
Apsec 2014 PresentationApsec 2014 Presentation
Apsec 2014 Presentation
 
Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and Evaluation
 
A Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF GraphsA Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF Graphs
 

More from Wei-Yuan Chang

Python Fundamentals - Basic
Python Fundamentals - BasicPython Fundamentals - Basic
Python Fundamentals - BasicWei-Yuan Chang
 
Data Analysis with Python - Pandas | WeiYuan
Data Analysis with Python - Pandas | WeiYuanData Analysis with Python - Pandas | WeiYuan
Data Analysis with Python - Pandas | WeiYuanWei-Yuan Chang
 
Data Crawler using Python (I) | WeiYuan
Data Crawler using Python (I) | WeiYuanData Crawler using Python (I) | WeiYuan
Data Crawler using Python (I) | WeiYuanWei-Yuan Chang
 
Learning to Use Git | WeiYuan
Learning to Use Git | WeiYuanLearning to Use Git | WeiYuan
Learning to Use Git | WeiYuanWei-Yuan Chang
 
Scientific Computing with Python - NumPy | WeiYuan
Scientific Computing with Python - NumPy | WeiYuanScientific Computing with Python - NumPy | WeiYuan
Scientific Computing with Python - NumPy | WeiYuanWei-Yuan Chang
 
Basic Web Development | WeiYuan
Basic Web Development | WeiYuanBasic Web Development | WeiYuan
Basic Web Development | WeiYuanWei-Yuan Chang
 
資料視覺化 - D3 的第一堂課 | WeiYuan
資料視覺化 - D3 的第一堂課 | WeiYuan資料視覺化 - D3 的第一堂課 | WeiYuan
資料視覺化 - D3 的第一堂課 | WeiYuanWei-Yuan Chang
 
JavaScript Beginner Tutorial | WeiYuan
JavaScript Beginner Tutorial | WeiYuanJavaScript Beginner Tutorial | WeiYuan
JavaScript Beginner Tutorial | WeiYuanWei-Yuan Chang
 
Python fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuanPython fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuanWei-Yuan Chang
 
Introduce to PredictionIO
Introduce to PredictionIOIntroduce to PredictionIO
Introduce to PredictionIOWei-Yuan Chang
 
Analysis and Classification of Respiratory Health Risks with Respect to Air P...
Analysis and Classification of Respiratory Health Risks with Respect to Air P...Analysis and Classification of Respiratory Health Risks with Respect to Air P...
Analysis and Classification of Respiratory Health Risks with Respect to Air P...Wei-Yuan Chang
 
Forecasting Fine Grained Air Quality Based on Big Data
Forecasting Fine Grained Air Quality Based on Big DataForecasting Fine Grained Air Quality Based on Big Data
Forecasting Fine Grained Air Quality Based on Big DataWei-Yuan Chang
 
On the Coverage of Science in the Media a Big Data Study on the Impact of th...
On the Coverage of Science in the Media a Big Data Study on the Impact of th...On the Coverage of Science in the Media a Big Data Study on the Impact of th...
On the Coverage of Science in the Media a Big Data Study on the Impact of th...Wei-Yuan Chang
 
On the Ground Validation of Online Diagnosis with Twitter and Medical Records
On the Ground Validation of Online Diagnosis with Twitter and Medical RecordsOn the Ground Validation of Online Diagnosis with Twitter and Medical Records
On the Ground Validation of Online Diagnosis with Twitter and Medical RecordsWei-Yuan Chang
 
Effective Event Identification in Social Media
Effective Event Identification in Social MediaEffective Event Identification in Social Media
Effective Event Identification in Social MediaWei-Yuan Chang
 
Eears (earthquake alert and report system) a real time decision support syst...
Eears (earthquake alert and report system)  a real time decision support syst...Eears (earthquake alert and report system)  a real time decision support syst...
Eears (earthquake alert and report system) a real time decision support syst...Wei-Yuan Chang
 
Fine Grained Location Extraction from Tweets with Temporal Awareness
Fine Grained Location Extraction from Tweets with Temporal AwarenessFine Grained Location Extraction from Tweets with Temporal Awareness
Fine Grained Location Extraction from Tweets with Temporal AwarenessWei-Yuan Chang
 
Practical Lessons from Predicting Clicks on Ads at Facebook
Practical Lessons from Predicting Clicks on Ads at FacebookPractical Lessons from Predicting Clicks on Ads at Facebook
Practical Lessons from Predicting Clicks on Ads at FacebookWei-Yuan Chang
 
Discovering human places of interest from multimodal mobile phone data
Discovering human places of interest from multimodal mobile phone dataDiscovering human places of interest from multimodal mobile phone data
Discovering human places of interest from multimodal mobile phone dataWei-Yuan Chang
 
Online Debate Summarization using Topic Directed Sentiment Analysis
Online Debate Summarization using Topic Directed Sentiment AnalysisOnline Debate Summarization using Topic Directed Sentiment Analysis
Online Debate Summarization using Topic Directed Sentiment AnalysisWei-Yuan Chang
 

More from Wei-Yuan Chang (20)

Python Fundamentals - Basic
Python Fundamentals - BasicPython Fundamentals - Basic
Python Fundamentals - Basic
 
Data Analysis with Python - Pandas | WeiYuan
Data Analysis with Python - Pandas | WeiYuanData Analysis with Python - Pandas | WeiYuan
Data Analysis with Python - Pandas | WeiYuan
 
Data Crawler using Python (I) | WeiYuan
Data Crawler using Python (I) | WeiYuanData Crawler using Python (I) | WeiYuan
Data Crawler using Python (I) | WeiYuan
 
Learning to Use Git | WeiYuan
Learning to Use Git | WeiYuanLearning to Use Git | WeiYuan
Learning to Use Git | WeiYuan
 
Scientific Computing with Python - NumPy | WeiYuan
Scientific Computing with Python - NumPy | WeiYuanScientific Computing with Python - NumPy | WeiYuan
Scientific Computing with Python - NumPy | WeiYuan
 
Basic Web Development | WeiYuan
Basic Web Development | WeiYuanBasic Web Development | WeiYuan
Basic Web Development | WeiYuan
 
資料視覺化 - D3 的第一堂課 | WeiYuan
資料視覺化 - D3 的第一堂課 | WeiYuan資料視覺化 - D3 的第一堂課 | WeiYuan
資料視覺化 - D3 的第一堂課 | WeiYuan
 
JavaScript Beginner Tutorial | WeiYuan
JavaScript Beginner Tutorial | WeiYuanJavaScript Beginner Tutorial | WeiYuan
JavaScript Beginner Tutorial | WeiYuan
 
Python fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuanPython fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuan
 
Introduce to PredictionIO
Introduce to PredictionIOIntroduce to PredictionIO
Introduce to PredictionIO
 
Analysis and Classification of Respiratory Health Risks with Respect to Air P...
Analysis and Classification of Respiratory Health Risks with Respect to Air P...Analysis and Classification of Respiratory Health Risks with Respect to Air P...
Analysis and Classification of Respiratory Health Risks with Respect to Air P...
 
Forecasting Fine Grained Air Quality Based on Big Data
Forecasting Fine Grained Air Quality Based on Big DataForecasting Fine Grained Air Quality Based on Big Data
Forecasting Fine Grained Air Quality Based on Big Data
 
On the Coverage of Science in the Media a Big Data Study on the Impact of th...
On the Coverage of Science in the Media a Big Data Study on the Impact of th...On the Coverage of Science in the Media a Big Data Study on the Impact of th...
On the Coverage of Science in the Media a Big Data Study on the Impact of th...
 
On the Ground Validation of Online Diagnosis with Twitter and Medical Records
On the Ground Validation of Online Diagnosis with Twitter and Medical RecordsOn the Ground Validation of Online Diagnosis with Twitter and Medical Records
On the Ground Validation of Online Diagnosis with Twitter and Medical Records
 
Effective Event Identification in Social Media
Effective Event Identification in Social MediaEffective Event Identification in Social Media
Effective Event Identification in Social Media
 
Eears (earthquake alert and report system) a real time decision support syst...
Eears (earthquake alert and report system)  a real time decision support syst...Eears (earthquake alert and report system)  a real time decision support syst...
Eears (earthquake alert and report system) a real time decision support syst...
 
Fine Grained Location Extraction from Tweets with Temporal Awareness
Fine Grained Location Extraction from Tweets with Temporal AwarenessFine Grained Location Extraction from Tweets with Temporal Awareness
Fine Grained Location Extraction from Tweets with Temporal Awareness
 
Practical Lessons from Predicting Clicks on Ads at Facebook
Practical Lessons from Predicting Clicks on Ads at FacebookPractical Lessons from Predicting Clicks on Ads at Facebook
Practical Lessons from Predicting Clicks on Ads at Facebook
 
Discovering human places of interest from multimodal mobile phone data
Discovering human places of interest from multimodal mobile phone dataDiscovering human places of interest from multimodal mobile phone data
Discovering human places of interest from multimodal mobile phone data
 
Online Debate Summarization using Topic Directed Sentiment Analysis
Online Debate Summarization using Topic Directed Sentiment AnalysisOnline Debate Summarization using Topic Directed Sentiment Analysis
Online Debate Summarization using Topic Directed Sentiment Analysis
 

Recently uploaded

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themeitharjee
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 

Recently uploaded (20)

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 

Extending faceted search to the general web

  • 1. + Extending Faceted Search to the General Web 2014/11/25 (Tue.) Weize Kong, James Allan CIKM‘14 Chang Wei-Yuan @ MakeLab Group Meeting
  • 2. +Outline n Introduction n Method n Facet Generation n Facet Feedback n Evaluation n Experiment n Conclusion n Thought 2
  • 3. +Outline n Introduction n Method n Facet Generation n Facet Feedback n Evaluation n Experiment n Conclusion n Thought 3
  • 4. +Introduction n Faceted search helps users by offering drill-down options as a complement to the keyword input box. 4
  • 5. +Introduction n However, this idea is not well explored for general web search. n heterogeneous nature 5
  • 6. +Introduction n However, this idea is not well explored for general web search. n heterogeneous nature 6 baggage allowance 所有航線 所有航線 國際航線 國內航線 貨運公司 行李類型
  • 7. +Introduction n However, this idea is not well explored for general web search. n heterogeneous nature 7 baggage allowance 所有航線 所有航線 國際航線 國內航線 貨運公司 行李類型 ← query ← facet ← facet term ↓ search result ( ducument)
  • 8. +Introduction n Goal : n query-dependent automatic facet generation n user feedback on these query facets into document ranking 8
  • 9. +Outline n Introduction n Method n Facet Generation n Facet Feedback n Evaluation n Experiment n Conclusion n Thought 9
  • 10. +Flow Chart 10 Search Result Candidate Facets Facets Selected Terms Top-ranked Documents Search Result Query Extracting Candidates Refining Candidates Facet Feedback
  • 11. +Flow Chart 11 Search Result Candidate Facets Facets Selected Terms Top-ranked Documents Search Result Query Extracting Candidates Refining Candidates Facet Feedback
  • 12. + Facet Generation Facet Feedback Evaluation 12 n Input : Query and Search Result n Step 1 : Extracting Candidates n Step 2 : Refining Candidates n Output : Query Facet
  • 13. + Facet Generation Facet Feedback Evaluation 13 n Step 1 : Extracting Candidates n applied both textual and HTML patterns on the top search results
  • 14. + Facet Generation Facet Feedback Evaluation 14 n Step 1 : Extracting Candidates n query : “mars landing” n search results n “ Mars rovers such as Curiosity, Opportunity and Spirit ” n candidate facets n C : { Curiosity, Opportunity, Spirit }
  • 15. + Facet Generation Facet Feedback Evaluation 15 n Step 1 : Extracting Candidates n the candidate query facets extracted. n noisy n non-relevant to the issued query n terms be not members of the same class
  • 16. + Facet Generation Facet Feedback Evaluation 16 n Step 1 : Extracting Candidates n query : “mars landing” n candidate facets :
  • 17. + Facet Generation Facet Feedback Evaluation 17 n Step 1 : Extracting Candidates n query : “mars landing” n candidate facets :
  • 18. + Facet Generation Facet Feedback Evaluation 18 n Step 1 : Extracting Candidates n query : “mars landing” n candidate facets : n Refine
  • 19. + Facet Generation Facet Feedback Evaluation 19 n Step 2 : Refining Candidates n re-cluster the query facets or their facet terms into higher quality query facets
  • 20. + Facet Generation Facet Feedback Evaluation 20 n Step 2 : Refining Candidates n Topic modeling n pLSA, LDA n Unsupervised clustering method n QDMiner, QDM n Super-vised methods based on a graphical model n QF-I, QF-J
  • 21. + Facet Generation Facet Feedback Evaluation 21 n Input : Query and Search Result n Step 1 : Extracting Candidates n Step 2 : Refining Candidates n Output : Facet : { a set of terms } n Year : { 2007, 2011, 2012 } n Lab : { NASA, Mars Science Lab, Curiosity Lab }
  • 22. +Flow Chart 22 Search Result Candidate Facets Facets Selected Terms Top-ranked Documents Search Result Query Extracting Candidates Refining Candidates Facet Feedback
  • 23. + Facet Generation Facet Feedback Evaluation 23 n Input : Document, Query, User Selection n Document = one of search result n Boolean Filtering Model n Soft Ranking Model n Output : the score of each document
  • 24. + Facet Generation Facet Feedback Evaluation 24 n Boolean Filtering Model n Fu denotes the set of feedback facets which user selected n condition B can be either AND, OR, or A+O n S(D, Q) is the score returned by the original retrieval model
  • 25. + Facet Generation Facet Feedback Evaluation 25 n Soft Ranking Model n λ is a parameter for adjusting the weight n SE(D, Fu) is the expansion part which captures the relevance between the document and feedback facet
  • 26. + Facet Generation Facet Feedback Evaluation 26 n Input : Documents, Query, User Selection n Boolean Filtering Model n Soft Ranking Model n Output : the score of each document
  • 27. + Facet Generation Facet Feedback Evaluation 27 n Intrinsic Evaluation n Ground Truth: query facets are constructed by human annotators n annotators are asked to group or re-group terms in the pool into preferred query facets. n pooling facets generated by the different systems n compared with facets generated by different systems
  • 28. + Facet Generation Facet Feedback Evaluation 28 n Extrinsic Evaluation n User Model n The user model describes how a user selects feedback terms from facets, based on which we can estimate the time cost for the user. ↑ time for selecting terms time for scanning facet ↓
  • 29. + Facet Generation Facet Feedback Evaluation 29 n Extrinsic Evaluation n Oracle Feedback and Annotator Feedback n Oracle feedback model only selected effective terms as feedback. n The annotator is asked to select all the terms from the facets that would help address the information need.
  • 30. +Outline n Introduction n Method n Facet Generation n Facet Feedback n Evaluation n Experiment n Conclusion n Thought 30
  • 31. +Experiment Settings n Dataset n For the document corpus, we use the ClueWeb09 Category-B collection. n 196 queries and 678 query subtopics n Facet Generation Models n pLSA, LDA, QDM, QF-I and QF-J n Facet Feedback Models n Boolean filtering models, soft ranking models n Baseline Retrieval Model n SDM, and its MAP(Mean average precision) = 0.185 31
  • 33. +Facet Generation Models 33 based on annotator feedback and SF feedback model based on oracle feedback and SF feedback model.
  • 34. +Facet Generation Models 34 Our experiments testify to the potential of Faceted Web Search. based on annotator feedback and SF feedback model based on oracle feedback and SF feedback model.
  • 36. +Facet Feedback Models 36 Our experiments show feedback models effective.
  • 37. +Outline n Introduction n Method n Facet Generation n Facet Feedback n Evaluation n Experiment n Conclusion n Thought 37
  • 38. +Conclusion n This paper proposed Faceted Web Search. n an extension of faceted search to the general Web n query-dependent automatic facet generation n feedback on these query facets into document ranking 38
  • 39. +Outline n Introduction n Method n Facet Generation n Facet Feedback n Evaluation n Experiment n Conclusion n Thought 39
  • 40. + Thanks for listening. 2014 / 11 / 25 (Tue.) @ MakeLab Group Meeting v123582@gmail.com