SlideShare a Scribd company logo
1 of 30
Explass: Exploring Associations between Entities 
via Top-K Ontological Patterns and Facets 
Gong Cheng, Yanan Zhang, Yuzhong Qu 
Websoft Research Group 
State Key Laboratory for Novel Software Technology 
Nanjing University, China
Association search
Association search 
? 
air pollution ? autism 
?
Association search 
You 
? 
? 
? 
?
Association search on the Web of documents 
associations hidden in text
Association search on an entity-relation graph 
paper-A conf-A 
Alice Bob 
article-A 
conf-B 
paper-B 
paper-C 
paper-D 
inProcOf 
secondAuthor reviewer 
chair 
firstAuthor 
firstAuthor inProcOf 
secondAuthor cites 
cites 
extends 
firstAuthor 
associations exposed as graph
association = path 
Alice Bob 
paper-A conf-A secondAuthor inProcOf reviewer 
paper-B conf-B firstAuthor inProcOf chair 
paper-B paper-C firstAuthor cites firstAuthor 
paper-D paper-C secondAuthor cites firstAuthor 
paper-D article-A secondAuthor extends firstAuthor
Challenge 
over 1,000 associations 
in DBpedia 
(within 4 hops) 
How to explore them?
Exploration methods (1) 
• Clustering 
• Facets
cluster = pattern 
Common 
super-property Common class 
Paper Conference author inProcOf role 
paper-A conf-A secondAuthor inProcOf reviewer 
paper-B conf-B firstAuthor inProcOf chair 
Position 1 Position 2 Position 3 Position 4 Position 5 
pattern 
match 
associations
Problem: To recommend k patterns 
paper-A conf-A secondAuthor inProcOf reviewer 
paper-B conf-B firstAuthor inProcOf chair 
paper-B paper-C firstAuthor cites firstAuthor 
paper-D paper-C secondAuthor cites firstAuthor 
paper-D article-A secondAuthor extends firstAuthor
Step 1: Mining all significant patterns 
Paper Conference author inProcOf role 
frequency = 2/5 > threshold 
paper-A conf-A secondAuthor inProcOf reviewer 
paper-B conf-B firstAuthor inProcOf chair 
paper-B paper-C firstAuthor cites firstAuthor 
paper-D paper-C secondAuthor cites firstAuthor 
paper-D article-A secondAuthor extends firstAuthor
Formulated as frequent itemset mining 
1. transaction = association 
item = <position, class> or <position, property> 
2. Mining frequent itemsets 
3. itemset  pattern 
paper-A conf-A secondAuthor inProcOf reviewer 
<1, secondAuthor> 
<1, author> 
<2, ConfPaper> 
<2, Paper> 
<3, inProcOf> <4, Conference> <5, reviewer> 
<5, role> 
Position 1 Position 2 Position 3 Position 4 Position 5
Formulated as frequent itemset mining 
1. transaction = association 
item = <position, class> or <position, property> 
2. Mining frequent itemsets 
3. itemset  pattern 
paper-A conf-A secondAuthor inProcOf reviewer 
<1, author> 
<2, ConfPaper> 
<2, Paper> 
<3, inProcOf> <4, Conference> 
<5, role> 
Position 1 Position 2 Position 3 Position 4 Position 5
Formulated as frequent itemset mining 
1. transaction = association 
item = <position, class> or <position, property> 
2. Mining frequent itemsets 
3. itemset  pattern 
paper-A conf-A secondAuthor inProcOf reviewer 
<1, author> 
<2, ConfPaper> 
<2, Paper> 
<3, inProcOf> <4, Conference> 
<5, role> 
Paper Conference author inProcOf role
Step 2: Finding k frequent, informative, and 
small-overlapping patterns 
• Frequency (as previous) 
• Informativeness 
• Overlap
Step 2: Finding k frequent, informative, and 
small-overlapping patterns 
• Frequency (as previous) 
• Informativeness 
• informativeness of a class = self-information of its occurrence 
(more informative = having fewer instances) 
e.g. ConfPaper > Paper 
• informativeness of a property = entropy of its values 
(more Informative = having more diverse values) 
e.g. is-author-of > nationality 
• Overlap 
Paper Conference author inProcOf role
Step 2: Finding k frequent, informative, and 
small-overlapping patterns 
• Frequency (as previous) 
• Informativeness 
• Overlap 
• Ontological overlap: holding subClassOf/subPropertyOf relations 
• Contextual overlap: matched by common associations in the results 
ConfPaper Conference author inProcOf role 
ontological 
overlap 
Paper Paper firstAuthor cites author
Formulated as multidimensional 0-1 knapsack 
• Find k patterns that 
maximize frequency*Informativeness (goal) 
and not share considerably large overlap (constraints) 
• Solved by a greedy algorithm
Exploration methods (2) 
• Clustering 
• Facets 
• facet values = classes of entities and properties 
appearing in associations in the results 
• Problem: To recommend k facet values 
(solved in a similar way) 
ConfPaper Paper Conference 
paper-A conf-A secondAuthor inProcOf reviewer
Demo based on DBpedia 
ws.nju.edu.cn/explass
Demo based on DBpedia 
ws.nju.edu.cn/explass 
facet values 
(classes) 
facet values 
(properties)
Demo based on DBpedia 
ws.nju.edu.cn/explass 
a collapsed 
pattern 
an expanded 
pattern 
associations not matching 
any pattern above
User study 
• 26 association exploration tasks over DBpedia 
• Derived from QALD queries and 
“People also search for” 
• Example: Suppose you will write an article 
about the associations between Abraham 
Lincoln and George Washington. Use the given 
system to explore their associations and 
identify several themes to discuss in the article. 
• 20 subjects 
• 3 approaches 
• Explass: clustering + facets 
• RelClus: clustering into a hierarchy of patterns 
• RF: facets only (similar to RelFinder) 
from QALD
Post-task questionnaire results
Usability scores (SUS)
User behavior
Conclusion 
1. Provide patterns wisely. 
• To avoid deep, complicated hierarchy 
• To avoid very general, almost meaningless concepts 
2. Combine patterns and facets wisely. 
• Patterns as meaningful summaries of results 
• Facets as filters for refining the search 
Filters Summaries of results
Future work 
• Performance optimization 
• (online) path finding 
• (online) frequent itemset mining 
• Exploring associations between several entities 
or, a data set
Questions?

More Related Content

Similar to Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
Kalpit Desai
 
Change Management in the Traditional and Semantic Web
Change Management in the Traditional and Semantic WebChange Management in the Traditional and Semantic Web
Change Management in the Traditional and Semantic Web
INRIA-OAK
 

Similar to Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets (20)

The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain Ontology
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)
 
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX toolWi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX tool
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
Introduction to Research Objects - Collaboartions Workshop 2015, Oxford
Introduction to Research Objects - Collaboartions Workshop 2015, OxfordIntroduction to Research Objects - Collaboartions Workshop 2015, Oxford
Introduction to Research Objects - Collaboartions Workshop 2015, Oxford
 
Diversified Social Media Retrieval for News Stories
Diversified Social Media Retrieval for News StoriesDiversified Social Media Retrieval for News Stories
Diversified Social Media Retrieval for News Stories
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
 
CiteSeerX: Mining Scholarly Big Data
CiteSeerX: Mining Scholarly Big DataCiteSeerX: Mining Scholarly Big Data
CiteSeerX: Mining Scholarly Big Data
 
Change Management in the Traditional and Semantic Web
Change Management in the Traditional and Semantic WebChange Management in the Traditional and Semantic Web
Change Management in the Traditional and Semantic Web
 
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research Objects
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and Summarization
 
Eprints Special Session - DC-2006, Mexico
Eprints Special Session - DC-2006, MexicoEprints Special Session - DC-2006, Mexico
Eprints Special Session - DC-2006, Mexico
 
BIBFRAME and OCLC Works: Defining Models and Discovering Evidence
BIBFRAME and OCLC Works: Defining Models and Discovering EvidenceBIBFRAME and OCLC Works: Defining Models and Discovering Evidence
BIBFRAME and OCLC Works: Defining Models and Discovering Evidence
 
Recommender Systems and Linked Open Data
Recommender Systems and Linked Open DataRecommender Systems and Linked Open Data
Recommender Systems and Linked Open Data
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the web
 
Aidan's PhD Viva
Aidan's PhD VivaAidan's PhD Viva
Aidan's PhD Viva
 

More from Gong Cheng

常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析
Gong Cheng
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
Gong Cheng
 
Taking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachTaking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval Approach
Gong Cheng
 

More from Gong Cheng (20)

Towards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and BeyondTowards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and Beyond
 
从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探
 
知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法
 
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
 
知识图谱中的关联搜索
知识图谱中的关联搜索知识图谱中的关联搜索
知识图谱中的关联搜索
 
面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探
 
知识图谱中的实体关联搜索
知识图谱中的实体关联搜索知识图谱中的实体关联搜索
知识图谱中的实体关联搜索
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference review
 
Relatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationRelatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity Summarization
 
Generating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGenerating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the Web
 
常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析
 
Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...
 
Summarizing Semantic Data
Summarizing Semantic DataSummarizing Semantic Data
Summarizing Semantic Data
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
 
Taking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachTaking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval Approach
 
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
 
知识的摘要
知识的摘要知识的摘要
知识的摘要
 
Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...
 
Towards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachTowards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based Approach
 
NJVR: The NanJing Vocabulary Repository
NJVR: The NanJing Vocabulary RepositoryNJVR: The NanJing Vocabulary Repository
NJVR: The NanJing Vocabulary Repository
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

  • 1. Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets Gong Cheng, Yanan Zhang, Yuzhong Qu Websoft Research Group State Key Laboratory for Novel Software Technology Nanjing University, China
  • 3. Association search ? air pollution ? autism ?
  • 5. Association search on the Web of documents associations hidden in text
  • 6. Association search on an entity-relation graph paper-A conf-A Alice Bob article-A conf-B paper-B paper-C paper-D inProcOf secondAuthor reviewer chair firstAuthor firstAuthor inProcOf secondAuthor cites cites extends firstAuthor associations exposed as graph
  • 7. association = path Alice Bob paper-A conf-A secondAuthor inProcOf reviewer paper-B conf-B firstAuthor inProcOf chair paper-B paper-C firstAuthor cites firstAuthor paper-D paper-C secondAuthor cites firstAuthor paper-D article-A secondAuthor extends firstAuthor
  • 8. Challenge over 1,000 associations in DBpedia (within 4 hops) How to explore them?
  • 9. Exploration methods (1) • Clustering • Facets
  • 10. cluster = pattern Common super-property Common class Paper Conference author inProcOf role paper-A conf-A secondAuthor inProcOf reviewer paper-B conf-B firstAuthor inProcOf chair Position 1 Position 2 Position 3 Position 4 Position 5 pattern match associations
  • 11. Problem: To recommend k patterns paper-A conf-A secondAuthor inProcOf reviewer paper-B conf-B firstAuthor inProcOf chair paper-B paper-C firstAuthor cites firstAuthor paper-D paper-C secondAuthor cites firstAuthor paper-D article-A secondAuthor extends firstAuthor
  • 12. Step 1: Mining all significant patterns Paper Conference author inProcOf role frequency = 2/5 > threshold paper-A conf-A secondAuthor inProcOf reviewer paper-B conf-B firstAuthor inProcOf chair paper-B paper-C firstAuthor cites firstAuthor paper-D paper-C secondAuthor cites firstAuthor paper-D article-A secondAuthor extends firstAuthor
  • 13. Formulated as frequent itemset mining 1. transaction = association item = <position, class> or <position, property> 2. Mining frequent itemsets 3. itemset  pattern paper-A conf-A secondAuthor inProcOf reviewer <1, secondAuthor> <1, author> <2, ConfPaper> <2, Paper> <3, inProcOf> <4, Conference> <5, reviewer> <5, role> Position 1 Position 2 Position 3 Position 4 Position 5
  • 14. Formulated as frequent itemset mining 1. transaction = association item = <position, class> or <position, property> 2. Mining frequent itemsets 3. itemset  pattern paper-A conf-A secondAuthor inProcOf reviewer <1, author> <2, ConfPaper> <2, Paper> <3, inProcOf> <4, Conference> <5, role> Position 1 Position 2 Position 3 Position 4 Position 5
  • 15. Formulated as frequent itemset mining 1. transaction = association item = <position, class> or <position, property> 2. Mining frequent itemsets 3. itemset  pattern paper-A conf-A secondAuthor inProcOf reviewer <1, author> <2, ConfPaper> <2, Paper> <3, inProcOf> <4, Conference> <5, role> Paper Conference author inProcOf role
  • 16. Step 2: Finding k frequent, informative, and small-overlapping patterns • Frequency (as previous) • Informativeness • Overlap
  • 17. Step 2: Finding k frequent, informative, and small-overlapping patterns • Frequency (as previous) • Informativeness • informativeness of a class = self-information of its occurrence (more informative = having fewer instances) e.g. ConfPaper > Paper • informativeness of a property = entropy of its values (more Informative = having more diverse values) e.g. is-author-of > nationality • Overlap Paper Conference author inProcOf role
  • 18. Step 2: Finding k frequent, informative, and small-overlapping patterns • Frequency (as previous) • Informativeness • Overlap • Ontological overlap: holding subClassOf/subPropertyOf relations • Contextual overlap: matched by common associations in the results ConfPaper Conference author inProcOf role ontological overlap Paper Paper firstAuthor cites author
  • 19. Formulated as multidimensional 0-1 knapsack • Find k patterns that maximize frequency*Informativeness (goal) and not share considerably large overlap (constraints) • Solved by a greedy algorithm
  • 20. Exploration methods (2) • Clustering • Facets • facet values = classes of entities and properties appearing in associations in the results • Problem: To recommend k facet values (solved in a similar way) ConfPaper Paper Conference paper-A conf-A secondAuthor inProcOf reviewer
  • 21. Demo based on DBpedia ws.nju.edu.cn/explass
  • 22. Demo based on DBpedia ws.nju.edu.cn/explass facet values (classes) facet values (properties)
  • 23. Demo based on DBpedia ws.nju.edu.cn/explass a collapsed pattern an expanded pattern associations not matching any pattern above
  • 24. User study • 26 association exploration tasks over DBpedia • Derived from QALD queries and “People also search for” • Example: Suppose you will write an article about the associations between Abraham Lincoln and George Washington. Use the given system to explore their associations and identify several themes to discuss in the article. • 20 subjects • 3 approaches • Explass: clustering + facets • RelClus: clustering into a hierarchy of patterns • RF: facets only (similar to RelFinder) from QALD
  • 28. Conclusion 1. Provide patterns wisely. • To avoid deep, complicated hierarchy • To avoid very general, almost meaningless concepts 2. Combine patterns and facets wisely. • Patterns as meaningful summaries of results • Facets as filters for refining the search Filters Summaries of results
  • 29. Future work • Performance optimization • (online) path finding • (online) frequent itemset mining • Exploring associations between several entities or, a data set