Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Explass: Exploring Associations between Entities 
via Top-K Ontological Patterns and Facets 
Gong Cheng, Yanan Zhang, Yuzh...
Association search
Association search 
? 
air pollution ? autism 
?
Association search 
You 
? 
? 
? 
?
Association search on the Web of documents 
associations hidden in text
Association search on an entity-relation graph 
paper-A conf-A 
Alice Bob 
article-A 
conf-B 
paper-B 
paper-C 
paper-D 
i...
association = path 
Alice Bob 
paper-A conf-A secondAuthor inProcOf reviewer 
paper-B conf-B firstAuthor inProcOf chair 
p...
Challenge 
over 1,000 associations 
in DBpedia 
(within 4 hops) 
How to explore them?
Exploration methods (1) 
• Clustering 
• Facets
cluster = pattern 
Common 
super-property Common class 
Paper Conference author inProcOf role 
paper-A conf-A secondAuthor...
Problem: To recommend k patterns 
paper-A conf-A secondAuthor inProcOf reviewer 
paper-B conf-B firstAuthor inProcOf chair...
Step 1: Mining all significant patterns 
Paper Conference author inProcOf role 
frequency = 2/5 > threshold 
paper-A conf-...
Formulated as frequent itemset mining 
1. transaction = association 
item = <position, class> or <position, property> 
2. ...
Formulated as frequent itemset mining 
1. transaction = association 
item = <position, class> or <position, property> 
2. ...
Formulated as frequent itemset mining 
1. transaction = association 
item = <position, class> or <position, property> 
2. ...
Step 2: Finding k frequent, informative, and 
small-overlapping patterns 
• Frequency (as previous) 
• Informativeness 
• ...
Step 2: Finding k frequent, informative, and 
small-overlapping patterns 
• Frequency (as previous) 
• Informativeness 
• ...
Step 2: Finding k frequent, informative, and 
small-overlapping patterns 
• Frequency (as previous) 
• Informativeness 
• ...
Formulated as multidimensional 0-1 knapsack 
• Find k patterns that 
maximize frequency*Informativeness (goal) 
and not sh...
Exploration methods (2) 
• Clustering 
• Facets 
• facet values = classes of entities and properties 
appearing in associa...
Demo based on DBpedia 
ws.nju.edu.cn/explass
Demo based on DBpedia 
ws.nju.edu.cn/explass 
facet values 
(classes) 
facet values 
(properties)
Demo based on DBpedia 
ws.nju.edu.cn/explass 
a collapsed 
pattern 
an expanded 
pattern 
associations not matching 
any p...
User study 
• 26 association exploration tasks over DBpedia 
• Derived from QALD queries and 
“People also search for” 
• ...
Post-task questionnaire results
Usability scores (SUS)
User behavior
Conclusion 
1. Provide patterns wisely. 
• To avoid deep, complicated hierarchy 
• To avoid very general, almost meaningle...
Future work 
• Performance optimization 
• (online) path finding 
• (online) frequent itemset mining 
• Exploring associat...
Questions?
Upcoming SlideShare
Loading in …5
×

Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

828 views

Published on

Presented at ISWC 2014, Riva del Garda, Italy.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

  1. 1. Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets Gong Cheng, Yanan Zhang, Yuzhong Qu Websoft Research Group State Key Laboratory for Novel Software Technology Nanjing University, China
  2. 2. Association search
  3. 3. Association search ? air pollution ? autism ?
  4. 4. Association search You ? ? ? ?
  5. 5. Association search on the Web of documents associations hidden in text
  6. 6. Association search on an entity-relation graph paper-A conf-A Alice Bob article-A conf-B paper-B paper-C paper-D inProcOf secondAuthor reviewer chair firstAuthor firstAuthor inProcOf secondAuthor cites cites extends firstAuthor associations exposed as graph
  7. 7. association = path Alice Bob paper-A conf-A secondAuthor inProcOf reviewer paper-B conf-B firstAuthor inProcOf chair paper-B paper-C firstAuthor cites firstAuthor paper-D paper-C secondAuthor cites firstAuthor paper-D article-A secondAuthor extends firstAuthor
  8. 8. Challenge over 1,000 associations in DBpedia (within 4 hops) How to explore them?
  9. 9. Exploration methods (1) • Clustering • Facets
  10. 10. cluster = pattern Common super-property Common class Paper Conference author inProcOf role paper-A conf-A secondAuthor inProcOf reviewer paper-B conf-B firstAuthor inProcOf chair Position 1 Position 2 Position 3 Position 4 Position 5 pattern match associations
  11. 11. Problem: To recommend k patterns paper-A conf-A secondAuthor inProcOf reviewer paper-B conf-B firstAuthor inProcOf chair paper-B paper-C firstAuthor cites firstAuthor paper-D paper-C secondAuthor cites firstAuthor paper-D article-A secondAuthor extends firstAuthor
  12. 12. Step 1: Mining all significant patterns Paper Conference author inProcOf role frequency = 2/5 > threshold paper-A conf-A secondAuthor inProcOf reviewer paper-B conf-B firstAuthor inProcOf chair paper-B paper-C firstAuthor cites firstAuthor paper-D paper-C secondAuthor cites firstAuthor paper-D article-A secondAuthor extends firstAuthor
  13. 13. Formulated as frequent itemset mining 1. transaction = association item = <position, class> or <position, property> 2. Mining frequent itemsets 3. itemset  pattern paper-A conf-A secondAuthor inProcOf reviewer <1, secondAuthor> <1, author> <2, ConfPaper> <2, Paper> <3, inProcOf> <4, Conference> <5, reviewer> <5, role> Position 1 Position 2 Position 3 Position 4 Position 5
  14. 14. Formulated as frequent itemset mining 1. transaction = association item = <position, class> or <position, property> 2. Mining frequent itemsets 3. itemset  pattern paper-A conf-A secondAuthor inProcOf reviewer <1, author> <2, ConfPaper> <2, Paper> <3, inProcOf> <4, Conference> <5, role> Position 1 Position 2 Position 3 Position 4 Position 5
  15. 15. Formulated as frequent itemset mining 1. transaction = association item = <position, class> or <position, property> 2. Mining frequent itemsets 3. itemset  pattern paper-A conf-A secondAuthor inProcOf reviewer <1, author> <2, ConfPaper> <2, Paper> <3, inProcOf> <4, Conference> <5, role> Paper Conference author inProcOf role
  16. 16. Step 2: Finding k frequent, informative, and small-overlapping patterns • Frequency (as previous) • Informativeness • Overlap
  17. 17. Step 2: Finding k frequent, informative, and small-overlapping patterns • Frequency (as previous) • Informativeness • informativeness of a class = self-information of its occurrence (more informative = having fewer instances) e.g. ConfPaper > Paper • informativeness of a property = entropy of its values (more Informative = having more diverse values) e.g. is-author-of > nationality • Overlap Paper Conference author inProcOf role
  18. 18. Step 2: Finding k frequent, informative, and small-overlapping patterns • Frequency (as previous) • Informativeness • Overlap • Ontological overlap: holding subClassOf/subPropertyOf relations • Contextual overlap: matched by common associations in the results ConfPaper Conference author inProcOf role ontological overlap Paper Paper firstAuthor cites author
  19. 19. Formulated as multidimensional 0-1 knapsack • Find k patterns that maximize frequency*Informativeness (goal) and not share considerably large overlap (constraints) • Solved by a greedy algorithm
  20. 20. Exploration methods (2) • Clustering • Facets • facet values = classes of entities and properties appearing in associations in the results • Problem: To recommend k facet values (solved in a similar way) ConfPaper Paper Conference paper-A conf-A secondAuthor inProcOf reviewer
  21. 21. Demo based on DBpedia ws.nju.edu.cn/explass
  22. 22. Demo based on DBpedia ws.nju.edu.cn/explass facet values (classes) facet values (properties)
  23. 23. Demo based on DBpedia ws.nju.edu.cn/explass a collapsed pattern an expanded pattern associations not matching any pattern above
  24. 24. User study • 26 association exploration tasks over DBpedia • Derived from QALD queries and “People also search for” • Example: Suppose you will write an article about the associations between Abraham Lincoln and George Washington. Use the given system to explore their associations and identify several themes to discuss in the article. • 20 subjects • 3 approaches • Explass: clustering + facets • RelClus: clustering into a hierarchy of patterns • RF: facets only (similar to RelFinder) from QALD
  25. 25. Post-task questionnaire results
  26. 26. Usability scores (SUS)
  27. 27. User behavior
  28. 28. Conclusion 1. Provide patterns wisely. • To avoid deep, complicated hierarchy • To avoid very general, almost meaningless concepts 2. Combine patterns and facets wisely. • Patterns as meaningful summaries of results • Facets as filters for refining the search Filters Summaries of results
  29. 29. Future work • Performance optimization • (online) path finding • (online) frequent itemset mining • Exploring associations between several entities or, a data set
  30. 30. Questions?

×