Towards Exploratory Relationship Search: A Clustering-based Approach

664 views
552 views

Published on

Presented at JIST2013, Seoul, Korea.

Published in: Technology, Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
664
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Towards Exploratory Relationship Search: A Clustering-based Approach

  1. 1. Towards Exploratory Relationship Search: A Clustering-Based Approach Yanan Zhang, Gong Cheng, Yuzhong Qu Nanjing University, China
  2. 2. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  3. 3. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  4. 4. Relationship search
  5. 5. Searching graph-structured data relatonship = path
  6. 6. Too many results!
  7. 7. Exploratory relationship search • Exploring a set of relationships interactively and continuously faceted categories (RelFinder) clustering (our solution: RelClus)
  8. 8. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  9. 9. Challenges • How to meaningfully label a cluster? • How to make sense of a cluster hierarchy? • How to measure similarity between clusters? Agglomerative hierarchical clustering • Initially: relationships  singleton clusters • Then: progressively merge the most similar pair
  10. 10. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  11. 11. Relationship pattern • High-level abstraction of relationships – Vertices: entities or classes – Edges: properties (undirected)
  12. 12. How to meaningfully label a cluster? • Using a leastest common relationship pattern – Vertices: leastest common classes (or entities) – Edges: leastest common properties Person P1 R4 R5 label({R4, R5}) = P1
  13. 13. How to make sense of a cluster hierarchy? • subPatternOf (⊑) – Vertices: s.t. subClassOf (or instance-type) – Edges: s.t. subPropertyOf P3 P2 P1 P2 ⊑ P3, P1 ⊑ P3
  14. 14. How to measure similarity between clusters? • sim(Ci,Cj) = how many commonalities they share which are exactly captured by label(Ci∪Cj) – Measure: -log (probability of seeing label(Ci∪Cj)) i.e. the information content associated with label(Ci∪Cj) – Probability estimation: based on the data set P3 P2 P1
  15. 15. A running example P3 P2 R1 R2 R3 P1 R4 R5
  16. 16. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  17. 17. Design • Data set: DBpedia • Systems – RList: just a list of all results – RFacet: w/ faceted categories (similar to RelFinder) – RClus: w/ hierarchical clustering (our solution) • Participants and tasks – 2 participants provide searh tasks • 3 (well-defined) lookup tasks • 3 (open) exploratory search tasks – 15 participants carry out tasks • Metrics – Questionnaire – SUS – User feedback
  18. 18. Questionnaire results
  19. 19. Some inspiring user feedback • Dislike deep hierarchies • Expect more concise visualization • Need more cognitive support
  20. 20. Performance testing
  21. 21. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  22. 22. Conclusion • Goal: clustering-based exploratory relationship search • Approach: pattern-centric • Future work – Combining faceted categories and hierarchical clustering – Going beyond them

×