Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards Exploratory Relationship Search: A Clustering-based Approach

820 views

Published on

Presented at JIST2013, Seoul, Korea.

Published in: Technology, Health & Medicine
  • Be the first to comment

  • Be the first to like this

Towards Exploratory Relationship Search: A Clustering-based Approach

  1. 1. Towards Exploratory Relationship Search: A Clustering-Based Approach Yanan Zhang, Gong Cheng, Yuzhong Qu Nanjing University, China
  2. 2. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  3. 3. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  4. 4. Relationship search
  5. 5. Searching graph-structured data relatonship = path
  6. 6. Too many results!
  7. 7. Exploratory relationship search • Exploring a set of relationships interactively and continuously faceted categories (RelFinder) clustering (our solution: RelClus)
  8. 8. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  9. 9. Challenges • How to meaningfully label a cluster? • How to make sense of a cluster hierarchy? • How to measure similarity between clusters? Agglomerative hierarchical clustering • Initially: relationships  singleton clusters • Then: progressively merge the most similar pair
  10. 10. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  11. 11. Relationship pattern • High-level abstraction of relationships – Vertices: entities or classes – Edges: properties (undirected)
  12. 12. How to meaningfully label a cluster? • Using a leastest common relationship pattern – Vertices: leastest common classes (or entities) – Edges: leastest common properties Person P1 R4 R5 label({R4, R5}) = P1
  13. 13. How to make sense of a cluster hierarchy? • subPatternOf (⊑) – Vertices: s.t. subClassOf (or instance-type) – Edges: s.t. subPropertyOf P3 P2 P1 P2 ⊑ P3, P1 ⊑ P3
  14. 14. How to measure similarity between clusters? • sim(Ci,Cj) = how many commonalities they share which are exactly captured by label(Ci∪Cj) – Measure: -log (probability of seeing label(Ci∪Cj)) i.e. the information content associated with label(Ci∪Cj) – Probability estimation: based on the data set P3 P2 P1
  15. 15. A running example P3 P2 R1 R2 R3 P1 R4 R5
  16. 16. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  17. 17. Design • Data set: DBpedia • Systems – RList: just a list of all results – RFacet: w/ faceted categories (similar to RelFinder) – RClus: w/ hierarchical clustering (our solution) • Participants and tasks – 2 participants provide searh tasks • 3 (well-defined) lookup tasks • 3 (open) exploratory search tasks – 15 participants carry out tasks • Metrics – Questionnaire – SUS – User feedback
  18. 18. Questionnaire results
  19. 19. Some inspiring user feedback • Dislike deep hierarchies • Expect more concise visualization • Need more cognitive support
  20. 20. Performance testing
  21. 21. Outline • • • • • Motivation Challenges Approach Evaluation Conclusion
  22. 22. Conclusion • Goal: clustering-based exploratory relationship search • Approach: pattern-centric • Future work – Combining faceted categories and hierarchical clustering – Going beyond them

×