Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtrends: Exploring query logs for ranking RDF data

164 views

Published on

http://2016.semantics.cc/edgard-marx

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg | DBtrends: Exploring query logs for ranking RDF data

  1. 1. DBtrends Exploring Query Logs for Ranking RDF Data AKSW Edgard Marx, Amrapali Javeri, Diego Moussallem, Sandro Rautenberg 12th International Conference on Semantic Systems
  2. 2. Outline • Motivation • Background • Ranking using Query Logs • Evaluation • Results • Discussion • Conclusion • Future Works 2 AKSW
  3. 3. 3 Personal Data Enterprise Data Motivation Open Data AKSW
  4. 4. 4 http://linkeddatacatalog.dws.informatik.uni-annheim.de/state/ "The size of LOD by 2014 was 31 billion triples" "Facebook users generates 2.7 billion Like actions per day and 300 million new photos are uploaded daily" Josh Constine, 2012 We Have Data "Google Processing 20,000 Terabytes A Day, And Growing" Erick Schonfeld, 2008 techcrunch.com techcrunch.com AKSW Motivation
  5. 5. Not all of data is relevant We Have Data Motivation 5 AKSW
  6. 6. 6 We Have Data Motivation AKSW
  7. 7. We Have Data 7 AKSW Motivation
  8. 8. Ranking 8 AKSW Motivation
  9. 9. Scenarios Search Machine Learning Link Discovery 9 AKSW Motivation
  10. 10. Resource Description Framework (RDF) Concrete E=MC² Abstract 10 Background AKSW Web of Data
  11. 11. Things 11 Background AKSW Web of Data • Semantic Search • Entity Search • Question Answering • Named Entity Recognition • Link Discovery • Machine Learning Use RDF Data E=MC²
  12. 12. Ranking Functions (Types) 12 "Give me all persons" AKSW Retrieve Processing & Ranking Background ...
  13. 13. Ranking Functions (Types) 13 "Give me all persons" AKSW Retrieve Persons Sort Processing & Ranking Answer Background ...
  14. 14. Ranking Functions (Types) 14 "Give me all persons" AKSW Retrieve Persons Sort Processing & Ranking Answer Background ...Query dependent Query independent
  15. 15. Ranking 15 AKSW Background Page et al.1999
  16. 16. Ranking 16 AKSW Background Page et al.1999 2001 Lee et al. Web of Data
  17. 17. Ranking RDF Data 17 AKSW Background Page et al. 2011 1999 Cheng et al. (Property) 2001 Lee et al. Web of Data
  18. 18. Ranking RDF Data 18 AKSW Background Page et al. Thalhammer et al. 2011 1999 2014 Cheng et al. (Property) 2001 Lee et al. Web of Data
  19. 19. Benchmarks 19 DBtrends Benchmark (Marx, 2016) • 60 users from different countries (USA, India) • 9 entity ranking functions applied to DBpedia Knowledge Base • Users sort relevant classes, properties and entities extracted from the top twenty entities belonging to the top four classes • Task were executed using Amazon Mechanical Turk Previous Benchmarks • Not public available • Evaluate performace of 30 profiles AKSW Background
  20. 20. Why use query logs? AKSW 20 Ranking using Query Logs
  21. 21. Why use query logs? AKSW 21 Ranking using Query Logs
  22. 22. Why use query logs? AKSW 22 Ranking using Query Logs Query Logs search...
  23. 23. Why use query logs? AKSW 23 Ranking using Query Logs
  24. 24. Why use query logs? • Query logs provide relevant information about user's preference • They refer to the real-world entities E=MC² AKSW 24 Ranking using Query Logs
  25. 25. Questions • How to map real-world entities to Web of Data? • How to measure it's relevance? • Where to find a good and trustable query log? AKSW 25 Ranking using Query Logs
  26. 26. How to map real world resources? • Rocha et al. (2004) • Ding et al. (2005) • Hogan et al. (2006) • Alsarem et al (2015) AKSW 26 Ranking using Query Logs Query Logs search... Web of Data
  27. 27. How to measure the resource's relevance? AKSW 27 Ranking using Query Logs • Users search (more often) for things that are relevant • Query logs register how often something is searched • Query logs can be used for better estimate resource's relevance by looking how often it is searched
  28. 28. Where to find a good and trustable query log? AKSW 28 Ranking using Query Logs
  29. 29. Where to find a good and trustable query log? AKSW 29 Ranking using Query Logs
  30. 30. Where to find a good and trustable query log? • Public API • Filters  Geographic • Country • State • City  Period  Day  Week  Month  Year AKSW 30 Ranking using Query Logs
  31. 31. DBtrends Ranking Function AKSW 31 Ranking using Query Logs
  32. 32. DBtrends Ranking Function AKSW 32 Ranking using Query Logs 36 Trendsdbr:New_York_City “New York” dbo:City dbo:Place 2 1 1 • First, the labels of the entities are extracted and used to acquire the search history in query logs e.g. GoogleTrends ( )2-
  33. 33. DBtrends Ranking Function 18 36 Trendsdbr:New_York_City “New York” dbo:City dbo:Place 1 2 3 4 9 • First, the labels of the entities are extracted and used to acquire the search history in query logs e.g. GoogleTrends ( ) • Thereafter, the entity ranks are used as a base to propagate the rank to the classes ( )3 4- 2- AKSW 1 33 Ranking using Query Logs
  34. 34. Entity Ranking Functions • DBtrends • MIXED-RANK • DB-IN • DB-OUT • DB-RANK • PAGE-IN • PAGE-OUT • PAGE-RANK • E-PAGE-IN • SEO-PA • SHARED-LINKS + Evaluation 34 AKSW
  35. 35. Property/Class Ranking Functions • Instances • Instances Property Class AKSW 35 Evaluation • Relin • RandomRank • Instances • Instances
  36. 36. Results AKSW • PAGE-RANK • E-PAGE-IN • SHARED-LINKS • SEO-PA • DB-OUT • PAGE-IN • PAGE-OUT • DB-IN • DB-RANK 36 Evaluation Entity
  37. 37. Results AKSW • MIXED-RANK • PAGE-RANK • E-PAGE-IN • SHARED-LINKS • SEO-PA • DB-OUT • PAGE-IN • DBtrends • PAGE-OUT • DB-IN • DB-RANK 37 Evaluation Entity
  38. 38. Discussion AKSW • Functions that take into consideration external information provide more insights about resource's relevance • RDF Links reflect natural connections rather than resouce's relevance • MIXED-RANK • PAGE-RANK • E-PAGE-IN • SHARED-LINKS • SEO-PA • DB-OUT • PAGE-IN • DBtrends • PAGE-OUT • DB-IN • DB-RANK Entity 38 Evaluation
  39. 39. Discussion AKSW • There is no pattern in the impact distribution of query longs • Queries (not necessarly) help to improve a ranking functions • Internal agreement ~63% 39 Evaluation Entity
  40. 40. Results AKSW • RandomRank • Relin • Instances • Instances • Instances • Instances Property Class 40 Evaluation
  41. 41. Discussion AKSW • RandomRank • Relin • Instances • Instances • Internal agreement ~37% • Ranks are very sparse • Not conclusive 41 Evaluation Property
  42. 42. Discussion AKSW • Internal agreement ~67% • Instances • Instances 42 Evaluation Class
  43. 43. Discussion AKSW dbo:PopulatedPlace dbo:Settlement dbo:Place owl:Thing A simple sort can be very effective 43 Evaluation dbo:PopulatedPlace dbo:Settlement dbo:Place owl:Thing • Instances • Instances Class
  44. 44. Discussion AKSW • Confidence in executing the tasks:  Indians 90%  Americans 60% • Ranks produced by Indians were more sparse • Abstract entities appear before entities 44 Evaluation Caviats
  45. 45. Summary AKSW • Entity Ranking functions produce better results when considering external information • A simple sort of the number of instances can be very effective for ranking classes • Query logs can (not necessarily) improve entity ranking functions 45 Evaluation
  46. 46. Benchmark AKSW • Benchmark • Ranking functions • Library (Java) 46 Evaluation dbtrends.aksw.org
  47. 47. Future Works AKSW • Extend the evaluation to other countries and ranking functions • Evaluate the impact of contex-aware ranking functions • Use others similarity ranking functions 47
  48. 48. Acknowledgements 48 AKSW Contact http://emarx.org

×