Discovery hub : an exploratory search engine on the top of DBpedia
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Discovery hub : an exploratory search engine on the top of DBpedia

  • 1,123 views
Uploaded on

Discovery hub is an exploratory search engine (http://en.wikipedia.org/wiki/Exploratory_search) which helps you to discover things you might like or be interested in. It widens your cultural and......

Discovery hub is an exploratory search engine (http://en.wikipedia.org/wiki/Exploratory_search) which helps you to discover things you might like or be interested in. It widens your cultural and knowledge horizons by revealing and explaining unattended information.

Want a film recommendation related to writers you like ? Want to discover bands at the crossroad of an electro and rock record-labels you like ? Interested by more complex and composite recommendations based on your deepest interests : a writer, a film and a band combination ? Or maybe something simpler ? If you have a thirst for discovery and knowledge, Discovery Hub has answers for you.

Discovery Hub is based on leading edge semantic web technologies. It allows you to discover new and unknown items of interest starting from what you like. Thanks to Discovery Hub you interactively explore DBpedia. DBpedia is a huge knowledge graph derived from Wikipedia data, it is composed of approximately 4 millions entities linked by more than 270 millions connexions. DBpedia covers many topics such as arts, technology, sciences, sport, etc.

Discovery Hub allows performing queries in an innovative way and helps you to navigate rich results. As a hub, it proposes redirections to others platforms to make you benefit from your discoveries (Youtube, Deezer and more). The results are explained in depth thanks to 3 explanatory features. It supports composite explorations i.e. starting from several items of interest; and proposes advanced exploration modes such as serendipitous, multi-lingual, and fine-grained ones

Discovery Hub V2 is more social ! You can like a topic, and share it on Twitter, but more important, now you can share searches you've made, collections you made, to your Discovery Hub followers ! And of course you can also follow your friends and/or interesting people if you find them !

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,123
On Slideshare
1,109
From Embeds
14
Number of Embeds
3

Actions

Shares
Downloads
21
Comments
0
Likes
7

Embeds 14

https://twitter.com 9
http://www.linkedin.com 4
https://www.linkedin.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery hub - a discovery engine on the top of DBpedia Nicolas Marie, Fabien Gandon, Damien Legrand, Myriam Ribière
  • 2. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 2 CONTEXT RESEARCH QUESTION RESEARCH - Proposition - Implementation - Operational prototypes - Users evaluations PUBLICATION
  • 3. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 3 CONTEXT RESEARCH QUESTION RESEARCH - Proposition - Implementation - Operational prototypes - Users evaluations PUBLICATION
  • 4. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Search is only a partially solved problem [ White, 2006]
  • 5. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Gary Marchioninni, 2006
  • 6. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Exploration/discovery todayLookup today « Claude Monet » + impressionism« Claude Monet » + birthday
  • 7. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub
  • 8. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub
  • 9. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Search is only a partially solved problem, White 2006 The degree of structure of the web content is the determining factor for the type of functionality that search engines can provide, Bizer and al., 2012
  • 10. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub SEMANTIC WEB • “The Semantic Web is a mesh of information linked up in such a way as to be easily processable by machines, on a global scale. You can think of it as being an efficient way of representing data on the World Wide Web, or as a globally linked database.” Marianna Sigala, Luisa Mich, Jamie Murphy
  • 11. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Tim Berners-Lee, WWW1994 [Stankovic, 2012]
  • 12. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub 3 000+ words, 13p. 191 triples http://en.wikipedia.org/wiki/Claude_Monet http://dbpedia.org/resource/Claude_Monet
  • 13. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub •Accessible through: - Browsers - Dumps - SPARQL endpoint Select * where { <http://dbpedia.org/resource/Claude_Monet> <http://dbpedia.org/property/influencedBy> ?x }
  • 14. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub DBpedia Google Knowledge Graph Linked Open Data cloud3.77 millions things 270 millions facts 500 millions things 3.5 billions facts 31+ billions facts Close OpenOpen
  • 15. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Google knowledge graph, 2012
  • 16. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Since 1995 Since 2007 2001 2007
  • 17. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub • Linked-data based exploratory search systems User interest, start point Interactive result space Results choice Ranking Sorting/categorization Explanation dbpedia: Claude Monet
  • 18. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub State of the art: Seevl, 2010
  • 19. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub State of the art: Yovisto, 2010
  • 20. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub State of the art: LED, 2010
  • 21. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub State of the art: MORE, 2012
  • 22. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub State of the art: Aemoo, 2011
  • 23. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub State of the art: Kaminskas et al., 2011
  • 24. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub State of the art: Google Knowledge Panel, 2012
  • 25. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 25 CONTEXT RESEARCH QUESTION RESEARCH - Proposition - Implementation - Operational prototypes - Users evaluations PUBLICATION
  • 26. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub •Challenge: enable on-the-fly linked data processing for exploratory search •3 major benefits: - Results freshness - Composite exploration enablement - Fine-grained querying capabilities
  • 27. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Freshness
  • 28. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub • 3.77 millions resources • 2 resources possible combinations: 14.212.900.000.000 • 3 resources possible combinations: 53.582.633.000.000.000.000 & Composite interest exploration: knowing my interest for X and Y what can I discover/learn which is related to all these resources?
  • 29. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Fine-grained querying capabilities • Artists_from_Paris • French_painters -- • Impressionist_painters ++ painted ++ influenced by ++
  • 30. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Fine-grained querying • 1960s_science_fiction_films • American_epic_films • Films_set_on_the_Moon • Artificial_intelligence_in_fiction • Space_adventure_films, … directed by
  • 31. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 31 CONTEXT RESEARCH QUESTION RESEARCH - Proposition - Implementation - Operational prototypes - Users evaluations PUBLICATION
  • 32. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Refer to publications for the complete algorithm
  • 33. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Spreading activation basis – monocentric Claude_Monet … … … … … … Iteration 0 Iteration 1 Iteration 2
  • 34. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Claude_Monet Musée d’Orsay Musée de l’Orangerie … …… Vincent Van Gogh Spreading activation basis – polycentric
  • 35. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Wheeler_School Art Institute of Chicago Gustave_Courbet Cadmium_sulfide Farmington_Mountain DBO:Museum DBO:ChemicalSubstance DBO:Mountain DBO:Artist cat:Impressionist_p… cat:Alumni_of_beaux… 2 0 0 0 3 +2 Propagation domain: artist, book, film, museum, river, television show, university, writer,… cat:Impressionist painters cat:Alumni_of_Beaux_Arts DBO:School
  • 36. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub • How to be fast ? How to execute it fast On very a large graph
  • 37. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Very large graph Locate the processing
  • 38. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub
  • 39. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub 1.sparql endpoint = http://xxx/sparql 2.seed(s) = xxx_Beatles 3. compute the propagation domain (w(i,o)) (4. find a path between the seeds) 5. import path nodes & their neighbors 6. for(i=1; i<=maxPulse; i++){ 7. pulse 8. if(sampleSize <= maxSampleSize){ 9. extend the sample 10. } 11.}
  • 40. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub select distinct ?x ?y where { service <sparqlEndpoint> { select * where { ?a(<…wikiPageWikiLink>| ^<…wikiPageWikiLink>){0,X} :: $path ?b filter (?a=<resource1> &&?b=<resource2>) } } graph $path {?x ?p ?y} filter(?x!=<resource1> && ?x!=<resource2>) } Path query using Kgram for polycentric SA
  • 41. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 41 CONTEXT RESEARCH QUESTION RESEARCH - Proposition - Implementation - Operational prototypes - Users evaluations PUBLICATION
  • 42. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub • Analysis method Analysis performed on a set of 100.000 queries
  • 43. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 500 1000 1500 2000 2500 3000 3500 4000 4500 0 5000 10000 15000 20000 KT Ms Triples loading limit Similarity of top 100 results (Kendall-Tau) from one loading limit to another maxSampleSize
  • 44. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Similarity of top 100 results (shared results, KT) from one iteration to another maxPulse 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 10 Kendall-Tau Sharedresults Iterations KT shared results
  • 45. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 Milliseconds Queries response time histogram Response time histogram
  • 46. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Algorithm visualization, available @ http://www.youtube.com/user/wearediscoveryhub
  • 47. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Polycentric query propagation visualization, iteration 0 • In red: Claude Monet • In blue : Musée d’Orsay • In purple: Recovery
  • 48. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Polycentric query propagation visualization, iteration 6 • In red: Claude Monet • In blue : Musée d’Orsay • In purple: Recovery
  • 49. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Semantic spreading activation Distance A Distance B Gap A - B Max distance A Max distance B poly1 - top 10 1.53 1.68 0.34 / / poly2 - top 10 1.52 1.66 0.33 / / poly1 - top 100 1.90 2.12 0.49 2.60 2.60 poly2 - top 100 1.88 2.11 0.48 2.58 2.58 Polycentric Polycentric queries, average distances of top results from each seed.
  • 50. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. • We studied the convergence of the algorithm according to various graph metrics. For this purpose we generated many graphs thanks to the Graphstream graph library, conclusion : the diameter is crucial. Discovery Hub Influence of graph diameter on algorithm convergence http://graphstream-project.org/doc/Generators/
  • 51. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 9 1011121314151617181920 Resultssimilarity Iterations diamètre 4.14 diamètre 6.72 diamètre 9.94 diamètre 13.28 diamètre 15.43 diamètre 19.59 diamètre 22.03 diamètre 24.87 diamètre 28.85 Influence of graph diameter on algorithm convergence
  • 52. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub 0 10 20 30 40 50 60 70 80 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Averagerankvariation Iterations diamètre 4.14 diamètre 6.72 diamètre 9.94 diamètre 13.28 diamètre 15.43 diamètre 19.59 diamètre 22.03 diamètre 24.87 diamètre 28.85 Influence of graph diameter on algorithm convergence
  • 53. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. • Discovery hub is an exploratory search engine which helps you to discover things you might like or be interested in. It widens your cultural and knowledge horizons by revealing and explaining unattended information. • It allows performing queries in an innovative way and helps you to navigate rich results. As a hub, it proposes redirections to others platforms to make you benefit from your discoveries (Youtube, Deezer and more). The results are explained in depth thanks to 3 explanatory features. • Discovery Hub supports simple and composite explorations i.e. starting from one or several items of interest. It proposes and is able to combine advanced exploration modes such as serendipitous, multi-lingual, and fine-grained ones. Discovery Hub powered
  • 54. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 54 CONTEXT RESEARCH QUESTION RESEARCH - Proposition - Implementation - Operational prototypes - Users evaluations PUBLICATION
  • 55. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub 1. Start from what you like or are interested in 2. Explore, discover, under stand 3. Be redirected on great platforms to experience your discoveries powered Book
  • 56. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub V1
  • 57. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub V2
  • 58. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. • 3 features to understand the results: common properties Discovery Hub
  • 59. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub • 3 features to understand the results: Wikipedia crossed references
  • 60. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub • 3 features to understand the results: explanatory graph
  • 61. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Internationalization
  • 62. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Serendipitous mode ? ? ? ? Claude_Monet … ? … ? ? … ? … ? … ? … Iteration 0 Iteration 1 Iteration 2
  • 63. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Multi-lingual mode dbpedia:Claude_Monet sameAs fr.dbpedia:Claude_Monet
  • 64. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Fine search mode 1960s_science_fiction_films Films_set_on_the_Moon Artificial_intelligence_in_fiction Space_adventure_films Top 4 films
  • 65. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Directed by Stanley Kubrick Top 4 films Fine search mode
  • 66. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub 1960s_science_fiction_films Films_set_on_the_Moon Artificial_intelligence_in_fiction Space_adventure_films Directed by Stanley Kubrick Top 4 films Fine search mode
  • 67. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Multi-criterias mode
  • 68. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub « Surprise mode » Multi-lingual Fine-search Multi-criterias mode
  • 69. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub Demo videos, available @ http://www.youtube.com/user/wearediscoveryhub
  • 70. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 70 CONTEXT RESEARCH QUESTION RESEARCH - Proposition - Implementation - Operational prototypes - Users evaluations PUBLICATION
  • 71. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub • Mono-centric queries: evaluated positively on movie domain against another algorithm: the sSVM implemented in MORE movie recommender Very interesting Not interesting at all
  • 72. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Scores for partial lists Discovery Hub
  • 73. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub - Films_about_criticism_and_refusal_of_work - Anti-modernist_films - Fiction_with_unreliable_narrators 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Neutral Personalized Interesting
  • 74. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub • Poly-centric queries: evaluated positively 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 Relevance Discovery Very interesting Not interesting at all Very surprizing Not suprizing at all
  • 75. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub •Recovery between relevance and unexpectedness: - 61.6% of results were rated as strongly relevant or relevant by the participants. - 65% of results were rated as strongly unexpected or unexpected. - 35.42% of results were rated both as strongly relevant or relevant and strongly unexpected or unexpected.
  • 76. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub •Explanation features 0 0.5 1 1.5 2 2.5 3 In Common Wikipedia Graph Overall Monocentric Polycentric Common prop. Wiki-based Graph-based Overall Very Helpful Not helpful At all
  • 77. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Publications: • Nicolas Marie, Fabien Gandon, Myriam Ribière, Florentin Rodio, Discovery Hub: on-the-fly linked data exploratory search. I-Semantics 2013, Graz, 4 – 6 september (paper). • Nicolas Marie, Fabien Gandon, Damien Legrand, Myriam Ribière, Exploratory Search on the top of DBpedia chapters with the Discovery Hub Application. ESWC2013, Montpellier, 26 – 30 may (demo+poster). • Nicolas Marie, Olivier Corby, Fabien Gandon, Myriam Ribière, Composite interests exploration thanks to on-the-fly linked data spreading activation, Hypertext 2013, 1-3 may, Paris (paper). • Clare J. Hooper, Nicolas Marie, Evangelos Kalampokis, Dissecting the Butterfly: Representation of Disciplines Publishing at the Web Science Conference Series, Web Science 2012, Northeastern university, Evanston, United States, 22-24 june (paper). • Nicolas Marie, Fabien Gandon. Advanced social objects recommendation in multidimensional social networks. Social Object Workshop 2011, MIT, Boston, USA (paper). Discovery Hub
  • 78. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. 78 CONTEXT RESEARCH QUESTION RESEARCH - Proposition - Implementation - Operational prototypes - Users evaluation PUBLICATION
  • 79. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub
  • 80. COPYRIGHT © 2011 ALCATEL-LUCENT. ALL RIGHTS RESERVED. Discovery Hub ncmarie3&@gmail.com http://ncmarie.tumblr.com http://discoveryhub.co werarediscoveryhub@gmail.com Thank you !