• Save
Evaluating Entity Summarization Using a Game-Based Ground Truth
Upcoming SlideShare
Loading in...5
×
 

Evaluating Entity Summarization Using a Game-Based Ground Truth

on

  • 470 views

In recent years, strategies for Linked Data consumption have caught attention in Semantic Web research. For direct consumption by users, Linked Data mashups, interfaces, and visualizations have become ...

In recent years, strategies for Linked Data consumption have caught attention in Semantic Web research. For direct consumption by users, Linked Data mashups, interfaces, and visualizations have become a popular research area. Many approaches in this field aim to make Linked Data interaction more user friendly to improve its accessibility for nontechnical users. A subtask for Linked Data interfaces is to present entities and their properties in a concise form. In general, these summaries take individual attributes and sometimes user contexts and preferences into account. But the objective evaluation of the quality of such summaries is an expensive task. In this paper we introduce a game-based approach aiming to establish a ground truth for the evaluation of entity summarization. We exemplify the applicability of the approach by evaluating two recent summarization approaches.

http://iswc2012.semanticweb.org/sites/default/files/76500342.pdf

Statistics

Views

Total Views
470
Slideshare-icon Views on SlideShare
449
Embed Views
21

Actions

Likes
0
Downloads
0
Comments
2

2 Embeds 21

http://render-project.eu 15
https://twitter.com 6

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Putcha, thank you for your comment. I would be happy to get in touch. You can contact me via thalhammer(at)kit.edu.

    Andreas
    Are you sure you want to
    Your message goes here
    Processing…
  • Hai Andreas Thalhammer:

    Your PPT popped up while I was reviewing my PPTs on slideshare. Good, faithful and factual summarization is very tough though some highly capable humans can do it effortlessly and impressively. I liked your informal definitions and observations---particularly 'everyone knows the main points and everyone misses key minor points' A good summary should also point out all the minor points without stretching the summary.

    The next complicated aspect of summary is to point out what is vital but missing. This requires knowledge and mastery of the subject and cannot depend only on the article reviewed for summary. I am working on this. It is very exciting. I am using color coding of concepts and clustering them into themes. I have a trial document for assessment. Let me know your email ID. I would like you to review and give me your views.

    Thanks and regards,
    23 MAY 14
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Evaluating Entity Summarization Using a Game-Based Ground Truth Evaluating Entity Summarization Using a Game-Based Ground Truth Presentation Transcript

    • Evaluating Entity Summarization Using a Game-Based Ground Truth Andreas Thalhammer¹, Magnus Knuth², and Harald Sack² ¹ University of Innsbruck, Austria13 Nov. 2012ISWC 2012 Boston ² Hasso Plattner Institute Potsdam, Germany
    • Google: “Get the best summary” [1] • Inglourious Basterds (Movie) • Freebase: 1279 triples • DBpedia: 217 triples • Google Knowledge Graph summary: 14 triples13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 2
    • Entity Summarization• First attempt towards a definition:“... not just represent the main themes of theoriginal data, but rather, can best identify theunderlying entity” [2] Is this a good definition?13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 3
    • Entity Summarization (cont.) “A summary can be loosely defined as a text that is produced from one or more texts, that conveys important information in the original text(s), and that is no longer than half of the original text(s) and usually significantly less than that.” [3] A summary is • short • and conveys important information.13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 4
    • Entity Summarization (cont.)• Our (loose) definition: “Entity summarization is the task of producing a summary that conveys important facts about the entity while reducing the number of shown facts significantly.”13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 5
    • The Problem: Evaluation • How do we make different summarization systems comparable? Sub-question: • How do we grasp the idea of “important facts”?13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 6
    • Related Work• RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization [2] – Intrinsic: 24 users compiled summaries of 149 entities (forming a gold standard) (Intersection-based similarity) – Extrinsic: 47 pairs of FB and DBpedia entities were selected (24 correct ones, 23 incorrect ones).  Users judge whether pairs are correct or not.13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 7
    • Related Work (cont.)• Towards exploratory video search using linked data [4] – Quantitative evaluation of heuristics  Ground truth, containing 115 entities summarized by 72 users. – Precision/Recall similarity measure13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 8
    • Related Work (cont.)• It is hard to find participants.• Generating summaries is a cumbersome process.• Only a subset of property-value pairs are ranked by the users.• Up to this point, none of the two evaluation datasets is publically available.13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 9
    • Our Idea• Important facts are commonly known• Unimportant facts are rarely known• How to find out?  Linked Data quiz game!13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 10
    • Hypothesis“A game-based ground truth is suitable forevaluating the performance of summarizationapproaches in the movie domain”Assumption: implemented approaches correlatewith the game-based ground truth while randomsummaries do not.13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 11
    • Dataset• 60 arbitrary selected movies from IMDb Top250• RDF descriptions from Freebase• Usage of a property white list• Triple store: Ontotext’s OWLIM with OWL2-RL reasoning enabled.• Property chains: <http://some-name.space/hasActor> <http://www.w3.org/2002/07/owl#propertyChainAxiom> ( <http://rdf.freebase.com/ns/film.film.starring> <http://rdf.freebase.com/ns/film.performance.actor> ). All data is available at: http://yovisto.com/labs/iswc201213 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 12
    • WhoKnows?Movies! S P O:The_Princess_Bride prop:actor :Billy_Crystal, ...:Braveheart prop:actor :Mel_Gibson, ...:Pulp_Fiction prop:actor :John_Travolta . • Question types: - One-to-One - One-to-N • Questions are composed upside down: ‘Object is the property of subject1, subject2, subject3’ Play the game at: http://bit.ly/WhoKnowsMovies 13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 13
    • Frequency == Importance ??? word upper lower• Information retrieval: frequency cut-off cut-off – Luhn (1958): “resolving power of words” [5] ranking by word frequency• Game supports half-knowledge in general – e.g. which movie was released 1994? Monsters, Inc. – Pulp Fiction – Casablanca – ... but the human brain performs better with pictures (actors), sounds (film music), ...13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 14
    • Evaluated Systems• UBES (Usage-based Entity Summarization) [5] – Combine Freebase with HetRec2011 MovieLens2k [6] – Use item-based collaborative filtering to form neighborhoods for each movie – Find out which property-value pairs a movie shares with its neighbors – Use a TF-IDF related weighting scheme Bob Alice Marc Elena John Mary Pulp Fiction 1 0 1 0 1 1 Heat 0 0 1 1 0 0 Kill Bill 1 0 1 0 1 013 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 15
    • Evaluated Systems (cont.) • GKG (Google Knowledge Graph) [1] – Enables semi-automatic transformation to Freebase/search?hl=en&q=quentin+tarantino&stick=H4sIAAAAAAAAAONgVuLQz9U3MLM0zgEA_sQyxwwAAAA&sa=X&ei=FnjTT7rXN8jftAaAhPWIDw&ved=0CKwBEJsTKAA – base64 + gzip  /m/0693l http://www.freebase.com/view/m/0693l redirects to: http://www.freebase.com/view/en/quentin_tarantino 13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 16
    • Results• 690 sessions, 8308 questions• 217 players (135 players played only once)• 2314 of 2829 triples were played more than 3 times13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 17
    • Result: Kendall’s τ• Property ranking:• Feature (property-value) ranking:13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 18
    • Conclusion• The results indicate that a game-based ground truth is suitable for evaluating entity summarization.• The current dataset is too sparse to make valid assumptions about the importance of single facts.13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 19
    • Future Work• Increase the number of players• Score the exclusion principle• Increase the number of movies• Application to additional domains• Publish new versions of the evaluation dataset on a regular basis13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 20
    • Questions? Help collecting data: http://bit.ly/WhoKnowsMovies Andreas Thalhammer (andreas.thalhammer@sti2.at) Magnus Knuth (magnus.knuth@hpi.uni-potsdam.de) Harald Sack (harald.sack@hpi.uni-potsdam.de)13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 21
    • References [1] Singhal, A.: Introducing the knowledge graph: things, not strings (2012), http://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html [2] Cheng, G., Tran, T., Qu, Y.: RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 114–129. Springer, Heidelberg (2011) [3] Dragomir R. Radev, Eduard Hovy, and Kathleen McKeown. 2002. Introduction to the special issue on summarization. Comput. Linguist. 28, 4 (December 2002), 399-408. DOI=10.1162/089120102762671927 http://dx.doi.org/10.1162/089120102762671927 [4] Waitelonis, J., Sack, H.: Towards exploratory video search using linked data. Multimedia Tools and Applications 59, 645–672 (2012), 10.1007/s11042-011-0733-1 [5] Thalhammer, A., Toma, I., Roa-Valverde, A.J., Fensel, D.: Leveraging usage data for linked data movie entity summarization. In: Proc. of the 2nd Int. Ws. on Usage Analysis and the Web of Data (USEWOD 2012) co-located with WWW 2012, Lyon, France, vol. abs/1204.2718 (2012) [6] Cantador, I., Brusilovsky, P., Kuflik, T.: 2nd ws. on information heterogeneity and fusion in recommender systems (hetrec 2011). In: Proc. of 5th ACM Conf. on Recommender systems, RecSys 2011. ACM, New York (2011)13 Nov. 2012 Evaluating Entity Summarization Using a Game-Based Ground Truth. ISWC 2012, Boston 22