Successfully reported this slideshow.
Your SlideShare is downloading. ×

What knowledge bases know (and what they don't)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 49 Ad

More Related Content

Similar to What knowledge bases know (and what they don't) (20)

Advertisement

What knowledge bases know (and what they don't)

  1. 1. What knowledge bases know (and what they don't) Simon Razniewski Free University of Bozen-Bolzano, Italy Max Planck Institute for Informatics (starting November 2017)
  2. 2. About myself • Assistant professor at FU Bozen-Bolzano, South Tyrol, Italy (since 2014) • PhD from FU Bozen-Bolzano (2014) • Diplom from TU Dresden, Germany (2010) • Research visits at UCSD (2012), AT&T Labs-Research (2013), UQ (2015), MPII (2016) Trilingual The Alps’ oldest criminal case: Ötzi 1/8th of EU apples
  3. 3. What do knowledge bases know? What is a knowledge base?  A collection of general world knowledge • Common sense: • Apples are sweet or sour, • Cats are smaller than cars • Activities: • “whisper” and “shout” are implementations of “talk” • Facts: • Saarbrücken is the capital of the Saarland • Ötzi has blood type O 3
  4. 4. Factual KBs: An old dream of AI • Early manual efforts (CYC, 1980s) • Structured extraction (YAGO, DBpedia, 2000s) • Text mining and extraction (NELL, Prospera, Textrunner, 2000s) • Back to the roots: Wikidata (2012) 4
  5. 5. KBs are useful (1/2): QA What is the capital of the Saarland? Try yourself: • When was Trump born? • What is the nickname of Ronaldo? • Who invented the light bulb? Q: What is the capital of the Saarland?
  6. 6. KBs are useful (2/2): Language Generation 7 • Wikipedia in world’s most spoken language: 1/10 as many articles as English Wikipedia • World’s fourth most spoken language: 1/100  Wikidata intended to help resource-poor languages
  7. 7. KB construction: Current state • More than 2300 papers with titles containing “information extraction” in the last 4 years [Google Scholar] • Large KBs at Google, Microsoft, Alibaba, Bloomberg, … • Progress visible downstream • IBM Watson beats humans in trivia game in 2011 • Entity linking systems close to human performance on popular news corpora • Systems pass 8th grade science tests in the AllenAI Science challenge in 2016 • But how good are KBs themself? 8
  8. 8. How good are the KBs that we build? Is what they know true? (precision or correctness)  Do they know what is true? (recall or completeness) 9
  9. 9. KBs know much of what is true 10 Google Knowledge Graph: 39 out of 48 Tarantino movies  DBpedia: 167 out of 204 Nobel laureates in Physics  Wikidata: 2 out of 2 children of Obama 
  10. 10. Affiliations https://query.wikidata.org/ SELECT (COUNT(?p) as ?result) WHERE {?p worksFor Saarland_University.} • Saarland University: • MPI-INF: • MPI-SWS: 11 325 2 0 (wdt:P108) (wd:Q700758)
  11. 11. KBs know little of what is true 12 DBpedia: contains 6 out of 35 Dijkstra Prize winners  Google Knowledge Graph: ``Points of Interest’’ – Completeness?  Wikidata knows not so well about employees here 
  12. 12. So, how complete are KBs? 13
  13. 13. What previous work says 14 [Dong et al., KDD 2014] There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don't know we don't know. KB engineers have only tried to make KBs bigger. The point, however is to understand what they are trying to approximate.
  14. 14. Outline – Assessing KB recall 1. Logical foundations 2. Rule mining 3. Information extraction 4. Data presence heuristic 15
  15. 15. Outline – Assessing KB recall 1. Logical foundations 2. Rule mining 3. Information extraction 4. Data presence heuristic 16
  16. 16. Closed and open-world assumption worksIn Name Department John D1 Mary D2 Bob D3 17 worksIn(John, D1)? worksIn(Ellen, D3)? Closed-world assumption Open-world assumption • (Relational) databases traditionally employ the closed-world assumption • KBs necessarily operate under the open-world assumption  Yes  Yes  No  Maybe
  17. 17. Open-world assumption • Q: Hamlet written by Goethe? KB: Maybe • Q: Schwarzenegger lives in Dudweiler? KB: Maybe • Q: Trump brother of Kim Jong Un? KB: Maybe  Open-world assumption often too cautious 18
  18. 18. Teaching KBs to say “no” • Need power to express both maybe and no = Partial-closed world assumption • Approach: Completeness statements [Motro 1989] 19 Completeness statement: worksIn is complete for employees of D1 worksIn(John, D1)? worksIn(Ellen, D1)? worksIn(Ellen, D3)?  Yes  No  Maybe worksIn Name Department John D1 Mary D2 Bob D3
  19. 19. Completeness statements • Assertions about the available database containing all information on a certain topic “worksIn is complete for employees of D1” • Form constraints between an ideal database and the available database ∀𝑥: 𝑤𝑜𝑟𝑘𝑠𝐼𝑛𝑖 𝑥, 𝐷1 → 𝑤𝑜𝑟𝑘𝑠𝐼𝑛 𝑎 (𝑥, 𝐷1) • Can have expressivity ranging from simple selections up to first-order-logic 20
  20. 20. If you have completeness statements you can do wonderful things… • Develop techniques for deciding whether a conjunctive query answer is complete [VLDB 2011] • Assign unambiguous semantics to SQL nulls [CIKM 2012] • Create an algebra for propagating completeness [SIGMOD 2015] • Ensure the soundness of queries with negation [ICWE 2016] • …. 21
  21. 21. Where would completeness statements come from? • Data creators should pass them along as metadata • Or editors should add them in curation steps • Developed plugin and external tool COOL-WD (Completeness tool for Wikidata) 22
  22. 22. 23
  23. 23. But… • Requires human effort • Editors are lazy • Automatically created KBs do not even have editors Remainder of this talk: How to automatically acquire information about KB completeness/recall 24
  24. 24. Outline – Assessing KB recall 1. Logical foundations 2. Rule mining 3. Information extraction 4. Data presence heuristic 25
  25. 25. Rule mining: Idea (1/2) Certain patterns in data hint at completeness/incompleteness • People with a death date but no death place are incomplete for death place • Movies with a producer are complete for directors • People with less than two parents are incomplete for parents 26
  26. 26. Rule mining: Idea (2/2) • Examples can be expressed as Horn rules: dateOfDeath(X, Y) ∧ lessThan1(X, placeOfDeath) ⇒ incomplete(X, placeOfDeath) movie(X) ∧ producer(X, Z) ⇒ complete(X, director) lessThan2(X, hasParent) ⇒ incomplete(X, hasParent) Can such patterns be discovered with association rule mining? 27
  27. 27. Rule mining: Implementation • We extended the AMIE association rule mining system with predicates on • Complete/incomplete complete(X, director) • Object counts lessThan2(X, hasParent) • Popularity popular(X) • Negated classes person(X) ∧ ¬ adult(X) • Then mined rules with complete/incomplete in the head for 20 YAGO/Wikidata relations • Result: Can predict (in-)completeness with 46-100% F-score 28[Galarraga et al., WSDM 2017]
  28. 28. Rule mining: Challenges • Consensus: human(x)  Complete(x, graduatedFrom) schoolteacher(x)  Incomplete(x, graduatedFrom) professor(x)  Complete(x, graduatedFrom) John ∈ (human, schoolteacher, professor)  Complete(John, graduatedFrom)? • Rare properties require very large training data • E.g., monks being complete for spouses • Annotated ~3000 rows at 10ct/row  0 monks 29
  29. 29. Outline – Assessing KB recall 1. Logical foundations 2. Rule mining 3. Information extraction 4. Data presence heuristic 30
  30. 30. Information extraction: Idea 31 KB: 0 KB: 1 KB: 2 Recall: 0% Recall: 50% Recall: 100% … Barack and Michelle have two children …
  31. 31. Information extraction: Implementation • Developed a CRF-based classifier for identifying numbers that express relation cardinalities • Works for a variety of topics such as • Family relations has 2 siblings • Geopolitics is composed of seven boroughs • Artwork consists of three episodes • Finds the existence of 178% more children than currently in Wikidata 32 [Mirza et al, ISWC 2016 + ACL 2017]
  32. 32. Information extraction: Challenges • Cardinalities are frequently expressed nonnumeric: • Nouns has twins, is a trilogy • Indefinite articles They have a daughter • Negation/adjectives Have no children/is childless • Often requires reasoning Has 3 children from Ivana and one from Marla • Training (dist. supervision) struggles with false positives • KBs used for training are themselves incomplete President Garfield: Wikidata knows only of 4 out of 7 children 33
  33. 33. Vision: Make IE recall-aware Textual information extraction usually gives precision estimates “John was born in Malmö, Sweden.” citizenship(John, Sweden) – precision 95% “John grew up in Malmö, Sweden.” citizenship(John, Sweden) – precision 70% Can we also produce recall estimates? “John has a son, Tom, and a daughter, Susan.” child(John, Tom), child(John, Susan) – recall 90% “John brought his children Susan and Tom to school.” child(John, Tom), child(John, Susan) – recall 30% 34
  34. 34. Outline – Assessing KB recall 1. Logical foundations 2. Rule mining 3. Information extraction 4. Data presence heuristic 35
  35. 35. Data presence heuristic: Idea KB: dateOfBirth(John, 17.5.1983) Q: dateOfBirth(John, 31.12.1999)? A: Probably not Single-value properties: • Having one value  Property is complete • Look at data alone suffices 36
  36. 36. What are single-value properties? 37 year Extreme case, but… • Multiple citizenships • More parents due to adoption • Several Twitter accounts due to presidentship
  37. 37. All hopes lost? • Presence of a value is better than nothing • Even better: For non-functional attributes, data is still frequently added in batches • All clubs Diego Maradona played for • All ministers of Merkel’s new cabinet • … • Checking data presence is a common heuristic among Wikidata editors 38
  38. 38. Value presence heuristic - example [https://www.wikidata.org/wiki/Wikidata:Wikivoyage/Lists/Embassies]
  39. 39. Data presence heuristic: Challenges 4.1: Which properties to look at? 4.2: How to quantify data presence? 40
  40. 40. 4.1: Which properties to look at? (1/2) • Complete(Wikidata for Putin)? • There are more than 3000 properties one can assign to Putin… • Not all properties are relevant to everyone. (Think of goals scored or monastic order) • Are at least all relevant properties there? • What do you mean by relevant? 41
  41. 41. 42 State-of-the-art approach gets 61% of high-agreement triples right • Mistakes frequency for interestingness Our method using also linguistic similarity achieves 75% We used crowdsourcing to annotate 350 random (person, property1, property2) triples with human perception of interestingness [Razniewski et al., ADMA 2017] 4.1: Which properties to look at? (2/2)
  42. 42. 4.2: How to quantify data presence? We have values for 46 out of 77 relevant properties for Putin  Hard to interpret Proposal: Quantify based on comparison with other similar entities Ingredients: • Similarity metric Who is similar to Trump? • Data quantification How much data is good/bad? • Deployed on Wikidata, but evaluation difficult 43 [Ahmeti et al., ESWC 2017]
  43. 43. https://www.wikidata.org/wiki/User:Ls1g/Recoin
  44. 44. 45 Quantifying groups
  45. 45. Outline – Assessing KB recall 1. Logical foundations 2. Rule mining 3. Information extraction 4. Data presence heuristic 5. Summary 46
  46. 46. Summary (1/3) • Increasing KB quality can to some extent be noticed downstream • Precision easy to evaluate • Recall largely unknown 47
  47. 47. Summary (2/3) • Ideal is human-curated completeness information • Created in conjunction with data (COOL-WD tool) • Not really scalable • Automated alternatives: • Association rule mining • Information extraction • Looking at existence of data is a useful start 48
  48. 48. Summary (3/3) • Recall-aware information extraction an open challenge • Concepts of relevance and relative completeness in KBs little understood to date • I look forward to fruitful collaborations with UdS, MPI-SWS and MPI-INF 49

Editor's Notes

  • O-like letter - otto
  • 350 man years to complete, estimate 1986
  • Google launched 1998 (1995 other name)
  • First Chinese, fourth Hindi
  • Marx point: see what you are actually trying to approximate
  • -> rule mining with constraints?
  • Here multiple claims, but so when do we have all?
  • Sl – sitelink yes or no, www yes or no, img yes or no
    Coordinate yes or no
    Phone yes or no
  • What is good/bad: Problem could be that very few are good/bad
  • Question: What are/how to find interesting facets?
  • Much work on entity and fact ranking, little on predicate ranking

×