Open hpi semweb-06-part6

212 views

Published on

  • Be the first to comment

  • Be the first to like this

Open hpi semweb-06-part6

  1. 1. Semantic Web TechnologiesLecture 6: Applications in the Web of Data 06: Named Entity Recognition Dr. Harald Sack Hasso Plattner Institute for IT Systems Engineering University of Potsdam Spring 2013 This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0)
  2. 2. 2 Lecture 6: Applications in the Web of Data Open HPI - Course: Semantic Web Technologies Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
  3. 3. 3 06 - Named Entity RecognitionOpen HPI - Course: Semantic Web Technologies - Lecture 6: Applications in the Web of Data Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
  4. 4. 4 Meaning sender Experience receiver Context Concept symbolizes refers toExperience Symbol Object stands for „Jaguar“ Pragmatics Ogden, Richards: The Meaning of Meaning: Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Study of the Influence of Language upon Thought and of the Science of Symbolism (1923) A Potsdam
  5. 5. Armstrong
  6. 6. ,Armstrong‘ is more than just a character string
  7. 7. ,Armstrong‘ is more than just a character string Neil Armstrong
  8. 8. ,Armstrong‘ is more than just a character string Neil Armstrong is a Astronaut
  9. 9. ,Armstrong‘ is more than just a character string Neil Armstrong is a Astronaut subClassOf Science Occupation
  10. 10. ,Armstrong‘ is more than just a character string Neil Armstrong is a Astronaut subClassOf Science Occupation subClassOf Employment
  11. 11. ,Armstrong‘ is more than just a character string Neil Armstrong is a is a Astronaut Person subClassOf Science Occupation subClassOf Employment
  12. 12. ,Armstrong‘ is more than just a character string Neil Armstrong is a is a Astronaut Person subClassOf Science Occupation subClassOf has an Employment
  13. 13. ,Armstrong‘ is more than just a character string Neil Armstrong Entities is a is a Ontologies Astronaut Person subClassOf Science Occupation subClassOf has an Employment
  14. 14. ,Armstrong‘ is more than just a character string Neil Armstrong Entities is a is a Ontologies Astronaut Person subClassOf is NOT a Science Occupation subClassOf has an Employment
  15. 15. ,Armstrong‘ is more than just a character string Neil Armstrong Entities is a is a Ontologies same as Cosmonaut Astronaut Person subClassOf is NOT a Science Occupation subClassOf has an Employment
  16. 16. ,Armstrong‘ is more than just a character string Juri Gagarin is a Neil Armstrong Entities is a is a Ontologies same as Cosmonaut Astronaut Person subClassOf is NOT a Science Occupation subClassOf has an Employment
  17. 17. ,Armstrong‘ is more than just a character string Juri Gagarin is a Neil Armstrong Name d Entity Recognition Entities (also Entity Identificati on or Entity Extraction) tomic elements...inis a is a to „lo cating and classifying a sOntologies sames names, person , predefined categories such a as Kosmonaut Astronautf time, so Person organizatio ns, locations, expression quantities, m oneta , etc.“ ry valuessubClassOf is NOT a Science Occupation 79) C.J.Rijsbe rgen, Information Retrieval (19 subClassOf has an Employment
  18. 18. Where does the knowledge come from...?
  19. 19. Where does the knowledge come from...?
  20. 20. Where does the knowledge come from...?
  21. 21. Web of Data = Linked Open Data
  22. 22. Armstrong
  23. 23. Armstrong Entity Mapping Neil Armstrong is a is a Astronaut Person subClassOf Science Occupation subClassOf Employment
  24. 24. 12 Meaning sender Experience receiver Concept symbolizes refers toExperience http://commons.wikimedia.org/wiki/User:McSmit Symbol Object stands for Armstrong Pragmatics Ogden, Richards: The Meaning of Meaning: Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Study of the Influence of Language upon Thought and of the Science of Symbolism (1923) A Potsdam
  25. 25. 12 Meaning sender Experience receiver Context Concept symbolizes refers toExperience http://commons.wikimedia.org/wiki/User:McSmit Symbol Object stands for Armstrong Pragmatics Ogden, Richards: The Meaning of Meaning: Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Study of the Influence of Language upon Thought and of the Science of Symbolism (1923) A Potsdam
  26. 26. Armstrong landed the Eagle on the Moon.
  27. 27. Armstrong landed the Eagle on the Moon. Determine all possible Entity Mapping Candidates • linguistic analysis (POS tagging) • normalization • encoding and spelling • special (language dependent) characters • language dependent spellings • abbreviations, acronyms • type dependent spellings • alternative names and synonyms • fuzzy string mapping • ...
  28. 28. Armstrong landed the Eagle on the Moon. Determine all possible Entity Mapping Candidates
  29. 29. Armstrong landed the Eagle on the Moon. Determine all possible Entity Mapping Candidates Anton Armstrong Armstrong, Ontario Armstrong Tools Ian Armstrong Armstrong, Ontario Armstrong (car) Edward Armstrong Armstrong, Florida Neil Armstrong Armstrong (moon crater) Armstrong County, Texas Gary Armstrong The Armstrongs George Armstrong Armstrong Tunnel Armstrong Bridge Louis Armstrong Craig Armstrong The Armstrong Twins Lance Armstrong + 400 more...
  30. 30. Armstrong landed the Eagle on the Moon. Entity Selection process is determined by • context • ambiguity of source data / mapping • accuracy /reliability of source data / mapping
  31. 31. Armstrong landed the Eagle on the Moon. Entity Selection process is determined by • context • ambiguity of source data / mapping • accuracy /reliability of source data / mapping Anton Armstrong Armstrong, Ontario Armstrong Tools Ian Armstrong Armstrong, Ontario Armstrong (car) Edward Armstrong Armstrong, Florida Neil Armstrong Armstrong (moon crater) Armstrong County, Texas Gary Armstrong The Armstrongs George Armstrong Armstrong Tunnel Armstrong Bridge Louis Armstrong Craig Armstrong The Armstrong Twins Lance Armstrong + 400 more...
  32. 32. Armstrong landed the Eagle on the Moon. Consider all entities within the same context Armstrong Eagle Moon 448 entities 95 entities 156 entities Man on the Moon (film) George Armstrong Custer Eagle (Bird) Moon (song) Neil Armstrong Eagle (heraldry) Moon Son-Ri The Armstrong Twins USCGC Eagle Moon 44 C Moon Armstrong, Florida The Eagle (2011 film) Eagle (comic) The Moon (Tarot card) Craig Armstrong Armstrong, Ontario Man on the Moon (soundtrack) Eagle (song) Moon Armstrong (Moon Crater) Eagle (lunar module) Armstrong Gun The Eagle (newspaper) Man on the Moon (musical) Armstrong‘s Theorem War Eagle Mr. Moon (song) Eagle (Moon Crater) Louis Armstrong International Airport Moon (Band) The Eagle (Pub) Armstrong County, Texass Moon OS Eagle TV Eagle Falls (Washington) Moon 83 Joe Armstrong Lottie Moon Ian Armstrong Eagle (racehorse) Edgar Moon Armstrong Tunnel Armstrong Tunnel Armstrong Automobile John H. Eagle Darvin Moon Sir Thomas Armstrong Eagle (typeface) Gary Moon William Moon Louis Armstrong Angela Eagle Francis Moon Armstrong (British Columbia) Linda Eagle Robert Charles Moon Karen Armstrong Allan Moon Curtis Armstrong James Philipp Eagle Fly me to the Moon (song) Hilary Armstrong Black Moon Ban-Ki Moon Gillian Armstrong William L. Armstrong
  33. 33. Named Entity Recognition Strategies Entity Selection Process Select matching entities from all possible candidate entities: • Popularity based strategies • reference text corpus (wikipedia) • Linguistical strategies • link graph (wikipedia) • Statistical strategies • semantic graph • Semantic based strategies (dbpedia) General Approach 1. Make an assumption 2. Do the strategies support or contradict your assumption 3. Make decision according to logical and probabilistic rules/constraints N. Ludwig, H. Sack, “Named entity recognition for user-generated tags,TIR 2011
  34. 34. Armstrong landed the Eagle on the Moon. Consider all entities within the same context
  35. 35. Armstrong landed the Eagle on the Moon. Consider all entities within the same context
  36. 36. Armstrong landed the Eagle on the Moon. Consider all entities within the same context
  37. 37. Armstrong landed the Eagle on the Moon. Consider all entities within the same context
  38. 38. Armstrong landed the Eagle on the Moon. Consider all entities within the same context
  39. 39. Armstrong landed the Eagle on the Moon. Consider all entities within the same context
  40. 40. Armstrong landed the Eagle on the Moon. Entity Selection Process (Semantic) Graph Analysis Armstrong Eagle Moon 448 entities 95 entities 156 entities Man on the Moon (film) George Armstrong Custer Eagle (Bird) Moon (song) Neil Armstrong Eagle (heraldry) Moon Son-Ri The Armstrong Twins USCGC Eagle Moon 44 C Moon Armstrong, Florida The Eagle (2011 film) Eagle (comic) The Moon (Tarot card) Craig Armstrong Armstrong, Ontario Moon Man on the Moon (soundtrack) Eagle (song) Armstrong (Moon Crater) Eagle (lunar module) Armstrong Gun The Eagle (newspaper) Man on the Moon (musical) Armstrong‘s Theorem War Eagle Mr. Moon (song) Eagle (Moon Crater) Louis Armstrong International Airport Moon (Band) The Eagle (Pub) Armstrong County, Texass Moon OS Eagle TV Eagle Falls (Washington) Moon 83 Joe Armstrong Lottie Moon Ian Armstrong Eagle (racehorse) Edgar Moon Armstrong Tunnel Armstrong Tunnel Armstrong Automobile John H. Eagle Darvin Moon Sir Thomas Armstrong Eagle (typeface) Gary Moon William Moon Louis Armstrong Angela Eagle Francis Moon Armstrong (British Columbia) Linda Eagle Robert Charles Moon Karen Armstrong Allan Moon Curtis Armstrong James Philipp Eagle Fly me to the Moon (song) Hilary Armstrong Black Moon Ban-Ki Moon Gillian Armstrong William L. Armstrong N. Steinmetz, H.Sack: Semantic Multimedia Information Retrieval Based on Contextual Descriptions, 2013
  41. 41. Armstrong landed the Eagle on the Moon.http://dbpedia.org/resource/Neil_Armstrong http://dbpedia.org/resource/Apollo_Lunar_Module http://dbpedia.org/resource/Moon
  42. 42. 22 07 - Semantic Search Open HPI - Course: Semantic Web Technologies - Lecture 6: Applications in the Web of Data Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam

×