Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Entity Typing and Event Extraction
Marieke van Erp

http://mariekevanerp.com
VU WP3 team
Isa Maks Antske Fokkens Marieke van Erp Piek Vossen
Entity typing
What is entity typing?
• Entity typing is the task of classifying an entity mention
• An entity mention is a recognised na...
What is the added value of entity typing?
• It allows you to query for fine-grained entity types: give
me all electricians ...
New synonym/concept lists are easy to plug in
New synonym/concept lists are easy to plug in
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrij...
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrij...
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrij...
Named Entity Recognition & Linking
• We are creating links between HISCO and Brouwers
• We are building on entity and conc...
Event Extraction
Event Extraction
• Event Extraction is the task of recognising and classifying
mentions of ‘things that happen’ in text
• ...
From words to concepts
• Linking terms to synonyms to obtain a higher level of
abstraction
• Word-sense disambiguation + W...
Why link to WordNet/ConceptNet/etc?
• It allows you to query for types rather than instances: give
me all lawsuits in the ...
Semantic Role Labelling
• Detecting the agent, patient, recipient and theme of a
sentence
• Mary sold the book to John
• A...
Event12
buy/sell fn:Seller
fn:Commerce_money_transfer
fn:Goods fn:Money
fn:Buyer
dbp:Porsche_fa
mily
dbp:QatarHolding
?Ent...
Event abstractions
• Enable searches such as: Give me all lawsuits in which a
politician was involved between 1990 and 200...
Find out more
• All modules and evaluations are described in: http://
kyoto.let.vu.nl/newsreader_deliverables/NWR-D4-2-3.p...
Discussion
• It’s research software (no fancy interface)
• Currently not adapted to deal with old spelling variants/OCR/
e...
Entity Typing and Event Extraction
Entity Typing and Event Extraction
Entity Typing and Event Extraction
Entity Typing and Event Extraction
Entity Typing and Event Extraction
Upcoming SlideShare
Loading in …5
×

Entity Typing and Event Extraction

288 views

Published on

Presentation about entity typing and event extraction in the context of CLARIAH as given at the CLARIAH tech day 2016, Utrecht 7 October 2016

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Entity Typing and Event Extraction

  1. 1. Entity Typing and Event Extraction Marieke van Erp http://mariekevanerp.com
  2. 2. VU WP3 team Isa Maks Antske Fokkens Marieke van Erp Piek Vossen
  3. 3. Entity typing
  4. 4. What is entity typing? • Entity typing is the task of classifying an entity mention • An entity mention is a recognised name in a text that refers to a real world person, location, organisation or other interesting ‘thing’
  5. 5. What is the added value of entity typing? • It allows you to query for fine-grained entity types: give me all electricians in the dataset, give me all historic buildings • Entity typing often includes linking an entity to background knowledge • The background knowledge provides additional filters: give me all politicians born after 1900 in the dataset • Caveat: the background knowledge is not complete
  6. 6. New synonym/concept lists are easy to plug in
  7. 7. New synonym/concept lists are easy to plug in
  8. 8. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  9. 9. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  10. 10. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  11. 11. Named Entity Recognition & Linking • We are creating links between HISCO and Brouwers • We are building on entity and concept linkers that can recognise concepts from HISCO and Brouwers in texts • We are developing a new general purpose entity linker that allows for use of datasets other than DBpedia and is less sensitive to general entity popularity • Discovering more about Dark and NIL entities is also ongoing work (cf. Van Erp & Vossen (2016) Entity Typing using Distributional Semantics and DBpedia. To appear in: Proceedings of the 4th NLP&DBpedia workshop. Kobe, Japan 18 October 2016)
  12. 12. Event Extraction
  13. 13. Event Extraction • Event Extraction is the task of recognising and classifying mentions of ‘things that happen’ in text • Events are multifaceted: they take place at a certain time and place and have participants involved • By recognising participants, times and places, we can generate event descriptions and compare events
  14. 14. From words to concepts • Linking terms to synonyms to obtain a higher level of abstraction • Word-sense disambiguation + WordNet + Multilingual Central Repository + Framenet + PropBank • Stop, quit, leave, relinquish, bow out -> all linked to the concept wn:leave_office
  15. 15. Why link to WordNet/ConceptNet/etc? • It allows you to query for types rather than instances: give me all lawsuits in the dataset • In the context of CLARIAH, we are converting various diachronous lexicons to Linked Data • integrate resources • tag interesting concepts in text • query expansion
  16. 16. Semantic Role Labelling • Detecting the agent, patient, recipient and theme of a sentence • Mary sold the book to John • Agent: Mary • Recipient: John • Theme: the book
  17. 17. Event12 buy/sell fn:Seller fn:Commerce_money_transfer fn:Goods fn:Money fn:Buyer dbp:Porsche_fa mily dbp:QatarHolding ?Entity23 10% stake type Qatar Holding sells 10% stake in Porsche to founding families Porsche family buys back 10pc stake from Qatar http://english.alarabiya.net http://www.telegraph.co.uk 2013-06-17 sem:hasTime 2013-06-17
  18. 18. Event abstractions • Enable searches such as: Give me all lawsuits in which a politician was involved between 1990 and 2000. • Current developments: expand resources to the historic domain, devise new crystallisation strategies for aggregating event information
  19. 19. Find out more • All modules and evaluations are described in: http:// kyoto.let.vu.nl/newsreader_deliverables/NWR-D4-2-3.pdf (158 pages!) • Selection to be adapted within CLARIAH: https:// github.com/CLARIAH/wp3-semantic-parsing-Dutch • New developments: http://www.clariah.nl & https:// github.com/clariah
  20. 20. Discussion • It’s research software (no fancy interface) • Currently not adapted to deal with old spelling variants/OCR/ etc • NLP isn’t perfect (but humans don’t always agree either!) • What would it take for you to start using such tools? • What types of analyses are most interesting to the community? • What use cases are most useful to the community at this point in time?

×