• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Using Suffix Arrays for Efficient Recognition of Named Entities in Large Scale
 

Using Suffix Arrays for Efficient Recognition of Named Entities in Large Scale

on

  • 824 views

 

Statistics

Views

Total Views
824
Views on SlideShare
823
Embed Views
1

Actions

Likes
0
Downloads
3
Comments
0

1 Embed 1

https://www.linkedin.com 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Using Suffix Arrays for Efficient Recognition of Named Entities in Large Scale Using Suffix Arrays for Efficient Recognition of Named Entities in Large Scale Presentation Transcript

    • Using Suffix Arrays forEfficient Recognition of Named Entities in Large Scale Benjamin Adrian, Sven Schwarz Benjamin Adrian, Sven Schwarz http://www.dfki.de/~lastname
    • A huge Web of Data The Semantic Web offers techniques for ... ● representing, ● formalizing, ● and reasoning information … on the WWW in order to make information ... ● transferable, ● portable, ● and interpretable … for machine consumption.∑ 9,363,625 distinct literal values Benjamin Adrian, Sven Schwarz 2 http://www.dfki.de/~lastname
    • Wouldnt it be great to … ?… to link entity references in text to referents in RDF graphs. Benjamin  works at DFKI,  Kaiserslautern. Goal: Enrich natural language text with formal facts. Benjamin Adrian, Sven Schwarz 3 http://www.dfki.de/~lastname
    • How to recognize entity references ?natural language text efficient representation RDF source Benjamin  works at DFKI,  Kaiserslautern. → application of relational databases and suffix arrays Benjamin Adrian, Sven Schwarz 4 http://www.dfki.de/~lastname
    • Entity Recognition Processtext noun-phrase suffix array hashes database RDF graph chunking prefix hashing query candidates with matching prefixes exact match exact matches Benjamin Adrian, Sven Schwarz 5 http://www.dfki.de/~lastname
    • RDF statements <#19810211> <rdfs:label> “Benjamin Adrian”symbols <#67478302> <rdfs:label> “DFKI” relation <#19810211> <#employedAt> <#67478302> Benjamin Adrian, Sven Schwarz 6 http://www.dfki.de/~lastname
    • Represent RDF data sepatarate storage of symbols and relations SYMBOLS RELATIONS SUBJECT PREDICATE OBJECT SUBJECT PREDICATE OBJECT RESOURCE INDEX LITERAL INDEX URI INDEX INDEX LITERAL HASH dictionaries Benjamin Adrian, Sven Schwarz 7 http://www.dfki.de/~lastname
    • Suffix ArrayText “Benjamin Adrian works in DFKI, Kaiserslautern”Suffix array (sorted list of suffixes) Adrian works in DFKI, Kaiserslautern Benjamin Adrian works in DFKI, Kaiserslautern DFKI, Kaiserslautern in DFKI, Kaiserslautern Kaiserslautern works in DFKI, Kaiserslautern Benjamin Adrian, Sven Schwarz 8 http://www.dfki.de/~lastname
    • Suffix ArrayText “Benjamin Adrian works in DFKI, Kaiserslautern”Suffix array (sorted list of suffixes) Adrian works in DFKI, Kaiserslautern Benjamin Adrian works in DFKI, Kaiserslautern DFKI, Kaiserslautern in DFKI, Kaiserslautern Kaiserslautern works in DFKI, KaiserslauternPhrases in text Reduced suffix array Adrian works in DFKI, Kaiserslautern Benjamin Adrian Benjamin Adrian works in DFKI, Kaiserslautern DFKI DFKI, Kaiserslautern Kaiserslautern Kaiserslautern Benjamin Adrian, Sven Schwarz 9 http://www.dfki.de/~lastname
    • Noun phrases in natural languagetext Benjamin Adrian, Sven Schwarz 10 http://www.dfki.de/~lastname
    • Hashing prefixesSuffix array (hashed prefix size = 4) LITERAL INDEX Adrian works in DFKI, Kaiserslautern Benjamin Adrian works in DFKI, Kaiserslautern INDEX LITERAL HASH DFKI, Kaiserslautern Kaiserslautern Benjamin Adrian, Sven Schwarz 11 http://www.dfki.de/~lastname
    • Select candidates from database Benjamin Adrian, Sven Schwarz 12 http://www.dfki.de/~lastname
    • Response time Benjamin Adrian, Sven Schwarz 13 http://www.dfki.de/~lastname
    • Summarytext noun-phrase suffix array hashes database RDF graph chunking prefix hashing query candidates with matching prefixes exact match exact matches Benjamin Adrian, Sven Schwarz 14 http://www.dfki.de/~lastname
    • Thank youBenjamin Adrian Questions?Sven Schwarz Benjamin Adrian, Sven Schwarz 15 http://www.dfki.de/~lastname