Approximate Scalable Bounded Space Sketch for Large Data NLP
Large-Scale Information Extraction from Textual Definitions through Deep Syntactic and Semantic Analysis
1. Large-Scale Information
Extraction from Textual Definitions
through Deep Syntactic and
Semantic Analysis
TACL 2015
Claudio Delli Bovi, Luca Telesca
and Roberto Navigli
Presentation: Koji Matsuda (Tohoku University)
1
著者のスライドから一部の図を拝借しています:
http://wwwusers.di.uniroma1.it/~dellibovi/talks/talk_OIE.pdf
8. Syntactic-Semantic Graphからの
知識獲得
8
DefIE:How it works
http://lcl.uniroma1.it/defie
=
=
=
=
Extraction1Extraction2
1. Extracting relation instances
エンティティペアの最短パスを取る
不要な知識がいっぱい取れるので、スコアリングします
9. 知識ベースを使ったスコアリング
9
DefIE:How it works
http://lcl.uniroma1.it/defie
2. Relation typing and scoring
For each relation 𝑅:
Compute the score of 𝑅 as
Total number of
extracted instances
for 𝑅
Length of the
relation pattern of 𝑅
Domain and range
entropy of 𝑅
知識ベースにグラウンドされているので、知識ベースを使ってRelationの
良し悪しをはかることが可能
パタンの頻度
パタンの(項の)曖昧性
Domain, Rangeの上位語を
(BabelNetから)求めて、その
上で曖昧性を計算
パタンの長さ
18. 18
BabelNet
• Multilingual Encyclopedic Dictionary
– Lexicographic & Encyclopedic knowledge
– Based on Automatic Integration of :
• WordNet, Wikipedia, Wiktionary, …
Named Entities and specialized
concepts from Wikipedia
Concepts from WordNet
50 Languages
21M definitions
62M entries
18
Concepts integrated from both
resources
19. 19
Lexical Knolwdge
Base
Encyclopedical Knolwdge Base
Integrated Knowledge Base
Thomas Muller
striker
Munich
Mario Gomez
Thomas Millan
playing
FC Bayern Munich
Semantic Interpretation Graph
Semantic Signature
→ Select most suitable meaning on the Graph
Thomas and Mario
are strikers playing
in Munich. They are
…
Input Text
[Moro+, 2013]