Perspectives on mining knowledge graphs from text

Presented by: Jennifer D’Souza, Postdoc in the ORKG team
http://orkg.org | @orkg_org
Technische Informationsbibliothek (TIB)
Welfengarten 1B // 30167 Hannover
Perspectives on Mining Knowledge Graphs
from Text

● A critical scientific document digitalization initiative in this
digital age
○ Capturing scholarly article contributions in machine interpretable
Knowledge Graphs
● The ORKG is hosted at TIB
○ https://www.orkg.org/
○ @orkg_org
● Led by TIB director Prof. (Dr.) Sören Auer
The Open Research Knowledge Graph (ORKG) is ...

Perspectives on Mining Knowledge Graphs from Text
Knowledge
Graph
Knowledge
Representation
Learning
Representation Space
Scoring Function
Encoding Models
Knowledge
Acquisition
Entity Discovery
Relation Extraction
Knowledge Graph Completion
*Point-wise *Manifold
*Complex *Gaussian
*Discrete
*Linear/Bilinear *RNN
*Factorization *Transformers
*Neural Nets *GCN
*CNN
*Recognition
*Typing
*Linking
*Alignment
*Neural Nets
*Attention
*GCN
*GAN
*RL
*Others
*Embedding-based Ranking
*Path-based Reasoning
*Rule-based Reasoning
*Meta Relational Learning
*Triple Classification
References
Ji, Shaoxiong, et al. "A survey on knowledge graphs: Representation, acquisition, and applications." IEEE Transactions on Neural Networks and Learning Systems
(2021).
*Distance-based
*Similarity Matching

(I) Entity Linking and
(II) KG Completion
Jennifer D’Souza
Technische Informationsbibliothek (TIB)
Welfengarten 1B // 30167 Hannover

● Given an entity mention in a text document and a
knowledge base (KB) of entities,
○ find the entity in the KB the entity mention refers to
or
○ determine that such entity does not exist in the KB
Entity Linking

Entity Linking
What is the birthdate of the famous basketball player
Michael Jordan?

Entity Linking
What is the birthdate of the famous basketball player
Michael Jordan?
Knowledge Bases

Entity Linking
● challenging because
○ entity ambiguity: mentions with the same word/phrase can have various entity
candidates
■ E.g., Michael Jordan: Basketball player or Berkeley professor?
○ name variations: mentions with different words/phrases can refer to the same
entity
■ E.g., New York City or Big Apple

Entity Linking
candidates
○ name variations: mentions with different words/phrases can refer to the same entity
● Aside 1: alternatively called Named Entity Disambiguation
○ However, Named Entity Disambiguation (NED) and Entity Linking (EL) can
sometimes be treated as separate tasks.
■ NED: determine which named entity a mention refers to.
● E.g., the mention “Trump” can refer to either a person, a corporation or a building;
■ EL: provide a standard IRI for each disambiguated entity.
● IRIs (Internationalized Resource Identifier) used as subjects, predicates, and objects
can be taken from well-defined vocabularies or ontologies in the Linked Open Data
(LOD) cloud

Entity Linking
candidates
○ However, Named Entity Disambiguation (NED) and Entity Linking (EL) can
sometimes be treated as separate tasks.
■ NED: determine which named entity a mention refers to.
● E.g., the mention “Trump” can refer to either a person, a corporation or a building;
■ EL: provide a standard IRI for each disambiguated entity.
● E.g., Trump-the-president can be linked to the IRI that represents him in Wikidata:
https://www.wikidata.org/entity/Q22686

Entity Linking
candidates
○ In this talk, NED and EL are treated as the same task, i.e. NED that finds which
entity a mention like “Trump” refers to, and the EL providing the LOD IRI for that
entity, are considered as one step

Entity Linking
candidates
● Aside 2: commonly known as normalization for the biomedical domain
○ Map a word/phrase in a document to a concept in an ontology after disambiguating
potential ambiguous words/phrases

Entity Linking
candidates
● Aside 2: commonly known as normalization for the biomedical domain
○ This talk will focus on the open-domain EL task, i.e. involving data from newswire or the
Web.
■ While the approaches for open-domain EL can be imported to biomedical entity
normalization, the latter task may be amenable to strong rule-based resolution1,2
as well
References
1 D’Souza, J., & Ng, V. (2015, July). Sieve-based entity linking for the biomedical domain. In Proceedings of the 53rd Annual Meeting of the Association for
Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (pp. 297-302)
2. D. Kim et al., "A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining," in IEEE Access, vol. 7, pp. 73729-73740,
2019, doi: 10.1109/ACCESS.2019.2920708.

Plan for Part I of II of the Talk
● Datasets & Knowledge Bases
● (Neural) Approaches
○ since 2015
● Evaluations

Datasets
● Open-domain Evaluation datasets from various genres
○ News, Tweets, Web pages, Blog, Encyclopedia

Datasets: Details & Statistics
Dataset Name Genre Mentions KB
AIDA news 34,587 YAGO/Freebase/W
ikipedia
KBP’2010 news 4,338 Wikipedia
MSNBC news 656 Wikipedia
AQUAINT news 449 Wikipedia
ACE-2004 news 257 Wikipedia
WNED-CWEB
(CWEB)
news 11,154 Wikipedia
WNED-WIKI (WW) news 6,821 Wikipedia

ikipedia
WNED-CWEB
(CWEB)
AIDA is the largest human-annotated dataset, where each of the 34,587 mentions
were checked for entities in the YAGO knowledge base.

ikipedia
WNED-CWEB
(CWEB)
The Knowledge Base Population (KBP) track conducted as part of NIST Text Analysis Conference (TAC) is
an international entity linking competition held every year since 2009. Entity linking is regarded as one of
the two subtasks in this track. These public entity linking competitions provided some benchmark data sets
to evaluate and compare different entity linking systems.

ikipedia
WNED-CWEB
(CWEB)
Then there is a dataset from the MSNBC news source.

ikipedia
WNED-CWEB
(CWEB)
WNED datasets where WNED stands for Walking Named Entity Disambiguation as
a name of the algorithm developed for EL are the largest automatically created
datasets.

Others: NEEL (tweets; 8,665 mentions; DBpedia); OKE-2015 (encyclopedia; DBpedia); WES2015
(blog; DBpedia); WikiNews (news; DBpedia); OKE2016 (Web pages; 1,043 mentions; DBpedia)
ikipedia
WNED-CWEB
(CWEB)

● A fundamental component for Entity Linking
● Knowledge bases provide the information about the world’s entities (e.g., the entities of
Albert Einstein and Ulm), their semantic categories (e.g., Albert Einstein has a type of
Scientist and Ulm has a type of City), and the mutual relationships between entities (e.g.,
Albert Einstein has a relation named bornIn with Ulm).
● Examples:
○ Wikipedia (6,195,675 English articles)1
■ a free online multilingual encyclopedia created through decentralized, collective efforts of thousands of volunteers around
the world.
■ The basic entry in Wikipedia is an article, which defines and describes an entity or a topic, and each article in Wikipedia is
uniquely referenced by an identifier.
■ Wikipedia provides a set of useful features for entity linking, such as entity pages, article categories, redirect pages,
disambiguation pages, and hyperlinks in Wikipedia articles.
○ DBpedia (4.58 million things in English version)2
■ multilingual knowledge base constructed by extracting structured information from Wikipedia such as infobox templates,
categorization information, geo-coordinates, and links to external Web pages
References
1 https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia
2 https://wiki.dbpedia.org/about/facts-figures
Knowledge Bases for Entities

● A fundamental component for Entity Linking
● Knowledge bases provide the information about the world’s entities (e.g., the entities of
Albert Einstein and Ulm), their semantic categories (e.g., Albert Einstein has a type of
Scientist and Ulm has a type of City), and the mutual relationships between entities (e.g.,
Albert Einstein has a relation named bornIn with Ulm).
● Examples:
○ YAGO (50 million entities and 2 billion facts)1
■ YAGO combines Wikidata and the schema.org ontology as the top level ontology for information organization, thus getting
the best from both worlds: a huge repository of facts, together with an ontology that is simple and used as a standard by a
large community.
References
1 https://yago-knowledge.org/getting-started
Knowledge Bases for Entities

We have:
1. released a novel multidisciplinary corpus of scholarly abstracts annotated for scientific
entities under a generic conceptual formalism that bridges 10 different STEM scientific
disciplines
a. The STEM domains we consider are Agriculture, Astronomy, Biology, Chemistry, Computer Science,
Earth Science, Engineering, Materials Science, and Mathematics.
b. The generic conceptual formalism involves four entity types
i. Process, Method, Material, and Data
c. The terms underlying the Process, Method, Material, and Data entities are linked in Wikipedia,
thereby, our entities are disambiguated for their scientific sense and grounded in the real world.
d. The STEM-ECR v1.0 corpus is publicly available: https://doi.org/10.25835/0017546 (ISLRN
749-555-840-571-2)
References
● Brack, Arthur, Jennifer D’Souza, Anett Hoppe, Sören Auer, and Ralph Ewerth. "Domain-independent extraction of scientific concepts from
research articles." In European Conference on Information Retrieval, pp. 251-266. Springer, Cham, 2020.
● D’Souza, Jennifer, Anett Hoppe, Arthur Brack, Mohmad Yaser Jaradeh, Sören Auer, and Ralph Ewerth. "The STEM-ECR Dataset: Grounding
Scientific Entity References in STEM Scholarly Content to Authoritative Encyclopedic and Lexicographic Sources." In Proceedings of The 12th
Language Resources and Evaluation Conference, pp. 2192-2203. 2020.
Datasets & Knowledge Bases
New Resource Highlight: Scholarly Knowledge Linked Entities across STEM
Disciplines

(Neural) Approaches to Entity Linking
● EL has three main subtasks:
○ candidate-entity generation;
■ aims to retrieve all possible entities in the KB that may refer to an entity mention
○ candidate-entity ranking or disambiguation;
■ aims to rank the candidate entities and return the most likely one for each targeted mention
○ NIL clustering
■ handles those mentions that cannot be matched with an entity in the KB

Approaches to Entity Linking: From the 3 subtasks perspective
Reference: Figure 7 in T. Al-Moslmi, M. Gallofré Ocaña, A. L. Opdahl and C. Veres, "Named Entity Extraction for Knowledge
Graphs: A Literature Overview," in IEEE Access, vol. 8, pp. 32862-32881, 2020, doi: 10.1109/ACCESS.2020.2973928.

Approaches to Entity Linking: From the systems perspective
Reference: Part of Figure 8 in T. Al-Moslmi, M. Gallofré Ocaña, A. L. Opdahl and C. Veres, "Named Entity Extraction for
Knowledge Graphs: A Literature Overview," in IEEE Access, vol. 8, pp. 32862-32881, 2020, doi:
10.1109/ACCESS.2020.2973928.

Three Non-Neural Approaches
● AIDA1
○ Mention Detection using Stanford NER Tagger
○ Linking as a graph-based technique with weighted edges computed as degree of links
between pages
● DBpedia Spotlight2
○ Mention Detection as a lightweight heuristics-based model with syntactic parsers to generate
mention candidates
○ Linking as a generative probabilistic model using maximum likelihood estimates
● Babelfy3
○ Mention Detection as named entities (e.g., Major League Soccer) and overlapping nominals
(e.g., major league, soccer)
○ A unified graph-based approach relying on encyclopedic and lexicographic knowledge
References
1. AIDA: An Online Tool for Accurate Disambiguation of Named Entities in Text and Tables
2. Daiber, J., Jakob, M., Hokamp, C., & Mendes, P. N. (2013, September). Improving efficiency and accuracy in multilingual entity extraction. In Proceedings of the
9th International Conference on Semantic Systems (pp. 121-124).
3. A. Moro, A. Raganato, R. Navigli. Entity Linking meets Word Sense Disambiguation: a Unified Approach. Transactions of the Association for Computational
Linguistics (TACL), 2, pp. 231-244, 2014
Approaches to Entity Linking: From the systems perspective
prior to 2015

(Neural) Approaches to Entity Linking: General Architecture
since 2015
Reference: Figure 2 in Sevgili, O., Shelmanov, A., Arkhipov, M., Panchenko, A., & Biemann, C. (2020). Neural Entity Linking: A Survey
of Models based on Deep Learning. arXiv preprint arXiv:2006.00575.

since 2015
mentions in a plain text
are distinguished
corresponding entity is predicted for the
given mention

since 2015
possible entities are
produced for the mention
context/mention - candidate
similarity score is computed
through the representations

Three prominent methods:
● a surface form matching
○ a candidate list is composed of entities, which simply match surface forms of mentions in the
text; does not work well if referent entity does not contain mention string
● a dictionary lookup
○ a dictionary of additional aliases is constructed using KB metadata like
disambiguation/redirect pages of Wikipedia or lexical synonymy relations
● and a prior probability computation
○ the candidates are generated based on precalculated prior probabilities of correspondence
between certain mentions and entities; based on mention-entity hyperlink count statistics
[1,2,3,4,5,etc.]
○
References
1. Stefan Zwicklbauer, Christin Seifert, and Michael Granitzer. 2016. Robust and collective entity disambiguation through semantic embeddings. In Proceedings of
the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’16, pages 425–434, New York, NY, USA. ACM.
2. Chen-Tse Tsai and Dan Roth. 2016. Cross-lingual Wikification using multilingual embeddings. In Proceedings of the 2016 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 589–598, San Diego, California, USA. ACL.
3. Octavian-Eugen Ganea and Thomas Hofmann. 2017. Deep joint entity disambiguation with local neural attention. In Proceedings of the 2017 Conference on
Empirical Methods in Natural Language Processing, pages 2619–2629, Copenhagen, Denmark. ACL.
4. Nikolaos Kolitsas, Octavian-Eugen Ganea, and Thomas Hofmann. 2018. End-to-end neural entity linking. In Proceedings of the 22nd Conference on
Computational Natural Language Learning, pages 519–529, Brussels, Belgium. Association for Computational Linguistics.
5. Avirup Sil, Gourab Kundu, Radu Florian, and Wael Hamza. 2018. Neural cross-lingual entity linking. In The 32 AAAI, New Orleans, Louisiana, USA. AAAI Press.
Candidate Generation

The goal of this stage is given a list of entity candidates from a KB and a context with a mention to
rank these entities assigning a score to each of them.
Entity Ranking

General Architecture of a Neural Entity Ranking Component

Three parts 1.Encoding the mention

● To correctly disambiguate an
entity mention, it is crucial to
thoroughly capture the
information from its context.
● A contextualized vector
representation of a mention
is generated by an encoder
network.

Two approaches prevail:
● recurrent networks with LSTM cells
● self-attention
Mention Encoding Subcomponent

○ concatenating outputs of two LSTM networks that independently encode left and right
contexts of a mention (including the mention itself) [1];
○ encode left and right local contexts via LSTMs but also pool the results across all mentions in
a coreference chain and postprocess left and right representations with a tensor network [2];
○ modification of LSTM–GRU in conjunction with an attention mechanism to encode left and
right context of a mention [3];
○ run a bidirectional LSTM network on words complemented with embeddings of word positions
relative to a target mention [4]
References
1 Nitish Gupta, Sameer Singh, and Dan Roth. 2017. Entity linking via joint encoding of types, descriptions, and context. In 2017 EMNLP, pages 2681–2690,
Copenhagen, Denmark. ACL.
2 Avirup Sil, Gourab Kundu, Radu Florian, and Wael Hamza. 2018. Neural cross-lingual entity linking. In The 32 AAAI, New Orleans, Louisiana, USA. AAAI Press.
3. Yotam Eshel, Noam Cohen, Kira Radinsky, Shaul Markovitch, Ikuya Yamada, and Omer Levy. 2017. Named entity disambiguation for noisy text. In CoNLL 2017,
pages 58–68, Vancouver, Canada. ACL.
4. Phong Le and Ivan Titov. 2019b. Distant learning for entity linking with automatic noise detection. In Proceedings of the 57th Annual Meeting of the Association for
Computational Linguistics, pages 4081–4090, Florence, Italy, July. ACL.

● self-attention: encoding methods based on self-attention rely on the outputs from pre-trained
BERT layers for context and mention encoding.

● self-attention: encoding methods based on self-attention rely on the outputs from pre-trained
BERT layers for context and mention encoding.
○ a mention representation is modeled by pooling over word pieces in a mention span. The
authors also put an additional self-attention block over all mention representations that
encode interactions between several entities in a sentence [1].
○ reduce a sequence by keeping the representation of the special pooling symbol ‘[CLS]’
inserted at the beginning of a sequence [2].
○ mark positions of a mention span by summing embeddings of words within the span with a
special vector [3] and use the same reduction strategy as [2].
○ concatenate text with all mentions in it and jointly encode this sequence via a self-attention
model based on pre-trained BERT [4].
References
1 Matthew E. Peters, Mark Neumann, Robert Logan, Roy Schwartz, Vidur Joshi, Sameer Singh, and Noah A. Smith. 2019. Knowledge enhanced contextual word
representations. In Proceedings of the 2019 EMNLP-IJCNLP, pages 43–54, Hong Kong, China. ACL.
2 Ledell Yu Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2020. Zero-shot entity linking with dense entity retrieval. ArXiv,
abs/1911.03814.
3 Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, Jacob Devlin, and Honglak Lee. 2019. Zero-shot entity linking by reading entity
descriptions. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3449–3460, Florence, Italy. ACL.
4 Ikuya Yamada, Koki Washio, Hiroyuki Shindo, and Yuji Matsumoto. 2020. Global entity disambiguation with pretrained contextualized embeddings of words and
entities. arXiv preprint arXiv:1909.00426v2.

2.Encoding the candidate entities
Three parts

● Linking decision will be
based on how accurately
candidate entities match a
corresponding mention or
context based on the entity
structured or textual
information.
● Low-dimensional semantic
representations of entities
account for this in such a
way that spatial proximity of
entities in a vector space
correlates with their
semantic similarity.
Three parts

Aim to obtain vector representations for entities:
● capture different kinds of entity information, including entity type, description page, linked mention,
and contextual information, and therefore, generate a large encoder, which involves CNN for the
entity description and alignment function for the others [1].
● encode entities based on their title, description page, and category information. All previously
mentioned models rely on the annotated data, and a few studies are challenged with less
resource dependence [2].
● derive entity embeddings using pre-trained word2vec word vectors through description page
words, surface forms words, and entity category words [3,4].
● depend on the BERT architecture to create representations through the description pages [5,6].
References
1 Nitish Gupta, Sameer Singh, and Dan Roth. 2017. Entity linking via joint encoding of types, descriptions, and context. In Proceedings of the 2017 EMNLP, pages
2681–2690, Copenhagen, Denmark. ACL.
2 Daniel Gillick, Sayali Kulkarni, Larry Lansing, Alessandro Presta, Jason Baldridge, Eugene Ie, and Diego Garcia-Olano. 2019. Learning dense representations for
entity retrieval. In Proceedings of the 23rd CoNLL, pages 528–537, Hong Kong, China. ACL.
3 Yaming Sun, Lei Lin, Duyu Tang, Nan Yang, Zhenzhou Ji, and XiaolongWang. 2015. Modeling mention, context and entity with neural networks for entity
disambiguation. In Proceedings of the 24th, IJCAI’15, pages 1333–1339. AAAI Press.
4 Avirup Sil, Gourab Kundu, Radu Florian, and Wael Hamza. 2018. Neural cross-lingual entity linking. In 32 AAAI, New Orleans, Louisiana, USA. AAAI Press.
5 Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, Jacob Devlin, and Honglak Lee. 2019. Zero-shot entity linking by reading entity
descriptions. In Proceedings of the 57th ACL, pages 3449–3460, Florence, Italy. ACL.
6 Ledell Yu Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2020. Zero-shot entity linking with dense entity retrieval. ArXiv,
abs/1911.03814.
Entity Encoding Subcomponent

3. Comparing Mention and Candidate Entity Representations
Three parts

● Most of the state-of-the-art studies compare mention and entity representations using a dot product [1,2,3,4] or
cosine similarity [5,6,7].
● The calculated similarity score is often combined with mention-entity priors obtained during the candidate
generation phase [1,3,6] or other features including various similarities, string matching indicator, and entity types
[6,8,9,10].
● Commonly an additional one or two-layer feedforward network [1,6,9] is used. The final disambiguation decision is
inferred via a probability distribution, usually by a softmax function over the candidates. The local similarity score
or a probability distribution can be further utilized for global scoring.
References
1 Octavian-Eugen Ganea and Thomas Hofmann. 2017. Deep joint entity disambiguation with local neural attention. In Proceedings of the 2017 Conference on Empirical
Methods in Natural Language Processing, pages 2619–2629, Copenhagen, Denmark. ACL.
2 Nitish Gupta, Sameer Singh, and Dan Roth. 2017. Entity linking via joint encoding of types, descriptions, and context. In Proceedings of the 2017 EMNLP, pages
2681–2690, Copenhagen, Denmark. ACL.
3 Nikolaos Kolitsas, Octavian-Eugen Ganea, and Thomas Hofmann. 2018. End-to-end neural entity linking. In 22nd CoNLL, pages 519–529, Brussels, Belgium. ACL.
representations. In Proceedings of the 2019 EMNLP-IJCNLP, pages 43–54, Hong Kong, China. ACL.
5 Yaming Sun, Lei Lin, Duyu Tang, Nan Yang, Zhenzhou Ji, and XiaolongWang. 2015. Modeling mention, context and entity with neural networks for entity disambiguation. In
Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pages 1333–1339. AAAI Press.
6 Matthew Francis-Landau, Greg Durrett, and Dan Klein. 2016. Capturing semantic similarity for entity linking with convolutional neural networks. In Proceedings of the 2016
NAACL: Human Language Technologies, pages 1256–1261, San Diego, California, USA.
7 Daniel Gillick, Sayali Kulkarni, Larry Lansing, Alessandro Presta, Jason Baldridge, Eugene Ie, and Diego Garcia-Olano. 2019. Learning dense representations for entity
retrieval. In Proceedings of the 23rd CoNLL, pages 528–537, Hong Kong, China. ACL.
8 Avirup Sil, Gourab Kundu, Radu Florian, and Wael Hamza. 2018. Neural cross-lingual entity linking. In 32 AAAI, New Orleans, Louisiana, USA. AAAI Press.
9 Hamed Shahbazi, Xiaoli Z Fern, Reza Ghaeini, Rasha Obeidat, and Prasad Tadepalli. 2019. Entity-aware elmo:Learning contextual entity representation for entity
disambiguation. arXiv preprint arXiv:1908.05762.
Comparing Mention and Candidate Entity Representations

● Optionally addressed in some systems
● Aim to equip EL systems to recognize cases when referent entities of some mentions can be
absent in the KBs. This is known as NIL prediction.
Unlinkable Mention Prediction

● Optionally addressed in some systems
● Aim to equip EL systems to recognize cases when referent entities of some mentions can be
absent in the KBs. This is known as NIL prediction.
● Four common ways to perform NIL prediction.
○ a candidate generator does not yield any corresponding entities for a mention by setting a
threshold for linking probability [1,2]
○ introduce an additional special ‘NIL’ entity in the ranking phase, so some models predict it as
the best match for the mention [3]
○ train an additional binary classifier that accepts mention-entity pairs after the ranking phase,
as well as several additional features (best linking score, whether mentions are also detected
by a dedicated NER system, etc.), and makes the final decision about whether a mention is
linkable or not [4,5].
References
representations. In Proceedings of the 2019 EMNLP-IJCNLP, pages 43–54, Hong Kong, China. Association for Computational Linguistics.
2 Nevena Lazic, Amarnag Subramanya, Michael Ringgaard, and Fernando Pereira. 2015. Plato: A selective context model for entity resolution. TACL, 3:503–515.
3 Nikolaos Kolitsas, Octavian-Eugen Ganea, and Thomas Hofmann. 2018. End-to-end neural entity linking. In 22nd CoNLL, pages 519–529, Brussels, Belgium. ACL.
4 Jose G. Moreno, Romaric Besanc¸on, Romain Beaumont, Eva D’hondt, Anne-Laure Ligozat, Sophie Rosset, Xavier Tannier, and Brigitte Grau. 2017. Combining word
and entity embeddings for entity linking. In Extended Semantic Web Conference (1), volume 10249 of Lecture Notes in Computer Science, pages 337–352.
5 Pedro Henrique Martins, Zita Marinho, and Andr´e F. T. Martins. 2019. Joint learning of named entity recognition and entity linking. In Proceedings of the 57th Annual
Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 190–196, Florence, Italy. Association for Computational Linguistics.
Unlinkable Mention Prediction

Modifications of the General Architecture:
● Joint Entity Recognition and Disambiguation Architectures
○ Observe that interaction between recognition and disambiguation is beneficial to improve overall
model
■ E.g., multi-task learning framework that integrates recognition and linking [1]
● Global Context Architectures
○ global EL seen as sequential decision task where disambiguation of new entities is based on the
already disambiguated ones
■ E.g., apply LSTM to be able to maintain long term memory for previous decisions [2]
● Cross-lingual Architectures
○ leverage supervision signals from multiple languages for training a model in a target language
■ E.g., the inter-lingual links in Wikipedia utilized for alignment of entities in multiple languages. With
this alignment, the annotated data from high-resource languages like English can help to improve
the quality of text processing for the low-resource ones [3]
References
1 Pedro Henrique Martins, Zita Marinho, and Andr´e F. T. Martins. 2019. Joint learning of named entity recognition and entity linking. In 57th ACL: Student Research
Workshop, pages 190–196, Florence, Italy. ACL.
2 Zheng Fang, Yanan Cao, Qian Li, Dongjie Zhang, Zhenyu Zhang, and Yanbing Liu. 2019. Joint entity linking with deep reinforcement learning. In The World Wide
Web Conference, WWW ’19, pages 438–447, New York, NY, USA. ACM.
3 Heng Ji, Joel Nothman, Ben Hachey, and Radu Florian. 2015. Overview of TAC-KBP2015 tri-lingual entity discovery and linking. In Proceedings of the 2015 Text
Analysis Conference, TAC 2015, pages 16–17, Gaithersburg, Maryland, USA. NIST.
Modifications

Results are described in terms of accuracy and micro F1 scores
Evaluations: Metrics

References
Ikuya Yamada, Koki Washio, Hiroyuki Shindo, and Yuji Matsumoto. 2020. Global entity disambiguation with pretrained contextualized embeddings of words and entities.
arXiv preprint arXiv:1909.00426v2.
Nikolaos Kolitsas, Octavian-Eugen Ganea, and Thomas Hofmann. 2018. End-to-end neural entity linking. In 22nd CoNNL, pages 519–529, Brussels, Belgium. ACL.
Ledell Yu Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2020. Zero-shot entity linking with dense entity retrieval. ArXiv, abs/1911.03814.
Evaluations: Entity Linking Results
Dataset Accuracy Micro F1 System
AIDA
0.950 - Yamada et al. (2020)
- 0.824 Kolitsas et al. (2018)
KBP’10 0.940 - Wu et al. (2020)
MSNBC - 0.963 Yamada et al. (2020)
AQUAINT - 0.935 Yamada et al. (2020)
ACE-2004 - 0.919 Yamada et al. (2020)
CWEB - 0.789 Yamada et al. (2020)
WW - 0.891 Yamada et al. (2020)

Open Source Tools and Resources for Entity Linking
Year System name NER NED URL
2010 Tagme Y Y https://tagme.d4science.org/tagme/
2011 DBpedia Spotlight Y Y https://www.dbpedia-spotlight.org/
2011 AIDA Y Y https://gate.d5.mpi-inf.mpg.de/webaida/
2013 TwitIE Y Y https://gate.ac.uk/wiki/twitie.html
2014 Babelfy Y Y http://babelfy.org/
2014 Stanford CoreNLP Y N https://stanfordnlp.github.io/CoreNLP/
2015 SpaCy Y N https://spacy.io/
since 2015 neural models - - https://paperswithcode.com/task/entity-linking

● Involves the construction of Knowledge Graphs (KG) from
unstructured text and other structured or semi-structured
sources.
○ Core tasks are relation and entity extraction
Knowledge Acquisition
A KG is typically a multi-relational graph containing entities as nodes and relations as edges. Each edge is
represented as a triplet (head entity, relation, tail entity) ((h; r; t) for short), indicating the relation between two
entities, e.g., (Albert Einstein, WinnerOf, Nobel Prize in Physics).

sources.
● Powered by KGs, many real-world applications such as
recommendation systems and question answering has
seen significant progress with the their new capacity for
commonsense understanding and reasoning.
○ Search powered by Google’s Knowledge Graph

sources.
○ Given knowledge: (Male,gender,Y) and (X,hasChild,Y)

sources.
○ Given knowledge: (Y,gender,Male) and (X,hasChild,Y)
Then, inferences such as (Y,sonOf,X) are possible.

● Also involves completing an existing knowledge graph,
and other entity-oriented acquisition tasks such as entity
resolution and alignment.
● Thus, the main tasks of knowledge acquisition include
relation extraction to convert unstructured text to
structured knowledge, knowledge graph completion
(KGC), and other entity-oriented acquisition tasks such as
entity recognition and entity alignment.
○ KGC and relation extraction can be treated jointly. Han et
al. [1] proposed a joint learning framework with mutual
attention for data fusion between knowledge graphs and
text, which solves both KGC and relation extraction from
text.
References
1 X. Han, Z. Liu, and M. Sun, “Neural knowledge acquisition via mutual attention between knowledge
graph and text,” in AAAI, 2018, pp. 4832–4839.

● Knowledge Graphs constructed from unstructured text or
acquired from other sources are by nature incomplete.
○ Why? Created at scale from millions of documents or
at Web scale they are easily amenable to noise.
A example of an incomplete Knowledge Graph with a missing relation
Img src: https://towardsdatascience.com/embedding-models-for-knowledge-graph-completion-a66d4c01d588

● Knowledge Graphs constructed from unstructured text or
acquired from other sources are by nature incomplete.
○ Why? Created at scale from millions of documents or
at Web scale they are easily amenable to noise.
is a university in
A example of an incomplete Knowledge Graph with a missing relation
Img src: https://towardsdatascience.com/embedding-models-for-knowledge-graph-completion-a66d4c01d588

The Knowledge Graph Completion Task
Given a KG having edges specified with a triplet of elements
(h, r, t) ∈ E × R × E where the head (h) and the tail (t) entities
are elements of E and r is a type of relation of R. Note relations
can be directed.
Formally, we define KGC as the task that tries to predict any
missing element of the triplet (h, r, t). In particular, we talk
about:
● link (entity) prediction when an element between h or t is
missing ((?, r, t) or (h, r, ?));
● relation prediction when r is missing (h, ?, t)

about:
missing ((?, r, t) or (h, r, ?));
● relation prediction when r is missing (h, ?, t)

about:
missing ((?, r, t) or (h, r, ?));
● relation prediction when r is missing (h, ?, t);
● Aside: triplet classification when an algorithm recognizes
whether a given triplet (h, r, t) is correct or not.

● challenging because:
○ it is not trivial to create a KG;
○ every entity could have a variable number of attributes
(non-unique specification);
○ R could contain different types of relation (multi-layer
network, hierarchical network);
○ a KG changes over time (evolution over time).

Plan for Part II of II of the Talk
● Approaches
● Datasets and Toolkits

1. Embedding-based (ranking) methods
○ involves learning low-dimensional embeddings, i.e. adopting Knowledge Graph Embedding
(KGE) method used originally for triple prediction
2. Relational path reasoning
○ Embedding-based methods however failed to capture multi-step relationships.
○ Relational path reasoning methods explore multi-step relation paths
○ The two approaches below also have the same information capture paradigm but
incorporating logical rules
3. Logical rule reasoning
4. Meta relational learning
Approaches to Knowledge Graph Completion

○ For the link prediction KGC task, i.e. for the KGC task with triples (h, r, t) with h or t missing:
■ learn embedding vectors based on existing triples:
● during test, the missing h or t entity is predicted from the existing set E of entities in the KG;
● during training, triple instances are created by replacing h or t with each entity in E, scores are
calculated of all candidate entities, and the top k entities are ranked.
■ all Knowledge Graph Embedding methods that represent inputs and candidates in a unified
embedding space are applicable. E.g., TransE [1], TransH [2], TransR [3], HolE [4].
References
1 A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko, “Translating embeddings for modeling multi-relational data,” in NIPS, 2013, pp. 2787–2795.
2 Z. Wang, J. Zhang, J. Feng, and Z. Chen, “Knowledge graph embedding by translating on hyperplanes,” in AAAI, 2014, pp. 1112–1119.
3 Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu, “Learning entity and relation embeddings for knowledge graph completion,” in AAAI, 2015, pp. 2181–2187.
4 M. Nickel, L. Rosasco, and T. Poggio, “Holographic embeddings of knowledge graphs,” in AAAI, 2016, pp. 1955–1961.

■ TransE model [Bordes et al., 2013]
References
1 A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko, “Translating embeddings for modeling multi-relational data,” in NIPS, 2013, pp. 2787–2795.

■ Unlike representing inputs and candidates in a unified embedding space, ProjE [1] proposes a
combined embedding by space projection of the known parts of input triples, i.e., (h; r; ?) or (?;
r; t), and the candidate entities with the candidate-entity matrix Wc belongs to Rsxd, where s
is the number of candidate entities. Their embedding projection function includes a neural
combination layer and a output projection layer.
References
1 Shi and T. Weninger, “ProjE: Embedding projection for knowledge graph completion,” in AAAI, 2017, pp. 1236–1242.

■ ConMask [1] proposes relationship-dependent content masking over the entity description to
select relevant snippets of given relations, and CNN-based target fusion to complete the
knowledge graph. It can only make a prediction when query relations and entities are explicitly
expressed in the text description.
References
1 B. Shi and T. Weninger, “Open-world knowledge graph completion,” in AAAI, 2018, pp. 1957–1964.

2. Relation path reasoning
○ A limitation of the embedding based method is that they do not model complex relation paths.
E.g. one-to-many, or many-to-many relations
■ Relation path reasoning leverages path information over the graph structure.

○ A limitation of the embedding based method is that they do not model complex relation paths.
■ Relation path reasoning leverages path information over the graph structure.
○ Random walk inference has been investigated.
■ E.g., the Path-Ranking Algorithm (PRA) [1] chooses a relational path under a combination of
path constraints and conducts maximum-likelihood classification.
○ Neural multi-hop relational path modeling is also studied.
■ Neelakantan et al. [2] models complex relation paths by applying compositionality recursively
over the relations in the path as depicted in the figure below.
References
1 N. Lao and W. W. Cohen, “Relational retrieval using a combination of path-constrained random walks,” Machine learning, vol. 81, no. 1, pp. 53–67, 2010.
2 A. Neelakantan, B. Roth, and A. McCallum, “Compositional vector space models for knowledge base completion,” in ACL-IJCNLP, vol. 1, 2015, pp. 156–166.

○ Chains-of-Reasoning [1], a neural attention mechanism to enable multiple reasons,
represents logical composition across all relations, entities, and text.
○ DIVA [2] proposes a unified variational inference framework that takes multi-hop reasoning as
two sub-steps of path-finding (a prior distribution for underlying path inference) and
path-reasoning (a likelihood for link classification).
References
1 R. Das, A. Neelakantan, D. Belanger, and A. McCallum, “Chains of reasoning over entities, relations, and text using recurrent neural networks,” in EACL, vol. 1, 2017,
pp. 132–141.
2 W. Chen, W. Xiong, X. Yan, and W. Y. Wang, “Variational knowledge graph reasoning,” in NAACL, 2018, pp. 1823–1832.

2. Reinforcement-learning based path finding
○ Deep reinforcement learning (RL) is introduced for multi-hop reasoning by formulating
path-finding between entity pairs as sequential decision making, specifically a Markov
decision process (MDP). The policy-based RL agent learns to find a step of relation to
extending the reasoning paths via the interaction between the knowledge graph environment,
where the policy gradient is utilized for training RL agents.
■ KGC based on RL concepts of State, Action, Reward, and Policy Network
○ DeepPath [1] firstly applies RL into relational path learning and develops a novel reward
function to improve accuracy, path diversity, and path efficiency. It encodes states in the
continuous space via a translational embedding method and takes the relation space as its
action space.
○ Similarly, MINERVA [2] takes path walking to the correct answer entity as a sequential
optimization problem by maximizing the expected reward. It excludes the target answer entity
and provides more capable inference.
References
1 W. Xiong, T. Hoang, and W. Y. Wang, “DeepPath: A reinforcement learning method for knowledge graph reasoning,” in EMNLP, 2017, pp. 564–573.
2 R. Das, S. Dhuliawala, M. Zaheer, L. Vilnis, I. Durugkar, A. Krishnamurthy, A. Smola, and A. McCallum, “Go for a walk and arrive at the answer: Reasoning over paths
in knowledge bases using reinforcement learning,” in ICLR, 2018, pp. 1–18.

2. Reinforcement-learning based path finding
○ Instead of using a binary reward function, MultiHop [1] proposes a soft reward mechanism.
Action dropout is also adopted to mask some outgoing edges during training to enable more
effective path exploration.
○ M-Walk [2] applies an RNN controller to capture the historical trajectory and uses the Monte
Carlo Tree Search (MCTS) for effective path generation.
○ Leveraging text corpus with the sentence bag of current entity denoted as bet , CPL [3]
proposes collaborative policy learning for pathfinding and fact extraction from text.
○ For the policy networks, DeepPath uses fully-connected network, the extractor of CPL
employs CNN, while the rest uses recurrent networks.
References
1 X. V. Lin, R. Socher, and C. Xiong, “Multi-hop knowledge graph reasoning with reward shaping,” in EMNLP, 2018, pp. 3243–3253.
2 Y. Shen, J. Chen, P.-S. Huang, Y. Guo, and J. Gao, “M-Walk: Learning to walk over graphs using monte carlo tree search,” in NeurIPS, 2018, pp. 6786–6797.
3 C. Fu, T. Chen, M. Qu, W. Jin, and X. Ren, “Collaborative policy learning for open knowledge graph reasoning,” in EMNLP, 2019, pp. 2672–2681.

3. Rule-based Reasoning
○ Another direction for Knowledge Graph Completion
■ making use of the symbolic nature of knowledge is logical rule learning
○ E.g., the inference rule: (Y; sonOf; X) <-- (X; hasChild; Y) ^ (Y; gender; Male), where the
relation ‘sonOf’ did not exist earlier.
■ Logical rules can been extracted by rule mining tools like AMIE [1]
○ RLvLR [2] proposes a scalable rule mining approach with efficient rule searching and
pruning, and uses the extracted rules for relation prediction.
References
1 L. A. Gal´arraga, C. Teflioudi, K. Hose, and F. Suchanek, “AMIE: association rule mining under incomplete evidence in ontological knowledge bases,” in WWW, 2013,
pp. 413–422.
2 P. G. Omran, K. Wang, and Z. Wang, “An embedding-based approach to rule learning in knowledge graphs,” IEEE TKDE, pp. 1–12, 2019.

3. Rule-based Reasoning
○ In a different research direction on this topic, research is focused on injecting logical rules
into embeddings to improve reasoning, with joint learning, as an example, applied to
incorporate first-order logic rules.
■ E.g., KALE [1] proposes a unified joint model with t-norm fuzzy logical connectives defined for
compatible triples and logical rules embedding.
■ Specifically, three compositions of logical conjunction, disjunction, and negation are defined to
compose the truth value of a complex formula.
References
1 S. Guo, Q. Wang, L. Wang, B. Wang, and L. Guo, “Jointly embedding knowledge graphs and logical rules,” in EMNLP, 2016, pp. 192–202.

4. Meta Relational Learning
○ Consider that the real-world scenario of knowledge is, in fact, dynamic where unseen triples
are usually acquired.
○ The new scenario is called as meta relational learning or few-shot relational learning
■ requires models to predict new relational facts with only very few samples
○ GMatching [1] develops a metric based few-shot learning method with entity embeddings and
local graph structures.
■ It encodes one-hop neighbors to capture the structural information with R-GCN and then takes
the structural entity embedding for multistep matching guided by long short-term memory
(LSTM) networks to calculate the similarity scores.
References
1 W. Xiong, M. Yu, S. Chang, X. Guo, and W. Y. Wang, “One-shot relational learning for knowledge graphs,” in EMNLP, 2018, pp. 1980–1990.

Datasets
Dataset Original Data # Rel. # Ent. # Train # Valid. # Test
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
Datasets for Tasks on Knowledge Graphs
Reference: Ji, S., Pan, S., Cambria, E., Marttinen, P., & Yu, P. S. (2020). A survey on knowledge graphs: Representation, acquisition and applications.
arXiv preprint arXiv:2002.00388.
A popular way of generating task-specific datasets is to sample subsets from large general datasets.

Datasets
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
E.g., the WN prefixed dataset names are a subset of the WordNet knowledge base.

Datasets
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
● WordNet is designed to produce an intuitively usable dictionary and thesaurus, and support automatic text
analysis.
● Its entities (termed synsets) correspond to word senses, and relationships define lexical relations between
them. Examples of triplets are (score_NN_1, hypernym, evaluation_NN_1) or (score_NN_2, has_part,
musical_notation_NN_1).

Datasets
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
On the other hand, the FB prefixed dataset names are a subset of the Freebase knowledge base.

Datasets
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
● Freebase is a huge and growing KB of general facts; there are currently around 1.2 billion triplets and
more than 80 million entities.

Datasets
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
● Freebase is a huge and growing KB of general facts; there are currently around 1.2 billion triplets and
more than 80 million entities.
● The small data set (FB15K) was made by selected the subset of entities that are also present in the
Wikilinks database and that also have at least 100 mentions in Freebase (for both entities and
relationships).

Datasets
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
● The large-scale dataset was created by selecting the most frequently occurring 5 million entities occuring
in Freebase.

Datasets
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
● The datasets WN18 and FB15k suffer from test set leakage through inverse relations, where a large
number of test triples could be obtained by inverting triples in the training set.

Datasets
WN18 WordNet 18 40,943 141,442 5,000 5,000
FB15K Freebase 1,345 14,951 483,142 50,000 59,071
WN11 WordNet 11 38,696 112,581 2,609 10,544
FB13 Freebase 13 75,043 316,232 5,908 23,733
WN18RR WordNet 11 40,943 86,835 3,034 3,134
FB15k-237 Freebase 237 14,541 272,115 17,535 20,466
FB5M Freebase 1,192 5,385,322 19,193,556 50,000 59,071
FB40K Freebase 1,336 39,528 370,648 67,946 96,678
● The FB15k-237 was then introduced – a subset of FB15k where inverse relations were removed

Toolkits
Task Library Language URL
General Grakn Python github.com/graknlabs/kglib
General AmpliGraph TensorFlow github.com/Accenture/AmpliGraph
General GraphVile Python graphvite.io
Database Akutan Go github.com/eBay/akutan
KRL OpenKE PyTorch github.com/thunlp/OpenKE
KRL Fast-TransX C++ github.com/thunlp/Fast-TransX
KRL sckit-kge Python github.com/mnick/scikit-kge
KRL LibKGE PyTorch github.com/uma-pi1/kge
KRL PyKEEN Python github.com/SmartDataAnalytics/PyKEEN
RE OpenNRE PyTorch github.com/thunlp/OpenNRE
Table: Summary of Knowledge Graph Building Technology as Open Source Libraries

Toolkits
Table: Summary of Knowledge Graph Building Technology as Open Source Libraries
● AmpliGraph for knowledge representation learning

Toolkits
● Akutan for knowledge graph store and query

Toolkits
● Three example useful toolkits released by the research community.
○ scikit-kge and OpenKE for knowledge graph embedding

Toolkits
● Three example useful toolkits released by the research community.
○ OpenNRE for relation extraction

● Entity Linking is a long researched topic in the NLP community
● Neural models have enabled systems to cross the 95% performance barriers for the
task
● Knowledge Graph Completion is an active research area and relatively new with neural
model considerations
○ Uses machine learning and neural networks to ‘vectorize’ entities and relationships
● Implementations can be slow, but recently this has started to change
Conclusion: Takeaways

Happy to take Questions
Thank you for your attention!

Perspectives on mining knowledge graphs from text

More Related Content

What's hot

Similar to Perspectives on mining knowledge graphs from text

Recently uploaded

Perspectives on mining knowledge graphs from text