Towards Encoding Time in Text-Based Entity Embeddings

Towards Encoding Time in
Text-Based Entity Embeddings
Federico Bianchi, Matteo Palmonari and Debora Nozza
University of Milano-Bicocca
INSID&S Lab
Interaction and Semantics for
Innovation with Data & Services
International Semantic Web Conference, Monterey, California. 2018
MIND Lab
Models in Decision making
and data analysis

Knowledge Graphs
Large knowledge bases
Entities classified using types
Types organized in sub-types graphs
Binary relationships between entities
Semantics and inference via
rules/axioms
Semantic similarity with lexical,
topological and other feature-based
approaches
A.S.
Roma
Kostas
Manolas
team
Soccer
Player
Soccer
Club
Athlete
Thing
Person
Sports
Club
Garry
Kasparov
Chess
Player
Real
Madrid
Organis.

Knowledge Graphs Embeddings
Generate vector representations of entities and relationships
A.S.
Roma
Kostas
Manolas
team 2
5
6
2
6
4
2
12
5
2
Kostas
Manolas
A.S.
Roma
4
2
12
5
2
team
Given in input a KG
Generate vector
representations
Embedding
Algorithm
Why should we embed?
● Latent components (e.g., → link prediction)
● Features generation (e.g., → entity linking)
● Fast and intuitive way to compute similarity

From Word Embeddings to Text-based Entity Embeddings
- Word embeddings (e.g., [Mikolov+, 2013])
- Text-based Entity Embeddings
- Text as main source vs. Graph as main source [Bordes+,2013][Trouillon+,2016]
- Typed Entity Embeddings (TEE): use word embeddings algorithms on documents where entities and
types replace words (next slide :) )
- Pros: good for similarity evaluation
- Cons: no embedding of relations, just entity
corpus
cat
black
eats
dog
similar words corresponds
to similar vectors
C
W
The big black cat eats its food.
My little black cat sleeps all day.
Sometimes my cat eats too much!

TEE: Typed Entity Embeddings from Text
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
[Bianchi+,2017b]
[Bianchi+, 2018a]
Wikipedia’s abstracts

Italy and a special
comune (named
Comune di Roma
“dbr:Rome dbr:Italy
dbr:Rome dbr:Lazio …”“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Link to DBpedia
entities via named
entity linking tools
[Bianchi+,2017b]
[Bianchi+, 2018a]

Italy and a special
comune (named
Comune di Roma
dbr:Rome dbr:Lazio …”
“dbo:City dbo:Country City
dbo:Administrative_Region …”
Italy and a special
comune (named
Comune di Roma
Link to DBpedia
entities via named
Replace
entities
with their most
specific types
[Bianchi+,2017b]
[Bianchi+, 2018a]

Italy and a special
comune (named
Comune di Roma
Generate Type
Vectors
From Text
Generate Entity
Vectors
From Text“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Link to DBpedia
entities via named
Replace
entities
with their most
specific types
[Bianchi+,2017b]
[Bianchi+, 2018a]

Italy and a special
comune (named
Comune di Roma
Generate Type
Vectors
From Text
Generate Entity
Vectors
From Text
Concatenate
Italy and a special
comune (named
Comune di Roma
Link to DBpedia
entities via named
Replace
entities
with their most
specific types
[Bianchi+,2017b]
[Bianchi+, 2018a]

Italy and a special
comune (named
Comune di Roma
Generate Type
Vectors
From Text
Generate Entity
Vectors
From Text
Concatenate
Italy and a special
comune (named
Comune di Roma
Link to DBpedia
entities via named
Replace
entities
with their most
specific types
[Bianchi+,2017b]
[Bianchi+, 2018a]
1 3 6 3 19 5 6
v(Rome)v(City)

Why Time?
● To the best of our knowledge this is the first approach to explicitly encode time periods into entity
embeddings
● We expect that when we evaluate similarity between entities time is important:
○ Entities are similar when they co-occur frequently, entities that share a time period co-occur
Most similar entities to “Winston Churchill” are his contemporary politicians
● In this paper we try to provide an approach to explicitly encode time in such a way that we can use
those representation to control the similarity with respect to time
Winston Churchill Harold Macmillan

Textual Descriptions of Time Periods via Events

Textual Descriptions of Time Periods via Events
“The succession of events is an inherent property of our time
perception. Memory is necessary, and the order of these
events is fundamental”
Snaider&al. 2012, Cognitive Systems Research

Embedding Years from Event Descriptions
A year is represented by the set of entities taking part in the year’s events
The year vector is the average of the entities’ vectors found inside the description

Adolf Hitler
Nazi Germany
World War II
4 3 6 2 3
5 1 2 9 2
1 2 8 4 1

Adolf Hitler 4 3 6 2 3
Nazi Germany 5 1 2 9 2
World War II 1 2 8 4 1
1941
9 2 3 5 5
AVG

Towards Time Aware Similarity
Time flattened similarity: to reduce the impact of time in the similarity.
E.g., make US presidents similar independently from their temporal context.
Time boosted similarity: to boost the impact of time in the similarity.
E.g., make politicians that share temporal contexts more similar

Time Flattened Similarity
What’s the time flattened similarity between
Barack Obama and Bill Clinton?

Extract the embeddings for the two entities

1999 2003
Find the closest year vectors to the two entity
embeddings (e.g., the entity vector of Barack
Obama is close to the vector of the year
2003).

1999 2003
𝝍( , )

1999 2003
𝝍( , ) = η( , )
Cosine similarity

1999 2003
𝝍( , ) = η( , ) - ηn
( , )1990 2003
Normalized cosine similarity

1999 2003
𝝍( , ) = ⍺η( , ) - (1 - ⍺) ηn
( , )1999 2003
⍺ to control the weight of the time factor

Experiments: Research Questions
1. Quality: properties of the year embeddings
2. Similarity and Time:
a. Time Bias in TEE and EE: Effect of time in entity embeddings from text
i. Adherence to Natural Time Order
ii. Clustering WWI and WWII Battles
iii. Relative Ordering of Entities
b. Controlling Time Bias: handling the effect of time

Embedded Representations vs. Natural Time Flow
191X
years
201X
years
PCA in 1D vs. natural order of years: Kendall τ = 0.80 and Spearman Rank correlation coefficient = 0.94
Good resemblance of natural time flow!
2D projection (PCA)
1D projection (PCA)

Time Bias: Adherence to Natural Time Order
Task: count number of entities shared by sequences of 2-3 contiguous years vs
number of entities shared in non contiguous years (randomly sampled):
● (e.g, 1991-1992 vs 1934-1992)
Dataset: two and three contiguous years and non contiguous years (1931-1991).
Results: contiguous years share an higher amount of entities than non contiguous
years.

Time Bias: Clustering Battles with EE
Task: classify battles as belonging to WWI or WWII.
Dataset: 152 resource identifier of WWI (63) and WWII (89) battles from Wikipedia.
Method: K-means clustering (K=2) on the vector representation in the entity
embedding space.
Results: 95% accuracy. Centroids of the two groups are close to WWI years and
WWII years respectively.

Controlling Time Bias: Flattened Similarity
Task: find similar entities to a given input entity but that are far in time
Barack
Obama

Task: find similar entities to a given input entity but that are far in time. E.g., find
past president given one
Ford
Coolidge
Hoover
T. Kennedy
Truman
Barack
Obama

Task: find similar entities to a given input entity but that are far in time. E.g., find
past president given one
Ford
Coolidge
Hoover
T. Kennedy
Truman
Barack
Obama
Correct
Correct
Correct
Correct
Wrong

Dataset: US presidents entities and British Prime ministers entities (19 and 19)
Method: start with the 6 most recent presidents for each group. For each entity
compute the number of older presidents that are in the ranked list created by the
similarity measures.
Time flattened reorders top-100 results from cosine similarity
Algorithms:
● Time-aware Similarity TEE (TATEE), with time-flattened similarity;
● Similarity TEE (STEE) (standard neighborhood with cosine);
● Time-Aware Similarity EE (TAEE), with time-flattened similarity;
● Similarity EE (SEE) (standard neighborhood with cosine);
● Time-flattened similarity Wiki2Vec (Baseline).

Results: time-flattened similarity on TATEE seems able to get the best results. This
is also due to the fact that TATEE considers type representations and thus it can
easily retrieve entities sharing types.

Controlling Time Bias: Qualitative Analysis
Clinton
Reagan
G. Bush
Carter
Al Gore
Nixon
J. Kerry
D. Cheney
McCain
Biden
The most
similar
entities to
Barack
Obama using
cosine
similarity in
TEE

Clinton
Reagan
G. Bush
Carter
Al Gore
Nixon
J. Kerry
D. Cheney
McCain
Biden
The most
similar
entities to
Barack
Obama using
cosine
similarity in
TEE
Clinton
Reagan
G. Bush
Carter
Al Gore
Nixon
Ford
Coolidge
T. Kennedy
Hoover
Time flattened
similarity to
reorder the
top-100 most
similar
alpha = 0.7
New
New
New
New

Clinton
Reagan
G. Bush
Carter
Al Gore
Nixon
J. Kerry
D. Cheney
McCain
Biden
The most
similar
entities to
Barack
Obama using
cosine
similarity in
TEE
Time flattened
similarity to
reorder the
top-100 most
similar
alpha = 0.1
New
New
New
New
Ford
Coolidge
Hoover
Truman
Roosevelt
Wilson
E. Roosevelt
Harding
Cleveland
Eisenhower
New
New
New
New
New
New

Conclusions and Future Work
Conclusions
● Time can be represented in the vector space using events descriptions
● Time sneaks into entity similarity (time bias)
● Time bias can be controlled by considering explicit representations of
time periods
Future Work
● Study compositionality of time periods representations
● Comparison with Doc2Vec
● Improve time-aware similarity measure
● Comparison with other KG embeddings models

References
Snaider, J., McCall, R., & Franklin, S. (2012). Time production and representation in a conceptual and computational cognitive
model. Cognitive Systems Research, 13(1), 59-71.
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling
multi-relational data. In Advances in neural information processing systems (pp. 2787-2795).
Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., & Bouchard, G. (2016, June). Complex embeddings for simple link prediction. In
International Conference on Machine Learning (pp. 2071-2080).
Tran, N. K., Tran, T., & Niederée, C. (2017, May). Beyond time: Dynamic context-aware entity recommendation. In European
Semantic Web Conference (pp. 353-368). Springer, Cham.
Bianchi, F., Soto, M., Palmonari, M., & Cutrona, V. (2018). Type vector representations from text: An empirical analysis. In Deep
Learning for Knowledge Graphs and Semantic Technologies Workshop, co-located with the Extended Semantic Web
Conference, Crete.
Bianchi, F., Palmonari, M., & Nozza, D. (2018), “Towards Encoding Time in Text-Based Entity Embeddings” in International
Semantic Web Conference (to appear), Monterey, California.

References
Bianchi, F., Palmonari, M., Cremaschi, M., & Fersini, E. (2017, May). Actively learning to rank semantic associations for
personalized contextual exploration of knowledge graphs. In European Semantic Web Conference (pp. 120-135). Springer,
Cham.
Bianchi, F., & Palmonari, M. (2017). Joint learning of entity and type embeddings for analogical reasoning with entities. In In
Proceedings of the NL4AI Workshop, co-located with the International Conference of the Italian Association for Artificial
Intelligence (AI* IA).

Qualitative Evaluation of Time Flattened Similarity
Winston Churchill Harold Macmillan
Tony Blair
Gordon Brown
Most similar 49th in
the list of
most
similars
41st in
the list of
most
similars
Method: Cosine similarity
Input: Winston Churchill

Qualitative Evaluation of Time Flattened Similarity
Winston Churchill Margaret Thatcher
Tony Blair
Gordon Brown
Most similar 16th in
the list of
most
similars
14th in
the list of
most
similars
Method: Time-flattened Similarity
Input: Winston Churchill

Towards Encoding Time in Text-Based Entity Embeddings

Recommended

Recommended

More Related Content

Similar to Towards Encoding Time in Text-Based Entity Embeddings

Similar to Towards Encoding Time in Text-Based Entity Embeddings (20)

Recently uploaded

Recently uploaded (20)

Towards Encoding Time in Text-Based Entity Embeddings