Towards Encoding Time in Text-Based Entity Embeddings
1. Towards Encoding Time in
Text-Based Entity Embeddings
Federico Bianchi, Matteo Palmonari and Debora Nozza
University of Milano-Bicocca
INSID&S Lab
Interaction and Semantics for
Innovation with Data & Services
International Semantic Web Conference, Monterey, California. 2018
MIND Lab
Models in Decision making
and data analysis
2. Knowledge Graphs
Large knowledge bases
Entities classified using types
Types organized in sub-types graphs
Binary relationships between entities
Semantics and inference via
rules/axioms
Semantic similarity with lexical,
topological and other feature-based
approaches
A.S.
Roma
Kostas
Manolas
team
Soccer
Player
Soccer
Club
Athlete
Thing
Person
Sports
Club
Garry
Kasparov
Chess
Player
Real
Madrid
Organis.
3. Knowledge Graphs Embeddings
Generate vector representations of entities and relationships
A.S.
Roma
Kostas
Manolas
team 2
5
6
2
6
4
2
12
5
2
Kostas
Manolas
A.S.
Roma
4
2
12
5
2
team
Given in input a KG
Generate vector
representations
Embedding
Algorithm
Why should we embed?
● Latent components (e.g., → link prediction)
● Features generation (e.g., → entity linking)
● Fast and intuitive way to compute similarity
4. From Word Embeddings to Text-based Entity Embeddings
- Word embeddings (e.g., [Mikolov+, 2013])
- Text-based Entity Embeddings
- Text as main source vs. Graph as main source [Bordes+,2013][Trouillon+,2016]
- Typed Entity Embeddings (TEE): use word embeddings algorithms on documents where entities and
types replace words (next slide :) )
- Pros: good for similarity evaluation
- Cons: no embedding of relations, just entity
corpus
cat
black
eats
dog
similar words corresponds
to similar vectors
C
W
The big black cat eats its food.
My little black cat sleeps all day.
Sometimes my cat eats too much!
5. TEE: Typed Entity Embeddings from Text
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
[Bianchi+,2017b]
[Bianchi+, 2018a]
Wikipedia’s abstracts
6. TEE: Typed Entity Embeddings from Text
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
“dbr:Rome dbr:Italy
dbr:Rome dbr:Lazio …”“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
Link to DBpedia
entities via named
entity linking tools
[Bianchi+,2017b]
[Bianchi+, 2018a]
Wikipedia’s abstracts
7. TEE: Typed Entity Embeddings from Text
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
“dbr:Rome dbr:Italy
dbr:Rome dbr:Lazio …”
“dbo:City dbo:Country City
dbo:Administrative_Region …”
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
Link to DBpedia
entities via named
entity linking tools
Replace
entities
with their most
specific types
[Bianchi+,2017b]
[Bianchi+, 2018a]
Wikipedia’s abstracts
8. TEE: Typed Entity Embeddings from Text
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
“dbr:Rome dbr:Italy
dbr:Rome dbr:Lazio …”
“dbo:City dbo:Country City
dbo:Administrative_Region …”
Generate Type
Vectors
From Text
Generate Entity
Vectors
From Text“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
Link to DBpedia
entities via named
entity linking tools
Replace
entities
with their most
specific types
[Bianchi+,2017b]
[Bianchi+, 2018a]
Wikipedia’s abstracts
9. TEE: Typed Entity Embeddings from Text
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
“dbr:Rome dbr:Italy
dbr:Rome dbr:Lazio …”
“dbo:City dbo:Country City
dbo:Administrative_Region …”
Generate Type
Vectors
From Text
Generate Entity
Vectors
From Text
Concatenate
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
Link to DBpedia
entities via named
entity linking tools
Replace
entities
with their most
specific types
[Bianchi+,2017b]
[Bianchi+, 2018a]
Wikipedia’s abstracts
10. TEE: Typed Entity Embeddings from Text
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
“dbr:Rome dbr:Italy
dbr:Rome dbr:Lazio …”
“dbo:City dbo:Country City
dbo:Administrative_Region …”
Generate Type
Vectors
From Text
Generate Entity
Vectors
From Text
Concatenate
“Rome is the capital of
Italy and a special
comune (named
Comune di Roma
Capitale). Rome also
serves as the capital of
the Lazio region.”
Link to DBpedia
entities via named
entity linking tools
Replace
entities
with their most
specific types
[Bianchi+,2017b]
[Bianchi+, 2018a]
1 3 6 3 19 5 6
v(Rome)v(City)
Wikipedia’s abstracts
11. Why Time?
● To the best of our knowledge this is the first approach to explicitly encode time periods into entity
embeddings
● We expect that when we evaluate similarity between entities time is important:
○ Entities are similar when they co-occur frequently, entities that share a time period co-occur
Most similar entities to “Winston Churchill” are his contemporary politicians
● In this paper we try to provide an approach to explicitly encode time in such a way that we can use
those representation to control the similarity with respect to time
Winston Churchill Harold Macmillan
13. Textual Descriptions of Time Periods via Events
“The succession of events is an inherent property of our time
perception. Memory is necessary, and the order of these
events is fundamental”
Snaider&al. 2012, Cognitive Systems Research
14. Embedding Years from Event Descriptions
A year is represented by the set of entities taking part in the year’s events
The year vector is the average of the entities’ vectors found inside the description
15. Embedding Years from Event Descriptions
A year is represented by the set of entities taking part in the year’s events
The year vector is the average of the entities’ vectors found inside the description
16. Embedding Years from Event Descriptions
A year is represented by the set of entities taking part in the year’s events
The year vector is the average of the entities’ vectors found inside the description
Adolf Hitler
Nazi Germany
World War II
4 3 6 2 3
5 1 2 9 2
1 2 8 4 1
17. Embedding Years from Event Descriptions
A year is represented by the set of entities taking part in the year’s events
The year vector is the average of the entities’ vectors found inside the description
Adolf Hitler 4 3 6 2 3
Nazi Germany 5 1 2 9 2
World War II 1 2 8 4 1
1941
9 2 3 5 5
AVG
18. Towards Time Aware Similarity
Time flattened similarity: to reduce the impact of time in the similarity.
E.g., make US presidents similar independently from their temporal context.
Time boosted similarity: to boost the impact of time in the similarity.
E.g., make politicians that share temporal contexts more similar
20. Time Flattened Similarity
Extract the embeddings for the two entities
What’s the time flattened similarity between
Barack Obama and Bill Clinton?
21. Time Flattened Similarity
1999 2003
Find the closest year vectors to the two entity
embeddings (e.g., the entity vector of Barack
Obama is close to the vector of the year
2003).
What’s the time flattened similarity between
Barack Obama and Bill Clinton?
23. Time Flattened Similarity
1999 2003
𝝍( , ) = η( , )
Cosine similarity
What’s the time flattened similarity between
Barack Obama and Bill Clinton?
24. Time Flattened Similarity
1999 2003
𝝍( , ) = η( , ) - ηn
( , )1990 2003
Normalized cosine similarity
What’s the time flattened similarity between
Barack Obama and Bill Clinton?
25. Time Flattened Similarity
1999 2003
𝝍( , ) = ⍺η( , ) - (1 - ⍺) ηn
( , )1999 2003
⍺ to control the weight of the time factor
What’s the time flattened similarity between
Barack Obama and Bill Clinton?
26. Experiments: Research Questions
1. Quality: properties of the year embeddings
2. Similarity and Time:
a. Time Bias in TEE and EE: Effect of time in entity embeddings from text
i. Adherence to Natural Time Order
ii. Clustering WWI and WWII Battles
iii. Relative Ordering of Entities
b. Controlling Time Bias: handling the effect of time
27. Embedded Representations vs. Natural Time Flow
191X
years
201X
years
PCA in 1D vs. natural order of years: Kendall τ = 0.80 and Spearman Rank correlation coefficient = 0.94
Good resemblance of natural time flow!
2D projection (PCA)
1D projection (PCA)
28. Time Bias: Adherence to Natural Time Order
Task: count number of entities shared by sequences of 2-3 contiguous years vs
number of entities shared in non contiguous years (randomly sampled):
● (e.g, 1991-1992 vs 1934-1992)
Dataset: two and three contiguous years and non contiguous years (1931-1991).
Results: contiguous years share an higher amount of entities than non contiguous
years.
29. Time Bias: Clustering Battles with EE
Task: classify battles as belonging to WWI or WWII.
Dataset: 152 resource identifier of WWI (63) and WWII (89) battles from Wikipedia.
Method: K-means clustering (K=2) on the vector representation in the entity
embedding space.
Results: 95% accuracy. Centroids of the two groups are close to WWI years and
WWII years respectively.
30. Controlling Time Bias: Flattened Similarity
Task: find similar entities to a given input entity but that are far in time
Barack
Obama
31. Controlling Time Bias: Flattened Similarity
Task: find similar entities to a given input entity but that are far in time. E.g., find
past president given one
Ford
Coolidge
Hoover
T. Kennedy
Truman
Barack
Obama
32. Controlling Time Bias: Flattened Similarity
Task: find similar entities to a given input entity but that are far in time. E.g., find
past president given one
Ford
Coolidge
Hoover
T. Kennedy
Truman
Barack
Obama
Correct
Correct
Correct
Correct
Wrong
33. Controlling Time Bias: Flattened Similarity
Dataset: US presidents entities and British Prime ministers entities (19 and 19)
Method: start with the 6 most recent presidents for each group. For each entity
compute the number of older presidents that are in the ranked list created by the
similarity measures.
Time flattened reorders top-100 results from cosine similarity
Algorithms:
● Time-aware Similarity TEE (TATEE), with time-flattened similarity;
● Similarity TEE (STEE) (standard neighborhood with cosine);
● Time-Aware Similarity EE (TAEE), with time-flattened similarity;
● Similarity EE (SEE) (standard neighborhood with cosine);
● Time-flattened similarity Wiki2Vec (Baseline).
34. Controlling Time Bias: Flattened Similarity
Results: time-flattened similarity on TATEE seems able to get the best results. This
is also due to the fact that TATEE considers type representations and thus it can
easily retrieve entities sharing types.
35. Controlling Time Bias: Qualitative Analysis
Clinton
Reagan
G. Bush
Carter
Al Gore
Nixon
J. Kerry
D. Cheney
McCain
Biden
The most
similar
entities to
Barack
Obama using
cosine
similarity in
TEE
36. Controlling Time Bias: Qualitative Analysis
Clinton
Reagan
G. Bush
Carter
Al Gore
Nixon
J. Kerry
D. Cheney
McCain
Biden
The most
similar
entities to
Barack
Obama using
cosine
similarity in
TEE
Clinton
Reagan
G. Bush
Carter
Al Gore
Nixon
Ford
Coolidge
T. Kennedy
Hoover
Time flattened
similarity to
reorder the
top-100 most
similar
alpha = 0.7
New
New
New
New
37. Controlling Time Bias: Qualitative Analysis
Clinton
Reagan
G. Bush
Carter
Al Gore
Nixon
J. Kerry
D. Cheney
McCain
Biden
The most
similar
entities to
Barack
Obama using
cosine
similarity in
TEE
Time flattened
similarity to
reorder the
top-100 most
similar
alpha = 0.1
New
New
New
New
Ford
Coolidge
Hoover
Truman
Roosevelt
Wilson
E. Roosevelt
Harding
Cleveland
Eisenhower
New
New
New
New
New
New
38. Conclusions and Future Work
Conclusions
● Time can be represented in the vector space using events descriptions
● Time sneaks into entity similarity (time bias)
● Time bias can be controlled by considering explicit representations of
time periods
Future Work
● Study compositionality of time periods representations
● Comparison with Doc2Vec
● Improve time-aware similarity measure
● Comparison with other KG embeddings models
39. References
Snaider, J., McCall, R., & Franklin, S. (2012). Time production and representation in a conceptual and computational cognitive
model. Cognitive Systems Research, 13(1), 59-71.
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling
multi-relational data. In Advances in neural information processing systems (pp. 2787-2795).
Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., & Bouchard, G. (2016, June). Complex embeddings for simple link prediction. In
International Conference on Machine Learning (pp. 2071-2080).
Tran, N. K., Tran, T., & Niederée, C. (2017, May). Beyond time: Dynamic context-aware entity recommendation. In European
Semantic Web Conference (pp. 353-368). Springer, Cham.
Bianchi, F., Soto, M., Palmonari, M., & Cutrona, V. (2018). Type vector representations from text: An empirical analysis. In Deep
Learning for Knowledge Graphs and Semantic Technologies Workshop, co-located with the Extended Semantic Web
Conference, Crete.
Bianchi, F., Palmonari, M., & Nozza, D. (2018), “Towards Encoding Time in Text-Based Entity Embeddings” in International
Semantic Web Conference (to appear), Monterey, California.
40. References
Bianchi, F., Palmonari, M., Cremaschi, M., & Fersini, E. (2017, May). Actively learning to rank semantic associations for
personalized contextual exploration of knowledge graphs. In European Semantic Web Conference (pp. 120-135). Springer,
Cham.
Bianchi, F., & Palmonari, M. (2017). Joint learning of entity and type embeddings for analogical reasoning with entities. In In
Proceedings of the NL4AI Workshop, co-located with the International Conference of the Italian Association for Artificial
Intelligence (AI* IA).
42. Qualitative Evaluation of Time Flattened Similarity
Winston Churchill Harold Macmillan
Tony Blair
Gordon Brown
Most similar 49th in
the list of
most
similars
41st in
the list of
most
similars
Method: Cosine similarity
Input: Winston Churchill
43. Qualitative Evaluation of Time Flattened Similarity
Winston Churchill Margaret Thatcher
Tony Blair
Gordon Brown
Most similar 16th in
the list of
most
similars
14th in
the list of
most
similars
Method: Time-flattened Similarity
Input: Winston Churchill