Entity Recommendations Using Hierarchical Knowledge Bases

Entity Recommendations Using
Hierarchical Knowledge Bases
Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data (KNOW@LOD)
at ESWC2015, May 31, 2015
Siva Kumar Cheekula, Pavan Kapanipathi, Derek Doran,
Prateek Jain, Amit Sheth
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton - OH

• Content Based (Users’ Ratings)
– Types
• Term based
• Entity based
• Knowledge based/Graph based
– Provides the most personalized results
• Collaborative (Other users’ ratings)
• Hybrid (Both content and collaborative)
– Works best
Entity Recommendations

• Structured representation of relationships
among concepts
• Allows better understanding of the user
interests
• Cold start problem is addressed well
6
Knowledge Base Enabled
Recommendations

Interstellar
(film)
Gravity
(film)
Relationships
7
Recommendations

Interstellar
(film)
Gravity
(film)
Relationships
8
Recommendations
Too many properties to address

Ball Games
Football
Games
9
Is-a
Focus specifically
on hierarchical
relationships
Hierarchical Relations

• Human memory has been argued to be
structured as a hierarchy of concepts (Semantic
Network)
• Spreading activation theory has been utilized to
simulate search on semantic network
• This theory has not been well explored for user
interest modeling and entity recommendations
10
Hierarchical Relations

• Domain Coverage
• Crowd-sourced knowledge base
–Regular updates
–Driven by strict guidelines
11
Wikipedia

13
Parent Categories
Wikipedia Category Graph

14
Spreading
Activation
Hierarchical Interest
Graphs
1
2 3
3 4
Entity Un-rated by user
2
3
Reverse
Spreading
Wikipedia
Category
Hierarchy
Approach

15
Graphs
1
2 3
3 4
2
3
Reverse
Spreading
Approach
Spreading
Activation
Wikipedia
Category
Hierarchy

Internet
Semantic
Search
Linked Data Metadata
Technology
World Wide Web
Semantic
Web
Structured
Information
0.8 0.2 0.6
Scores for
Interests
16
User Interests
0.7
0.5
0.4
0.3
Activation
Function

• Wikipedia Category Structure is transformed to a
hierarchy (Wikipedia Hierarchy)
– 0.8 Million Categories with approx 2 Million relationships
between them
– Category “Main Topic Classification” is root of the
hierarchy as it subsumes 98% of the categories
– Each Category is assigned a Hierarchy level based on
shortest path from root node
• Spreading Activation Theory on Wikipedia Hierarchy
• Experimented with different Activation Functions
– Bell, Bell Log, Priority Intersect
17
Graphs

Romance
Films
Titanic Anna_Karenina
Forever
Young
Films
Romantic Drama Films
1990s romantic
drama films
Structured
Information
5 4 4
Ratings of
Movies
18
Movies
4
3
3
1

19
Spreading
Activation
Graphs
1
2 3
3 4
2
3
Reverse
Spreading
Wikipedia Category
Hierarchy
Approach

1990s
Romantic
Drama Films
Romantic
Drama Films
4
3
20
Function to rank
the entities based
on category
scores
The
Remains
of the Day
Jude
5 4
Ranking Unrated Entities

21
The
Terminator
1984 Films
1980s Action
Films
Films Directed by
James Cameron
Action
Horror Films
Robot Films
4
3
2
3
4
Reverse Spreading

• Sum the activation values of each parent of an
entity 22
The
Terminator
1984
Films
1980s Action
Films
Films Directed by
James Cameron
Action
Horror Films
Robot
Films
4
3
2
3
4
16
Summing
Sum

• Ranking is biased towards categories having
larger number of entities
• Example:
– Category “English Language Films” has good
interesting score for user
– Results are topped by the entities connected to
“English Language Films” for user
23
Summing – Drawbacks

• Parent category activation value is normalized by it’s out degree
• Bias towards generic categories is addressed
24
The
Terminator
1984
Films
Footloose
3
1 1 1
Ghostbusters
Degree Based
Normalization
Out Degree = 3
Spread Weight =
3/3 = 1

• Priority of a category to an entity as edge weight
26
The
Terminator
1984
Films
Footloose
3
1 2 1
3
1.5 3
Ghostbusters
Category Priority

Ai = Σ(Aj / (Oj * Pij))
27
Reverse Spreading Activation
Function
Activation Value of the Entity i (Movie)
Activation Value of the Categories of “i”
Outdegree of Category j
Priority of Category j to Entity “i”

• Focus on “5” rated movies for test as per Cremonesi et al.
– New approach for evaluation of recommendation systems
• Dataset - Movielens
– Movies -- 3158
– Users -- 1476
– 0.9 m ratings
• Baseline – Ziegler et al.
– Hybrid(Content & Collaborative) technique to recommend
books using Amazon book taxonomy
– We compared with the content based part of Ziegler et al
28
Evaluation

29
Summing Summing with
Degree
Normalization
Summing with
Degree
Normalization
and Priority
Decay BellLog Top 1 -- 0.01
Top 20 -- 0.15
Top 1 -- 0.01
Top 20 -- 0.18
Top 1 -- 0.07
Top 20 -- 0.21
Bell Log Intersect Top 1 -- 0.0
Top 20 -- 0.09
Top 1 -- 0.01
Top 20 -- 0.20
Top 1 -- 0.07
Top 20 -- 0.24
Ziegler et al. Top 1 -- 0.0
Top 20 -- 0.002
Recall @Top 1 – Top 20

• While normalization is necessary, we are
looking for better features for normalization
• Priority works, however we find the category-
entity priority added by users on Wikipedia
needs a quality check
• We are looking into collaborative Hierarchical
Interest Graphs
30
Future Work

Paper at http://knoesis.org/library/resource.php?id=2150
Thank you
Contact
Siva Kumar (siva@knoesis.org)
Pavan Kapanipathi (pavan@knoesis.org) @pavankaps
Amit Sheth (amit@knoesis.org) @amit_p

32
Top N Evaluation
If: Rank(T) <= N
Hits++
Else
Miss++
Recall = Hits/Total
User Ratings
– Test Set
Pick top rated
entity ‘T’
Pick 1000
random entities
Generate Rank
for 1001 entities
Cremonesi et al.
Top N Evaluation

Entity Recommendations Using Hierarchical Knowledge Bases

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Entity Recommendations Using Hierarchical Knowledge Bases

Similar to Entity Recommendations Using Hierarchical Knowledge Bases (20)

Recently uploaded

Recently uploaded (20)

Entity Recommendations Using Hierarchical Knowledge Bases