1
Ranking Objects by Following Paths
in Entity-Relationship Graphs
Seoul National University, Seoul, Republic of Korea
4th Workshop for Ph.D. Students in
Information and Knowledge Management (PIKM 2011)
in conjunction with CIKM 2011
School of Computer Science and Engineering
October 28th, 2011
Minsuk Kahng, Sangkeun Lee and Sang-goo Lee
2Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Problem
• Objective
– Given some query objects,
rank objects of a specified type
• Examples
– For a given user and current timestamp,
which songs are most preferable?
– For a given paper,
which conferences are most relevant to be submitted?
Introduction
Brian
28 OCT
11:30 AM
Query Ranked List
3Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Outline
• Introduction
• Data Model
• Method
• Preliminary Experiments
• Related Work
• Challenging Issues
• Conclusion
4Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Heterogeneous Data for Search
• Various types of data for search and recommendation
– Traditionally, only documents and words
– Now, various objects and relationships between them
• Documents, Products, Music, User, Clickthrough Log, Map,
Contextual Data (e.g. time & place)
• Incorporating these heterogeneous data is key!
Introduction
5Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Utilize Graph-based Data Model
• Graph-based data models have gained popularity
for dealing with these heterogeneous data.
– Objects and relationships can be modeled by nodes and edges.
– It is called “Entity-Relationship Graphs”.
• Solve the problem with graph operations
Introduction
Star: stenier…
G. Kasneci
2009
ICDEIEEE
Paper#052
Paper#065
Discover:
keyword…
M. Ramanath
M. Sozio
F. Suchanek
G. Weikum
2008
Paper#053
Naga:
searching…
hasTitle
hasTitle
hasPublisher
User Doc Click Time
#12 cikm10.org 0 11:00
#12 cikm11.org 1 11:01
#500 … 0 13:45
#500 … 0 13:46
#500 … 1 13:46
#904 … 1 13:51
degree
random walk
shortest path
Original database Graph data model Graph problem
6Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Object Ranking with Graphs
• With graph-based data models
– Problem Definition
• Given query objects, rank objects of a specified type
– With graph models, it becomes:
• Which target nodes are most similar to the query nodes?
• Challenges
– How to define similarity between nodes in the graph?
• Not just shortest distances
• Capture semantics of structured data
Introduction
7Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Existing work, but…
• Homogeneous Graph
– focus on structure of graph, not consider ‘type’ of nodes/edges
– PageRank – inlink/outlink of nodes, random walks, … [1]
• Consider the type of edges: Edge-level feature
– Each edge has a ‘type’.
– ObjectRank - Each edge type has different weight [2].
• But still, hard to capture semantics
– Importance of edge type is always same regardless of tasks
Introduction
[1] Brin and Page. … hypertextual web search engine. In WWW, 1998.
[2] Balmin et al. Objectrank: authority-based keyword search …. In VLDB, 2004.
8Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Path-level feature instead of edge
• Recently, the path-level feature has been mentioned
as a replacement for the edge-level feature [3, 4].
– Each path, a sequence of edge types, has weight.
– An edge has a different role depending on the tasks.
Introduction
[3] Sun et al. PathSim: meta path-based top-k similarity …. In VLDB, 2011.
[4] Lao et al. … retrieval … path constrained random walks. In ECML-PKDD, 2010.
hasAuthor
hasTitle
cite
pubWhere
0.90
0.50
0.35
0.15
hasTitle-1
cite pubWhere
hasChair
-1
hasAuthor
-1
hasAuthor pubWhere
hasTitle
-1
pubWhere
0.90
0.50
0.35
0.15
feature weight feature weight
edge-level feature path-level feature
9Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Proposed: Object Ranking with Paths
• We propose an object ranking method.
– Transform original data into graph-based data models
– Use paths in the graphs for ranking
Introduction
Paper#051
1998
L. Page
S. Brin
The anatomy
of a large… Star: stenier…
G. Kasneci
2009
ICDEIEEE
Paper#052WWW
Paper#061
2004
A. Balmin
Objectrank:
authority…
V. Hristidis
VLDB
Paper#065
2002
Discover:
keyword…Y. Papakonstantinou
M. Ramanath
M. Sozio
F. Suchanek
G. Weikum
2008
Paper#053
G. Ifrim
Naga:
searching…
hasTitle
hasTitle
hasTitle
hasTitle
hasPublisher
10Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Outline
• Introduction
• Data Model
• Method
11Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Data Graph & Schema Graph
• Data graph – Instances (types both on nodes & edges)
• Schema graph – Schema of data graph
Data Model
Using DBLP dataset, we build a data graph which has types on both nodes & edges.
Example of data graph
Paper#051
1998
L. Page
S. Brin
The anatomy
of a large…
Star: stenier…
G. Kasneci
2009
ICDEIEEE
Paper#052
WWW
Paper#061
2004
A. Balmin
Objectrank:
authority…
V. Hristidis
VLDB
Paper#065
2002
Discover:
keyword…Y. Papakonstantinou
M. Ramanath
M. Sozio
F. Suchanek
G. Weikum
2008
Paper#053
G. Ifrim
Naga:
searching…
hasTitle
hasTitle
hasTitle
hasTitle
pubWhere
pubWhere
hasPublisher
pubYear
pubWhere
WWW‘98
ICDE’09
VLDB’04
VLDB’02
ICDE’08
Publisher
Proceeding
Title
Author
ConferencePaper
Year
12Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Method Overview
• Problem
– Given query nodes and target type,
which target nodes are most relevant to the query nodes?
• Steps
1. Choose schema paths from the schema graph
2. For each path, look into data graph, and score target nodes
3. Combine those results of each path
Method
13Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
1-1) What is Schema Path?
• A schema path p is a sequence of edge types.
• Represents a workflow of how we traverse the graph.
Method
T
ARTIST
ALBUM
TRACK LISTEN_LOG
USER
GENRE
TRACK_GENRE
USER:
username
USER:
birthday
USER:
register_date USER:
gender
LISTEN_LOG:
listen_dateTRACK:
title
ARTIST:
name
ARTIST:
genderALBUM:
title
ALBUM:
release_date
GENRE:
name
TRACKLLOG
hasTrack
TRACK:title termhasTitle hasTerm
The edge types are omitted.
USER
hasUser-1
term
TRACK:title TRACKhasTerm
-1
hasTitle-1
ALBUM
hasAlbum
14Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
1-2) Choose Schema Paths
• Discover candidate schema paths, and select some of them.
– Simply, find all paths between two nodes on schema graph
• 1) User > Log > Song > Album
• 2) User > Log > Song > Artist > Album
– Symmetric Paths
• a) User > Log > Song > Log > User
• b) Song > SongTitle > Term > SongTitle > Song
– Expanding with Symmetric Paths
• 1+a) User > Log > Song > Log > User > Log > Song -> Album (CF)
• 1+b) User > Log > Song > STitle > Term > STitle > Song > Album
Method
AlbumUser
symmetric path
15Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
2-1) For each path, look into data graph
• For each path, look into data graph, and score target objects.
• From query to target, follow edges in the path.
Method
t1 t2 t3
(Term) (SongTitle) (Song) (Album) (Song)
t4
Schema Path (extracted from schema graph)
from Data Graph
query Q
target type dr
r = M
(m)
×…× M
(1)
×q = A×q
16Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
2-2) How to give scores to nodes?
• How to give scores to the nodes during propagation?
• Define scoring functions
– Number of paths
– Distribute the current state equally (random walk)
– Distribute, but not equally (e.g. for diminishing popularity effect)
Method
?
2
0
1
0
0
0
2
3
2
1
0
0.20
0.39
0.22
0.12
0.07
0.35
0.20
0.20
0.16
0.09
0.60
0.05
0.20
0.05
0.05
0.05
r = M
(m)
×…× M
(1)
×q = A×q Define scoring functions
= Define how to fill value of the matrix M
17Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
3) Combine the results
• Now, we have scores of nodes obtained from each path.
• Weighted combination of each result
– Manually
– End user
– Learning (e.g. learning-to-rank)
Method
∑ ×=
j
ijji qxRwqxR ),(),(
i-th objects of target type
weight of j-th path
score of i-th object
obtained from j-th path
18Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Outline
• Introduction
• Data Model
• Method
• Preliminary Experiments
– Datasets
– Tasks (Scenarios)
19Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Two Real-world Datasets
• Music streaming service (Bugs Music)
– Music Metadata (song, album, artist, genre)
– Users’ Listening Log
• e.g. “’User #12’ listened to ‘Song #59’ at time t.”
– Node types: song, album, artist, genre, user, date, log, etc.
• DBLP publication
– All papers published in 20 major conferences
– Node types: paper, author, conference, etc.
Preliminary Experiments
20Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Task #1: Basic Scenario
• Scenario: Recommend related conferences
– Input: a set of terms
– Output: Conferences
• Path
– (Term) -> (PaperTitle) -> (Paper) -> (Conference)
Preliminary Experiments
1 WWW
2 ICWSM
3 CHI
4 ECIR
5 WSDM
Query #1:
“twitter”
1 KDD
2 CIKM
3 ICDE
4 ICDM
5 SDM
1 ICDE
2 SIGMOD
3 VLDB
4 DASFAA
5 CIKM
1 SIGIR
2 ECIR
3 CIKM
4 AAAI
5 WWW
Query #2:
“graph clustering”
Query #3:
“transaction lock
protocal optimizing”
Query #4:
“language model
smoothing”
21Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Task #2: Different paths
• Research Q: How two different paths produce different results
• Scenario: Find similar papers
– Input: particular Paper(w/ title)
– Output: Papers(w/ title)
Preliminary Experiments
1 DISCOVER: Keyword Search in Relational
Datab.
2 Discover Relaxed Periodicity in Temporal Datab.
3 Discover Relev. Env. Feature Using Concurrent
Reinf. Learning
4 Mining Propositional Knowl. Bases to Discover
Multi-level Rules
5 Mining Multivar. Time-Ser. Sensor Data to
Discover Behav. Envel.
1 DISCOVER: Keyword Search in Relational
Datab.
2 ObjectRank: Authority-Based Keyword Search
in Datab.
3 ObjectRank: A System for Auth.-based Search
on Datab.
4 Keyword Proximity Search on XML Graphs
5 Efficient IR-Style Keyword Search over
Relational Datab.
Path #1: “(Paper) -> (PaperTitle) ->
(Term) -> (PaperTitle) -> (Paper)”
Path #2: “(Paper) -> (PaperAuthorRelation) ->
(Author) -> (PaperAuthorRelation) -> (Paper)”
Query: “DISCOVER: Keyword Search in Relational Databases”
22Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Task #3: Combine Paths
• Research Q: Combine results from two different paths
• Scenario: Given a user and current date, recommend artists
based on user's listening log and daily popularity
– Input: particular User, Current Date
– Output: Artists
Preliminary Experiments
Path #1: “(User) -> (ListenLog) -> (Song) -> (ListenLog) -> (User)
-> (ListenLog) -> (Song) -> (Artist)”
Path #2: “(Date) -> (ListenTimestamp) -> (ListenLog) -> (Song) -> (Artist)”
Artist’s name Path #1 Path #2
1 Black Eyed Peas 1 21
2 8Elight (local) 8 2
3 V.O.S. (local) - 1
4 Eminem 4 37
5 Ciara 2 86
23Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Outline
• Introduction
• Data Model
• Method
• Preliminary Experiments
• Related Work
– Keyword search in DB
– Recommender systems
– Graph-based Ranking
• Challenging Issues
– Learning
– Efficiency
– Flexible Querying Framework
– Result Presentation
24Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Related Work
• Keyword Search in DB
– Also deal with heterogeneous objects
– More concerned with performance, not ranking
• Recommender Systems
– Focus more on semantic similarity
– Only two types of entities: User & Item
• Graph-based Ranking
– Mostly edge-level
– Recently, path-level methods have been introduced.
– Some work with RDF & SPARQL
25Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Challenging Issues (Future Work)
• Learning
– can improve the quality of ranked results
– Weights of paths can be determined (e.g. learning-to-rank) [4]
• Efficiency
– Performance issues when data size go up
– Materializing [3], Filtering
• Flexible Querying Framework
– Expressiveness of query [5]
– Provide operators, conditional clauses, choose score functions
• Result presentation
– Explanation of results [6] (why it is ranked 1st)
– Schema path would be helpful.
[5] Varadarajan et al. Flexible and efficient querying and ranking …. In EDBT, 2009.
[6] Yu et al. Recommendation diversification using explanations. In ICDE, 2009.
26Ranking Objects by Following Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011||
Conclusion
• Propose an object ranking method.
• Heterogeneous Data
– Utilize graph-based data model to capture heterogeneity.
• Paths in Graph
– Use schema path in the graph which implies the semantics.
• Ranking
– By following these paths, objects are ranked.
• Discussion
– Many challenging issues & much room for improvements
27
Thank you
minsuk@europa.snu.ac.kr

Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Workshop at CIKM)

  • 1.
    1 Ranking Objects byFollowing Paths in Entity-Relationship Graphs Seoul National University, Seoul, Republic of Korea 4th Workshop for Ph.D. Students in Information and Knowledge Management (PIKM 2011) in conjunction with CIKM 2011 School of Computer Science and Engineering October 28th, 2011 Minsuk Kahng, Sangkeun Lee and Sang-goo Lee
  • 2.
    2Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Problem • Objective – Given some query objects, rank objects of a specified type • Examples – For a given user and current timestamp, which songs are most preferable? – For a given paper, which conferences are most relevant to be submitted? Introduction Brian 28 OCT 11:30 AM Query Ranked List
  • 3.
    3Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Outline • Introduction • Data Model • Method • Preliminary Experiments • Related Work • Challenging Issues • Conclusion
  • 4.
    4Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Heterogeneous Data for Search • Various types of data for search and recommendation – Traditionally, only documents and words – Now, various objects and relationships between them • Documents, Products, Music, User, Clickthrough Log, Map, Contextual Data (e.g. time & place) • Incorporating these heterogeneous data is key! Introduction
  • 5.
    5Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Utilize Graph-based Data Model • Graph-based data models have gained popularity for dealing with these heterogeneous data. – Objects and relationships can be modeled by nodes and edges. – It is called “Entity-Relationship Graphs”. • Solve the problem with graph operations Introduction Star: stenier… G. Kasneci 2009 ICDEIEEE Paper#052 Paper#065 Discover: keyword… M. Ramanath M. Sozio F. Suchanek G. Weikum 2008 Paper#053 Naga: searching… hasTitle hasTitle hasPublisher User Doc Click Time #12 cikm10.org 0 11:00 #12 cikm11.org 1 11:01 #500 … 0 13:45 #500 … 0 13:46 #500 … 1 13:46 #904 … 1 13:51 degree random walk shortest path Original database Graph data model Graph problem
  • 6.
    6Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Object Ranking with Graphs • With graph-based data models – Problem Definition • Given query objects, rank objects of a specified type – With graph models, it becomes: • Which target nodes are most similar to the query nodes? • Challenges – How to define similarity between nodes in the graph? • Not just shortest distances • Capture semantics of structured data Introduction
  • 7.
    7Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Existing work, but… • Homogeneous Graph – focus on structure of graph, not consider ‘type’ of nodes/edges – PageRank – inlink/outlink of nodes, random walks, … [1] • Consider the type of edges: Edge-level feature – Each edge has a ‘type’. – ObjectRank - Each edge type has different weight [2]. • But still, hard to capture semantics – Importance of edge type is always same regardless of tasks Introduction [1] Brin and Page. … hypertextual web search engine. In WWW, 1998. [2] Balmin et al. Objectrank: authority-based keyword search …. In VLDB, 2004.
  • 8.
    8Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Path-level feature instead of edge • Recently, the path-level feature has been mentioned as a replacement for the edge-level feature [3, 4]. – Each path, a sequence of edge types, has weight. – An edge has a different role depending on the tasks. Introduction [3] Sun et al. PathSim: meta path-based top-k similarity …. In VLDB, 2011. [4] Lao et al. … retrieval … path constrained random walks. In ECML-PKDD, 2010. hasAuthor hasTitle cite pubWhere 0.90 0.50 0.35 0.15 hasTitle-1 cite pubWhere hasChair -1 hasAuthor -1 hasAuthor pubWhere hasTitle -1 pubWhere 0.90 0.50 0.35 0.15 feature weight feature weight edge-level feature path-level feature
  • 9.
    9Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Proposed: Object Ranking with Paths • We propose an object ranking method. – Transform original data into graph-based data models – Use paths in the graphs for ranking Introduction Paper#051 1998 L. Page S. Brin The anatomy of a large… Star: stenier… G. Kasneci 2009 ICDEIEEE Paper#052WWW Paper#061 2004 A. Balmin Objectrank: authority… V. Hristidis VLDB Paper#065 2002 Discover: keyword…Y. Papakonstantinou M. Ramanath M. Sozio F. Suchanek G. Weikum 2008 Paper#053 G. Ifrim Naga: searching… hasTitle hasTitle hasTitle hasTitle hasPublisher
  • 10.
    10Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Outline • Introduction • Data Model • Method
  • 11.
    11Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Data Graph & Schema Graph • Data graph – Instances (types both on nodes & edges) • Schema graph – Schema of data graph Data Model Using DBLP dataset, we build a data graph which has types on both nodes & edges. Example of data graph Paper#051 1998 L. Page S. Brin The anatomy of a large… Star: stenier… G. Kasneci 2009 ICDEIEEE Paper#052 WWW Paper#061 2004 A. Balmin Objectrank: authority… V. Hristidis VLDB Paper#065 2002 Discover: keyword…Y. Papakonstantinou M. Ramanath M. Sozio F. Suchanek G. Weikum 2008 Paper#053 G. Ifrim Naga: searching… hasTitle hasTitle hasTitle hasTitle pubWhere pubWhere hasPublisher pubYear pubWhere WWW‘98 ICDE’09 VLDB’04 VLDB’02 ICDE’08 Publisher Proceeding Title Author ConferencePaper Year
  • 12.
    12Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Method Overview • Problem – Given query nodes and target type, which target nodes are most relevant to the query nodes? • Steps 1. Choose schema paths from the schema graph 2. For each path, look into data graph, and score target nodes 3. Combine those results of each path Method
  • 13.
    13Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| 1-1) What is Schema Path? • A schema path p is a sequence of edge types. • Represents a workflow of how we traverse the graph. Method T ARTIST ALBUM TRACK LISTEN_LOG USER GENRE TRACK_GENRE USER: username USER: birthday USER: register_date USER: gender LISTEN_LOG: listen_dateTRACK: title ARTIST: name ARTIST: genderALBUM: title ALBUM: release_date GENRE: name TRACKLLOG hasTrack TRACK:title termhasTitle hasTerm The edge types are omitted. USER hasUser-1 term TRACK:title TRACKhasTerm -1 hasTitle-1 ALBUM hasAlbum
  • 14.
    14Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| 1-2) Choose Schema Paths • Discover candidate schema paths, and select some of them. – Simply, find all paths between two nodes on schema graph • 1) User > Log > Song > Album • 2) User > Log > Song > Artist > Album – Symmetric Paths • a) User > Log > Song > Log > User • b) Song > SongTitle > Term > SongTitle > Song – Expanding with Symmetric Paths • 1+a) User > Log > Song > Log > User > Log > Song -> Album (CF) • 1+b) User > Log > Song > STitle > Term > STitle > Song > Album Method AlbumUser symmetric path
  • 15.
    15Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| 2-1) For each path, look into data graph • For each path, look into data graph, and score target objects. • From query to target, follow edges in the path. Method t1 t2 t3 (Term) (SongTitle) (Song) (Album) (Song) t4 Schema Path (extracted from schema graph) from Data Graph query Q target type dr r = M (m) ×…× M (1) ×q = A×q
  • 16.
    16Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| 2-2) How to give scores to nodes? • How to give scores to the nodes during propagation? • Define scoring functions – Number of paths – Distribute the current state equally (random walk) – Distribute, but not equally (e.g. for diminishing popularity effect) Method ? 2 0 1 0 0 0 2 3 2 1 0 0.20 0.39 0.22 0.12 0.07 0.35 0.20 0.20 0.16 0.09 0.60 0.05 0.20 0.05 0.05 0.05 r = M (m) ×…× M (1) ×q = A×q Define scoring functions = Define how to fill value of the matrix M
  • 17.
    17Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| 3) Combine the results • Now, we have scores of nodes obtained from each path. • Weighted combination of each result – Manually – End user – Learning (e.g. learning-to-rank) Method ∑ ×= j ijji qxRwqxR ),(),( i-th objects of target type weight of j-th path score of i-th object obtained from j-th path
  • 18.
    18Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Outline • Introduction • Data Model • Method • Preliminary Experiments – Datasets – Tasks (Scenarios)
  • 19.
    19Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Two Real-world Datasets • Music streaming service (Bugs Music) – Music Metadata (song, album, artist, genre) – Users’ Listening Log • e.g. “’User #12’ listened to ‘Song #59’ at time t.” – Node types: song, album, artist, genre, user, date, log, etc. • DBLP publication – All papers published in 20 major conferences – Node types: paper, author, conference, etc. Preliminary Experiments
  • 20.
    20Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Task #1: Basic Scenario • Scenario: Recommend related conferences – Input: a set of terms – Output: Conferences • Path – (Term) -> (PaperTitle) -> (Paper) -> (Conference) Preliminary Experiments 1 WWW 2 ICWSM 3 CHI 4 ECIR 5 WSDM Query #1: “twitter” 1 KDD 2 CIKM 3 ICDE 4 ICDM 5 SDM 1 ICDE 2 SIGMOD 3 VLDB 4 DASFAA 5 CIKM 1 SIGIR 2 ECIR 3 CIKM 4 AAAI 5 WWW Query #2: “graph clustering” Query #3: “transaction lock protocal optimizing” Query #4: “language model smoothing”
  • 21.
    21Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Task #2: Different paths • Research Q: How two different paths produce different results • Scenario: Find similar papers – Input: particular Paper(w/ title) – Output: Papers(w/ title) Preliminary Experiments 1 DISCOVER: Keyword Search in Relational Datab. 2 Discover Relaxed Periodicity in Temporal Datab. 3 Discover Relev. Env. Feature Using Concurrent Reinf. Learning 4 Mining Propositional Knowl. Bases to Discover Multi-level Rules 5 Mining Multivar. Time-Ser. Sensor Data to Discover Behav. Envel. 1 DISCOVER: Keyword Search in Relational Datab. 2 ObjectRank: Authority-Based Keyword Search in Datab. 3 ObjectRank: A System for Auth.-based Search on Datab. 4 Keyword Proximity Search on XML Graphs 5 Efficient IR-Style Keyword Search over Relational Datab. Path #1: “(Paper) -> (PaperTitle) -> (Term) -> (PaperTitle) -> (Paper)” Path #2: “(Paper) -> (PaperAuthorRelation) -> (Author) -> (PaperAuthorRelation) -> (Paper)” Query: “DISCOVER: Keyword Search in Relational Databases”
  • 22.
    22Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Task #3: Combine Paths • Research Q: Combine results from two different paths • Scenario: Given a user and current date, recommend artists based on user's listening log and daily popularity – Input: particular User, Current Date – Output: Artists Preliminary Experiments Path #1: “(User) -> (ListenLog) -> (Song) -> (ListenLog) -> (User) -> (ListenLog) -> (Song) -> (Artist)” Path #2: “(Date) -> (ListenTimestamp) -> (ListenLog) -> (Song) -> (Artist)” Artist’s name Path #1 Path #2 1 Black Eyed Peas 1 21 2 8Elight (local) 8 2 3 V.O.S. (local) - 1 4 Eminem 4 37 5 Ciara 2 86
  • 23.
    23Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Outline • Introduction • Data Model • Method • Preliminary Experiments • Related Work – Keyword search in DB – Recommender systems – Graph-based Ranking • Challenging Issues – Learning – Efficiency – Flexible Querying Framework – Result Presentation
  • 24.
    24Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Related Work • Keyword Search in DB – Also deal with heterogeneous objects – More concerned with performance, not ranking • Recommender Systems – Focus more on semantic similarity – Only two types of entities: User & Item • Graph-based Ranking – Mostly edge-level – Recently, path-level methods have been introduced. – Some work with RDF & SPARQL
  • 25.
    25Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Challenging Issues (Future Work) • Learning – can improve the quality of ranked results – Weights of paths can be determined (e.g. learning-to-rank) [4] • Efficiency – Performance issues when data size go up – Materializing [3], Filtering • Flexible Querying Framework – Expressiveness of query [5] – Provide operators, conditional clauses, choose score functions • Result presentation – Explanation of results [6] (why it is ranked 1st) – Schema path would be helpful. [5] Varadarajan et al. Flexible and efficient querying and ranking …. In EDBT, 2009. [6] Yu et al. Recommendation diversification using explanations. In ICDE, 2009.
  • 26.
    26Ranking Objects byFollowing Paths in Entity-Relationship Graphs© 2011 Minsuk Kahng PIKM 2011|| Conclusion • Propose an object ranking method. • Heterogeneous Data – Utilize graph-based data model to capture heterogeneity. • Paths in Graph – Use schema path in the graph which implies the semantics. • Ranking – By following these paths, objects are ranked. • Discussion – Many challenging issues & much room for improvements
  • 27.