SlideShare a Scribd company logo
Mining Interesting Meta-Paths
from Complex Heterogeneous
Information Networks
Baoxu Shi, Tim Weninger
University of Notre Dame
1
Homogeneous Network
MoDAT
2
Heterogeneous Network
Association
People
University City Country
Conference Workshop
Belongs to
Speaks at
locates at
locates at the capital of
affiliate
Professor of
3
MoDAT
Heterogeneous Network
Belongs to
Speaks at
locates at
locates at the capital of
affiliate
Professor at
People
Association
Meeting
Education
Geography
Meeting
Geography
Heterogeneous Network
People
Association
Meeting
Education
Geography
Meeting
Geography
Path and Meta-Path
PeopleMeeting Education Geography
Association
How things are uniquely
connected/separated?
NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND
3
EDUCATION GEOGRAPHYPEOPLE
Path
Meta-Path
1 2
1 2 3
EDUCATION
Interesting meta-path is meta-path that best describes how
two objects are uniquely related in complex HINs.
7
NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND
3
EDUCATION GEOGRAPHYPEOPLE
Path
Meta-Path
1 2
1 2 3
EDUCATION
Education Professor University Geography
NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND
3
EDUCATION GEOGRAPHYPEOPLE
Path
Meta-Path
1 2
1 2 3
EDUCATION
Education Professor University Geography
Education Network Scientist
Catholic
University
Geography
9
NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND
3
EDUCATION GEOGRAPHYPEOPLE
Path
Meta-Path
1 2
1 2 3
EDUCATION
Education Professor University Geography
Education Network Scientist
Catholic
University
Geography
Education
Network Scientist
who born in
Transylvania,1967
Catholic
University
at South Bend, IN
Geography
10
Limitations of State of the Art
Meta-Path Related Researches
• Type of meta-labels are limited
• Meta-types do not have complex hierarchy
• Meta-paths are pre-defined manually
• No large scale experiments
Term Venue
Paper
Author
11
NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND
3
EDUCATION GEOGRAPHYPEOPLE
Path
Meta-Path
1 2
1 2 3
EDUCATION
Limitations of State of the Art
Meta-Path Related Researches
• Type of meta-labels are limited
• Meta-types do not have complex hierarchy
• Meta-paths are pre-defined manually
• No large scale experiments
Framework that can handle millions of meta-types
Meta-types with complex hierarchy
Meta-path are automatically generated
Experiments are done on Wikipedia 

(10 million nodes, 740 million edges)
12
How to find interesting paths?
• Generate paths
• Rank top k interesting paths using meta-data
• Extract meta-path for searching
13
Path Generation
sib(ai, aj) i↵ ai 2 t ^ aj 2 t
8au0 2 A0
u, sib(au0 , au) 8av0 2 A0
v, sib(av0 , av)
{~y1, ~y2, . . .} 2 Y ~y = ha1, a2, . . . , a|~y|ii
~x = ha1, a2, . . . , a|~x|ii{~x1, ~x2, . . . , ~xk} 2 X
au = a1, av = a|~x|, 1  i  k
• Generate path set for given points
• Generate sibling path set
14
a1 2 A0
u, a|~y| 2 A0
v, 1  i  k
au, av
ROCKNE
WAND
AMERICAN
COMPUTER
SCIENTISTS
PROGRAMMING
LANGUAGE
RESEARCHERS
983 Others
BARBARA LISKOV
ANDERS_HEJLSBERG
79 Others
UNIVERSITY OF
NOTRE DAME
FACULTY
COLLEGE
FOOTBALL
HALL OF FAME
INDUCTEES
21 Others
JOHN HEISMAN
BARRY SANDERS
JULIUS NIEUWLAND
BARABÁSI
204 Others
1075 OthersHAL ABELSON
Y
X
VASANT HONAVAR
Short Paths
Example: Path generation
15
ROCKNE
NORTHEASTERN NOTRE DAME
WAND
BARABÁSI
ROSE BOWLHARVARD
CY YOUNG CARL HUBBELL
CARNEGIE MELLON UNIVERSITY
TD GARDEN LA COLISEUM
Example: Path generation
16
Which is the most interesting path?
Path Ranking
• Unordered Ranking
~xi = ha1, a2, . . . , a|~xi|i ~T~xi
= hTa1 , Ta2 , . . . , Ta|~xi|
i
T~xi
=
|~xn|
[
n=1
Tan
TY =
|Y |
[
i=1
{T~yi
}T~yi
= Ta0
u
[
|~yi| 1
[
n=2
{Tan
} [ Ta0
v
r(~xi) =
|T~xi
 TY |
|TY |
Path Ranking
• Ordered Ranking
~xi = ha1, a2, . . . , a|~xi|i ~T~xi
= hTa1 , Ta2 , . . . , Ta|~xi|
i
p(an, a0
n) =
|Tan
 TYn
|
|TYn |
r(~xi) = mean
|~xi|
n=1(p(an, a0
n))
19
Result: Path Ranking
Qualitative analysis is done with mechanical turkers.
20
●
●
●
●
●
0.48
0.52
0.56
0.60
0 0.25 0.5 0.75 1
Result: Path Ranking
Result shows user more like to pick path with
lowest or highest similarity.
People pick path with highest score may because they treat best
as correct.
DATA MINERS
JIAWEI
HAN
DATA MINING SIGKDD JOHANNES
GEHRKE
STATISTICIANS
MATHEMATICIANS
PEOPLE
SCHOLARS AND
ACADEMICS
DATA MINING
SCIENCE
ACM SIGS
PEOPLE
MorespecificMoregeneral
Nodes
Types
COMPUTATIONAL
STATISTICS
MATHEMATICAL
SCIENCES
STATISTICS
SOCIETY
ACM
PROFESSIONAL
ORGANIZATIONS
SCIENTIFIC
SOCIETIES
DATABASE
RESEARCHERS
COMPUTER
SCIENTISTS
SCHOLARS AND
ACADEMICS
SCHOLARS
ORGANIZATIONS
Example: Extract Meta-Path
22
Result: Meta-Path Constraint RWR
0 0.24 0.41 0.48
Edgar F. Codd 40.5 18.1 9.0
Johannes Gehrke 28.4 29.4 8.4 2.8
Raghu Ramakrishnan 31.1 6.0 3.6
Anita Borg 5.1 0.6 0.2
Shafi Goldwasser 4.9 0.6
Osmar R. Zaiane 4.8 3.6 1.6
Vint Cerf 4.1 2.4 0.2
Allen Newell 2.0 0.6
ACM 5.1
IEEE 4.9
Yahoo! Research 4.8
Microsoft Research 4.4
Database
researchers
Computer
Scientist
Questions?

More Related Content

Recently uploaded

Writing Sample 2 -Bridging the Divide: Enhancing Public Engagement in Urban D...
Writing Sample 2 -Bridging the Divide: Enhancing Public Engagement in Urban D...Writing Sample 2 -Bridging the Divide: Enhancing Public Engagement in Urban D...
Writing Sample 2 -Bridging the Divide: Enhancing Public Engagement in Urban D...Rahsaan L. Browne
 
Hi-Tech Industry 2024-25 Prospective.pptx
Hi-Tech Industry 2024-25 Prospective.pptxHi-Tech Industry 2024-25 Prospective.pptx
Hi-Tech Industry 2024-25 Prospective.pptxShivamM16
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesIP ServerOne
 
Pollinator Ambassador Earth Steward Day Presentation 2024-05-22
Pollinator Ambassador Earth Steward Day Presentation 2024-05-22Pollinator Ambassador Earth Steward Day Presentation 2024-05-22
Pollinator Ambassador Earth Steward Day Presentation 2024-05-22LHelferty
 
05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community Networking05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community NetworkingMichael Orias
 
123445566544333222333444dxcvbcvcvharsh.pptx
123445566544333222333444dxcvbcvcvharsh.pptx123445566544333222333444dxcvbcvcvharsh.pptx
123445566544333222333444dxcvbcvcvharsh.pptxgargh1099
 
Introduction of Biology in living organisms
Introduction of Biology in living organismsIntroduction of Biology in living organisms
Introduction of Biology in living organismssoumyapottola
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationAccess Innovations, Inc.
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerVladimir Samoylov
 
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdfOracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdfSkillCertProExams
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic AbusersOWASP Beja
 
527598851-ppc-due-to-various-govt-policies.pdf
527598851-ppc-due-to-various-govt-policies.pdf527598851-ppc-due-to-various-govt-policies.pdf
527598851-ppc-due-to-various-govt-policies.pdfrajpreetkaur75080
 
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Orkestra
 
The Canoga Gardens Development Project. PDF
The Canoga Gardens Development Project. PDFThe Canoga Gardens Development Project. PDF
The Canoga Gardens Development Project. PDFRahsaan L. Browne
 

Recently uploaded (14)

Writing Sample 2 -Bridging the Divide: Enhancing Public Engagement in Urban D...
Writing Sample 2 -Bridging the Divide: Enhancing Public Engagement in Urban D...Writing Sample 2 -Bridging the Divide: Enhancing Public Engagement in Urban D...
Writing Sample 2 -Bridging the Divide: Enhancing Public Engagement in Urban D...
 
Hi-Tech Industry 2024-25 Prospective.pptx
Hi-Tech Industry 2024-25 Prospective.pptxHi-Tech Industry 2024-25 Prospective.pptx
Hi-Tech Industry 2024-25 Prospective.pptx
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
 
Pollinator Ambassador Earth Steward Day Presentation 2024-05-22
Pollinator Ambassador Earth Steward Day Presentation 2024-05-22Pollinator Ambassador Earth Steward Day Presentation 2024-05-22
Pollinator Ambassador Earth Steward Day Presentation 2024-05-22
 
05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community Networking05232024 Joint Meeting - Community Networking
05232024 Joint Meeting - Community Networking
 
123445566544333222333444dxcvbcvcvharsh.pptx
123445566544333222333444dxcvbcvcvharsh.pptx123445566544333222333444dxcvbcvcvharsh.pptx
123445566544333222333444dxcvbcvcvharsh.pptx
 
Introduction of Biology in living organisms
Introduction of Biology in living organismsIntroduction of Biology in living organisms
Introduction of Biology in living organisms
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
 
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdfOracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
Oracle Database Administration I (1Z0-082) Exam Dumps 2024.pdf
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
 
527598851-ppc-due-to-various-govt-policies.pdf
527598851-ppc-due-to-various-govt-policies.pdf527598851-ppc-due-to-various-govt-policies.pdf
527598851-ppc-due-to-various-govt-policies.pdf
 
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
 
The Canoga Gardens Development Project. PDF
The Canoga Gardens Development Project. PDFThe Canoga Gardens Development Project. PDF
The Canoga Gardens Development Project. PDF
 

Featured

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 

Featured (20)

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 

Mining Interesting Meta-Paths from Complex Heterogeneous Information Networks

  • 1. Mining Interesting Meta-Paths from Complex Heterogeneous Information Networks Baoxu Shi, Tim Weninger University of Notre Dame 1
  • 3. Heterogeneous Network Association People University City Country Conference Workshop Belongs to Speaks at locates at locates at the capital of affiliate Professor of 3 MoDAT
  • 4. Heterogeneous Network Belongs to Speaks at locates at locates at the capital of affiliate Professor at People Association Meeting Education Geography Meeting Geography
  • 6. Path and Meta-Path PeopleMeeting Education Geography Association
  • 7. How things are uniquely connected/separated? NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND 3 EDUCATION GEOGRAPHYPEOPLE Path Meta-Path 1 2 1 2 3 EDUCATION Interesting meta-path is meta-path that best describes how two objects are uniquely related in complex HINs. 7
  • 8. NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND 3 EDUCATION GEOGRAPHYPEOPLE Path Meta-Path 1 2 1 2 3 EDUCATION Education Professor University Geography
  • 9. NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND 3 EDUCATION GEOGRAPHYPEOPLE Path Meta-Path 1 2 1 2 3 EDUCATION Education Professor University Geography Education Network Scientist Catholic University Geography 9
  • 10. NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND 3 EDUCATION GEOGRAPHYPEOPLE Path Meta-Path 1 2 1 2 3 EDUCATION Education Professor University Geography Education Network Scientist Catholic University Geography Education Network Scientist who born in Transylvania,1967 Catholic University at South Bend, IN Geography 10
  • 11. Limitations of State of the Art Meta-Path Related Researches • Type of meta-labels are limited • Meta-types do not have complex hierarchy • Meta-paths are pre-defined manually • No large scale experiments Term Venue Paper Author 11 NORTHEASTERN BARABÁSI NOTRE DAME SOUTH BEND 3 EDUCATION GEOGRAPHYPEOPLE Path Meta-Path 1 2 1 2 3 EDUCATION
  • 12. Limitations of State of the Art Meta-Path Related Researches • Type of meta-labels are limited • Meta-types do not have complex hierarchy • Meta-paths are pre-defined manually • No large scale experiments Framework that can handle millions of meta-types Meta-types with complex hierarchy Meta-path are automatically generated Experiments are done on Wikipedia (10 million nodes, 740 million edges) 12
  • 13. How to find interesting paths? • Generate paths • Rank top k interesting paths using meta-data • Extract meta-path for searching 13
  • 14. Path Generation sib(ai, aj) i↵ ai 2 t ^ aj 2 t 8au0 2 A0 u, sib(au0 , au) 8av0 2 A0 v, sib(av0 , av) {~y1, ~y2, . . .} 2 Y ~y = ha1, a2, . . . , a|~y|ii ~x = ha1, a2, . . . , a|~x|ii{~x1, ~x2, . . . , ~xk} 2 X au = a1, av = a|~x|, 1  i  k • Generate path set for given points • Generate sibling path set 14 a1 2 A0 u, a|~y| 2 A0 v, 1  i  k au, av
  • 15. ROCKNE WAND AMERICAN COMPUTER SCIENTISTS PROGRAMMING LANGUAGE RESEARCHERS 983 Others BARBARA LISKOV ANDERS_HEJLSBERG 79 Others UNIVERSITY OF NOTRE DAME FACULTY COLLEGE FOOTBALL HALL OF FAME INDUCTEES 21 Others JOHN HEISMAN BARRY SANDERS JULIUS NIEUWLAND BARABÁSI 204 Others 1075 OthersHAL ABELSON Y X VASANT HONAVAR Short Paths Example: Path generation 15
  • 16. ROCKNE NORTHEASTERN NOTRE DAME WAND BARABÁSI ROSE BOWLHARVARD CY YOUNG CARL HUBBELL CARNEGIE MELLON UNIVERSITY TD GARDEN LA COLISEUM Example: Path generation 16 Which is the most interesting path?
  • 17. Path Ranking • Unordered Ranking ~xi = ha1, a2, . . . , a|~xi|i ~T~xi = hTa1 , Ta2 , . . . , Ta|~xi| i T~xi = |~xn| [ n=1 Tan TY = |Y | [ i=1 {T~yi }T~yi = Ta0 u [ |~yi| 1 [ n=2 {Tan } [ Ta0 v r(~xi) = |T~xi TY | |TY |
  • 18. Path Ranking • Ordered Ranking ~xi = ha1, a2, . . . , a|~xi|i ~T~xi = hTa1 , Ta2 , . . . , Ta|~xi| i p(an, a0 n) = |Tan TYn | |TYn | r(~xi) = mean |~xi| n=1(p(an, a0 n))
  • 19. 19 Result: Path Ranking Qualitative analysis is done with mechanical turkers.
  • 20. 20 ● ● ● ● ● 0.48 0.52 0.56 0.60 0 0.25 0.5 0.75 1 Result: Path Ranking Result shows user more like to pick path with lowest or highest similarity. People pick path with highest score may because they treat best as correct.
  • 21. DATA MINERS JIAWEI HAN DATA MINING SIGKDD JOHANNES GEHRKE STATISTICIANS MATHEMATICIANS PEOPLE SCHOLARS AND ACADEMICS DATA MINING SCIENCE ACM SIGS PEOPLE MorespecificMoregeneral Nodes Types COMPUTATIONAL STATISTICS MATHEMATICAL SCIENCES STATISTICS SOCIETY ACM PROFESSIONAL ORGANIZATIONS SCIENTIFIC SOCIETIES DATABASE RESEARCHERS COMPUTER SCIENTISTS SCHOLARS AND ACADEMICS SCHOLARS ORGANIZATIONS Example: Extract Meta-Path
  • 22. 22 Result: Meta-Path Constraint RWR 0 0.24 0.41 0.48 Edgar F. Codd 40.5 18.1 9.0 Johannes Gehrke 28.4 29.4 8.4 2.8 Raghu Ramakrishnan 31.1 6.0 3.6 Anita Borg 5.1 0.6 0.2 Shafi Goldwasser 4.9 0.6 Osmar R. Zaiane 4.8 3.6 1.6 Vint Cerf 4.1 2.4 0.2 Allen Newell 2.0 0.6 ACM 5.1 IEEE 4.9 Yahoo! Research 4.8 Microsoft Research 4.4 Database researchers Computer Scientist