SlideShare a Scribd company logo
1 of 46
Download to read offline
Tackling Usability Challenges in Querying
Massive, Ultra-heterogeneous Graphs
Chengkai Li
Associate Professor, Department of Computer Science and Engineering
Director, Innovative Database and Information Systems Research (IDIR) Laboratory
University of Texas at Arlington
North Carolina State University, Feb. 9th, 2016
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Ultra-heterogeneous Entity Graphs
Large, complex and schema-less graphs capturing millions of entities and
billionsof relationshipsbetween entities.
Linked Open Data : 52 billion RDF triples
Freebase : 1.8 billion triples
DBpedia : 470 million triples
Yago : 120 million triples
entity
relationship
2
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Linked Open Data
http://linkeddata.org/(hundreds of datasets, billions of RDF triples)
3
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Structured Queries are Difficult to Write
SQL QUERY:
SELECT Founder.subj, Founder.obj
FROM Founder,
Nationality,
HeadquarteredIn
WHERE
Founder.subj = Nationality.subj AND
Founder.obj = HeadquarteredIn.subj
SPARQL QUERY:
SELECT ?company ?founder WHERE {
?founder dbo:founded ?company .
?founder dbo:nationality db:United_States .
?company dbprop:headquartered_in db:Silicon_Valley .
}4
- Require knowledge on data
model, query language, and schema.
- Well-known usability challenges [Jagadish+07]
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Simpler Query Paradigms
Keyword Search
o [Kargar+11], BLINKS [He+07]
o Challenging to articulate exact query intent by keywords
Approximate Query Answering
o NESS [Khan+11]: uses neighborhood-based indexes to quickly find
approximate matches to a query graph;
o TALE [Tian+08]: approximate large graph matching
o Users still have to formulate the initial query graph
5
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Visual Query Builders
Relational Databases: CLIDE [Petropoulos+06]
Graph Databases: VOGUE [Bhowmick+13], PRAGUE [Jin+12],
Gblender [Jin+10], GRAPHITE [Chau+08]
Single Large Graphs: QUBLE [Hung+13]
o Require a good knowledge of the underlying schema
o No automatic suggestionsregardingwhat to include in the query graph
6
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
An Example: QUBLE
7
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Orion: Auto-Suggestion Based Visual Interface
for Interactive Query Construction
8
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Orion
9
Introduction Video
http://bit.ly/1O0GnNo
Prototype
http://idir.uta.edu/orion
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Objectives of Orion
o Interactive GUI for building query components
o Iteratively suggest edges based on their relevance to the user’s query
intent, according to the partial query graph so far
10
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Orion GUI
Dynamic list of all
possible user actions
at any given moment
Control panel for
various settings and
tips
11
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Active Mode
Grey edges and nodes automatically
suggested in active mode:
o Accepted by user (blue): positive edges
o Ignored by user: negative edges
12
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Passive Mode
A new node added in passive mode
A new edge added in passive mode
13
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Rank Candidate Edges for Suggestions
Possible Solutions
o Order alphabetically
o Adapt standard machine learning algorithms
o Naïve Bayes classifier
o Random forests
o Class association rules
o Recommendation systems (based on SVD)
Query-specific Random Decision Paths (RDP)
14
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Concepts
Edges in partial query graph (positive edges)
starring, director, music
Edges rejected by users (negative edges)
education, nationality
Candidate edges
producer, writer, editor
Query Session:
<starring, director, music, education, nationality>
15
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Concepts
Query Log (W)
Problem
Given a query log, a query session, and a set of candidate edges, rank the
candidate edges by their relevance to the user’s query intent
16
Positive Edge
Negative Edge
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Random Decision Path (RDP)
o Choose edges from the query session randomly, to form RDPs
o Each decision path selects a subset of the query log, with no more than ‘τ’ rows
o Grow a path incrementally until its support in the query log drops below ‘τ’
<starring, director, music, education, nationality>
17
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Random Decision Path: Scoring
o For each RDP, use its corresponding query log subset to compute the
support of each candidate edge.
o Final score of each candidate is its average score across all RDPs.
o If R is the set of all RDPs:
|},|{|
|}}{,|{|
),,sup(
),,sup(*
||
1
)(
wQWww
weQWww
WQe
WQe
R
escore
i
i
i
RQi
i
⊆∈
⊆∪∈
=
= ∑∈
18
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Query Log
Nonexistent (almost)
Simulate and bootstrap
o Find positive edges
o Wikipedia and data graph
o Data graph only
o SPARQL query log [Morsey+11]
o Inject negative edges
19
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Query Log Simulation: Wikipedia + Data Graph
Use Sentences in Wikipedia Articles to Identify Positive Edges
<degree, almaMater, frat_member>
20
NodesMapped:
Jerry Yang, Electrical Engineering,
StanfordUniversity,PhiKappaPsi
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Query Log Simulation: Data Graph Only
Represent Each Node as an Itemset of Positive Edges
Generate Frequent Itemsets of Varying Sizes
o Each frequent itemset of edges forms positive edges
<degree, almaMater, nationality, frat_member,
founded, places_lived>
<founded, place_founded, headquartered_in>
21
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Query Log Simulation: Injecting Negative Edges
Positive Edges List
(1) writer, starring, producer
(2) starring, editor, education
(3) editor, nationality, music
Inject Negative Edges
writer, starring, producer, editor, education (starring appears in 2)
starring, editor, education, writer, producer, nationality, music (starring
appears in 1, and editor appears in 3)
editor, nationality, music, starring, education (editor appears in 2)
22
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Experiments
System Configurations
o Double quad-core 2.0 GHz Xeon server, 24 GB RAM
o TACC: 5 Dell PowerEdge R910 server nodes, with 4 Intel Xeon
E7540 2.0 GHz 6-core processors, 1 TB RAM
Datasets
o Freebase (33 M edges, 30 M nodes, 5253 edge types)
o DBpedia (12 M edges, 4 M nodes, 647 edge types)
23
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Experiments (cont.)
User Studies
o Orion (RDP) and Naïve
Edge Ranking Algorithms Compared
o Random Decision Paths (RDP)
o Naïve Bayes classifier (NB)
o Random forest classifier (RF)
o Class association rules (CAR)
o SVD based recommendation system (SVD)
24
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Query Logs Compared
o Freebase: Wiki, Data
o DBpedia: Wiki, Data, QLog
25
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
User Studies: Setup
15 Users for Orion, 15 Users for Naïve (A/B testing)
45 Easy, 30 Medium, and 30 Hard Query Tasks Designed
3 Easy, 2 Medium, 2 Hard Queries Assigned per Query Task
105 Query Tasks per System in Total
4 Survey Questions per Query Task
26
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
User Studies: Conversion Rate
Conversion Rate:
o Percentage of query tasks completed successfully
o Successful completion measured using edge isomorphism, and not a
binary notion of matching
Orion has a higher conversion rate
than Naïve for complex queries!
27
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
User Studies: Efficiency by Time
Time required to construct query graphs in Orion is
comparable to Naïve in most cases, despite the steeper
learning curve of Orion due to more features
0
200
400
600
800
1000
1200
1400
Naïve Orion
Timeperquery(inseconds)
Hard Queries
0
200
400
600
800
1000
1200
1400
Naïve Orion
Timeperquery(inseconds)
All Queries
0
200
400
600
800
1000
1200
1400
1600
Naïve Orion
Timeperquery(inseconds)
Easy Queries
0
200
400
600
800
1000
Naïve Orion
Timeperquery(inseconds)
Medium Queries
28
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
User Studies: Efficiency by Number of Iterations
Time required to construct query graphs in Orion is comparable to Naïve in most
cases, despite the steeper learning curve of Orion due to more features
0
10
20
30
40
50
60
70
80
90
All Easy Medium Hard
Numberofiterations
Query Task Difficulty Level
29
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
User Studies: User Experience Results
As the difficulty level of the
query graph being constructed
increases, the usability of
Orion seems significantly
better than Naïve’s
3
3.2
3.4
3.6
3.8
4
4.2
Q1 Q2 Q3 Q4
AverageLikertscalescore
All Queries
Naive Orion
3
3.2
3.4
3.6
3.8
4
4.2
4.4
Q1 Q2 Q3 Q4
AverageLikertscalescore
Easy Queries
Naive Orion
3
3.2
3.4
3.6
3.8
4
4.2
Q1 Q2 Q3 Q4
AverageLikertscalescore
Medium Queries
Naive Orion
3
3.2
3.4
3.6
3.8
4
4.2
Q1 Q2 Q3 Q4
AverageLikertscalescore
Hard Queries
Naive Orion
30
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Edge Ranking Algorithms
o Simulates only Passive Mode
o 43 target query graphs for Freebase
o 6 two-edged, 10 three-edged, 9 four-edged, 17 five-edged, 1 six-
edged (includes medium and hard queries from the user study)
o 167 input instances
o 33 target query graphs for DBpedia
o 2 three-edged, 29 four-edged, 2 five-edged
o 130 input instances31
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Edge Ranking Algorithms: Efficiency by Number
of Suggestions
0
25
50
75
100
125
150
175
RDP RF NB SVD CAR
Average#ofsuggestionsperquery
Freebase
0
25
50
75
100
125
150
175
RDP RF NB SVD CAR
Average#ofsuggestionsperquery
DBpedia
RDP requires fewer suggestions compared
to all other methods
RDP requires only 40 suggestions, 1.5-
4 times fewer than other methods
32
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Edge Ranking Algorithms: Efficiency by Time
0
25
50
75
100
125
150
175
RDP RF NB SVD CAR
Averagetimeperquery(inseconds)
Freebase
0
100
200
300
400
RDP RF NB SVD CAR
AveragetimeperQuery(inseconds)
DBpedia
RDP better than RF and comparable to
NB, despite RF and NB being light models
RDP significantly better than SVD
and CAR, but worse than RF and NB
33
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Query Logs Comparison
Positive edges better captured based
on the context of human usage of
relationships in Wikipedia
DBpedia is created using info-boxes in
Wikipedia, and is thus very clean. Wiki-DB
is highly similar to Data-DB for DBpedia
0
25
50
75
100
125
150
175
200
Wiki-FB Data-FB
Numberofsuggestionsperquery
Freebase
0
25
50
75
100
125
150
175
200
Wiki-DB Data-DB QLog-DB
Average#ofsuggestionsperquery
DBpedia
34
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Tuning RDP Parameters
RDP performs better with more random
decision paths and higher query log
threshold
Considering negative edges in query
session is important, as it results in
better performance of RDP
40
60
80
100
120
(1,1) (2,2) (4,4) (8,8) (10,10) (25,25)
Average#ofsuggestionsperquery
Freebase: Parameters (N, τ)
RDP RDP-noneg
120
130
140
150
160
170
180
(1,1) (2,2) (4,4) (8,8) (10,10) (25,25)
Average#ofsuggestionsperquery
DBpedia: Parameters (N, τ)
RDP RDP-noneg
35
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Orion Contributions
Unique visual query builder with suggestions
Edge ranking algorithm : random decision paths
Query log simulation
Extensive user studies and experiments
Prototype available
36
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
GQBE: Graph Query by Example
37
Introduction Video
http://bit.ly/1PLqLTD
Prototype
http://idir.uta.edu/gqbe
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
GQBE GUI
Ranked similar
answer tuples
Keyword completion
powered query interface
Query graph automatically
discovered by the system An example answer graph
38
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
TableView: Generating Preview Tables for
Entity Graphs
39
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
TableView
40
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
TableView
41
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Publications
Overview
o IntuitiveandInteractiveQueryFormulationtoImprovetheUsabilityof QuerySystemsfor
HeterogeneousGraphs.NandishJayaram.VLDB2015 PhDWorkshop.
Orion
o Auto-SuggestionBasedVisualInterfaceforInteractiveQueryConstructiononUltra-Heterogeneous
Graphs.NandishJayaram,RohitBhooplam,ChengkaiLi,VassilisAthitsos.Underpreparation.
o VIIQ:Auto-SuggestionEnabledVisualInterfaceforInteractiveGraphQueryFormulation.Nandish
Jayaram,SidharthGoyal,ChengkaiLi.PVLDB,8(12):1940-1943, August2015. Demonstration
description.
GQBE
o Queryingknowledgegraphsbyexampleentitytuples(ExtendedAbstract).NandishJayaram,Arijit
Khan,ChengkaiLi,XifengYan,RamezElmasri.ICDE2016 (TKDEPoster)
42
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Publications
GQBE (cont’d)
o Queryingknowledgegraphsbyexampleentitytuples.NandishJayaram,ArijitKhan,ChengkaiLi,
XifengYan,RamezElmasri.TKDE,27(10):2797-2811, October2015.
o GQBE:Queryingknowledgegraphsbyexampleentitytuples.NandishJayaram,MaheshGupta,Arijit
Khan,ChengkaiLi,XifengYan,RamezElmasri.ICDE,pages1250-1253, 2014. Demonstration
description.
o Towardsaquery-by-examplesystemforknowledgegraphs.NandishJayaram,ArijitKhan,Chengkai
Li,XifengYan,RamezElmasri.GRADES2014.
TableView
o GeneratingPreviewTablesforEntityGraphs.NingYan,AbolfazlAsudeh,andChengkaiLi.Technical
Report,arXiv:1403.5006, March2014.
43
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Acknowledgment
Funding sponsors
Disclaimer: This material is based upon work partially supported by the National
Science Foundation Grants 1018865, 1117369, 1408928 and 1565699, 2011 and
2012 HP Labs Innovation Research Awards, the National Natural Science
Foundation of China Grant 61370019, and Army Research Laboratory under
cooperative agreement W911NF-09-2-0053 (NSCTA). Any opinions, findings, and
conclusions or recommendations expressed in this material are those of the
author(s) and do not necessarily reflect the views of the funding agencies.
44
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Acknowledgment
UTA Students
o NandishJayaram
o Sona Hasani
o Rohit Bhoopalam
o SidharthGoyal
o MaheshGupta
Collaborators
o VassilisAthitsos(UTA)
o ArijitKhan(NTU)
o RamezElmasri(UTA)
o NingYan(FutureWei,formerstudent)
o XifengYan(UCSB)
45
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Thank You! Questions?
o http://ranger.uta.edu/~cli
cli@uta.edu
o Prototypes
http://idir.uta.edu/orion
http://idir.uta.edu/gqbe
46

More Related Content

Similar to Tackling Usability Challenges in Querying Massive, Ultra-heterogeneous Graphs

Haystack Live tallison_202010_v2
Haystack Live tallison_202010_v2Haystack Live tallison_202010_v2
Haystack Live tallison_202010_v2Tim Allison
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics EnvironmentIan Foster
 
SkiPHP -- Database Basics for PHP
SkiPHP -- Database Basics for PHP SkiPHP -- Database Basics for PHP
SkiPHP -- Database Basics for PHP Dave Stokes
 
BioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioCatalogue
 
The data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architecturesThe data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architecturesVincenzo Gulisano
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyNeil Chue Hong
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document usefulssuser3c3f88
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataJames Sirota
 
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...WSO2
 
Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Daniel Jacobson
 
Geometric and Semantic Matching for Cultural Heritage Artefacts
Geometric and Semantic Matching for Cultural Heritage ArtefactsGeometric and Semantic Matching for Cultural Heritage Artefacts
Geometric and Semantic Matching for Cultural Heritage ArtefactsGravitate Project
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Josef Hardi
 
Database Basics with PHP -- Connect JS Conference October 17th, 2015
Database Basics with PHP -- Connect JS Conference October 17th, 2015Database Basics with PHP -- Connect JS Conference October 17th, 2015
Database Basics with PHP -- Connect JS Conference October 17th, 2015Dave Stokes
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for GraphsJean Ihm
 

Similar to Tackling Usability Challenges in Querying Massive, Ultra-heterogeneous Graphs (20)

Haystack Live tallison_202010_v2
Haystack Live tallison_202010_v2Haystack Live tallison_202010_v2
Haystack Live tallison_202010_v2
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics Environment
 
Generating Preview Tables for Entity Graphs
Generating Preview Tables for Entity GraphsGenerating Preview Tables for Entity Graphs
Generating Preview Tables for Entity Graphs
 
SkiPHP -- Database Basics for PHP
SkiPHP -- Database Basics for PHP SkiPHP -- Database Basics for PHP
SkiPHP -- Database Basics for PHP
 
BioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogue
 
cikm14
cikm14cikm14
cikm14
 
The data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architecturesThe data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architectures
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
VIIQ: Auto-suggestion Enabled Visual Interface for Interactive Graph Query Fo...
VIIQ: Auto-suggestion Enabled Visual Interface for Interactive Graph Query Fo...VIIQ: Auto-suggestion Enabled Visual Interface for Interactive Graph Query Fo...
VIIQ: Auto-suggestion Enabled Visual Interface for Interactive Graph Query Fo...
 
Kaggle kenneth
Kaggle kennethKaggle kenneth
Kaggle kenneth
 
Lichang Wang_CV
Lichang Wang_CVLichang Wang_CV
Lichang Wang_CV
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document useful
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
 
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
 
Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016
 
Geometric and Semantic Matching for Cultural Heritage Artefacts
Geometric and Semantic Matching for Cultural Heritage ArtefactsGeometric and Semantic Matching for Cultural Heritage Artefacts
Geometric and Semantic Matching for Cultural Heritage Artefacts
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!
 
Database Basics with PHP -- Connect JS Conference October 17th, 2015
Database Basics with PHP -- Connect JS Conference October 17th, 2015Database Basics with PHP -- Connect JS Conference October 17th, 2015
Database Basics with PHP -- Connect JS Conference October 17th, 2015
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
 

More from The Innovative Data Intelligence Research (IDIR) Laboratory, University of Texas at Arlington

More from The Innovative Data Intelligence Research (IDIR) Laboratory, University of Texas at Arlington (20)

Enabling Computational Journalism: Automated Fact-Checking and Story-Finding
Enabling Computational Journalism: Automated Fact-Checking and Story-FindingEnabling Computational Journalism: Automated Fact-Checking and Story-Finding
Enabling Computational Journalism: Automated Fact-Checking and Story-Finding
 
Restoring Trust by Computing: Data-driven Fact-checking and Exceptional Fact ...
Restoring Trust by Computing: Data-driven Fact-checking and Exceptional Fact ...Restoring Trust by Computing: Data-driven Fact-checking and Exceptional Fact ...
Restoring Trust by Computing: Data-driven Fact-checking and Exceptional Fact ...
 
Dynamic Symbolic Database Application Testing
Dynamic Symbolic Database Application TestingDynamic Symbolic Database Application Testing
Dynamic Symbolic Database Application Testing
 
Facetedpedia: Dynamic Generation of Query-Dependent Faceted Interfaces for Wi...
Facetedpedia: Dynamic Generation of Query-Dependent Faceted Interfaces for Wi...Facetedpedia: Dynamic Generation of Query-Dependent Faceted Interfaces for Wi...
Facetedpedia: Dynamic Generation of Query-Dependent Faceted Interfaces for Wi...
 
Entity-Relationship Queries over Wikipedia
Entity-Relationship Queries over WikipediaEntity-Relationship Queries over Wikipedia
Entity-Relationship Queries over Wikipedia
 
Prominent Streak Discovery in Sequence Data
Prominent Streak Discovery in Sequence DataProminent Streak Discovery in Sequence Data
Prominent Streak Discovery in Sequence Data
 
On Skyline Groups
On Skyline GroupsOn Skyline Groups
On Skyline Groups
 
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScout
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScoutAnything You Can Do, I Can Do Better: Finding Expert Teams by CrewScout
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScout
 
Incremental Discovery of Prominent Situational Facts
Incremental Discovery of Prominent Situational FactsIncremental Discovery of Prominent Situational Facts
Incremental Discovery of Prominent Situational Facts
 
Towards a Query-by-Example System for Knowledge Graphs
Towards a Query-by-Example System for Knowledge GraphsTowards a Query-by-Example System for Knowledge Graphs
Towards a Query-by-Example System for Knowledge Graphs
 
Data In, Fact Out: Automated Monitoring of Facts by FactWatcher
Data In, Fact Out: Automated Monitoring of Facts by FactWatcherData In, Fact Out: Automated Monitoring of Facts by FactWatcher
Data In, Fact Out: Automated Monitoring of Facts by FactWatcher
 
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScoutCrewsc...
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScoutCrewsc...Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScoutCrewsc...
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScoutCrewsc...
 
Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
Crowdsourcing Pareto-Optimal Object Finding by Pairwise ComparisonsCrowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
 
Comparing Automated Factual Claim Detection Against Judgments of Journalism O...
Comparing Automated Factual Claim Detection Against Judgments of Journalism O...Comparing Automated Factual Claim Detection Against Judgments of Journalism O...
Comparing Automated Factual Claim Detection Against Judgments of Journalism O...
 
ClaimBuster: The First-ever End-to-end Fact-checking System
ClaimBuster: The First-ever End-to-end Fact-checking SystemClaimBuster: The First-ever End-to-end Fact-checking System
ClaimBuster: The First-ever End-to-end Fact-checking System
 
Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by Clai...
Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by Clai...Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by Clai...
Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by Clai...
 
TableView: A Visual Interface for Generating Preview Tables of Entity Graphs
TableView: A Visual Interface for Generating Preview Tables of Entity GraphsTableView: A Visual Interface for Generating Preview Tables of Entity Graphs
TableView: A Visual Interface for Generating Preview Tables of Entity Graphs
 
Maverick: Discovering Exceptional Facts from Knowledge Graphs
Maverick: Discovering Exceptional Facts from Knowledge GraphsMaverick: Discovering Exceptional Facts from Knowledge Graphs
Maverick: Discovering Exceptional Facts from Knowledge Graphs
 
An Empirical Study on Identifying Sentences with Salient Factual Statements
An Empirical Study on Identifying Sentences with Salient Factual StatementsAn Empirical Study on Identifying Sentences with Salient Factual Statements
An Empirical Study on Identifying Sentences with Salient Factual Statements
 
Continuous Monitoring of Pareto Frontiers on Partially Ordered Attributes for...
Continuous Monitoring of Pareto Frontiers on Partially Ordered Attributes for...Continuous Monitoring of Pareto Frontiers on Partially Ordered Attributes for...
Continuous Monitoring of Pareto Frontiers on Partially Ordered Attributes for...
 

Recently uploaded

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 

Recently uploaded (20)

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 

Tackling Usability Challenges in Querying Massive, Ultra-heterogeneous Graphs

  • 1. Tackling Usability Challenges in Querying Massive, Ultra-heterogeneous Graphs Chengkai Li Associate Professor, Department of Computer Science and Engineering Director, Innovative Database and Information Systems Research (IDIR) Laboratory University of Texas at Arlington North Carolina State University, Feb. 9th, 2016
  • 2. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Ultra-heterogeneous Entity Graphs Large, complex and schema-less graphs capturing millions of entities and billionsof relationshipsbetween entities. Linked Open Data : 52 billion RDF triples Freebase : 1.8 billion triples DBpedia : 470 million triples Yago : 120 million triples entity relationship 2
  • 3. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Linked Open Data http://linkeddata.org/(hundreds of datasets, billions of RDF triples) 3
  • 4. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Structured Queries are Difficult to Write SQL QUERY: SELECT Founder.subj, Founder.obj FROM Founder, Nationality, HeadquarteredIn WHERE Founder.subj = Nationality.subj AND Founder.obj = HeadquarteredIn.subj SPARQL QUERY: SELECT ?company ?founder WHERE { ?founder dbo:founded ?company . ?founder dbo:nationality db:United_States . ?company dbprop:headquartered_in db:Silicon_Valley . }4 - Require knowledge on data model, query language, and schema. - Well-known usability challenges [Jagadish+07]
  • 5. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Simpler Query Paradigms Keyword Search o [Kargar+11], BLINKS [He+07] o Challenging to articulate exact query intent by keywords Approximate Query Answering o NESS [Khan+11]: uses neighborhood-based indexes to quickly find approximate matches to a query graph; o TALE [Tian+08]: approximate large graph matching o Users still have to formulate the initial query graph 5
  • 6. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Visual Query Builders Relational Databases: CLIDE [Petropoulos+06] Graph Databases: VOGUE [Bhowmick+13], PRAGUE [Jin+12], Gblender [Jin+10], GRAPHITE [Chau+08] Single Large Graphs: QUBLE [Hung+13] o Require a good knowledge of the underlying schema o No automatic suggestionsregardingwhat to include in the query graph 6
  • 7. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. An Example: QUBLE 7
  • 8. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Orion: Auto-Suggestion Based Visual Interface for Interactive Query Construction 8
  • 9. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Orion 9 Introduction Video http://bit.ly/1O0GnNo Prototype http://idir.uta.edu/orion
  • 10. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Objectives of Orion o Interactive GUI for building query components o Iteratively suggest edges based on their relevance to the user’s query intent, according to the partial query graph so far 10
  • 11. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Orion GUI Dynamic list of all possible user actions at any given moment Control panel for various settings and tips 11
  • 12. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Active Mode Grey edges and nodes automatically suggested in active mode: o Accepted by user (blue): positive edges o Ignored by user: negative edges 12
  • 13. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Passive Mode A new node added in passive mode A new edge added in passive mode 13
  • 14. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Rank Candidate Edges for Suggestions Possible Solutions o Order alphabetically o Adapt standard machine learning algorithms o Naïve Bayes classifier o Random forests o Class association rules o Recommendation systems (based on SVD) Query-specific Random Decision Paths (RDP) 14
  • 15. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Concepts Edges in partial query graph (positive edges) starring, director, music Edges rejected by users (negative edges) education, nationality Candidate edges producer, writer, editor Query Session: <starring, director, music, education, nationality> 15
  • 16. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Concepts Query Log (W) Problem Given a query log, a query session, and a set of candidate edges, rank the candidate edges by their relevance to the user’s query intent 16 Positive Edge Negative Edge
  • 17. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Random Decision Path (RDP) o Choose edges from the query session randomly, to form RDPs o Each decision path selects a subset of the query log, with no more than ‘τ’ rows o Grow a path incrementally until its support in the query log drops below ‘τ’ <starring, director, music, education, nationality> 17
  • 18. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Random Decision Path: Scoring o For each RDP, use its corresponding query log subset to compute the support of each candidate edge. o Final score of each candidate is its average score across all RDPs. o If R is the set of all RDPs: |},|{| |}}{,|{| ),,sup( ),,sup(* || 1 )( wQWww weQWww WQe WQe R escore i i i RQi i ⊆∈ ⊆∪∈ = = ∑∈ 18
  • 19. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Query Log Nonexistent (almost) Simulate and bootstrap o Find positive edges o Wikipedia and data graph o Data graph only o SPARQL query log [Morsey+11] o Inject negative edges 19
  • 20. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Query Log Simulation: Wikipedia + Data Graph Use Sentences in Wikipedia Articles to Identify Positive Edges <degree, almaMater, frat_member> 20 NodesMapped: Jerry Yang, Electrical Engineering, StanfordUniversity,PhiKappaPsi
  • 21. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Query Log Simulation: Data Graph Only Represent Each Node as an Itemset of Positive Edges Generate Frequent Itemsets of Varying Sizes o Each frequent itemset of edges forms positive edges <degree, almaMater, nationality, frat_member, founded, places_lived> <founded, place_founded, headquartered_in> 21
  • 22. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Query Log Simulation: Injecting Negative Edges Positive Edges List (1) writer, starring, producer (2) starring, editor, education (3) editor, nationality, music Inject Negative Edges writer, starring, producer, editor, education (starring appears in 2) starring, editor, education, writer, producer, nationality, music (starring appears in 1, and editor appears in 3) editor, nationality, music, starring, education (editor appears in 2) 22
  • 23. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Experiments System Configurations o Double quad-core 2.0 GHz Xeon server, 24 GB RAM o TACC: 5 Dell PowerEdge R910 server nodes, with 4 Intel Xeon E7540 2.0 GHz 6-core processors, 1 TB RAM Datasets o Freebase (33 M edges, 30 M nodes, 5253 edge types) o DBpedia (12 M edges, 4 M nodes, 647 edge types) 23
  • 24. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Experiments (cont.) User Studies o Orion (RDP) and Naïve Edge Ranking Algorithms Compared o Random Decision Paths (RDP) o Naïve Bayes classifier (NB) o Random forest classifier (RF) o Class association rules (CAR) o SVD based recommendation system (SVD) 24
  • 25. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Query Logs Compared o Freebase: Wiki, Data o DBpedia: Wiki, Data, QLog 25
  • 26. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. User Studies: Setup 15 Users for Orion, 15 Users for Naïve (A/B testing) 45 Easy, 30 Medium, and 30 Hard Query Tasks Designed 3 Easy, 2 Medium, 2 Hard Queries Assigned per Query Task 105 Query Tasks per System in Total 4 Survey Questions per Query Task 26
  • 27. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. User Studies: Conversion Rate Conversion Rate: o Percentage of query tasks completed successfully o Successful completion measured using edge isomorphism, and not a binary notion of matching Orion has a higher conversion rate than Naïve for complex queries! 27
  • 28. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. User Studies: Efficiency by Time Time required to construct query graphs in Orion is comparable to Naïve in most cases, despite the steeper learning curve of Orion due to more features 0 200 400 600 800 1000 1200 1400 Naïve Orion Timeperquery(inseconds) Hard Queries 0 200 400 600 800 1000 1200 1400 Naïve Orion Timeperquery(inseconds) All Queries 0 200 400 600 800 1000 1200 1400 1600 Naïve Orion Timeperquery(inseconds) Easy Queries 0 200 400 600 800 1000 Naïve Orion Timeperquery(inseconds) Medium Queries 28
  • 29. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. User Studies: Efficiency by Number of Iterations Time required to construct query graphs in Orion is comparable to Naïve in most cases, despite the steeper learning curve of Orion due to more features 0 10 20 30 40 50 60 70 80 90 All Easy Medium Hard Numberofiterations Query Task Difficulty Level 29
  • 30. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. User Studies: User Experience Results As the difficulty level of the query graph being constructed increases, the usability of Orion seems significantly better than Naïve’s 3 3.2 3.4 3.6 3.8 4 4.2 Q1 Q2 Q3 Q4 AverageLikertscalescore All Queries Naive Orion 3 3.2 3.4 3.6 3.8 4 4.2 4.4 Q1 Q2 Q3 Q4 AverageLikertscalescore Easy Queries Naive Orion 3 3.2 3.4 3.6 3.8 4 4.2 Q1 Q2 Q3 Q4 AverageLikertscalescore Medium Queries Naive Orion 3 3.2 3.4 3.6 3.8 4 4.2 Q1 Q2 Q3 Q4 AverageLikertscalescore Hard Queries Naive Orion 30
  • 31. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Edge Ranking Algorithms o Simulates only Passive Mode o 43 target query graphs for Freebase o 6 two-edged, 10 three-edged, 9 four-edged, 17 five-edged, 1 six- edged (includes medium and hard queries from the user study) o 167 input instances o 33 target query graphs for DBpedia o 2 three-edged, 29 four-edged, 2 five-edged o 130 input instances31
  • 32. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Edge Ranking Algorithms: Efficiency by Number of Suggestions 0 25 50 75 100 125 150 175 RDP RF NB SVD CAR Average#ofsuggestionsperquery Freebase 0 25 50 75 100 125 150 175 RDP RF NB SVD CAR Average#ofsuggestionsperquery DBpedia RDP requires fewer suggestions compared to all other methods RDP requires only 40 suggestions, 1.5- 4 times fewer than other methods 32
  • 33. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Edge Ranking Algorithms: Efficiency by Time 0 25 50 75 100 125 150 175 RDP RF NB SVD CAR Averagetimeperquery(inseconds) Freebase 0 100 200 300 400 RDP RF NB SVD CAR AveragetimeperQuery(inseconds) DBpedia RDP better than RF and comparable to NB, despite RF and NB being light models RDP significantly better than SVD and CAR, but worse than RF and NB 33
  • 34. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Query Logs Comparison Positive edges better captured based on the context of human usage of relationships in Wikipedia DBpedia is created using info-boxes in Wikipedia, and is thus very clean. Wiki-DB is highly similar to Data-DB for DBpedia 0 25 50 75 100 125 150 175 200 Wiki-FB Data-FB Numberofsuggestionsperquery Freebase 0 25 50 75 100 125 150 175 200 Wiki-DB Data-DB QLog-DB Average#ofsuggestionsperquery DBpedia 34
  • 35. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Tuning RDP Parameters RDP performs better with more random decision paths and higher query log threshold Considering negative edges in query session is important, as it results in better performance of RDP 40 60 80 100 120 (1,1) (2,2) (4,4) (8,8) (10,10) (25,25) Average#ofsuggestionsperquery Freebase: Parameters (N, τ) RDP RDP-noneg 120 130 140 150 160 170 180 (1,1) (2,2) (4,4) (8,8) (10,10) (25,25) Average#ofsuggestionsperquery DBpedia: Parameters (N, τ) RDP RDP-noneg 35
  • 36. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Orion Contributions Unique visual query builder with suggestions Edge ranking algorithm : random decision paths Query log simulation Extensive user studies and experiments Prototype available 36
  • 37. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. GQBE: Graph Query by Example 37 Introduction Video http://bit.ly/1PLqLTD Prototype http://idir.uta.edu/gqbe
  • 38. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. GQBE GUI Ranked similar answer tuples Keyword completion powered query interface Query graph automatically discovered by the system An example answer graph 38
  • 39. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. TableView: Generating Preview Tables for Entity Graphs 39
  • 40. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. TableView 40
  • 41. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. TableView 41
  • 42. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Publications Overview o IntuitiveandInteractiveQueryFormulationtoImprovetheUsabilityof QuerySystemsfor HeterogeneousGraphs.NandishJayaram.VLDB2015 PhDWorkshop. Orion o Auto-SuggestionBasedVisualInterfaceforInteractiveQueryConstructiononUltra-Heterogeneous Graphs.NandishJayaram,RohitBhooplam,ChengkaiLi,VassilisAthitsos.Underpreparation. o VIIQ:Auto-SuggestionEnabledVisualInterfaceforInteractiveGraphQueryFormulation.Nandish Jayaram,SidharthGoyal,ChengkaiLi.PVLDB,8(12):1940-1943, August2015. Demonstration description. GQBE o Queryingknowledgegraphsbyexampleentitytuples(ExtendedAbstract).NandishJayaram,Arijit Khan,ChengkaiLi,XifengYan,RamezElmasri.ICDE2016 (TKDEPoster) 42
  • 43. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Publications GQBE (cont’d) o Queryingknowledgegraphsbyexampleentitytuples.NandishJayaram,ArijitKhan,ChengkaiLi, XifengYan,RamezElmasri.TKDE,27(10):2797-2811, October2015. o GQBE:Queryingknowledgegraphsbyexampleentitytuples.NandishJayaram,MaheshGupta,Arijit Khan,ChengkaiLi,XifengYan,RamezElmasri.ICDE,pages1250-1253, 2014. Demonstration description. o Towardsaquery-by-examplesystemforknowledgegraphs.NandishJayaram,ArijitKhan,Chengkai Li,XifengYan,RamezElmasri.GRADES2014. TableView o GeneratingPreviewTablesforEntityGraphs.NingYan,AbolfazlAsudeh,andChengkaiLi.Technical Report,arXiv:1403.5006, March2014. 43
  • 44. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Acknowledgment Funding sponsors Disclaimer: This material is based upon work partially supported by the National Science Foundation Grants 1018865, 1117369, 1408928 and 1565699, 2011 and 2012 HP Labs Innovation Research Awards, the National Natural Science Foundation of China Grant 61370019, and Army Research Laboratory under cooperative agreement W911NF-09-2-0053 (NSCTA). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies. 44
  • 45. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Acknowledgment UTA Students o NandishJayaram o Sona Hasani o Rohit Bhoopalam o SidharthGoyal o MaheshGupta Collaborators o VassilisAthitsos(UTA) o ArijitKhan(NTU) o RamezElmasri(UTA) o NingYan(FutureWei,formerstudent) o XifengYan(UCSB) 45
  • 46. ©2015-2016 The University of Texas at Arlington. All Rights Reserved. Thank You! Questions? o http://ranger.uta.edu/~cli cli@uta.edu o Prototypes http://idir.uta.edu/orion http://idir.uta.edu/gqbe 46