RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge

RDF GRAPH VISUALIZATION
BY INTERPRETING LINKED DATA AS KNOWLEDGE
Rathachai CHAWUTHAI & Prof.Hideaki TAKEDA
National Institute of Informatics , and SOKENDAI
RDF4U
JIST2015 Yichang, China 11-13 Nov 2015

AGENDA
• Motivation
• Methods
• Graph Simplification
• Triple Ranking
• Property Selection
• Outcome
• Future Plan

THE ROLE OF SEMANTIC WEB IN KNOWLEDGE MANAGEMENT
DDaattaa ttiieerr
SSeerrvviiccee ttiieerr
VViissuuaalliissaattiioonn ttiieerr
SSPPAARRQQLL JJEENNAA eettcc..
4
AApppplliiccaattiioonn//PPrreesseennttaattiioonn//
At Visualisation Tier,
• RDF data are transformed into  
Chart, Geographic Map, etc.  
and then serve users.
It’s cool, but
• Users are far from RDF data, so
they do not understand the power
of Semantic Web and do not realise
how to contribute RDF data.
For this reason,
• It could be good if users can read
RDF data directly using node-link
diagram or concept-map diagram.
read

READING FROM A QUERY GRAPH
5
Querying the 2-hop neighbourhood (or more hops) of a given URI
gives wider information on the topic.
CCaaffffee
MMoocchhaa
EEsspprreessssoo CChhooccoollaattee
SSuuggaarr MMiillkk
CCooffffeeeettyyppee
sswweeeett
ttyyppee
ttaassttee
ssuuggaarrccaannee
mmaaddee ffrroomm
ccooww
pprroodduucceess
wwhhiittee
ccoolloorr
ccooccooaa
ccoonnttaaiinnss
aa sshhoott ooff
ttooppppeedd bbyyccoonnttaaiinnss
hhaass llaayyeerr ooff
ccaaffffeeiinnee
ccoonnttaaiinn
443300 mmgg//LL
bbllaacckk
ccoolloorr
bbiitttteerr
ttaassttee

PROBLEMS
1) A Query Graph is TOO Complicated to Read.
http://lod.ac/species/Bubohttp://dbpedia.org/resource/Tokyo
6

PROBLEMS
7
2) Lacking of Reading Flow of RDF Data
All triples are equal, so Background Content and Main Point
are NOT structured in any RDF graphs.
≠ TTooppiicc

GOAL
8
we prefer …….
✦ A Simply Readable Graph
✦ A Well-Reading-Flow Graph
TTooppiicc
TTooppiicc
Common Information
Topic-Specific Information

DEMO
http://my.tv.sohu.com/us/271745761/81854223.shtml
9
https://www.youtube.com/watch?v=z3roA9-Cp8g
bit.ly/youtube_rdf4u
bit.ly/sohu_rdf4u
Full urls

OVERALL
11
PropertySelection
Graph
Simplification
TripleRanking
RDF4U Human-Readable
Graph
Original
Query Graph
display/hide properties
select simplification rules
choose a proper rank
User

GRAPH SIMPLICATION
12
• Some well-prepared RDF repositories did reasoning on
ontologies in order to support a SPARQL service.
• One impact is that the inferred triples create giant
components in a graph.
• A closer look at the data indicates that the following
situations are commonly found in any complex RDF graph.
• equivalent or same-as instances (owl:sameAs),
• transitive properties (e.g. skos:broaderTransitive), and
• hierarchical classification (rdf:type & rdfs:subClassOf)
• Thus, this method aims to remove some redundant triples
by using the mechanism of Semantic Web rules.

xx
CC11
CC22
rrddffss::ssuubbCCllaassssOOffrrddff::ttyyppee
xx
yy
zz
PP
PP
GRAPH SIMPLICATION
13
ss11 oo11
oo22
pp11
pp22
ss22
oowwll::ssaammeeAAss and fD(s1) > fD(s2) ss11
pp11
pp22
oo11
oo22
To merge same-as nodes
To remove transitive links
To remove inferred type hierarchies
xx
yy
zz
PP
PP
PP
and p rdf:type
owl:TransitiveProperty .
xx
CC11
CC22
rrddff::ttyyppee
rrddff::ttyyppee
rrddffss::ssuubbCCllaassssOOff
11
22
33

GRAPH SIMPLICATION
Example Result
14
Graph
Simplification
Superorder(
Order(
owls(
Strigiformes(
Family(
Common(Name(
Strigidae(Aves(
Bubo(
eagle(owls(
Genus(
Class(
birds(
Coelurosauria(
Neognathae(
Taxon(Name(
hasSynonym)
hasSynonym)
hasParentTaxon)
hasParentTaxon)
hasParentTaxon)
hasTaxonRank)
hasTaxonRank)
hasTaxonRank)
hasTaxonRank)
hasSynonym)
hasParentTaxon)
hasTaxonRank)
type)
type)
type)type)
type)
ScienAﬁc(Name(
http://lod.ac/species/Bubo
Simplified Graph
Original
Query Graph

TRIPLE RANKING
15
Since users have different background knowledge in a specific topic,
beginners may interested in reading common information before getting
topic-specific information, while experts may prefer to read only topic-
specific information.
• Concept Level (resources || properties)
• General Concepts are terms that are commonly known such as
“name”, “address”, and “class”, and they are always found in a corpus.
• Key Concepts are important terms that are always found in the query
result and not many in the whole dataset.
• Information Level (triples)
• Common Information explains background knowledge that supports
readers to understand the main content. (a lot of general concepts)
• Topic-Specific Information contains specific terms that are highly
relevance to the article. (a lot of key concepts)

TRIPLE RANKING
16
are General Concepts are Key Concepts
Identify
• General concepts
• Key concepts
Get an RDF graph 2211

TRIPLE RANKING
17
are General Concepts are Key Concepts
Common Information
Most of nodes and links
are general concepts
33 44
Topic-Specific Information
Most of nodes and links are
key concepts

α⋅w(s) + β⋅w(p) + γ⋅w(o)
3
α⋅w(s) + β⋅w(p) + γ⋅w(o)
α + β + γ
TRIPLE RANKING
18
w(uri)=
fQ(uri)
log( fD(uri) + 1)
vw(〈s,p,o〉)=
a number of a URI in a Query result
a logarithmic scale of a number of a URI
in a whole Dataset
Weight of a URI
Visualization-Weight of a Triple
The coefficients are 1.0 by default,
but they can be adjusted due to for specific purpose.
Concept Level
Information Level
high: key concept
low: general concept
high: topic-specific
low: common info

TRIPLE RANKING
19
h"p://dbpedia.org/resource/Hydrogen 53 1,386 16.87
h"p://dbpedia.org/resource/Category:Chemical_elements 14 10,880 3.47
h"p://dbpedia.org/resource/Hydrogen_economy 13 6,489 3.41
h"p://dbpedia.org/resource/Category:Diatomic_nonmetals 12 103 5.96
h"p://dbpedia.org/resource/Category:Airship_technology 8 166 3.60
h"p://www.w3.org/2004/02/skos/core#Concept 8 9,707,808 1.14
h"p://www.w3.org/2002/07/owl#Thing 2 9,761,514 0.29
h"p://www.hydrogen.energy.gov/ 1 1 0.00
h"p://www.w3.org/2002/07/owl#sameAs 72 !meout 0.00
h"p://www.w3.org/1999/02/22-‐rdf-‐syntax-‐ns#type 38 !meout 0.00
h"p://www.w3.org/2000/01/rdf-‐schema#subClassOf 24 !meout 0.00
h"p://www.w3.org/2002/07/owl#equivalentClass 22 !meout 0.00
h"p://purl.org/dc/terms/subject 12 30,232,709 1.60
h"p://www.w3.org/2004/02/skos/core#broader 12 2,485,421 1.88
h"p://xmlns.com/foaf/0.1/isPrimaryTopicOf 3 34,557,438 0.40
h"p://purl.org/dc/elements/1.1/rights 2 3,102,660 0.31
URI
fQ fD
log(fD)
fQ
ResourcesProperties
in a Query
graph
in a whole
Dataset
Query Topic: dbpedia:Hydrogen
(raw: 1,291,986)
(raw: 15,195,702)
Concept Level

TRIPLE RANKING
20
Subject Predicate Object vw
dp:Hydrogen rdf:type owl:Thing 5.62
dp:Hydrogen rdf:type skos:Concept 6.01
dp:Hydrogen dct:subject dp:Chemical_elements 7.31
dp:Hydrogen dct:subject dp:Airship_technology 7.35
dp:Hydrogen rdf:type dp:Diatomic_nonmetals 7.48
H
For Example
http://dbpedia.org/resource/Hydrogen
Common
Topic-Specific
Information Level

TRIPLE RANKING
21
In case of sub-property (also sub-class)
ltk:higherTaxon
ltk:mergedInto
skos:broader
rdfs:subPropertyOf
rdfs:subPropertyOf
ltk:higherTaxon
ltk:mergedInto
a x
a y
skos:broader
a x
a y
skos:broader
more specific than
Raw Data Inferred Data

PROTOTYPE
23
http://rc.lodac.nii.ac.jp/rdf4u/
Thanks to
Client: D3js, Bootstrap, jQuery,
Server: SimpleRDF, SPARQL for PHP
• To simplify a graph by removing some
inferred triples.
• To give ranking scores to triples based on
common and topic-specific information.
• To filter a graph by selecting preferred
properties.
• To control an interactive graph diagram.
Features
bit.ly/rdf4u

DISCUSSION
Usefulness
Uniqueness
Novelty
Prospect
Some graph visualisation works: Motif,
Gephi, RDF Gravity, Fenfire, and
IsaViz,
• do not use the power of Semantic
Web to sparsity a graph, and
• do not mention to provide
different data for different user
levels
• TF-IDF is adapted for ordering
triple from common to topic-
specific level of information.
• The degree of commonness versus
specificity is calculated by
evaluating the nature of the
dataset with the algorithm.
• The triple ranking can be extended
by applying various algorithm in
order to satisfy diverse
characteristics of the data in other
domains such as Biodiversity
Informatics.
• Mashup tools should consider this
idea.
24
• A diagram is sparser and easier
to be read by human.
• Beginners can read common
information firstly.
• Expert can read topic-specific
information.

FUTURE PLAN
• To do critical evaluation
• Survey
• Number of cutting edge
• To find the precise border between
common information and topic-
specific information
• To find a better way to count the
number of URIs 
(always timeout)
• To remove noisy triples
• To improve triple ranking algorithm
for other domains
25

PropertySelection
Graph
Simplification
TripleRanking
RDF4U
Human-Readable
Graph
Original Query
Graph
http://rc.lodac.nii.ac.jp/rdf4u
非常感謝

THANKS TO THESE IMAGE SOURCES
https://www.pinterest.com/pin/
444660163179663554/
http://www.clipartpanda.com/categories/
reading-clipart
https://en.wikipedia.org/wiki/
Facebook_like_button
http://www.iconarchive.com/show/
misc-icons-by-iconlicious/Monitor-
icon.html
http://www.w3.org/RDF/icons/
http://designplaygrounds.com/tv/the-
power-of-data-visualization-2/
https://conceptdraw.com/a1247c3/
preview/256

RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge

Similar to RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge (20)

Recently uploaded

Recently uploaded (20)

RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge