Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge

1,188 views

Published on

Presented in : JIST2015, Yichang, China
Prototype: http://rc.lodac.nii.ac.jp/rdf4u/
Video: https://www.youtube.com/watch?v=z3roA9-Cp8g
Abstract: It is known that Semantic Web and Linked Open Data (LOD) are powerful technologies for knowledge management, and explicit knowledge is expected to be presented by RDF format (Resource Description Framework), but normal users are far from RDF due to technical skills required. As we learn, a concept-map or a node-link diagram can enhance the learning ability of learners from beginner to advanced user level, so RDF graph visualization can be a suitable tool for making users be familiar with Semantic technology. However, an RDF graph generated from the whole query result is not suitable for reading, because it is highly connected like a hairball and less organized. To make a graph presenting knowledge be more proper to read, this research introduces an approach to sparsify a graph using the combination of three main functions: graph simplification, triple ranking, and property selection. These functions are mostly initiated based on the interpretation of RDF data as knowledge units together with statistical analysis in order to deliver an easily-readable graph to users. A prototype is implemented to demonstrate the suitability and feasibility of the approach. It shows that the simple and flexible graph visualization is easy to read, and it creates the impression of users. In addition, the attractive tool helps to inspire users to realize the advantageous role of linked data in knowledge management.

Published in: Technology
  • Be the first to comment

RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge

  1. 1. RDF GRAPH VISUALIZATION BY INTERPRETING LINKED DATA AS KNOWLEDGE Rathachai CHAWUTHAI & Prof.Hideaki TAKEDA National Institute of Informatics , and SOKENDAI RDF4U JIST2015 Yichang, China 11-13 Nov 2015
  2. 2. AGENDA • Motivation • Methods • Graph Simplification • Triple Ranking • Property Selection • Outcome • Future Plan
  3. 3. MOTIVATION
  4. 4. THE ROLE OF SEMANTIC WEB IN KNOWLEDGE MANAGEMENT DDaattaa ttiieerr SSeerrvviiccee ttiieerr VViissuuaalliissaattiioonn ttiieerr SSPPAARRQQLL JJEENNAA eettcc.. 4 AApppplliiccaattiioonn//PPrreesseennttaattiioonn// At Visualisation Tier, • RDF data are transformed into 
 Chart, Geographic Map, etc. 
 and then serve users. It’s cool, but • Users are far from RDF data, so they do not understand the power of Semantic Web and do not realise how to contribute RDF data. For this reason, • It could be good if users can read RDF data directly using node-link diagram or concept-map diagram. read
  5. 5. READING FROM A QUERY GRAPH 5 Querying the 2-hop neighbourhood (or more hops) of a given URI gives wider information on the topic. CCaaffffee MMoocchhaa EEsspprreessssoo CChhooccoollaattee SSuuggaarr MMiillkk CCooffffeeeettyyppee sswweeeett ttyyppee ttaassttee ssuuggaarrccaannee mmaaddee ffrroomm ccooww pprroodduucceess wwhhiittee ccoolloorr ccooccooaa ccoonnttaaiinnss aa sshhoott ooff ttooppppeedd bbyyccoonnttaaiinnss hhaass llaayyeerr ooff ccaaffffeeiinnee ccoonnttaaiinn 443300 mmgg//LL bbllaacckk ccoolloorr bbiitttteerr ttaassttee
  6. 6. PROBLEMS 1) A Query Graph is TOO Complicated to Read. http://lod.ac/species/Bubohttp://dbpedia.org/resource/Tokyo 6
  7. 7. PROBLEMS 7 2) Lacking of Reading Flow of RDF Data All triples are equal, so Background Content and Main Point are NOT structured in any RDF graphs. ≠ TTooppiicc
  8. 8. GOAL 8 we prefer ……. ✦ A Simply Readable Graph ✦ A Well-Reading-Flow Graph TTooppiicc TTooppiicc Common Information Topic-Specific Information
  9. 9. DEMO http://my.tv.sohu.com/us/271745761/81854223.shtml 9 https://www.youtube.com/watch?v=z3roA9-Cp8g bit.ly/youtube_rdf4u bit.ly/sohu_rdf4u Full urls
  10. 10. METHODS
  11. 11. OVERALL 11 PropertySelection Graph Simplification TripleRanking RDF4U Human-Readable Graph Original Query Graph display/hide properties select simplification rules choose a proper rank User
  12. 12. GRAPH SIMPLICATION 12 • Some well-prepared RDF repositories did reasoning on ontologies in order to support a SPARQL service. • One impact is that the inferred triples create giant components in a graph. • A closer look at the data indicates that the following situations are commonly found in any complex RDF graph. • equivalent or same-as instances (owl:sameAs), • transitive properties (e.g. skos:broaderTransitive), and • hierarchical classification (rdf:type & rdfs:subClassOf) • Thus, this method aims to remove some redundant triples by using the mechanism of Semantic Web rules.
  13. 13. xx CC11 CC22 rrddffss::ssuubbCCllaassssOOffrrddff::ttyyppee xx yy zz PP PP GRAPH SIMPLICATION 13 ss11 oo11 oo22 pp11 pp22 ss22 oowwll::ssaammeeAAss and fD(s1) > fD(s2) ss11 pp11 pp22 oo11 oo22 To merge same-as nodes To remove transitive links To remove inferred type hierarchies xx yy zz PP PP PP and p rdf:type owl:TransitiveProperty . xx CC11 CC22 rrddff::ttyyppee rrddff::ttyyppee rrddffss::ssuubbCCllaassssOOff 11 22 33
  14. 14. GRAPH SIMPLICATION Example Result 14 Graph Simplification Superorder( Order( owls( Strigiformes( Family( Common(Name( Strigidae(Aves( Bubo( eagle(owls( Genus( Class( birds( Coelurosauria( Neognathae( Taxon(Name( hasSynonym) hasSynonym) hasParentTaxon) hasParentTaxon) hasParentTaxon) hasTaxonRank) hasTaxonRank) hasTaxonRank) hasTaxonRank) hasSynonym) hasParentTaxon) hasTaxonRank) type) type) type)type) type) ScienAfic(Name( http://lod.ac/species/Bubo Simplified Graph Original Query Graph
  15. 15. TRIPLE RANKING 15 Since users have different background knowledge in a specific topic, beginners may interested in reading common information before getting topic-specific information, while experts may prefer to read only topic- specific information. • Concept Level (resources || properties) • General Concepts are terms that are commonly known such as “name”, “address”, and “class”, and they are always found in a corpus. • Key Concepts are important terms that are always found in the query result and not many in the whole dataset. • Information Level (triples) • Common Information explains background knowledge that supports readers to understand the main content. (a lot of general concepts) • Topic-Specific Information contains specific terms that are highly relevance to the article. (a lot of key concepts)
  16. 16. TRIPLE RANKING 16 are General Concepts are Key Concepts Identify • General concepts • Key concepts Get an RDF graph 2211
  17. 17. TRIPLE RANKING 17 are General Concepts are Key Concepts Common Information Most of nodes and links are general concepts 33 44 Topic-Specific Information Most of nodes and links are key concepts
  18. 18. α⋅w(s) + β⋅w(p) + γ⋅w(o) 3 α⋅w(s) + β⋅w(p) + γ⋅w(o) α + β + γ TRIPLE RANKING 18 w(uri)= fQ(uri) log( fD(uri) + 1) vw(〈s,p,o〉)= a number of a URI in a Query result a logarithmic scale of a number of a URI in a whole Dataset Weight of a URI Visualization-Weight of a Triple The coefficients are 1.0 by default, but they can be adjusted due to for specific purpose. Concept Level Information Level high: key concept low: general concept high: topic-specific low: common info
  19. 19. TRIPLE RANKING 19 h"p://dbpedia.org/resource/Hydrogen 53 1,386 16.87 h"p://dbpedia.org/resource/Category:Chemical_elements 14 10,880 3.47 h"p://dbpedia.org/resource/Hydrogen_economy 13 6,489 3.41 h"p://dbpedia.org/resource/Category:Diatomic_nonmetals 12 103 5.96 h"p://dbpedia.org/resource/Category:Airship_technology 8 166 3.60 h"p://www.w3.org/2004/02/skos/core#Concept 8 9,707,808 1.14 h"p://www.w3.org/2002/07/owl#Thing 2 9,761,514 0.29 h"p://www.hydrogen.energy.gov/ 1 1 0.00 h"p://www.w3.org/2002/07/owl#sameAs 72 !meout 0.00 h"p://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#type 38 !meout 0.00 h"p://www.w3.org/2000/01/rdf-­‐schema#subClassOf 24 !meout 0.00 h"p://www.w3.org/2002/07/owl#equivalentClass 22 !meout 0.00 h"p://purl.org/dc/terms/subject 12 30,232,709 1.60 h"p://www.w3.org/2004/02/skos/core#broader 12 2,485,421 1.88 h"p://xmlns.com/foaf/0.1/isPrimaryTopicOf 3 34,557,438 0.40 h"p://purl.org/dc/elements/1.1/rights 2 3,102,660 0.31 URI fQ fD log(fD) fQ ResourcesProperties in a Query graph in a whole Dataset Query Topic: dbpedia:Hydrogen (raw: 1,291,986) (raw: 15,195,702) Concept Level
  20. 20. TRIPLE RANKING 20 Subject Predicate Object vw dp:Hydrogen rdf:type owl:Thing 5.62 dp:Hydrogen rdf:type skos:Concept 6.01 dp:Hydrogen dct:subject dp:Chemical_elements 7.31 dp:Hydrogen dct:subject dp:Airship_technology 7.35 dp:Hydrogen rdf:type dp:Diatomic_nonmetals 7.48 H For Example http://dbpedia.org/resource/Hydrogen Common Topic-Specific Information Level
  21. 21. TRIPLE RANKING 21 In case of sub-property (also sub-class) ltk:higherTaxon ltk:mergedInto skos:broader rdfs:subPropertyOf rdfs:subPropertyOf ltk:higherTaxon ltk:mergedInto a x a y skos:broader a x a y skos:broader more specific than Raw Data Inferred Data
  22. 22. OUTCOME
  23. 23. PROTOTYPE 23 http://rc.lodac.nii.ac.jp/rdf4u/ Thanks to Client: D3js, Bootstrap, jQuery, Server: SimpleRDF, SPARQL for PHP • To simplify a graph by removing some inferred triples. • To give ranking scores to triples based on common and topic-specific information. • To filter a graph by selecting preferred properties. • To control an interactive graph diagram. Features bit.ly/rdf4u
  24. 24. DISCUSSION Usefulness Uniqueness Novelty Prospect Some graph visualisation works: Motif, Gephi, RDF Gravity, Fenfire, and IsaViz, • do not use the power of Semantic Web to sparsity a graph, and • do not mention to provide different data for different user levels • TF-IDF is adapted for ordering triple from common to topic- specific level of information. • The degree of commonness versus specificity is calculated by evaluating the nature of the dataset with the algorithm. • The triple ranking can be extended by applying various algorithm in order to satisfy diverse characteristics of the data in other domains such as Biodiversity Informatics. • Mashup tools should consider this idea. 24 • A diagram is sparser and easier to be read by human. • Beginners can read common information firstly. • Expert can read topic-specific information.
  25. 25. FUTURE PLAN • To do critical evaluation • Survey • Number of cutting edge • To find the precise border between common information and topic- specific information • To find a better way to count the number of URIs
 (always timeout) • To remove noisy triples • To improve triple ranking algorithm for other domains 25
  26. 26. PropertySelection Graph Simplification TripleRanking RDF4U Human-Readable Graph Original Query Graph http://rc.lodac.nii.ac.jp/rdf4u 非常感謝
  27. 27. THANKS TO THESE IMAGE SOURCES https://www.pinterest.com/pin/ 444660163179663554/ http://www.clipartpanda.com/categories/ reading-clipart https://en.wikipedia.org/wiki/ Facebook_like_button http://www.iconarchive.com/show/ misc-icons-by-iconlicious/Monitor- icon.html http://www.w3.org/RDF/icons/ http://designplaygrounds.com/tv/the- power-of-data-visualization-2/ https://conceptdraw.com/a1247c3/ preview/256

×