Your SlideShare is downloading. ×
Assessing Linked Data Mappings using Network Measures
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Assessing Linked Data Mappings using Network Measures

1,559
views

Published on

When generating a lot of WoD links automatically, data quality is a pressing issue. This presentation, and the related paper, introduce LinkQA: a network based node-centric framework to analyse the …

When generating a lot of WoD links automatically, data quality is a pressing issue. This presentation, and the related paper, introduce LinkQA: a network based node-centric framework to analyse the impact of linkage on the network topology and assess the quality of these links.

Published in: Technology

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,559
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
17
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Assessing Linked Data Mappings using Network Measures Christophe Guéret, Paul Groth, Claus Stadler, Jens Lehmann 9th Extended Semantic Web Conference (ESWC) May 29, 2012 http://latc-project.euESWC - May 2012 http://aksw.org Assessing Linked Data mappings http://www.vu.nl 1/25
  • 2. The next 25+5 minutes The impact of links in the Web of Data Main questions What is the impact of link creation? Can we detect “bad” links based on their impact? Is adding links always a good thing? Contributions A framework to assess the impact of links Results for 5 metricsESWC - May 2012 Assessing Linked Data mappings 2/25
  • 3. Is this a good or a bad link ?ESWC - May 2012 Assessing Linked Data mappings 3/25
  • 4. Measuring the Web of Data Look at the topology using network analysis tools Impossible to get the complete graph Sampling of the graph focusing on specific nodes See the bigger picture through aggregation Build the local network around a resource Repeat the process a sufficient number of timeESWC - May 2012 Assessing Linked Data mappings 4/25
  • 5. Network sampling process Use SPARQL end point or de-reference the resources to get the descriptionsESWC - May 2012 Assessing Linked Data mappings 5/25
  • 6. Aggregation of local results Observed Target …ESWC - May 2012 Assessing Linked Data mappings 6/25
  • 7. Metrics Compute local scores for a resource Criteria Use only the local network Representative of a global property Not sensitive to change of observation scale 5 metrics currently available in LinkQAESWC - May 2012 Assessing Linked Data mappings 7/25
  • 8. What do we want to see? Increase of connectivity within topical groups Increase chances of finding related information More bridges between topical groups Improve browsing capabilities More connectivity around hubs Decrease the dependency upon the hubsESWC - May 2012 Assessing Linked Data mappings 8/25
  • 9. Metric 1 – Degree Metric Number of edges around the target node Target Power-law distribution of values Intuition Presence of hubsESWC - May 2012 Assessing Linked Data mappings 9/25
  • 10. Metric 2 – Clustering coefficient Metric Density of links around the target node Target Increase clustering around nodes Intuition Topical clustersESWC - May 2012 Assessing Linked Data mappings 10/25
  • 11. Metric 3 – Centrality Metric Ratio between outgoing and incoming links Target Lower the discrepancy between the values Intuition Hubs are sensitiveESWC - May 2012 Assessing Linked Data mappings 11/25
  • 12. Metric 4 – SameAs chains Metric Number of “open” sameAs chains Target No open sameAs Intuition Peer agreementESWC - May 2012 Assessing Linked Data mappings 12/25
  • 13. Metric 5 – Description enrichment Metric Richness of resource description Target Increase as possible Intuition “SameAsed” resources are complementaryESWC - May 2012 Assessing Linked Data mappings 13/25
  • 14. Under the hood of LinkQAESWC - May 2012 Assessing Linked Data mappings 14/25 http://www.flickr.com/photos/cradlehall/5747161514
  • 15. Workflow of an analysisESWC - May 2012 Assessing Linked Data mappings 15/25
  • 16. Output of an analysis Results on the node and aggregated scale Per metric: Indication of change with respect to the target Sorted list of outlier nodes, sorted by their distance to the target Plus, a global ranking of nodes => Input for manual inspection by an expertESWC - May 2012 Assessing Linked Data mappings 16/25
  • 17. Experimental resultsESWC - May 2012 Assessing Linked Data mappings 17/25
  • 18. Global impact of links Observe the distributions to detect bad linksESWC - May 2012 Assessing Linked Data mappings 18/25
  • 19. First evaluation 160 linking specifications for Silk, developed in the context of LATC 6 linking specifications with manual verification of results 50 positive links 50 negative links Execute LinkQA with 10 samples of 50 linksESWC - May 2012 Assessing Linked Data mappings 19/25
  • 20. Results of the detection “C” if change detected in > 50% of runsESWC - May 2012 Assessing Linked Data mappings 20/25
  • 21. Some explanations Low sensitivity of metrics: Lack of data Stable change 50/50 accuracy of detection: Targets may not be the right ones Sample may not be big enough Semantics agnostic measures are less performantESWC - May 2012 Assessing Linked Data mappings 21/25
  • 22. A closer look at the outliers See if the outliers are necessarily bad linksESWC - May 2012 Assessing Linked Data mappings 22/25
  • 23. Second evaluation Linking specifications for Silk, developed in the context of LATC All linking specifications sampled to have 45 positive links 5 negative links Execute LinkQA five time, on five samplesESWC - May 2012 Assessing Linked Data mappings 23/25
  • 24. Rank of positive and negative linksESWC - May 2012 Assessing Linked Data mappings 24/25
  • 25. Take home message LinkQA is a node centric approach to measure the impact of links in the WoD network Scalable, can be distributed Current results show that The 5 metrics defines are to be improved Metrics considering Semantics perform better The network sample seems too small Outliers detection improves with the number of metricsESWC - May 2012 Assessing Linked Data mappings 25/25

×