Your SlideShare is downloading. ×
0
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

DSNotify - Detecting and Fixing Broken Links in Linked Data Sets

778

Published on

Bernhard Haslhofer and Niko Popitsc, University of Vienna …

Bernhard Haslhofer and Niko Popitsc, University of Vienna
Web Semantic Workshop, DEXA 2009 Linz, 2 September 2009

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
778
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. DSNotify - Detecting and Fixing Broken Links in Linked Data Sets WebS ’09 @ DEXA 2009 Linz, 02/09/2009 Bernhard Haslhofer and Niko Popitsch Bernhard Haslhofer, Niko Popitsch
  • 2. Summary Bernhard Haslhofer, Niko Popitsch 2
  • 3. <mo:MusicGroup rdf:about="/music/artists/084308bd-1654-436f-ba03-df6697104e19#artist"> <foaf:name>Green Day</foaf:name> <owl:sameAs rdf:resource="http://dbpedia.org/resource/Green_Day" /> <mo:image rdf:resource="/music/images/artists/7col_in/084308bd-1654-436f-ba03- df6697104e19.jpg" /> <foaf:page rdf:resource="/music/artists/084308bd-1654-436f-ba03-df6697104e19.html" /> <mo:musicbrainz rdf:resource="http://musicbrainz.org/artist/084308bd-1654-436f-ba03- df6697104e19.html" /> <mo:homepage rdf:resource="http://www.greenday.com/" /> <mo:fanpage rdf:resource="http://www.greendayvideos.com/" /> <mo:fanpage rdf:resource="http://www.greenday.net" /> <mo:imdb rdf:resource="http://www.imdb.com/name/nm1554564/" /> <mo:myspace rdf:resource="http://www.myspace.com/greenday" /> ...
  • 4. ... <rdf:Description rdf:about="http://dbpedia.org/resource/Green_Day"> <dbpprop:abstract xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="en">Green Day is an American rock trio formed in 1987. The band has consisted of Billie Joe Armstrong (vocals, guitar), Mike Dirnt, and Tré Cool for the majority of its existence... </dbpprop:abstract> </rdf:Description> ... <rdf:Description rdf:about="http://dbpedia.org/resource/Green_Day"> <dbpprop:abstract xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="de">Green Day [gɹiːn deɪ] ist eine US-amerikanische Punk-Rock-Band, mit der Anfang der 1990er das Punk- Revival begann. Die Band wurde 1987 von Billie Joe Armstrong und Mike Dirnt zusammen mit dem Schlagzeuger John Kiffmeyer alias Al Sobrante als The Sweet Children.... </dbpprop:abstract> </rdf:Description> ...
  • 5. ...but... Bernhard Haslhofer, Niko Popitsch 8
  • 6. Some numbers... • Events between DBpedia 3.2 (10/2008) and 3.3 (05/2009) • # resources created: 29449 • # resources removed: 4789 • # resources moved: 729 Bernhard Haslhofer, Niko Popitsch 9
  • 7. Link Integrity... • is a qualitative property that is given when all links within and between a set of data sources are valid and deliver the result intended by the link creator. • cf. referential integrity in RDBMS • demands a solution that • detects broken links between resources • provides support for fixing broken links Bernhard Haslhofer, Niko Popitsch 11
  • 8. Types of broken links • Removed link targets • e.g., resource deleted, server not available anymore, etc. • Moved link targets • available at another Web location • e.g., reorganization of Web resources • Modified link targets Bernhard Haslhofer, Niko Popitsch 12
  • 9. The DSNotify Approach • periodically monitor items (resources) in a specific Linked Data source • extract descriptive features vector for each item • store item + feature vector in index • use feature vectors to detect if items have been removed or moved to another location • if moved, add relationship between “old” and “new” item Bernhard Haslhofer, Niko Popitsch 13
  • 10. Architecture LOD „consuming“ application LOD Sources LOD Source owl:sameAs owl:sameAs monitor update * Monitor (feature extraction) Event LOG notifications * LOD source Indices updater querying II RII AII * Decider Decision making * Move Detector (heuristic) user DSNOTIFY Bernhard Haslhofer, Niko Popitsch 14
  • 11. Index Interaction Item Index (II) Archived Item Index (AII) Removed Item Index (RII) http://dbpedia.org/resource/ t1 Green_Day (band) t2 http://dbpedia.org/resource/ Green_Day (band) t3 http://dbpedia.org/resource/ http://dbpedia.org/resource/ band/Green_Day Green_Day (band) t4 http://dbpedia.org/resource/ http://dbpedia.org/resource/ band/Alternative/Green_Day band/Green_Day http://dbpedia.org/resource/ time Green_Day (band) Bernhard Haslhofer, Niko Popitsch 15
  • 12. Move Detection • is a semi-automatic process • calculate similarity between items based on their feature vectors using domain-specific heuristics • probability > given threshold: automatic decision • probability < given threshold: ask expert user Bernhard Haslhofer, Niko Popitsch 16
  • 13. DSNotify HTTP Interface • GET http://<server>:<port>/<dsnotify>/item/<uri> • find out what happened with an item • GET http://<server>:<port>/<dsnotify>/eventChoice • retrieve pending event choices (move / remove) • ... Bernhard Haslhofer, Niko Popitsch 17
  • 14. Evaluation Plan t -n ... t -2 t -1 t 0 DBpedia 2.0 DBpedia 3.0 DBpedia 3.1 DBpedia 3.2 Diff Diff Diff manual classification manual classification manual classification mv rm mv rm mv rm Bernhard Haslhofer, Niko Popitsch 18
  • 15. Status / Future Work • 1st prototype (infrastructure) ready • annotated test-data set based on DBpedia available • Currently working on: • system for simulating past modifications in DBpedia • the DSNotify evaluation Bernhard Haslhofer, Niko Popitsch 19
  • 16. Fixing Your Web since 2009
  • 17. Backup Bernhard Haslhofer, Niko Popitsch 21
  • 18. Evaluation Plan • Monitor simulated DBpedia evolution (t-n - t0) • Precision / recall of automatic move detection • with different similarity thresholds • with different heuristics / and feature vectors Bernhard Haslhofer, Niko Popitsch 22
  • 19. Linked Data / Web of Data • Data management paradigm on the basis of Web technologies • HTTP, URI, and RDF/S are the key technologies • Applications (not Web browsers) are data consumers • Links between resources play a major role Bernhard Haslhofer, Niko Popitsch 23

×