Your SlideShare is downloading. ×
0
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data

370

Published on

The intensive growth of the Linked Open Data (LOD) Cloud has spawned a web of data where a multitude of data sources provides huge amounts of valuable information across different domains. Nowadays, …

The intensive growth of the Linked Open Data (LOD) Cloud has spawned a web of data where a multitude of data sources provides huge amounts of valuable information across different domains. Nowadays, when accessing and using Linked Data more and more often the challenging question is not so much whether there is relevant data available, but rather where it can be found, how it is structured and to make best use of it.

I this lecture I will start with giving a brief introduction to the concepts underlying LOD. Then I will focus on three aspects of current research:
(1) Managing Linked Data. Index structures play an important role for making use of the information in LOD cloud. I will give an overview of indexing approaches, present algorithms and discuss the ideas behind the index structures.
(2) Analysing Linked Data. I will present methods for analysing various aspects of LOD. From an information theoretic analysis for measuring structural redundancy, over formal concept analysis for identifying alternative declarative descriptions to a dynamics analysis for capturing the evolution of Linked Data sources.
(3) Making Use of Linked Data. Finally I will give a brief overview and outlook on where the presented techniques and approaches are of practical relevance in applications.

(Talk at the IRSS summerschool 2014 in Athens)

Published in: Science
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
370
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Institute for Web Science & Technologies – WeST Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open Data Thomas Gottron July 18th, 2014 IRSS
  • 2. Thomas Gottron IRSS, Athens, 18.7.2014, 2Leveraging the Web of Data Linked Open Data – Vision of a Web of Data §  „Classic“ Web w  Linked documents §  Web of Data w  Linked data entities
  • 3. Thomas Gottron IRSS, Athens, 18.7.2014, 3Leveraging the Web of Data §  „Classic“ Web Linked Open Data – Vision of a Web of Data §  Web of Data ID ID
  • 4. Thomas Gottron IRSS, Athens, 18.7.2014, 4Leveraging the Web of Data LOD – Base technologies §  IDs: Dereferencable HTTP URIs §  Data Format: RDF §  No schema §  Links to other data sources foaf:Document „Extracting schema ...“ fb:Computer_Scientist dc:creator http://dblp.l3s.de/.../NesterovAM98 http://dblp.l3s.de/.../Serge_Abiteboul rdf:type „Serge Abiteboul“ dc:title rdf:type foaf:name http://www.bibsonomy.org/.../Serge+Abiteboul rdfs:seeAlso 1 Statement = 1 Tripel Subject Predicate Object rdf:type = http://www.w3.org/1999/02/22-rdf-syntax-ns#type foaf:Document = http://xmlns.com/foaf/0.1/Document swrc:InProceedingsrdf:type
  • 5. Thomas Gottron IRSS, Athens, 18.7.2014, 5Leveraging the Web of Data LOD Cloud … the Web of Linked Data consisting of more than 30 Billion RDF triples from hundreds of data sources … Gerhard Weikum SIGMOD Blog, 6.3.2013 http://wp.sigmod.org/ Where’s the Data in the Big Data Wave?
  • 6. Thomas Gottron IRSS, Athens, 18.7.2014, 6Leveraging the Web of Data Some „Bubbles“ of the LOD Cloud
  • 7. Thomas Gottron IRSS, Athens, 18.7.2014, 7Leveraging the Web of Data Making Use of the Linked Data Cloud ... Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ LOD: a rich, huge, diverse, public and distributed knowledge base on the Web. Pros Cons rich knowledge base diversepublic huge on the Web diversedistributed Find technical solutions to overcome challenges
  • 8. Thomas Gottron IRSS, Athens, 18.7.2014, 8Leveraging the Web of Data Managing k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... Search data structure Efficientstorage andretrieval MakingUseAnalysing 0.0 0.2 0.4 0.6 0.8 1.0 0 10 20 30 40 50 60 70 80 MicroAvg.F1 Week of Data Snapshot RDF Type TS PS IPS ECS SchemEX E1 rdf:type dc:creator E2 Bad News ...dc:title foaf:Document swrc:InProceedings rdf:type
  • 9. Thomas Gottron IRSS, Athens, 18.7.2014, 9Leveraging the Web of Data Managing k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... Search data structure Efficientstorage andretrieval E1 rdf:type dc:creator E2 Bad News ...dc:title foaf:Document swrc:InProceedings rdf:type Analysing 0.0 0.2 0.4 0.6 0.8 1.0 0 10 20 30 40 50 60 70 80 MicroAvg.F1 Week of Data Snapshot RDF Type TS PS IPS ECS SchemEX MakingUse
  • 10. Thomas Gottron IRSS, Athens, 18.7.2014, 10Leveraging the Web of Data Data Format §  Linked Data as N-Quads: triple – what is the information? context URI – where does it come from? s op c ( )s op c
  • 11. Thomas Gottron IRSS, Athens, 18.7.2014, 11Leveraging the Web of Data Index Models
  • 12. Thomas Gottron IRSS, Athens, 18.7.2014, 12Leveraging the Web of Data (Abstract) Index Models w  D : Data elements to be retrieved (payload) w  K : Key elements to access the data (index elements) w  σ : Selection function: How to get data for a key k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... DK σ Searchdata structure Efficientstorage andretrieval ℘( ) Data items / PayloadKeys
  • 13. Thomas Gottron IRSS, Athens, 18.7.2014, 13Leveraging the Web of Data Choices for Key Elements Subject s op c Search data structure s1 s2 sn Context Search data structure c1 c2 cm s op c Literals Search data structure x y z s litp c Types Search data structure t1 t2 tk s t rdf: type c ... Alternative: predicates or objects
  • 14. Thomas Gottron IRSS, Athens, 18.7.2014, 14Leveraging the Web of Data Choices for the Payload Full Caching local Web s op c Triples local Web s op Entities local Web s Data Sources local Web c ...
  • 15. Thomas Gottron IRSS, Athens, 18.7.2014, 15Leveraging the Web of Data , ... Concrete Example: Subject Based Index Model west:Gottron west:Staab west:Schegi ... tud:CGottron (west:Gottron, rdf:type, foaf:Person) (west:Gottron, foaf:knows, west:Staab) ... (west:Staab, swrc:institution, west:WeST) (west:Staab, foaf:name, „Steffen Staab“) ... (west:Schegi, rdf:type, foaf:Person) (west:Schegi, foaf:name, „Stefan Scheglmann“) (tud:CGottron, swrc:institution, tud:KOM) (tud:CGottron, foaf:knows, west:Gottron) ... s o1p1s s o2p2
  • 16. Thomas Gottron IRSS, Athens, 18.7.2014, 16Leveraging the Web of Data Implemented Index Models §  Triple based §  Meta data https://github.com/gottron/lod-index-models s ops s opp s opo s opterm s opc s opPLD
  • 17. Thomas Gottron IRSS, Athens, 18.7.2014, 17Leveraging the Web of Data Schema-level Indices
  • 18. Thomas Gottron IRSS, Athens, 18.7.2014, 18Leveraging the Web of Data Schema Information on the LOD Cloud (No) Schema? Guidelines / best practices Automatic tools Social effects Emerging Schema! Induce from data observations
  • 19. Thomas Gottron IRSS, Athens, 18.7.2014, 19Leveraging the Web of Data Examples for Schema Information Property Set Type Set { } ... x, ...p1 p3p2 { } ... t1 t2 y, ... rdf:type y rdf:type t1 t2 x p1 p2 p3
  • 20. Thomas Gottron IRSS, Athens, 18.7.2014, 20Leveraging the Web of Data Implemented Index Models §  Triple based §  Meta data §  Schema-level https://github.com/gottron/lod-index-models s ops s opp s opo s opterm s opc s opPLD type s SchemEX s t t st t p p sp p p-1 p-1 op-1p-1 t p sp t
  • 21. Thomas Gottron IRSS, Athens, 18.7.2014, 21Leveraging the Web of Data SchemEX
  • 22. Thomas Gottron IRSS, Athens, 18.7.2014, 22Leveraging the Web of Data Schema-based Access to the LOD cloud ? foaf:Document fb:Computer_Scientist dc:creator x swrc:InProceedings SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type swrc:InProceedings . ?x dc:creator ?y . ?y rdf:type fb:Computer_Scientist }
  • 23. Thomas Gottron IRSS, Athens, 18.7.2014, 23Leveraging the Web of Data Schema-based Access to the LOD cloud Schema- level Index Where? •  ACM •  DBLP SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type swrc:InProceedings . ?x dc:creator ?y . ?y rdf:type fb:Computer_Scientist } Which schema information?
  • 24. Thomas Gottron IRSS, Athens, 18.7.2014, 24Leveraging the Web of Data Typecluster §  Entities with the same Set of types t1 t2 C1 C2 Cm tn... ... TCj
  • 25. Thomas Gottron IRSS, Athens, 18.7.2014, 25Leveraging the Web of Data Typecluster: Example foaf:Document swrc:InProceedings DBLP ACM tc2309
  • 26. Thomas Gottron IRSS, Athens, 18.7.2014, 26Leveraging the Web of Data Property Sets §  Entities with the same Set of properties p1 p2 C1 C2 Cm pn... ... PSi
  • 27. Thomas Gottron IRSS, Athens, 18.7.2014, 27Leveraging the Web of Data Bi-Simulation: Example dc:creator BBC DBLP ps2608
  • 28. Thomas Gottron IRSS, Athens, 18.7.2014, 28Leveraging the Web of Data SchemEX: Combination TC and PS §  Partition of TC based on PS with restrictions on the destination TC (equivalence relation) t1 t2 tn... C1 C2 Cm ... t45 t2 tn‘... p1 pn‘‘... EQC EQC Cx TCj TCk EQCl PSi SchemaPayload p2 PS
  • 29. Thomas Gottron IRSS, Athens, 18.7.2014, 29Leveraging the Web of Data SchemEX: Example DBLP ... tc2309 tc2101 eqc707 ps2608 foaf:Document swrc:InProceedings fb:Computer_Scientist dc:creator SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type swrc:InProceedings . ?x dc:creator ?y . ?y rdf:type fb:Computer_Scientist }
  • 30. Thomas Gottron IRSS, Athens, 18.7.2014, 30Leveraging the Web of Data Building Indices
  • 31. Thomas Gottron IRSS, Athens, 18.7.2014, 31Leveraging the Web of Data Building Indices: Operators §  Combination of few simple operations w  Aggregate, Join, Invert §  Example: Property Set index s1 o1p1 c1 s1 o1p2 c1 s2 o2p2 c1 s3 o3p1 c1 s3 o4p2 c1 s4 o1p3 c1 s1 p1 p2 s2 p2 s3 p1 p2 s4 p3 p1 p2 s1 s3 p2 s2 p3 s4 Aggregate Invert
  • 32. Thomas Gottron IRSS, Athens, 18.7.2014, 32Leveraging the Web of Data SchemEX: Computation §  Precise computation: 3 Aggregates + Join + Invert t1 t2 tn... C1 C2 Cm ... t45 t2 tn‘... p1 pn‘‘... EQC EQC Cx TCj TCk EQCl PSi SchemaPayload p2 PS Better Approach?(Faster, more scalable)
  • 33. Thomas Gottron IRSS, Athens, 18.7.2014, 33Leveraging the Web of Data Stream-based Computation of SchemEX §  LOD Crawler: Stream of N-Quads … Q16, Q15, Q14, Q13, Q12, Q11, Q10, Q9, Q8, Q7, Q6, Q5, Q4, Q3, Q2, Q1 FiFo 4 3 2 1 1 6 2 3 4 5 t3 t2 t2 t1 t1 t2 tn... C1 C2 Cm ... t45 t2 tn‘... p1 pn‘‘... EQC EQC Cx TCj TCk EQCl PSi SchemaPayload p2 PS
  • 34. Thomas Gottron IRSS, Athens, 18.7.2014, 34Leveraging the Web of Data Quality of Approximated Index §  Stream-based computation vs. precise computation w  Data set of 11 Mio. tripel
  • 35. Thomas Gottron IRSS, Athens, 18.7.2014, 35Leveraging the Web of Data SchemEX: 1st place @ BTC 2011 §  SchemEX w  Allows complex queries (Star, Chain) w  Scalable computation w  High quality §  Index over BTC 2011 data w  2.17 billion triple w  Index: 55 million triple §  Commodity hardware w  VM: 1 Core, 4 GB RAM w  Throughput: 39.500 triple / second w  Computation of full index: 15h
  • 36. Thomas Gottron IRSS, Athens, 18.7.2014, 36Leveraging the Web of Data k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... Search data structure Efficientstorage andretrieval ManagingMakingUseAnalysing 0.0 0.2 0.4 0.6 0.8 1.0 0 10 20 30 40 50 60 70 80 MicroAvg.F1 Week of Data Snapshot RDF Type TS PS IPS ECS SchemEX E1 rdf:type dc:creator E2 Bad News ...dc:title foaf:Document swrc:InProceedings rdf:type
  • 37. Thomas Gottron IRSS, Athens, 18.7.2014, 37Leveraging the Web of Data Redundancy of Schema Information
  • 38. Thomas Gottron IRSS, Athens, 18.7.2014, 38Leveraging the Web of Data Explicit and Implicit Schema Information on LOD E1 rdf:type Explicit Assigning class types Implicit Modelling attributes dc:creator E2 Bad News ...dc:title foaf:Document swrc:InProceedings rdf:type Class Type Entity rdf:type Entity Property Entity 2 To which degree?
  • 39. Thomas Gottron IRSS, Athens, 18.7.2014, 39Leveraging the Web of Data e.g. Probabilistic model of schema information §  Joint Distribution P(T,R) over w  Type sets: TS w  Property sets: PS §  P(T=t,R=r) : probability of resource to have type set t and property set r P(T,R)   r1 r2 r3 r4   P(T)   t1 14%   2%   5%   8%   29%   t2 5%  15%   2%   3%   25%   t3 7%   3%  30%   5%   45%     P(R)   26%  20%  37%  17%  foaf:Document swrc:InProceedings e.g. dc:creator dc:title Marginal Distributions Marginal Distributions
  • 40. Thomas Gottron IRSS, Athens, 18.7.2014, 40Leveraging the Web of Data Estimating probabilities Data set Triples TS PS Rest 22.3M 793 7,522 Datahub 910.1M 28,924 14,712 Dbpedia 198.1M 1,026,272 391,170 Freebase 101.2M 69,732 162,023 Timbl 204.8M 4,139 9,619 §  Todo (on large data sets) w  Determine schema use w  Aggregate counts §  „Query“ schema-level index: §  Data background: segments from BTC‘12 p(t,r) = d ∈ σ t,r( ) N
  • 41. Thomas Gottron IRSS, Athens, 18.7.2014, 41Leveraging the Web of Data Redundancy To which degree can one type of information (either types or properties) explain the respective other? §  Mutual Information §  Normalized MI: •  H(T) and H(R) : entropy of the marginal distributions. I(T, R) = p t,r( )⋅log p t,r( ) P T = t( )⋅ P R = r( ) " # $$ % & '' r∈PS ∑ t∈TS ∑ I0 (T, R) = I T, R( ) min H T( ), H R( )( )
  • 42. Thomas Gottron IRSS, Athens, 18.7.2014, 42Leveraging the Web of Data Example: Normalized Mutual Information P(T,R)   r1 r2 r3 r4   P(T)   t1 14%   2%   5%   8%   29%   t2 5%  15%   2%   3%   25%   t3 7%   3%  30%   5%   45%     P(R)   26%  20%  37%  17%   I0 (T, R) = 0.239 P(T,R)   r1 r2 r3 r4   P(T)   t1 22%   1%   0%   1%   24%   t2 1%  23%   1%   0%   26%   t3 1%   0%  48%   1%   50%     P(R)   24%  24%  49%   2%   I0 (T, R) = 0.766
  • 43. Thomas Gottron IRSS, Athens, 18.7.2014, 43Leveraging the Web of Data Normalized Mutual Information on BTC‘12 §  Tendencies: w  Relatively high redundancy w  Freebase: (weakly) pre-defined schema w  Timbl: narrow domain (FOAF profiles) w  DBpedia: de-centralized schema Data set I0(T,R) Rest 0.881 Datahub 0.747 Dbpedia 0.635 Freebase 0.860 Timbl 0.850 1 4 5 2 3
  • 44. Thomas Gottron IRSS, Athens, 18.7.2014, 44Leveraging the Web of Data Finding Alternative Descriptions
  • 45. Thomas Gottron IRSS, Athens, 18.7.2014, 45Leveraging the Web of Data Searching for a Suitable Description SELECT ?x WHERE { ?x rdf:type foaf:Document } SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type foaf:PersonalProfileDocument } SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type sioc:Post . } Declarative descriptions
  • 46. Thomas Gottron IRSS, Athens, 18.7.2014, 46Leveraging the Web of Data Operations on the Declarative Description Entity Set C1 C2 C1 C2 C3 C1 C2 C1 C2 C1sub C1sup C2 C1 Add Delete Refine Generalize
  • 47. Thomas Gottron IRSS, Athens, 18.7.2014, 47Leveraging the Web of Data Just (Small) Baby Steps ...
  • 48. Thomas Gottron IRSS, Athens, 18.7.2014, 48Leveraging the Web of Data Formal Concept Analysis §  Constructs concepts and their hierarchy from data objects §  Input (formal context) w  Set O (the objects) w  Set A (the attributes) w  Relation I ⊆ O×A (which object has which attributes) §  Derivation Operator w  (Sub)Set of objects w  (Sub)Set of attributes Attributes common to all objects in X Objects which have all attributes in Y X ' := y ∈ A : x, y( ) ∈ I,∀x ∈ X{ } Y ' := x ∈ O : x, y( ) ∈ I,∀y ∈ Y{ }
  • 49. Thomas Gottron IRSS, Athens, 18.7.2014, 49Leveraging the Web of Data Formal Concept Analysis §  (X,Y) is a formal concept, if X'=Y and Y'=X w  X is the extent w  Y is the intent w  Example: ({1,5,10},{a,c}) §  Partial order for formal concepts: w  (X1,Y1) ≤ (X2,Y2) if X1 ⊂ X2 w  Equivalent to Y1 ⊃ Y2 w  Example: ({1,10},{a,b,c}) ≤ ({1,5,10},{a,b}) §  Defines a lattice structure on the concepts Table 1. Example of two formal conte Object a b c 1 ⇥ ⇥ ⇥ 2 ⇥ 3 ⇥ 4 ⇥ ⇥ 5 ⇥ ⇥ 6 ⇥ ⇥ 7 ⇥ 8 ⇥ ⇥ 9 ⇥ ⇥ 10 ⇥ ⇥ ⇥ ⌧ ⇠⇡ ⇢; qqqqqq MMMMMM ⌧ ⇠⇡ ⇢{a} LLLLL ⌧ ⇠⇡ ⇢{b} rrrrr LLLLL ⌧ ⇠⇡ ⇢{c} rrrrr ⌧ ⇠⇡ ⇢{a, b} LLLL ⌧ ⇠⇡ ⇢{a, c} ⌧ ⇠⇡ ⇢{b, c} rrrr ⌧ ⇠⇡ ⇢{a, b, c}
  • 50. Thomas Gottron IRSS, Athens, 18.7.2014, 50Leveraging the Web of Data Formal Concept Lattice (extent) ≤ Table 1. Example of two formal conte Object a b c 1 ⇥ ⇥ ⇥ 2 ⇥ 3 ⇥ 4 ⇥ ⇥ 5 ⇥ ⇥ 6 ⇥ ⇥ 7 ⇥ 8 ⇥ ⇥ 9 ⇥ ⇥ 10 ⇥ ⇥ ⇥ ⌧ ⇠⇡ ⇢; qqqqqq MMMMMM ⌧ ⇠⇡ ⇢{a} LLLLL ⌧ ⇠⇡ ⇢{b} rrrrr LLLLL ⌧ ⇠⇡ ⇢{c} rrrrr ⌧ ⇠⇡ ⇢{a, b} LLLL ⌧ ⇠⇡ ⇢{a, c} ⌧ ⇠⇡ ⇢{b, c} rrrr ⌧ ⇠⇡ ⇢{a, b, c}
  • 51. Thomas Gottron IRSS, Athens, 18.7.2014, 51Leveraging the Web of Data Formal Concept Lattice (intent) Top-Concept: (O,Ø) Bottom-Concept: (Ø,A) Table 1. Example of two formal conte Object a b c 1 ⇥ ⇥ ⇥ 2 ⇥ 3 ⇥ 4 ⇥ ⇥ 5 ⇥ ⇥ 6 ⇥ ⇥ 7 ⇥ 8 ⇥ ⇥ 9 ⇥ ⇥ 10 ⇥ ⇥ ⇥ ⌧ ⇠⇡ ⇢; qqqqqq MMMMMM ⌧ ⇠⇡ ⇢{a} LLLLL ⌧ ⇠⇡ ⇢{b} rrrrr LLLLL ⌧ ⇠⇡ ⇢{c} rrrrr ⌧ ⇠⇡ ⇢{a, b} LLLL ⌧ ⇠⇡ ⇢{a, c} ⌧ ⇠⇡ ⇢{b, c} rrrr ⌧ ⇠⇡ ⇢{a, b, c} 9 ⇥ ⇥ 10 ⇥ ⇥ ⇥ 9 ⇥ ⇥ 10 ⇥ ⇥ ⌧ ⇠⇡ ⇢; qqqqqq MMMMMM ⌧ ⇠⇡ ⇢; ppppppp ⌧ ⇠⇡ ⇢{a} LLLLL ⌧ ⇠⇡ ⇢{b} rrrrr LLLLL ⌧ ⇠⇡ ⇢{c} rrrrr ⌧ ⇠⇡ ⇢{x} ⌧ ⇠⇡ ⇢{y} qqqqq MMMM ⌧ ⇠⇡ ⇢{a, b} LLLL ⌧ ⇠⇡ ⇢{a, c} ⌧ ⇠⇡ ⇢{b, c} rrrr ⌧ ⇠⇡ ⇢{x, y} MMMM qqq ⌧ ⇠⇡ ⇢{a, b, c} ⌧ ⇠⇡ ⇢{x, y, z} Formal concept lattice structures based on the relations in Ta d by their intent—which provides a better overview.
  • 52. Thomas Gottron IRSS, Athens, 18.7.2014, 52Leveraging the Web of Data Navigating the Lattice Remove constraints Extend object set Add constraints Reduce object set Nice formalization, but ... 9 ⇥ ⇥ 10 ⇥ ⇥ ⇥ 9 ⇥ ⇥ 10 ⇥ ⇥ ⌧ ⇠⇡ ⇢; qqqqqq MMMMMM ⌧ ⇠⇡ ⇢; ppppppp ⌧ ⇠⇡ ⇢{a} LLLLL ⌧ ⇠⇡ ⇢{b} rrrrr LLLLL ⌧ ⇠⇡ ⇢{c} rrrrr ⌧ ⇠⇡ ⇢{x} ⌧ ⇠⇡ ⇢{y} qqqqq MMMM ⌧ ⇠⇡ ⇢{a, b} LLLL ⌧ ⇠⇡ ⇢{a, c} ⌧ ⇠⇡ ⇢{b, c} rrrr ⌧ ⇠⇡ ⇢{x, y} MMMM qqq ⌧ ⇠⇡ ⇢{a, b, c} ⌧ ⇠⇡ ⇢{x, y, z} Formal concept lattice structures based on the relations in Ta d by their intent—which provides a better overview.
  • 53. Thomas Gottron IRSS, Athens, 18.7.2014, 53Leveraging the Web of Data … still Baby Steps
  • 54. Thomas Gottron IRSS, Athens, 18.7.2014, 54Leveraging the Web of Data Parallel Lattices §  Availability of several attribute sets w  Facet dimensions w  „Natural“ subdivision w  Different descriptions of the same data
  • 55. Thomas Gottron IRSS, Athens, 18.7.2014, 55Leveraging the Web of Data Parallel Lattices
  • 56. Thomas Gottron IRSS, Athens, 18.7.2014, 56Leveraging the Web of Data General Idea for Mapping Entity Set C1 C2 C3 C4 C5 Approx. Entity Set derive derive approximate description alternative description
  • 57. Thomas Gottron IRSS, Athens, 18.7.2014, 57Leveraging the Web of Data Implementing Mappings §  Minimal Extension w  Top-Down Maximal Reduction Bottom-Up {b,c}' = {1,4,6,9,10} Alternative description for {b,c} 1 1,9 10 2,8 3,5,7 Precision? Recall?
  • 58. Thomas Gottron IRSS, Athens, 18.7.2014, 58Leveraging the Web of Data Observations §  On LOD: Mapping type sets onto property sets w  Evaluation on 20 data sets (subset of BTC‘12) §  Quality of approximations w  max-red: •  High recall: mainly > 0.8 •  Better for smaller concepts w  min-ext: •  Good precision: mainly > 0.5 •  Better for larger concepts rss:Item sioc:MicroblogPost foaf:maker sioc:has_discussion dcterms:date
  • 59. Thomas Gottron IRSS, Athens, 18.7.2014, 59Leveraging the Web of Data Evolution of Linked Data
  • 60. Thomas Gottron IRSS, Athens, 18.7.2014, 60Leveraging the Web of Data Evolution of LOD 2007 2008 2009 2010 2011
  • 61. Thomas Gottron IRSS, Athens, 18.7.2014, 61Leveraging the Web of Data Evolution of LOD Time Volume Triples provided by data sources Insertion, deletion, modification
  • 62. Thomas Gottron IRSS, Athens, 18.7.2014, 62Leveraging the Web of Data Effects on Indices: Decline in accuracy 0.0 0.2 0.4 0.6 0.8 1.0 0 10 20 30 40 50 60 70 80 MicroAvg.F1 Week of Data Snapshot RDF Type TS PS IPS ECS SchemEX
  • 63. Thomas Gottron IRSS, Athens, 18.7.2014, 63Leveraging the Web of Data Updates of Indices and Caches Which sources to prioritise in an update?
  • 64. Thomas Gottron IRSS, Athens, 18.7.2014, 64Leveraging the Web of Data Change Metrics §  Comparison of two RDF data sets (e.g. from different points in time) w  Xi : Set of triple statements w  Numeric expression for „distance“ §  Example: X1 X2 Δ 0,∞[ ) ΔJaccard X1, X2( )=1− X1 ∩ X2 X1 ∪ X2 Suitable to measuredynamics???
  • 65. Thomas Gottron IRSS, Athens, 18.7.2014, 65Leveraging the Web of Data Toy example: Changes Analysis of LOD 1st snapshot GerdInstitute ZBW Institute WeST Thomas Gerd Ansgar Renata
  • 66. Thomas Gottron IRSS, Athens, 18.7.2014, 66Leveraging the Web of Data Toy example: Changes Analysis of LOD 1st snapshot GerdInstitute ZBW Institute WeST Thomas Gerd Ansgar Renata 2nd snapshot Institute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute Paluno
  • 67. Thomas Gottron IRSS, Athens, 18.7.2014, 67Leveraging the Web of Data Toy example: Changes Analysis of LOD Changes detected between 1st and 2nd snapshot 1.  Deleted: <InstituteWEST hasMember Gerd> 2.  New: <InstitutePaluno hasMember Gerd > 1st snapshot GerdInstitute ZBW Institute WeST Thomas Gerd Ansgar Renata 2nd snapshot Institute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute Paluno
  • 68. Thomas Gottron IRSS, Athens, 18.7.2014, 68Leveraging the Web of Data Toy example: Changes Analysis of LOD 1st snapshot GerdInstitute ZBW Institute WeST Thomas Gerd Ansgar Renata 2nd snapshot Institute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute Paluno 3rd snapshot Institute ZBW Institute WeST Thomas Gerd Ansgar Renata
  • 69. Thomas Gottron IRSS, Athens, 18.7.2014, 69Leveraging the Web of Data Toy example: Changes Analysis of LOD 1st snapshot 2nd snapshot 3rd snapshot GerdInstitute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute Paluno Changes detected between 2nd and 3rd snapshot 1.  New: <InstituteWEST hasMember Gerd> 2.  Deleted: <InstitutePaluno hasMember Gerd >
  • 70. Thomas Gottron IRSS, Athens, 18.7.2014, 70Leveraging the Web of Data Toy example: Changes Analysis of LOD 1st snapshot 2nd snapshot 3rd snapshot GerdInstitute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute ZBW Institute WeST Thomas Gerd Ansgar Renata Institute Paluno Changes detected between 1st and 3rd snapshot None!?! Change metrics capture differences We want to measure dynamics!
  • 71. Thomas Gottron IRSS, Athens, 18.7.2014, 71Leveraging the Web of Data Measuring Dynamics: Requirements §  Dynamics function Θ w  quantify the evolution of a dataset X over a period of time Θti tj (X) = Θ(Xtj )−Θ(Xti ) ≥ 0 Θ Dynamics as amount of evolution Timeti tj X
  • 72. Thomas Gottron IRSS, Athens, 18.7.2014, 72Leveraging the Web of Data Constructing a Dynamics Function §  Function Θ difficult to define directly §  Indirect definition over a change rate function c(Xt) Θ(Xtj )−Θ(Xti ) = c Xt( ) ti tj ∫ dt Time Θ c ti tj X
  • 73. Thomas Gottron IRSS, Athens, 18.7.2014, 73Leveraging the Web of Data Change Rate Function §  Also c(Xt) not explicitely known! §  But can be approximated! w  Given snapshots of the data in small time intervals: w  The change rate can be approximated via change metrics: Δ Xti , Xti−1 ( ) ti −ti−1 ti−1→ti $ →$$ c Xti ( )= d dt Θ(Xti ) Xt1 , Xt2 , Xt3 ,!, Xtn
  • 74. Thomas Gottron IRSS, Athens, 18.7.2014, 74Leveraging the Web of Data Dynamics Framework §  Approximating c(Xt) as step function Timeti tj Θ c Θt1 tn (X) = Δ Xti , Xti−1 ( ) i=2 n ∑ Choice of Δ: Flexible use of different notions of change! X
  • 75. Thomas Gottron IRSS, Athens, 18.7.2014, 75Leveraging the Web of Data Introduction of Decay §  So far: w  Impact of evolution independent of moment in time w  Desirable: Focus on certain periods of time •  e.g. recent past §  Solution: w  Decay function f to assign weights to moments in time Time c ti tj f f ⋅c
  • 76. Thomas Gottron IRSS, Athens, 18.7.2014, 76Leveraging the Web of Data Implementing a Decay Function §  Exponential decay function: §  Incoporated in the framework: §  When using the step function approximation of c(Xt) : f t( )= e−λt Θ(Xtj )−Θ(Xti ) = e −λ tj −t( ) ⋅c Xt( ) ti tj ∫ dt Θt1 tn (X) = e −λ tn−ti( ) ⋅Δ Xti , Xti−1 ( ) i=2 n ∑
  • 77. Thomas Gottron IRSS, Athens, 18.7.2014, 77Leveraging the Web of Data Tabelle1 2012-05-06 2012-06-03 2012-07-01 2012-07-29 2012-08-26 2012-09-23 2012-10-21 2012-11-18 2012-12-16 2013-01-13 2013-02-24 2013-03-24 2013-04-22 2013-05-19 2013-06-16 2013-07-14 2013-08-11 2013-09-08 2013-10-06 2013-11-03 0 0,2 0,4 0,6 0,8 1 Change Rate Function of Seleted Data Sources Tabelle1 2012-05-06 2012-05-27 2012-06-17 2012-07-08 2012-07-29 2012-08-19 2012-09-09 2012-09-30 2012-10-21 2012-11-11 2012-12-02 2012-12-23 2013-01-13 2013-02-19 2013-03-10 2013-03-31 2013-04-22 2013-05-12 2013-06-04 2013-06-23 2013-07-14 2013-08-04 2013-08-25 2013-09-15 2013-10-06 2013-10-27 2013-11-17 0 0,2 0,4 0,6 0,8 1 Θ = 55.71 , Θdecay = 23.42 dbpedia.org Tabelle1 2012-05-06 2012-06-03 2012-07-01 2012-07-29 2012-08-26 2012-09-23 2012-10-21 2012-11-18 2012-12-16 2013-01-13 2013-02-24 2013-03-24 2013-04-22 2013-05-19 2013-06-16 2013-07-14 2013-08-11 2013-09-08 2013-10-06 2013-11-03 0 0,2 0,4 0,6 0,8 1 Θ = 58.45 , Θdecay = 18.48 identi.ca Θ = 51.75 , Θdecay = 25.03 linkedct.org Tabelle1 2012-05-06 2012-06-03 2012-07-01 2012-07-29 2012-08-26 2012-09-23 2012-10-21 2012-11-18 2012-12-16 2013-01-13 2013-02-24 2013-03-24 2013-04-22 2013-05-19 2013-06-16 2013-07-14 2013-08-11 2013-09-08 2013-10-06 2013-11-03 0 0,2 0,4 0,6 0,8 1 Θ = 20.90 , Θdecay = 8.33 dbtune.org
  • 78. Thomas Gottron IRSS, Athens, 18.7.2014, 78Leveraging the Web of Data k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... Search data structure Efficientstorage andretrieval Managing E1 rdf:type dc:creator E2 Bad News ...dc:title foaf:Document swrc:InProceedings rdf:type Analysing 0.0 0.2 0.4 0.6 0.8 1.0 0 10 20 30 40 50 60 70 80 MicroAvg.F1 Week of Data Snapshot RDF Type TS PS IPS ECS SchemEX MakingUse
  • 79. Thomas Gottron IRSS, Athens, 18.7.2014, 79Leveraging the Web of Data Schema-level Search on LOD
  • 80. Thomas Gottron IRSS, Athens, 18.7.2014, 80Leveraging the Web of Data Schema-based Access to the LOD cloud Schema- level Index Where? •  ACM •  DBLP SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type swrc:InProceedings . ?x dc:creator ?y . ?y rdf:type fb:Computer_Scientist }
  • 81. Thomas Gottron IRSS, Athens, 18.7.2014, 81Leveraging the Web of Data LODatio: Schema-level Search of LOD
  • 82. Thomas Gottron IRSS, Athens, 18.7.2014, 82Leveraging the Web of Data LODatio: Query transformation
  • 83. Thomas Gottron IRSS, Athens, 18.7.2014, 83Leveraging the Web of Data LODatio: Query transformation foaf:Document fb:Computer_Scientist dc:creator x swrc:InProceedings DBLP ... tc2309 tc2101 eqc707 ps2608 foaf:Document swrc:InProceedings fb:Computer_Scientist dc:creator SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type swrc:InProceedings . ?x dc:creator ?y . ?y rdf:type fb:Computer_Scientist } SELECT ?c WHERE { ?eqc schemex:hasDataset ?c . ?tc_A schemex:hasSubset ?eqc . ?tc_A schemex:hasClass foaf:Document . ?tc_A schemex:hasClass swrc:InProceedings . ?bs void:subjectsTarget ?eqc . ?bs void:objectsTarget ?tc_B . ?bs void:property dc:creator . ?tc_B schemex:hasClass fb:Computer_Scientist }
  • 84. Thomas Gottron IRSS, Athens, 18.7.2014, 84Leveraging the Web of Data LODatio: Retrieval Results
  • 85. Thomas Gottron IRSS, Athens, 18.7.2014, 85Leveraging the Web of Data LODatio: Retrieval Results C1 EQCl C1 EQCl DS23 URI 1 URI 2 URI 3 Entity count Example entities
  • 86. Thomas Gottron IRSS, Athens, 18.7.2014, 86Leveraging the Web of Data LODatio: User Support
  • 87. Thomas Gottron IRSS, Athens, 18.7.2014, 87Leveraging the Web of Data LODatio: User Support §  Currently implemented: w  Moderate reductions / extensions §  Next release: w  Include alternative description based on parallel lattices further properties further types DBLP ... tc2309 tc2101 eqc707 ps2608 foaf:Document swrc:InProceedings fb:Computer_Scientist dc:creator DS23
  • 88. Thomas Gottron IRSS, Athens, 18.7.2014, 88Leveraging the Web of Data LODatio: next steps Keyword search Better recommendations Other payload entities Visual exploration Related datasources Coverage
  • 89. Thomas Gottron IRSS, Athens, 18.7.2014, 89Leveraging the Web of Data Focused Exploration (work in progress)
  • 90. Thomas Gottron IRSS, Athens, 18.7.2014, 90Leveraging the Web of Data Use Case: Social Media Coverage of Events
  • 91. Thomas Gottron IRSS, Athens, 18.7.2014, 91Leveraging the Web of Data LinkedGeoData OSM owl:sameAs ??? Other locations?
  • 92. Thomas Gottron IRSS, Athens, 18.7.2014, 92Leveraging the Web of Data Extending LinkedGeoData Seed Exploration Overlay
  • 93. Thomas Gottron IRSS, Athens, 18.7.2014, 93Leveraging the Web of Data Task of Focused Exploration (use case: locations) §  Prioritise/select object URIs for exploration umbel:Village s -1.404 50.897 wgs84:long wgs84:lat dbponto:isPartOf dbponto:wikiPageExternalLink dbponto:governmentType dbpprop:settlementType dbpprop:subdivisionName o1 dbpprop:postalCode dcterms:subject o4 o5 o6 o7 o8 o10 o9 o11 o2 o3 xxx yyy wgs84:long wgs84:lat
  • 94. Thomas Gottron IRSS, Athens, 18.7.2014, 94Leveraging the Web of Data Exploration based on Schema Semantics §  Exploit rdfs:range definitions of predicates §  Follow edges which lead to locations with higher priority dbponto:twinCity dbpedia:City rdfs:range dbpedia:Place rdfs:subClassOf
  • 95. Thomas Gottron IRSS, Athens, 18.7.2014, 95Leveraging the Web of Data Supervised Machine Learning §  Use incoming predicates as features w  Learn predicates typically leading to locations §  Train a classifier (e.g. Naive Bayes) o xxx yyy wgs84:long wgs84:lat p2 p3 o‘ p4 p6
  • 96. Thomas Gottron IRSS, Athens, 18.7.2014, 96Leveraging the Web of Data IR Inspired Approaches §  Model discriminativeness of predicates w  Inspired by tf-idf §  Property relevance frequency (prf): •  Normalised version (prr) §  Inverse property frequency §  Rank by combine measure: prf-ipf prf = c(p, L) ipf = log c(∗,∗) c(p,∗) " # $ % & '
  • 97. Thomas Gottron IRSS, Athens, 18.7.2014, 97Leveraging the Web of Data Performance 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 ROC random Schema Semantics NB (all predicates) NB (present predicates) prf-ipf prr-ipf 0.95 0.975 1 0 0.025 0.05
  • 98. Thomas Gottron IRSS, Athens, 18.7.2014, 98Leveraging the Web of Data Performance . Average performance of approaches († indicates significant improvem nce level ⇢ = 0.01) Method Recall Precision F1 Accuracy AUC Schema Scemantics 0.1188 0.8119 0.2073 0.7262 0.5552 NB (all predicates) 0.9906 0.9491 † 0.9694 † 0.9812 0.9970 NB (observed predicates) 0.9943 0.9436 0.9683 0.9804 0.9968 prf-ipf 0.8512 † 0.9754 0.9091 0.9487 0.9958 prr-ipf † 0.9973 0.9240 0.9592 0.9745 0.9769 mance in bold. Furthermore, we marked the results where we had a signific ent over the second best method at confidence level of ⇢ = 0.01. The agg basically confirm the observations made above. In general, when conside es F1, Accuracy and AUC, the Naive Bayes classifier making use of al
  • 99. Thomas Gottron IRSS, Athens, 18.7.2014, 99Leveraging the Web of Data E1 rdf:type dc:creator E2 Bad News ...dc:title foaf:Document swrc:InProceedings rdf:type Analysing 0.0 0.2 0.4 0.6 0.8 1.0 0 10 20 30 40 50 60 70 80 MicroAvg.F1 Week of Data Snapshot RDF Type TS PS IPS ECS SchemEX MakingUse k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... Search data structure Efficientstorage andretrieval Managing
  • 100. Thomas Gottron IRSS, Athens, 18.7.2014, 100Leveraging the Web of Data Summary Pros Cons rich knowledge base diversepublic huge on the Web diversedistributed k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... Technical solutions to some of the problems
  • 101. Thomas Gottron IRSS, Athens, 18.7.2014, 101Leveraging the Web of Data Summary Pros Cons rich knowledge base diversepublic huge on the Web diversedistributed k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ...
  • 102. Thomas Gottron IRSS, Athens, 18.7.2014, 102Leveraging the Web of Data Thank you! Contact: Thomas Gottron WeST – Institute for Web Science and Technologies Universität Koblenz-Landau gottron@uni-koblenz.de
  • 103. Thomas Gottron IRSS, Athens, 18.7.2014, 103Leveraging the Web of Data References 1.  M. Konrath, T. Gottron, and A. Scherp, “Schemex – web-scale indexed schema extraction of linked open data,” in Semantic Web Challenge, Submission to the Billion Triple Track, 2011. 2.  M. Konrath, T. Gottron, S. Staab, and A. Scherp, “Schemex—efficient construction of a data catalogue by stream-based indexing of linked data,” Journal of Web Semantics, 2012. 3.  T. Gottron, M. Knauf, S. Scheglmann, and A. Scherp, “Explicit and implicit schema information on the linked open data cloud: Joined forces or antagonists?,” Tech. Rep. 06/2012, Institut WeST, Universität Koblenz-Landau, 2012. 4.  T. Gottron and R. Pickhardt, “A detailed analysis of the quality of stream-based schema construction on linked open data,” in CSWS’12: Proceedings of the Chinese Semantic Web Symposium, 2012. 5.  T. Gottron, A. Scherp, B. Krayer, and A. Peters, “Get the google feeling: Supporting users in finding relevant sources of linked open data at web-scale,” in Semantic Web Challenge, Submission to the Billion Triple Track, 2012. 6.  T. Gottron, A. Scherp, B. Krayer, and A. Peters, “LODatio: Using a Schema-Based Index to Support Users in Finding Relevant Sources of Linked Data,” in K-CAP’13: Proceedings of the Conference on Knowledge Capture, 2013. 7.  T. Gottron, M. Knauf, S. Scheglmann, and A. Scherp, “A Systematic Investigation of Explicit and Implicit Schema Information on the Linked Open Data Cloud,” in ESWC’13: Proceedings of the 10th Extended Semantic Web Conference, 2013. 8.  J. Schaible, T. Gottron, S. Scheglmann, and A. Scherp, “LOVER: Support for Modeling Data Using Linked Open Vocabularies,” in LWDM’13: 3rd International Workshop on Linked Web Data Management, 2013. 9.  R. Dividino, A. Scherp, G. Gröner, and T. Gottron, “Change-a-LOD: Does the Schema on the Linked Data Cloud Change or Not?,” in COLD’13: International Workshop on Consuming Linked Data, 2013.
  • 104. Thomas Gottron IRSS, Athens, 18.7.2014, 104Leveraging the Web of Data References 10.  T. Gottron, M. Knauf, and A. Scherp, “Analysis of schema structures in the linked open data graph based on unique subject uris, pay-level domains, and vocabulary usage,” Distributed and Parallel Databases, pp. 1–39, 2014. 11.  T. Gottron and C. Gottron, “Perplexity of index models over evolving linked data,” in ESWC’14: Proceedings of the Extended Semantic Web Conference, 2014. 12.  T. Gottron, A. Scherp, and S. Scheglmann, “Providing alternative declarative descriptions for entity sets using parallel concept lattices,” in ESWC’14: Proceedings of the Extended Semantic Web Conference, 2014. 13.  Carothers, G.: Rdf 1.1 n-quads. W3C Recommendation (Feb 2014), http://www.w3. org/TR/2014/REC-n- quads-20140225/, (accessed 14 March 2014) 14.  Käfer, T., Abdelrahman, A., Umbrich, J., O’Byrne, P., Hogan, A.: Observing linked data dynamics. In: The Se- mantic Web: Semantics and Big Data, Lecture Notes in Computer Science, vol. 7882, pp. 213– 227. Springer Berlin Heidelberg (2013) 15.  T. Gottron, “Of Sampling and Smoothing: Approximating Distributions over Linked Open Data,” in PROFILES’14: Proceedings of the Workshop on Dataset ProfiIling and Federated Search for Linked Data, 2014. 16.  R. Dividino, T. Gottron, A. Scherp, and G. Gröner, “From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources,” in PROFILES’14: Proceedings of the Workshop on Dataset ProfiIling and Federated Search for Linked Data, 2014. 17.  R. Dividino, A. Kramer, and T. Gottron, “An Investigation of HTTP Header Information for Detecting Changes of Linked Open Data Sources,” in ESWC’14: Proceedings of the Extended Semantic Web Conference, 2014.
  • 105. Thomas Gottron IRSS, Athens, 18.7.2014, 105Leveraging the Web of Data Sources •  Photograph of three of Nevins Memorial Library's earliest librarians. Wikimedia Commons collection, http://commons.wikimedia.org/wiki/File:Nevins_Library_First_Librarians.jpg •  Wide-angle view of the ALMA correlator, This Wikipedia and Wikimedia Commons image is from the European Southern Observatory (ESO) and is freely available at http://commons.wikimedia.org/wiki/ File:Wide-angle_view_of_the_ALMA_correlator.jpg under Creative Commons Attribution 3.0 Unportedlicense. •  Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/, This work is available under a CC-BY-SA license.

×