SlideShare a Scribd company logo
1 of 63
Validating RDF data:
Challenges and perspectives
Jose Emilio Labr Gayo
WESO Research Group (WEb Semantics Oviedo)
University of Oviedo, Spain
About me…
Founded WESO (Web Semantics Oviedo) research group
Practical applications of semantic technologies since 2004
Several domains: e-Government, e-Health
Some books:
"Web semántica" (in Spanish), 2012
"Validating RDF data", 2017
…and software:
SHaclEX (Scala library, implements ShEx & SHACL)
RDFShape (RDF playground)
HTML version: http://book.validatingrdf.com
Examples: https://github.com/labra/validatingRDFBookExamples
Contents
Short intro to semantic Web & knowledge graphs
Short overview of RDF
Understanding the RDF validation problem
2 technologies: ShEx/SHACL
Challenges & perspectives
Semantic Web
Vision of the Web as web of data
Not only pages, but also data: linked data
Data understandable by machines
Related with:
Big Data
Artificial intelligence
Knowledge representation
Example: Wikipedia vs Wikidata
Tim Berners-Lee
Source: Wikipedia
Source: http://www.internetlivestats.com/total-number-of-websites/
Date: 05/04/2019 (19:05h)
Who consumes Web information?
People
We access through some device
...and Machines
They show us web pages (browsers)
...but also manage information (bots)
Filtering content
Doing suggestions
...
"If Google doesn't understand your Web page, it almost doesn't exist"
Web information
Documentos
Information
retrieval
DocumentosDocumentos
Posts
Tweets
Web pages
...
Publish
Web
documents
Reality
Mental model
Web information must be machine-processable
Semantic web technologies focus on how to achieve this goal
Sensors Generate
User
Content
generation &
filtering
People vs Machines
Programmed for some tasks
Predictable (no mistakes*)
No problem with repetitive tasks
Problems to understand the context
Creativity, imagination
Unpredictable (we do mistakes)
We get tired with repetitive tasks
Understanding based on context
*when well programmed
Information understandable by machines?
Example: "Oviedo has a temperature of 36 degrees"
Problem: Ambiguity and context identification
Oviedo ? It can be... A city in Spain
...or a city in Florida, USA
...or a soccer player http://www.wikidata.org/entity/Q325997
http://wikidata.org/entity/Q14317
http://www.wikidata.org/entity/Q1813449
...has a temperatrure of... https://www.wikidata.org/wiki/Property:P2076
Identifiers (URIs) in Wikidata
Machine representation
http://www.wikidata.org/entity/Q14317
https://www.wikidata.org/prop/direct/P2076
36^^<http://www.w3.org/2001/XMLSchema#integer>
Knowledge representation for machines
Simple statement (triple 3 elements): subject, predicate, object
http://www.wikidata.org/entity/Q14317
https://www.wikidata.org/prop/direct/P2076
subject predicate object
36^^<http://www.w3.org/2001/XMLSchema#integer>
wd:Q14317
wdt:P2076
36^^xsd:integer
wd: <http://www.wikidata.org/entity/>
wdt: <https://www.wikidata.org/prop/direct/>
xsd: <http://www.w3.org/2001/XMLSchema#>
Prefix declarations help simplify long URIs
Knowledge graphs
wd:Q14317 36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
Wikidata statistics (17-07-2019)
57,462,853 entities
728,979,397 triples
Source: https://www.wikidata.org/wiki/Wikidata:Statistics
Wikidata is an example of a knowledge graph
There are a lot more:
Open: DBpedia, BabelNet, etc
Proprietary: Google, facebook, Microsoft, etc. also have knowledge graphs
Web information & knowledge graphs
Documentos
Information
retrieval
Knowledge
graphs
DocumentosDocumentos
Posts
Tweets
Web pages
...
Publish
Web
documents
Reality
Mental model
Knowledge graphs are a key part of the Web nowadays
Sensors Generate
User
Content
generation &
filtering
RDF
RDF (Resource Description Framework)
Allows to represent knowledge graphs
RDF data model = graph as a set of statements
Predicates = URIs
Several syntaxes
Turtle, N-Triples, JSON-LD,...
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
prefix wd: <http://www.wikidata.org/entity/>
prefix wdt: <https://www.wikidata.org/prop/direct/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
wd:Q14317 wdt:P1082 220020 ;
wdt:P2076 36 ;
wdt:P1376 wd:Q3934 .
wd:Q10514 wdt:P19 wd:Q14317 ;
wdt:P569 "1981-07-29"^^xsd:date ;
wdt:P18 <picture.jpg> .
RDF syntaxes: Turtle
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1082> "220020"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1376> <http://www.wikidata.org/entity/Q3934> .
<http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P2076> "36"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P18> <picture.jpg> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P19> <http://www.wikidata.org/entity/Q14317> .
<http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P569> "1981-07-29"^^<http://www.w3.org/2001/XMLSchema#date> .
RDF syntaxes: N-Triples
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
{
"@context" : "http://...wikidata.json",
"@graph" : [ {
"@id" : "wd:Q10514",
"wdt:P18" : "picture.jpg",
"wdt:P19" : "wd:Q14317",
"P569" : "1981-07-29"
}, {
"@id" : "wd:Q14317",
"wdt:P1082" : 220020,
"wdt:P1376" : "wd:Q3934",
"wdt:P2076" : 36
}
]
}
RDF syntaxes: JSON-LD
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:wdt="https://www.wikidata.org/prop/direct/"
xmlns:wd="http://www.wikidata.org/entity/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
<rdf:Description rdf:about="http://www.wikidata.org/entity/Q10514">
<wdt:P18>picture.jpg</wdt:P18>
<wdt:P19>
<rdf:Description rdf:about="http://www.wikidata.org/entity/Q14317">
<wdt:P1082 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">220020</wdt:P1082>
<wdt:P1376 rdf:resource="http://www.wikidata.org/entity/Q3934"/>
<wdt:P2076 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">36</wdt:P2076>
</rdf:Description>
</wdt:P19>
<wdt:P569 rdf:datatype=http://www.w3.org/2001/XMLSchema#date>1981-07-29</wdt:P569>
</rdf:Description>
</rdf:RDF>
RDF syntaxes: RDF/XML
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
RDF graphs are composable
RDF helps information integration
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
wd:Q14317
:alice
:bob
:carol
schema:birthPlace
schema:birthPlace
schema:knows
schema:knows
Alice Cooper
Schema:name Robert Smith
schema:nameschema:knows
Carol King
schema:name
RDF data model
Graph = set of statements (triples)
Predicates (arcs) = URIs
Subjects: URIs*
Objects: URIs or literals* 36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
wd:Q14317
*Exception: blank nodes (next slide)
RDF data model: Blank nodes
A statement about something
Example: "Alice knows someone who knows Bob"
:alice
schema:knows
:bob
schema:knows
:alice schema:knows _:x ;
_:x schema:knows :bob .
Notation (in Turtle)
:alice schema:knows [
schema:knows :bob
] .
 x ( :alice :knows x 
x :knows :bob )
RDF data model: Literals
:bob schema:name "Robert"^^xsd:string ;
schema:age "18"^^xsd:integer ;
schema:birthDate "2010-04-12"^^xsd:date .
:bob schema:name "Robert" ;
schema:age 18 ;
schema:birthDate "2010-04-12"^^xsd:date .
Objects can also be literals
Literals contain a lexical form and a datatype
Common datatypes: XML Schema primitive datatypes
If not specified, a literal has type xsd:string
:bob
schema:name
"2010-04-12"^^xsd:date18"Robert Smith"
schema:age
schema:birthDate
RDF data model: Language tagged strings
String literals can be qualified by a language tag
They have datatype rdfs:langString
:spain rdfs:label "Spain"@en ;
rdfs:label "España"@es .
:Spain
"España"@es"Spain"@en
rdfs:label
rdfs:label
...and that's all
Yes, the RDF Data model is simple
RDF ecosystem
SPARQL
Vocabularies
Ontologies and inference
SELECT ?name ?birthDate ?picture WHERE {
?person wdt:P19 wd:Q14317 ;
rdfs:label ?name ;
wdt:P569 ?birthDate ;
wdt:P18 ?picture .
FILTER (Lang(?name) = 'es')
}
SPARQL
SPARQL = Simple Protocol and RDF Query Language
Similar to SQL, but for RDF
Example: People born in Oviedo
wdt:P19
"1981-07-29"^^xsd:date
wdt:P18
wdt:P569
Oviedo
wd:Q14317
"Fernando Alonso"@en
rdfs:label
?person
?birthPlace?name ?picture
wdt:P19
wdt:P18wdt:P569
wd:Q14317
rdfs:label
wd:Q10514
wdt:P19
...
Shared entities and vocabularies
URIs as global identifiers allow to reuse them
Lots of vocabularies: domain specific or general purpose
Examples:
schema.org: properties that major browsers can recognize
Linked Open Vocabularies
Inference and ontologies
Some vocabularies define properties that can infer new information
RDF Schema: Classes, subclasses, etc.
OWL: Ontologies using description logics
Trade-off between expressiveness and complexity
rdf:type
:Student
:alice
:Person
rdfs:subClassOf
:HumanBeing
owl:sameAs
rdf:type
rdf:type
RDF, the good parts...
RDF as an integration language
RDF as a lingua franca for semantic web and linked data
RDF flexibility & integration
Data can be adapted to multiple environments
Open and reusable data by default
RDF for knowledge representation
RDF data stores & SPARQL
RDF, the other parts
Consuming & producing RDF
Multiple syntaxes: Turtle, RDF/XML, JSON-LD, ...
Embedding RDF in HTML
Describing and validating RDF content
Producer Consumer
Why describe & validate RDF?
For producers
Developers can understand the contents they are going to produce
Ensure they produce the expected structure
Advertise and document the structure
Generate interfaces
For consumers
Understand the contents
Verify the structure before processing it
Query generation & optimization
Producer Consumer
Shapes
Similar technologies
Technology Schema
Relational Databases DDL
XML DTD, XML Schema, RelaxNG, Schematron
Json Json Schema
RDF ?
Fill that gap
Understanding the problem
Identifying the shape of graphs...
Shapes can describe the form of a node (node constraint)
...and the number of possible arcs incoming/outgoing from a node
...and the possible values associated with those arcs
:alice schema:name "Alice";
schema:knows :bob, :carol .
IRI schema:name string 1
schema:knows IRI 0, 1,...
RDF Node
Shape
RDF Node that
represents a User
<UserShape> IRI {
schema:name xsd:string ;
schema:knows IRI *
}
ShEx
Understanding the problem
Repeated properties
The same property can be used for different purposes in the same data
Example: A product must have 2 codes with different structure
:product schema:productID "isbn:123-456-789";
schema:productID "code456" .
A practical example from FHIR
See: http://hl7-fhir.github.io/observation-example-bloodpressure.ttl.html
Understanding the problem
Shapes ≠ types
Nodes in RDF graphs can have 0, 1 or many rdf:type declarations
A type can be used in multiple contexts, e.g. foaf:Person
Nodes are not necessarily annotated with discriminating types
Nodes with type :Person can represent friends, students, patients,...
Different meanings and different structure depending on context
Specific validation constraints for different contexts
Understanding the problem
RDF validation ≠ ontology definition ≠ instance data
Ontologies are usually focused on domain entities
RDF validation is focused on RDF graph features (lower level)
Ontology
Constraints
RDF Validation
Instance data
Different levels
:alice schema:name "Alice";
schema:knows :bob .
<User> IRI {
schema:name xsd:string ;
schema:knows IRI
}
schema:knows a owl:ObjectProperty ;
rdfs:domain schema:Person ;
rdfs:range schema:Person .
A user must have only two properties:
schema:name of value xsd:string
schema:knows with an IRI value
Why not using SPARQL to validate?
Pros:
Expressive
Ubiquitous
Cons
Expressive
Idiomatic
many ways to encode the same constraint
ASK {{ SELECT ?Person {
?Person schema:name ?o .
} GROUP BY ?Person HAVING (COUNT(*)=1)
}
{ SELECT ?Person {
?Person schema:name ?o .
FILTER ( isLiteral(?o) &&
datatype(?o) = xsd:string )
} GROUP BY ?Person HAVING (COUNT(*)=1)
}
{ SELECT ?Person (COUNT(*) AS ?c1) {
?Person schema:gender ?o .
} GROUP BY ?Person HAVING (COUNT(*)=1)}
{ SELECT ?Person (COUNT(*) AS ?c2) {
?S schema:gender ?o .
FILTER ((?o = schema:Female ||
?o = schema:Male))
} GROUP BY ?Person HAVING (COUNT(*)=1)}
FILTER (?c1 = ?c2)
}
Example: Define SPARQL query to check:
There must be one schema:name which
must be a xsd:string, and one
schema:gender which must be
schema:Male or schema:Female
Previous/other approaches
SPIN, by TopQuadrant http://spinrdf.org/
SPARQL templates, Influenced SHACL
Stardog ICV: http://docs.stardog.com/icv/icv-specification.html
OWL with UNA and CWA
OSLC resource shapes: https://www.w3.org/Submission/shapes/
Vocabulary for RDF validation
Dublin Core Application profiles (K. Coyle, T. Baker)
http://dublincore.org/documents/dc-dsp/
RDF Data Descriptions (Fischer et al)
http://ceur-ws.org/Vol-1330/paper-33.pdf
RDFUnit (D. Kontokostas)
http://aksw.org/Projects/RDFUnit.html
...
ShEx and SHACL
2013 RDF Validation Workshop
Conclusions of the workshop:
There is a need of a higher level, concise language for RDF Validation
ShEx initially proposed (v 1.0)
2014 W3c Data Shapes WG chartered
2017 SHACL accepted as W3C recommendation
2017 ShEx 2.0 released as Community group draft
2018 ShEx adopted by Wikidata
Short intro to ShEx
ShEx (Shape Expressions Language)
Concise and human-readable language for RDF validation & description
Syntax similar to SPARQL, Turtle
Semantics inspired by regular expressions & RelaxNG
2 syntaxes: Compact and RDF/JSON-LD
Official info: http://shex.io
Semantics: http://shex.io/shex-semantics/, primer: http://shex.io/shex-primer
ShEx implementations and playgrounds
Libraries:
shex.js: Javascript
SHaclEX: Scala (Jena/RDF4j)
PyShEx: Python
shex-java: Java
Ruby-ShEx: Ruby
Online demos & playgrounds
ShEx-simple
RDFShape
ShEx-Java
ShExValidata
Simple example
Nodes conforming to <User> shape must:
• Be IRIs
• Have exactly one schema:name with a value of type xsd:string
• Have zero or more schema:knows whose values conform to <User>
prefix schema: <http://schema.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
<User> IRI {
schema:name xsd:string ;
schema:knows @<User> *
}
Prefix
declarations
as in
Turtle/SPARQL
RDF Validation using ShEx
:alice schema:name "Alice" ;
schema:knows :alice .
:bob schema:knows :alice ;
schema:name "Robert".
:carol schema:name "Carol", "Carole" .
:dave schema:name 234 .
:emily foaf:name "Emily" .
:frank schema:name "Frank" ;
schema:email <mailto:frank@example.org> ;
schema:knows :alice, :bob .
:grace schema:name "Grace" ;
schema:knows :alice, _:1 .
_:1 schema:name "Unknown" .
Try it (RDFShape): https://goo.gl/97bYdv
Try it (ShExDemo):https://goo.gl/Y8hBsW
Schema
Data
<User> IRI {
schema:name xsd:string ;
schema:knows @<User> *
}
:alice@<User>,
:bob @<User>,
:carol@<User>,
:dave @<User>,
:emily@<User>,
:frank@<User>,
:grace@<User>
Shape map







Validation process
:alice@:User, :bob@:User, :carol@:User
ShEx
Validator
Result shape map
:User {
schema:name xsd:string ;
schema:knows @:User *
}
ShEx Schema
:alice schema:name "Alice" ;
schema:knows :alice .
:bob schema:knows :alice ;
schema:name "Robert".
:carol schema:name "Carol", "Carole" .
RDF data
Shape map :alice@:User,
:bob@:User,
:carol@!:User
Input: RDF data, ShEx schema, Shape map
Output: Result shape map
Example with more ShEx features
:AdultPerson EXTRA rdf:type {
rdf:type [ schema:Person ] ;
:name xsd:string ;
:age MinInclusive 18 ;
:gender [:Male :Female] OR xsd:string ;
:address @:Address ? ;
:worksFor @:Company + ;
}
:Address CLOSED {
:addressLine xsd:string {1,3} ;
:postalCode /[0-9]{5}/ ;
:state @:State ;
:city xsd:string
}
:Company {
:name xsd:string ;
:state @:State ;
:employee @:AdultPerson * ;
}
:State /[A-Z]{2}/
:alice rdf:type :Student, schema:Person ;
:name "Alice" ;
:age 20 ;
:gender :Male ;
:address [
:addressLine "Bancroft Way" ;
:city "Berkeley" ;
:postalCode "55123" ;
:state "CA"
] ;
:worksFor [
:name "Company" ;
:state "CA" ;
:employee :alice
] . Try it: https://tinyurl.com/yd5hp9z4
More info about ShEx
See:
ShEx by Example (slides):
https://figshare.com/articles/ShExByExample_pptx/6291464
ShEx chapter from Validating RDF data book:
http://book.validatingrdf.com/bookHtml010.html
Short intro to SHACL
W3C recommendation:
https://www.w3.org/TR/shacl/ (July 2017)
RDF vocabulary
2 parts: SHACL-Core, SHACL-SPARQL
SHACL implementations
Name Parts Language - Library Comments
Topbraid SHACL API SHACL Core, SPARQL Java (Jena) Used by TopBraid composer
SHACL playground SHACL Core Javascript (rdflib.js) http://shacl.org/playground/
SHACLEX SHACL Core Scala (Jena, RDF4j) http://rdfshape.weso.es
pySHACL SHACL Core, SPARQL Python (rdflib) https://github.com/RDFLib/pySHACL
Corese SHACL SHACL Core, SPARQL Java (STTL) http://wimmics.inria.fr/corese
RDFUnit SHACL Core, SPARQL Java (Jena) https://github.com/AKSW/RDFUnit
Basic example
Try it. RDFShape https://goo.gl/T1uuzv
prefix : <http://example.org/>
prefix sh: <http://www.w3.org/ns/shacl#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix schema: <http://schema.org/>
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
:hasEmail sh:path schema:email ;
sh:minCount 1;
sh:maxCount 1;
sh:nodeKind sh:IRI .
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org> .
:bob schema:firstName "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email "carol@mail.org" .
Shapes graph
Data graph


Same example with blank nodes
Try it. RDFShape https://goo.gl/ukY5vq
prefix : <http://example.org/>
prefix sh: <http://www.w3.org/ns/shacl#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix schema: <http://schema.org/>
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property [
sh:path schema:name ;
sh:minCount 1; sh:maxCount 1;
sh:datatype xsd:string ;
] ;
sh:property [
sh:path schema:email ;
sh:minCount 1; sh:maxCount 1;
sh:nodeKind sh:IRI ;
] .
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org> .
:bob schema:firstName "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email "carol@mail.org" .
Shapes graph
Data graph


Some definitions about SHACL
Shape: collection of targets and constraints components
Targets: specify which nodes in the data graph must conform to a shape
Constraint components: Determine how to validate a node
Shape
target declarations
constraint components
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
. . .
Validation Report
The output of the validation process is a list of violation errors
No errors  RDF conforms to shapes graph
[ a sh:ValidationReport ;
sh:conforms false ;
sh:result [
a sh:ValidationResult ;
sh:focusNode :bob ;
sh:message
"MinCount violation. Expected 1, obtained: 0" ;
sh:resultPath schema:name ;
sh:resultSeverity sh:Violation ;
sh:sourceConstraintComponent
sh:MinCountConstraintComponent ;
sh:sourceShape :hasName
] ;
...
[ a sh:ValidationReport ;
sh:conforms true
].
SHACL processor
SHACL
Processor
Validation report
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
. . .
Shapes
graph
:alice schema:name "Alice Cooper" ;
schema:email <mailto:alice@mail.org>.
:bob schema:name "Bob" ;
schema:email <mailto:bob@mail.org> .
:carol schema:name "Carol" ;
schema:email <mailto:carol@mail.org> .
Data
Graph
[ a sh:ValidationReport ;
sh:conforms true
].
SHACL Core built-in constraint components
Type Constraints
Cardinality minCount, maxCount
Types of values class, datatype, nodeKind
Values node, in, hasValue, property
Range of values minInclusive, maxInclusive
minExclusive, maxExclusive
String based minLength, maxLength, pattern
Language based languageIn, uniqueLang
Logical constraints not, and, or, xone
Closed shapes closed, ignoredProperties
Property pair constraints equals, disjoint, lessThan, lessThanOrEquals
Non-validating constraints name, description, order, group
Qualified shapes qualifiedValueShape, qualifiedValueShapesDisjoint
qualifiedMinCount, qualifiedMaxCount
Longer example
:AdultPerson EXTRA a {
a [ schema:Person ] ;
:name xsd:string ;
:age MinInclusive 18 ;
:gender [:Male :Female] OR xsd:string ;
:address @:Address ? ;
:worksFor @:Company + ;
}
:Address CLOSED {
:addressLine xsd:string {1,3} ;
:postalCode /[0-9]{5}/ ;
:state @:State ;
:city xsd:string
}
:Company {
:name xsd:string ;
:state @:State ;
:employee @:AdultPerson * ;
}
:State /[A-Z]{2}/
:AdultPerson a sh:NodeShape ;
sh:property [
sh:path rdf:type ;
sh:qualifiedValueShape [
sh:hasValue schema:Person
];
sh:qualifiedMinCount 1 ;
sh:qualifiedMaxCount 1 ;
] ;
sh:targetNode :alice ;
sh:property [ sh:path :name ;
sh:minCount 1; sh:maxCount 1;
sh:datatype xsd:string ;
] ;
sh:property [ sh:path :gender ;
sh:minCount 1; sh:maxCount 1;
sh:in (:Male :Female);
] ;
sh:property [ sh:path :age ;
sh:maxCount 1;
sh:minInclusive 18
] ;
sh:property [ sh:path :address ;
sh:node :Address ;
sh:minCount 1 ; sh:maxCount 1
] ;
sh:property [ sh:path :worksFor ;
sh:node :Company ;
sh:minCount 1 ; sh:maxCount 1
].
:Address a sh:NodeShape ;
sh:closed true ;
sh:property [ sh:path :addressLine;
sh:datatype xsd:string ;
sh:minCount 1 ; sh:maxCount 3
] ;
sh:property [ sh:path :postalCode ;
sh:pattern "[0-9]{5}" ;
sh:minCount 1 ; sh:maxCount 3
] ;
sh:property [ sh:path :city ;
sh:datatype xsd:string ;
sh:minCount 1 ; sh:maxCount 1
] ;
sh:property [ sh:path :state ;
sh:node :State ;
] .
:Company a sh:NodeShape ;
sh:property [ sh:path :name ;
sh:datatype xsd:string
] ;
sh:property [
sh:path :state ;
sh:node :State
] ;
sh:property [ sh:path :employee ;
sh:node :AdultPerson ;
] ;.
:State a sh:NodeShape ;
sh:pattern "[A-Z]{2}" .
In ShEx
In SHACL
Its recursive!!! (not well defined SHACL)
Implementation dependent feature
Try it: https://tinyurl.com/ycl3mkzr
More info about SHACL
SHACL by example (slides):
https://figshare.com/articles/SHACL_by_example/6449645
SHACL chapter at Validating RDF data book
http://book.validatingrdf.com/bookHtml011.html
Some challenges and perspectives
Theoretical foundations of ShEx/SHACL
Inference shapes from data
Validation Usability
RDF Stream validation
Schema ecosystems
Wikidata
Solid
Theoretical foundations of ShEx/SHACL
Conversion between ShEx and SHACL
SHaclEX library converts subsets of both
Challenges
Recursion and negation
Performance and algorithmic complexity
Detect useful subsets of the languages
Convert to SPARQL
Schema/data mapping Shapes Shapes
ShEx SHACL
Inference of Shapes from Data
Useful use case in practice
Knowledge Graph summarization
Some prototypes:
ShExer, RDFShape, ShapeArchitect
Try it with RDFShape:
https://tinyurl.com/y8pjcbyf
Shape Expression
generated for
wd:Q51613194
Shapes
RDF data
infer
Validation usability
Learning from users
Early adopters: WebIndex, HL7 FHIR, Eclipse Lyo, GenWiki,…
Improve error information/visualization/navigation/repairing
Authoring/visualization tools
Propose annotation sets
UI generation
Error reporting/suggestion (SHOULD/MUST/…)
Shapes
RDF Stream validation
Validation of RDF streams
Challenges:
Incremental validation
Named graphs
Addition/removal of triples
Sensor
Stream
RDF
data
Shapes
Validator
Stream
Validation
results
Schema ecosystems: Wikidata
In May, 2019, Wikidata announced ShEx adoption
New namespace for schemas
Example: https://www.wikidata.org/wiki/EntitySchema:E2
It opens lots of opportunities/challenges
Schema evolution and comparison
Schema ecosystems: Solid project
SOLID (SOcial Linked Data): Promoted by Tim Berners-Lee
Goal: Re-decentralize the Web
Separate data from apps
Give users more control about their data
Internally using linked data & RDF
Shapes needed for interoperability
See:
https://ruben.verborgh.org/blog/2019/06/17/shaping-linked-data-apps/
Data
pod
Shape
A
Shape
C
Shape
B
App
1
App
2
App
3
Conclusions
RDF as a basis for knowledge graphs
Explicit schemas can help improve data quality
2 languages proposed: ShEx/SHACL
Lots of new challenges and opportunities
End of presentation

More Related Content

What's hot

What's hot (20)

Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked DataAn introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked Data
 
An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQL
 
SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
 
Linked Data의 RDF 어휘 이해하고 체험하기 - FOAF, SIOC, SKOS를 중심으로 -
Linked Data의 RDF 어휘 이해하고 체험하기 - FOAF, SIOC, SKOS를 중심으로 -Linked Data의 RDF 어휘 이해하고 체험하기 - FOAF, SIOC, SKOS를 중심으로 -
Linked Data의 RDF 어휘 이해하고 체험하기 - FOAF, SIOC, SKOS를 중심으로 -
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Web ontology language (owl)
Web ontology language (owl)Web ontology language (owl)
Web ontology language (owl)
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshell
 
Jarrar: RDFS ( RDF Schema)
Jarrar: RDFS ( RDF Schema) Jarrar: RDFS ( RDF Schema)
Jarrar: RDFS ( RDF Schema)
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative Facts
 
RDF data model
RDF data modelRDF data model
RDF data model
 
The Semantic Web #9 - Web Ontology Language (OWL)
The Semantic Web #9 - Web Ontology Language (OWL)The Semantic Web #9 - Web Ontology Language (OWL)
The Semantic Web #9 - Web Ontology Language (OWL)
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 
Rdf
RdfRdf
Rdf
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
DBpedia InsideOut
DBpedia InsideOutDBpedia InsideOut
DBpedia InsideOut
 
SHACL by example
SHACL by exampleSHACL by example
SHACL by example
 

Similar to Validating RDF data: Challenges and perspectives

2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
vafopoulos
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
Josef Petrák
 

Similar to Validating RDF data: Challenges and perspectives (20)

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
SMW Introduction Wikimedia Hackathon Krabina 2024.pdf
SMW Introduction Wikimedia Hackathon Krabina 2024.pdfSMW Introduction Wikimedia Hackathon Krabina 2024.pdf
SMW Introduction Wikimedia Hackathon Krabina 2024.pdf
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapes
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2
 
HyperGraphQL
HyperGraphQLHyperGraphQL
HyperGraphQL
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your Analytics
 
Validating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape ExpressionsValidating and Describing Linked Data Portals using RDF Shape Expressions
Validating and Describing Linked Data Portals using RDF Shape Expressions
 
Sem webmaubeuge
Sem webmaubeugeSem webmaubeuge
Sem webmaubeuge
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage
 
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
 
Semantic Web
Semantic WebSemantic Web
Semantic Web
 
Linked open data sandwich
Linked open data sandwichLinked open data sandwich
Linked open data sandwich
 
MWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdfMWCon KnowledgeGraph Extension Krabina 2024.pdf
MWCon KnowledgeGraph Extension Krabina 2024.pdf
 
Culture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data LandCulture Geeks Feb talk: Adventures in Linked Data Land
Culture Geeks Feb talk: Adventures in Linked Data Land
 
Semantic Web talk TEMPLATE
Semantic Web talk TEMPLATESemantic Web talk TEMPLATE
Semantic Web talk TEMPLATE
 
Linked opendata parisemantique.fr - 24062011
Linked opendata   parisemantique.fr - 24062011Linked opendata   parisemantique.fr - 24062011
Linked opendata parisemantique.fr - 24062011
 
Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012Introducción a la web semántica - Linkatu - irekia 2012
Introducción a la web semántica - Linkatu - irekia 2012
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 

More from Jose Emilio Labra Gayo

More from Jose Emilio Labra Gayo (20)

Publicaciones de investigación
Publicaciones de investigaciónPublicaciones de investigación
Publicaciones de investigación
 
Introducción a la investigación/doctorado
Introducción a la investigación/doctoradoIntroducción a la investigación/doctorado
Introducción a la investigación/doctorado
 
Legislative data portals and linked data quality
Legislative data portals and linked data qualityLegislative data portals and linked data quality
Legislative data portals and linked data quality
 
Wikidata
WikidataWikidata
Wikidata
 
Legislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologiesLegislative document content extraction based on Semantic Web technologies
Legislative document content extraction based on Semantic Web technologies
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
Introducción a la Web Semántica
Introducción a la Web SemánticaIntroducción a la Web Semántica
Introducción a la Web Semántica
 
2017 Tendencias en informática
2017 Tendencias en informática2017 Tendencias en informática
2017 Tendencias en informática
 
RDF, linked data and semantic web
RDF, linked data and semantic webRDF, linked data and semantic web
RDF, linked data and semantic web
 
19 javascript servidor
19 javascript servidor19 javascript servidor
19 javascript servidor
 
Como publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazadosComo publicar datos: hacia los datos abiertos enlazados
Como publicar datos: hacia los datos abiertos enlazados
 
16 Alternativas XML
16 Alternativas XML16 Alternativas XML
16 Alternativas XML
 
XSLT
XSLTXSLT
XSLT
 
XPath
XPathXPath
XPath
 
Arquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el ServidorArquitectura de la Web y Computación en el Servidor
Arquitectura de la Web y Computación en el Servidor
 
RDF Validation Future work and applications
RDF Validation Future work and applicationsRDF Validation Future work and applications
RDF Validation Future work and applications
 
ShEx vs SHACL
ShEx vs SHACLShEx vs SHACL
ShEx vs SHACL
 
Máster en Ingeniería Web
Máster en Ingeniería WebMáster en Ingeniería Web
Máster en Ingeniería Web
 
2016 temuco tecnologias_websemantica
2016 temuco tecnologias_websemantica2016 temuco tecnologias_websemantica
2016 temuco tecnologias_websemantica
 
2015 bogota datos_enlazados
2015 bogota datos_enlazados2015 bogota datos_enlazados
2015 bogota datos_enlazados
 

Recently uploaded

原版定制(PSU毕业证书)美国宾州州立大学毕业证原件一模一样
原版定制(PSU毕业证书)美国宾州州立大学毕业证原件一模一样原版定制(PSU毕业证书)美国宾州州立大学毕业证原件一模一样
原版定制(PSU毕业证书)美国宾州州立大学毕业证原件一模一样
rgdasda
 
一比一原版(Soton毕业证书)南安普顿大学毕业证原件一模一样
一比一原版(Soton毕业证书)南安普顿大学毕业证原件一模一样一比一原版(Soton毕业证书)南安普顿大学毕业证原件一模一样
一比一原版(Soton毕业证书)南安普顿大学毕业证原件一模一样
Fi
 
一比一原版布兰迪斯大学毕业证如何办理
一比一原版布兰迪斯大学毕业证如何办理一比一原版布兰迪斯大学毕业证如何办理
一比一原版布兰迪斯大学毕业证如何办理
A
 
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
gfhdsfr
 
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
Fir
 
一比一原版(Princeton毕业证书)普林斯顿大学毕业证如何办理
一比一原版(Princeton毕业证书)普林斯顿大学毕业证如何办理一比一原版(Princeton毕业证书)普林斯顿大学毕业证如何办理
一比一原版(Princeton毕业证书)普林斯顿大学毕业证如何办理
C
 
一比一定制波士顿学院毕业证学位证书
一比一定制波士顿学院毕业证学位证书一比一定制波士顿学院毕业证学位证书
一比一定制波士顿学院毕业证学位证书
A
 
一比一定制加州大学欧文分校毕业证学位证书
一比一定制加州大学欧文分校毕业证学位证书一比一定制加州大学欧文分校毕业证学位证书
一比一定制加州大学欧文分校毕业证学位证书
A
 
一比一定制(OSU毕业证书)美国俄亥俄州立大学毕业证学位证书
一比一定制(OSU毕业证书)美国俄亥俄州立大学毕业证学位证书一比一定制(OSU毕业证书)美国俄亥俄州立大学毕业证学位证书
一比一定制(OSU毕业证书)美国俄亥俄州立大学毕业证学位证书
rgdasda
 
一比一定制(Temasek毕业证书)新加坡淡马锡理工学院毕业证学位证书
一比一定制(Temasek毕业证书)新加坡淡马锡理工学院毕业证学位证书一比一定制(Temasek毕业证书)新加坡淡马锡理工学院毕业证学位证书
一比一定制(Temasek毕业证书)新加坡淡马锡理工学院毕业证学位证书
B
 

Recently uploaded (20)

原版定制(PSU毕业证书)美国宾州州立大学毕业证原件一模一样
原版定制(PSU毕业证书)美国宾州州立大学毕业证原件一模一样原版定制(PSU毕业证书)美国宾州州立大学毕业证原件一模一样
原版定制(PSU毕业证书)美国宾州州立大学毕业证原件一模一样
 
一比一原版(Soton毕业证书)南安普顿大学毕业证原件一模一样
一比一原版(Soton毕业证书)南安普顿大学毕业证原件一模一样一比一原版(Soton毕业证书)南安普顿大学毕业证原件一模一样
一比一原版(Soton毕业证书)南安普顿大学毕业证原件一模一样
 
一比一原版布兰迪斯大学毕业证如何办理
一比一原版布兰迪斯大学毕业证如何办理一比一原版布兰迪斯大学毕业证如何办理
一比一原版布兰迪斯大学毕业证如何办理
 
Free on Wednesdays T Shirts Free on Wednesdays Sweatshirts
Free on Wednesdays T Shirts Free on Wednesdays SweatshirtsFree on Wednesdays T Shirts Free on Wednesdays Sweatshirts
Free on Wednesdays T Shirts Free on Wednesdays Sweatshirts
 
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
一比一原版(Cranfield毕业证书)英国克兰菲尔德大学毕业证如何办理
 
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
一比一定制(USC毕业证书)美国南加州大学毕业证学位证书
 
Free scottie t shirts Free scottie t shirts
Free scottie t shirts Free scottie t shirtsFree scottie t shirts Free scottie t shirts
Free scottie t shirts Free scottie t shirts
 
一比一原版(Princeton毕业证书)普林斯顿大学毕业证如何办理
一比一原版(Princeton毕业证书)普林斯顿大学毕业证如何办理一比一原版(Princeton毕业证书)普林斯顿大学毕业证如何办理
一比一原版(Princeton毕业证书)普林斯顿大学毕业证如何办理
 
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
 
一比一定制波士顿学院毕业证学位证书
一比一定制波士顿学院毕业证学位证书一比一定制波士顿学院毕业证学位证书
一比一定制波士顿学院毕业证学位证书
 
Development Lifecycle.pptx for the secure development of apps
Development Lifecycle.pptx for the secure development of appsDevelopment Lifecycle.pptx for the secure development of apps
Development Lifecycle.pptx for the secure development of apps
 
一比一定制加州大学欧文分校毕业证学位证书
一比一定制加州大学欧文分校毕业证学位证书一比一定制加州大学欧文分校毕业证学位证书
一比一定制加州大学欧文分校毕业证学位证书
 
💞 Safe And Seℂure ℂall Girls Dehradun ℂall Girls Serviℂe Just ℂall 🍑👄93157910...
💞 Safe And Seℂure ℂall Girls Dehradun ℂall Girls Serviℂe Just ℂall 🍑👄93157910...💞 Safe And Seℂure ℂall Girls Dehradun ℂall Girls Serviℂe Just ℂall 🍑👄93157910...
💞 Safe And Seℂure ℂall Girls Dehradun ℂall Girls Serviℂe Just ℂall 🍑👄93157910...
 
一比一定制(OSU毕业证书)美国俄亥俄州立大学毕业证学位证书
一比一定制(OSU毕业证书)美国俄亥俄州立大学毕业证学位证书一比一定制(OSU毕业证书)美国俄亥俄州立大学毕业证学位证书
一比一定制(OSU毕业证书)美国俄亥俄州立大学毕业证学位证书
 
一比一定制(Temasek毕业证书)新加坡淡马锡理工学院毕业证学位证书
一比一定制(Temasek毕业证书)新加坡淡马锡理工学院毕业证学位证书一比一定制(Temasek毕业证书)新加坡淡马锡理工学院毕业证学位证书
一比一定制(Temasek毕业证书)新加坡淡马锡理工学院毕业证学位证书
 
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
 
AI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model GeneratorAI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model Generator
 
🍑👄Dehradun Esℂorts Serviℂe☎️9315791090🍑👄 ℂall Girl serviℂe in ☎️Dehradun ℂall...
🍑👄Dehradun Esℂorts Serviℂe☎️9315791090🍑👄 ℂall Girl serviℂe in ☎️Dehradun ℂall...🍑👄Dehradun Esℂorts Serviℂe☎️9315791090🍑👄 ℂall Girl serviℂe in ☎️Dehradun ℂall...
🍑👄Dehradun Esℂorts Serviℂe☎️9315791090🍑👄 ℂall Girl serviℂe in ☎️Dehradun ℂall...
 
The Rise of Subscription-Based Digital Services.pdf
The Rise of Subscription-Based Digital Services.pdfThe Rise of Subscription-Based Digital Services.pdf
The Rise of Subscription-Based Digital Services.pdf
 
iThome_CYBERSEC2024_Drive_Into_the_DarkWeb
iThome_CYBERSEC2024_Drive_Into_the_DarkWebiThome_CYBERSEC2024_Drive_Into_the_DarkWeb
iThome_CYBERSEC2024_Drive_Into_the_DarkWeb
 

Validating RDF data: Challenges and perspectives

  • 1. Validating RDF data: Challenges and perspectives Jose Emilio Labr Gayo WESO Research Group (WEb Semantics Oviedo) University of Oviedo, Spain
  • 2. About me… Founded WESO (Web Semantics Oviedo) research group Practical applications of semantic technologies since 2004 Several domains: e-Government, e-Health Some books: "Web semántica" (in Spanish), 2012 "Validating RDF data", 2017 …and software: SHaclEX (Scala library, implements ShEx & SHACL) RDFShape (RDF playground) HTML version: http://book.validatingrdf.com Examples: https://github.com/labra/validatingRDFBookExamples
  • 3. Contents Short intro to semantic Web & knowledge graphs Short overview of RDF Understanding the RDF validation problem 2 technologies: ShEx/SHACL Challenges & perspectives
  • 4. Semantic Web Vision of the Web as web of data Not only pages, but also data: linked data Data understandable by machines Related with: Big Data Artificial intelligence Knowledge representation Example: Wikipedia vs Wikidata Tim Berners-Lee Source: Wikipedia Source: http://www.internetlivestats.com/total-number-of-websites/ Date: 05/04/2019 (19:05h)
  • 5. Who consumes Web information? People We access through some device ...and Machines They show us web pages (browsers) ...but also manage information (bots) Filtering content Doing suggestions ... "If Google doesn't understand your Web page, it almost doesn't exist"
  • 6. Web information Documentos Information retrieval DocumentosDocumentos Posts Tweets Web pages ... Publish Web documents Reality Mental model Web information must be machine-processable Semantic web technologies focus on how to achieve this goal Sensors Generate User Content generation & filtering
  • 7. People vs Machines Programmed for some tasks Predictable (no mistakes*) No problem with repetitive tasks Problems to understand the context Creativity, imagination Unpredictable (we do mistakes) We get tired with repetitive tasks Understanding based on context *when well programmed
  • 8. Information understandable by machines? Example: "Oviedo has a temperature of 36 degrees" Problem: Ambiguity and context identification Oviedo ? It can be... A city in Spain ...or a city in Florida, USA ...or a soccer player http://www.wikidata.org/entity/Q325997 http://wikidata.org/entity/Q14317 http://www.wikidata.org/entity/Q1813449 ...has a temperatrure of... https://www.wikidata.org/wiki/Property:P2076 Identifiers (URIs) in Wikidata Machine representation http://www.wikidata.org/entity/Q14317 https://www.wikidata.org/prop/direct/P2076 36^^<http://www.w3.org/2001/XMLSchema#integer>
  • 9. Knowledge representation for machines Simple statement (triple 3 elements): subject, predicate, object http://www.wikidata.org/entity/Q14317 https://www.wikidata.org/prop/direct/P2076 subject predicate object 36^^<http://www.w3.org/2001/XMLSchema#integer> wd:Q14317 wdt:P2076 36^^xsd:integer wd: <http://www.wikidata.org/entity/> wdt: <https://www.wikidata.org/prop/direct/> xsd: <http://www.w3.org/2001/XMLSchema#> Prefix declarations help simplify long URIs
  • 10. Knowledge graphs wd:Q14317 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso Wikidata statistics (17-07-2019) 57,462,853 entities 728,979,397 triples Source: https://www.wikidata.org/wiki/Wikidata:Statistics Wikidata is an example of a knowledge graph There are a lot more: Open: DBpedia, BabelNet, etc Proprietary: Google, facebook, Microsoft, etc. also have knowledge graphs
  • 11. Web information & knowledge graphs Documentos Information retrieval Knowledge graphs DocumentosDocumentos Posts Tweets Web pages ... Publish Web documents Reality Mental model Knowledge graphs are a key part of the Web nowadays Sensors Generate User Content generation & filtering
  • 12. RDF RDF (Resource Description Framework) Allows to represent knowledge graphs RDF data model = graph as a set of statements Predicates = URIs Several syntaxes Turtle, N-Triples, JSON-LD,... 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 13. prefix wd: <http://www.wikidata.org/entity/> prefix wdt: <https://www.wikidata.org/prop/direct/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> wd:Q14317 wdt:P1082 220020 ; wdt:P2076 36 ; wdt:P1376 wd:Q3934 . wd:Q10514 wdt:P19 wd:Q14317 ; wdt:P569 "1981-07-29"^^xsd:date ; wdt:P18 <picture.jpg> . RDF syntaxes: Turtle 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 14. <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1082> "220020"^^<http://www.w3.org/2001/XMLSchema#integer> . <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P1376> <http://www.wikidata.org/entity/Q3934> . <http://www.wikidata.org/entity/Q14317> <https://www.wikidata.org/prop/direct/P2076> "36"^^<http://www.w3.org/2001/XMLSchema#integer> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P18> <picture.jpg> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P19> <http://www.wikidata.org/entity/Q14317> . <http://www.wikidata.org/entity/Q10514> <https://www.wikidata.org/prop/direct/P569> "1981-07-29"^^<http://www.w3.org/2001/XMLSchema#date> . RDF syntaxes: N-Triples 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 15. { "@context" : "http://...wikidata.json", "@graph" : [ { "@id" : "wd:Q10514", "wdt:P18" : "picture.jpg", "wdt:P19" : "wd:Q14317", "P569" : "1981-07-29" }, { "@id" : "wd:Q14317", "wdt:P1082" : 220020, "wdt:P1376" : "wd:Q3934", "wdt:P2076" : 36 } ] } RDF syntaxes: JSON-LD 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 16. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wdt="https://www.wikidata.org/prop/direct/" xmlns:wd="http://www.wikidata.org/entity/" xmlns:xsd="http://www.w3.org/2001/XMLSchema#"> <rdf:Description rdf:about="http://www.wikidata.org/entity/Q10514"> <wdt:P18>picture.jpg</wdt:P18> <wdt:P19> <rdf:Description rdf:about="http://www.wikidata.org/entity/Q14317"> <wdt:P1082 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">220020</wdt:P1082> <wdt:P1376 rdf:resource="http://www.wikidata.org/entity/Q3934"/> <wdt:P2076 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">36</wdt:P2076> </rdf:Description> </wdt:P19> <wdt:P569 rdf:datatype=http://www.w3.org/2001/XMLSchema#date>1981-07-29</wdt:P569> </rdf:Description> </rdf:RDF> RDF syntaxes: RDF/XML 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg BirthPlace Picture BirthDate capital temperature population Oviedo Asturias Fernando Alonso wd:Q14317
  • 17. RDF graphs are composable RDF helps information integration 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg wd:Q14317 :alice :bob :carol schema:birthPlace schema:birthPlace schema:knows schema:knows Alice Cooper Schema:name Robert Smith schema:nameschema:knows Carol King schema:name
  • 18. RDF data model Graph = set of statements (triples) Predicates (arcs) = URIs Subjects: URIs* Objects: URIs or literals* 36^^xsd:integer wdt:P2076 wd:Q10514 wd:Q3934 wdt:P19 wdt:P1376 "1981-07-29"^^xsd:date 220020^^xsd:integer wdt:P1082 wdt:P18 wdt:P569 ...jpg wd:Q14317 *Exception: blank nodes (next slide)
  • 19. RDF data model: Blank nodes A statement about something Example: "Alice knows someone who knows Bob" :alice schema:knows :bob schema:knows :alice schema:knows _:x ; _:x schema:knows :bob . Notation (in Turtle) :alice schema:knows [ schema:knows :bob ] .  x ( :alice :knows x  x :knows :bob )
  • 20. RDF data model: Literals :bob schema:name "Robert"^^xsd:string ; schema:age "18"^^xsd:integer ; schema:birthDate "2010-04-12"^^xsd:date . :bob schema:name "Robert" ; schema:age 18 ; schema:birthDate "2010-04-12"^^xsd:date . Objects can also be literals Literals contain a lexical form and a datatype Common datatypes: XML Schema primitive datatypes If not specified, a literal has type xsd:string :bob schema:name "2010-04-12"^^xsd:date18"Robert Smith" schema:age schema:birthDate
  • 21. RDF data model: Language tagged strings String literals can be qualified by a language tag They have datatype rdfs:langString :spain rdfs:label "Spain"@en ; rdfs:label "España"@es . :Spain "España"@es"Spain"@en rdfs:label rdfs:label
  • 22. ...and that's all Yes, the RDF Data model is simple
  • 24. SELECT ?name ?birthDate ?picture WHERE { ?person wdt:P19 wd:Q14317 ; rdfs:label ?name ; wdt:P569 ?birthDate ; wdt:P18 ?picture . FILTER (Lang(?name) = 'es') } SPARQL SPARQL = Simple Protocol and RDF Query Language Similar to SQL, but for RDF Example: People born in Oviedo wdt:P19 "1981-07-29"^^xsd:date wdt:P18 wdt:P569 Oviedo wd:Q14317 "Fernando Alonso"@en rdfs:label ?person ?birthPlace?name ?picture wdt:P19 wdt:P18wdt:P569 wd:Q14317 rdfs:label wd:Q10514 wdt:P19 ...
  • 25. Shared entities and vocabularies URIs as global identifiers allow to reuse them Lots of vocabularies: domain specific or general purpose Examples: schema.org: properties that major browsers can recognize Linked Open Vocabularies
  • 26. Inference and ontologies Some vocabularies define properties that can infer new information RDF Schema: Classes, subclasses, etc. OWL: Ontologies using description logics Trade-off between expressiveness and complexity rdf:type :Student :alice :Person rdfs:subClassOf :HumanBeing owl:sameAs rdf:type rdf:type
  • 27. RDF, the good parts... RDF as an integration language RDF as a lingua franca for semantic web and linked data RDF flexibility & integration Data can be adapted to multiple environments Open and reusable data by default RDF for knowledge representation RDF data stores & SPARQL
  • 28. RDF, the other parts Consuming & producing RDF Multiple syntaxes: Turtle, RDF/XML, JSON-LD, ... Embedding RDF in HTML Describing and validating RDF content Producer Consumer
  • 29. Why describe & validate RDF? For producers Developers can understand the contents they are going to produce Ensure they produce the expected structure Advertise and document the structure Generate interfaces For consumers Understand the contents Verify the structure before processing it Query generation & optimization Producer Consumer Shapes
  • 30. Similar technologies Technology Schema Relational Databases DDL XML DTD, XML Schema, RelaxNG, Schematron Json Json Schema RDF ? Fill that gap
  • 31. Understanding the problem Identifying the shape of graphs... Shapes can describe the form of a node (node constraint) ...and the number of possible arcs incoming/outgoing from a node ...and the possible values associated with those arcs :alice schema:name "Alice"; schema:knows :bob, :carol . IRI schema:name string 1 schema:knows IRI 0, 1,... RDF Node Shape RDF Node that represents a User <UserShape> IRI { schema:name xsd:string ; schema:knows IRI * } ShEx
  • 32. Understanding the problem Repeated properties The same property can be used for different purposes in the same data Example: A product must have 2 codes with different structure :product schema:productID "isbn:123-456-789"; schema:productID "code456" . A practical example from FHIR See: http://hl7-fhir.github.io/observation-example-bloodpressure.ttl.html
  • 33. Understanding the problem Shapes ≠ types Nodes in RDF graphs can have 0, 1 or many rdf:type declarations A type can be used in multiple contexts, e.g. foaf:Person Nodes are not necessarily annotated with discriminating types Nodes with type :Person can represent friends, students, patients,... Different meanings and different structure depending on context Specific validation constraints for different contexts
  • 34. Understanding the problem RDF validation ≠ ontology definition ≠ instance data Ontologies are usually focused on domain entities RDF validation is focused on RDF graph features (lower level) Ontology Constraints RDF Validation Instance data Different levels :alice schema:name "Alice"; schema:knows :bob . <User> IRI { schema:name xsd:string ; schema:knows IRI } schema:knows a owl:ObjectProperty ; rdfs:domain schema:Person ; rdfs:range schema:Person . A user must have only two properties: schema:name of value xsd:string schema:knows with an IRI value
  • 35. Why not using SPARQL to validate? Pros: Expressive Ubiquitous Cons Expressive Idiomatic many ways to encode the same constraint ASK {{ SELECT ?Person { ?Person schema:name ?o . } GROUP BY ?Person HAVING (COUNT(*)=1) } { SELECT ?Person { ?Person schema:name ?o . FILTER ( isLiteral(?o) && datatype(?o) = xsd:string ) } GROUP BY ?Person HAVING (COUNT(*)=1) } { SELECT ?Person (COUNT(*) AS ?c1) { ?Person schema:gender ?o . } GROUP BY ?Person HAVING (COUNT(*)=1)} { SELECT ?Person (COUNT(*) AS ?c2) { ?S schema:gender ?o . FILTER ((?o = schema:Female || ?o = schema:Male)) } GROUP BY ?Person HAVING (COUNT(*)=1)} FILTER (?c1 = ?c2) } Example: Define SPARQL query to check: There must be one schema:name which must be a xsd:string, and one schema:gender which must be schema:Male or schema:Female
  • 36. Previous/other approaches SPIN, by TopQuadrant http://spinrdf.org/ SPARQL templates, Influenced SHACL Stardog ICV: http://docs.stardog.com/icv/icv-specification.html OWL with UNA and CWA OSLC resource shapes: https://www.w3.org/Submission/shapes/ Vocabulary for RDF validation Dublin Core Application profiles (K. Coyle, T. Baker) http://dublincore.org/documents/dc-dsp/ RDF Data Descriptions (Fischer et al) http://ceur-ws.org/Vol-1330/paper-33.pdf RDFUnit (D. Kontokostas) http://aksw.org/Projects/RDFUnit.html ...
  • 37. ShEx and SHACL 2013 RDF Validation Workshop Conclusions of the workshop: There is a need of a higher level, concise language for RDF Validation ShEx initially proposed (v 1.0) 2014 W3c Data Shapes WG chartered 2017 SHACL accepted as W3C recommendation 2017 ShEx 2.0 released as Community group draft 2018 ShEx adopted by Wikidata
  • 38. Short intro to ShEx ShEx (Shape Expressions Language) Concise and human-readable language for RDF validation & description Syntax similar to SPARQL, Turtle Semantics inspired by regular expressions & RelaxNG 2 syntaxes: Compact and RDF/JSON-LD Official info: http://shex.io Semantics: http://shex.io/shex-semantics/, primer: http://shex.io/shex-primer
  • 39. ShEx implementations and playgrounds Libraries: shex.js: Javascript SHaclEX: Scala (Jena/RDF4j) PyShEx: Python shex-java: Java Ruby-ShEx: Ruby Online demos & playgrounds ShEx-simple RDFShape ShEx-Java ShExValidata
  • 40. Simple example Nodes conforming to <User> shape must: • Be IRIs • Have exactly one schema:name with a value of type xsd:string • Have zero or more schema:knows whose values conform to <User> prefix schema: <http://schema.org/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> <User> IRI { schema:name xsd:string ; schema:knows @<User> * } Prefix declarations as in Turtle/SPARQL
  • 41. RDF Validation using ShEx :alice schema:name "Alice" ; schema:knows :alice . :bob schema:knows :alice ; schema:name "Robert". :carol schema:name "Carol", "Carole" . :dave schema:name 234 . :emily foaf:name "Emily" . :frank schema:name "Frank" ; schema:email <mailto:frank@example.org> ; schema:knows :alice, :bob . :grace schema:name "Grace" ; schema:knows :alice, _:1 . _:1 schema:name "Unknown" . Try it (RDFShape): https://goo.gl/97bYdv Try it (ShExDemo):https://goo.gl/Y8hBsW Schema Data <User> IRI { schema:name xsd:string ; schema:knows @<User> * } :alice@<User>, :bob @<User>, :carol@<User>, :dave @<User>, :emily@<User>, :frank@<User>, :grace@<User> Shape map       
  • 42. Validation process :alice@:User, :bob@:User, :carol@:User ShEx Validator Result shape map :User { schema:name xsd:string ; schema:knows @:User * } ShEx Schema :alice schema:name "Alice" ; schema:knows :alice . :bob schema:knows :alice ; schema:name "Robert". :carol schema:name "Carol", "Carole" . RDF data Shape map :alice@:User, :bob@:User, :carol@!:User Input: RDF data, ShEx schema, Shape map Output: Result shape map
  • 43. Example with more ShEx features :AdultPerson EXTRA rdf:type { rdf:type [ schema:Person ] ; :name xsd:string ; :age MinInclusive 18 ; :gender [:Male :Female] OR xsd:string ; :address @:Address ? ; :worksFor @:Company + ; } :Address CLOSED { :addressLine xsd:string {1,3} ; :postalCode /[0-9]{5}/ ; :state @:State ; :city xsd:string } :Company { :name xsd:string ; :state @:State ; :employee @:AdultPerson * ; } :State /[A-Z]{2}/ :alice rdf:type :Student, schema:Person ; :name "Alice" ; :age 20 ; :gender :Male ; :address [ :addressLine "Bancroft Way" ; :city "Berkeley" ; :postalCode "55123" ; :state "CA" ] ; :worksFor [ :name "Company" ; :state "CA" ; :employee :alice ] . Try it: https://tinyurl.com/yd5hp9z4
  • 44. More info about ShEx See: ShEx by Example (slides): https://figshare.com/articles/ShExByExample_pptx/6291464 ShEx chapter from Validating RDF data book: http://book.validatingrdf.com/bookHtml010.html
  • 45. Short intro to SHACL W3C recommendation: https://www.w3.org/TR/shacl/ (July 2017) RDF vocabulary 2 parts: SHACL-Core, SHACL-SPARQL
  • 46. SHACL implementations Name Parts Language - Library Comments Topbraid SHACL API SHACL Core, SPARQL Java (Jena) Used by TopBraid composer SHACL playground SHACL Core Javascript (rdflib.js) http://shacl.org/playground/ SHACLEX SHACL Core Scala (Jena, RDF4j) http://rdfshape.weso.es pySHACL SHACL Core, SPARQL Python (rdflib) https://github.com/RDFLib/pySHACL Corese SHACL SHACL Core, SPARQL Java (STTL) http://wimmics.inria.fr/corese RDFUnit SHACL Core, SPARQL Java (Jena) https://github.com/AKSW/RDFUnit
  • 47. Basic example Try it. RDFShape https://goo.gl/T1uuzv prefix : <http://example.org/> prefix sh: <http://www.w3.org/ns/shacl#> prefix xsd: <http://www.w3.org/2001/XMLSchema#> prefix schema: <http://schema.org/> :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . :hasEmail sh:path schema:email ; sh:minCount 1; sh:maxCount 1; sh:nodeKind sh:IRI . :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org> . :bob schema:firstName "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email "carol@mail.org" . Shapes graph Data graph  
  • 48. Same example with blank nodes Try it. RDFShape https://goo.gl/ukY5vq prefix : <http://example.org/> prefix sh: <http://www.w3.org/ns/shacl#> prefix xsd: <http://www.w3.org/2001/XMLSchema#> prefix schema: <http://schema.org/> :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property [ sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string ; ] ; sh:property [ sh:path schema:email ; sh:minCount 1; sh:maxCount 1; sh:nodeKind sh:IRI ; ] . :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org> . :bob schema:firstName "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email "carol@mail.org" . Shapes graph Data graph  
  • 49. Some definitions about SHACL Shape: collection of targets and constraints components Targets: specify which nodes in the data graph must conform to a shape Constraint components: Determine how to validate a node Shape target declarations constraint components :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . . . .
  • 50. Validation Report The output of the validation process is a list of violation errors No errors  RDF conforms to shapes graph [ a sh:ValidationReport ; sh:conforms false ; sh:result [ a sh:ValidationResult ; sh:focusNode :bob ; sh:message "MinCount violation. Expected 1, obtained: 0" ; sh:resultPath schema:name ; sh:resultSeverity sh:Violation ; sh:sourceConstraintComponent sh:MinCountConstraintComponent ; sh:sourceShape :hasName ] ; ... [ a sh:ValidationReport ; sh:conforms true ].
  • 51. SHACL processor SHACL Processor Validation report :UserShape a sh:NodeShape ; sh:targetNode :alice, :bob, :carol ; sh:nodeKind sh:IRI ; sh:property :hasName, :hasEmail . :hasName sh:path schema:name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string . . . . Shapes graph :alice schema:name "Alice Cooper" ; schema:email <mailto:alice@mail.org>. :bob schema:name "Bob" ; schema:email <mailto:bob@mail.org> . :carol schema:name "Carol" ; schema:email <mailto:carol@mail.org> . Data Graph [ a sh:ValidationReport ; sh:conforms true ].
  • 52. SHACL Core built-in constraint components Type Constraints Cardinality minCount, maxCount Types of values class, datatype, nodeKind Values node, in, hasValue, property Range of values minInclusive, maxInclusive minExclusive, maxExclusive String based minLength, maxLength, pattern Language based languageIn, uniqueLang Logical constraints not, and, or, xone Closed shapes closed, ignoredProperties Property pair constraints equals, disjoint, lessThan, lessThanOrEquals Non-validating constraints name, description, order, group Qualified shapes qualifiedValueShape, qualifiedValueShapesDisjoint qualifiedMinCount, qualifiedMaxCount
  • 53. Longer example :AdultPerson EXTRA a { a [ schema:Person ] ; :name xsd:string ; :age MinInclusive 18 ; :gender [:Male :Female] OR xsd:string ; :address @:Address ? ; :worksFor @:Company + ; } :Address CLOSED { :addressLine xsd:string {1,3} ; :postalCode /[0-9]{5}/ ; :state @:State ; :city xsd:string } :Company { :name xsd:string ; :state @:State ; :employee @:AdultPerson * ; } :State /[A-Z]{2}/ :AdultPerson a sh:NodeShape ; sh:property [ sh:path rdf:type ; sh:qualifiedValueShape [ sh:hasValue schema:Person ]; sh:qualifiedMinCount 1 ; sh:qualifiedMaxCount 1 ; ] ; sh:targetNode :alice ; sh:property [ sh:path :name ; sh:minCount 1; sh:maxCount 1; sh:datatype xsd:string ; ] ; sh:property [ sh:path :gender ; sh:minCount 1; sh:maxCount 1; sh:in (:Male :Female); ] ; sh:property [ sh:path :age ; sh:maxCount 1; sh:minInclusive 18 ] ; sh:property [ sh:path :address ; sh:node :Address ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:path :worksFor ; sh:node :Company ; sh:minCount 1 ; sh:maxCount 1 ]. :Address a sh:NodeShape ; sh:closed true ; sh:property [ sh:path :addressLine; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 3 ] ; sh:property [ sh:path :postalCode ; sh:pattern "[0-9]{5}" ; sh:minCount 1 ; sh:maxCount 3 ] ; sh:property [ sh:path :city ; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:path :state ; sh:node :State ; ] . :Company a sh:NodeShape ; sh:property [ sh:path :name ; sh:datatype xsd:string ] ; sh:property [ sh:path :state ; sh:node :State ] ; sh:property [ sh:path :employee ; sh:node :AdultPerson ; ] ;. :State a sh:NodeShape ; sh:pattern "[A-Z]{2}" . In ShEx In SHACL Its recursive!!! (not well defined SHACL) Implementation dependent feature Try it: https://tinyurl.com/ycl3mkzr
  • 54. More info about SHACL SHACL by example (slides): https://figshare.com/articles/SHACL_by_example/6449645 SHACL chapter at Validating RDF data book http://book.validatingrdf.com/bookHtml011.html
  • 55. Some challenges and perspectives Theoretical foundations of ShEx/SHACL Inference shapes from data Validation Usability RDF Stream validation Schema ecosystems Wikidata Solid
  • 56. Theoretical foundations of ShEx/SHACL Conversion between ShEx and SHACL SHaclEX library converts subsets of both Challenges Recursion and negation Performance and algorithmic complexity Detect useful subsets of the languages Convert to SPARQL Schema/data mapping Shapes Shapes ShEx SHACL
  • 57. Inference of Shapes from Data Useful use case in practice Knowledge Graph summarization Some prototypes: ShExer, RDFShape, ShapeArchitect Try it with RDFShape: https://tinyurl.com/y8pjcbyf Shape Expression generated for wd:Q51613194 Shapes RDF data infer
  • 58. Validation usability Learning from users Early adopters: WebIndex, HL7 FHIR, Eclipse Lyo, GenWiki,… Improve error information/visualization/navigation/repairing Authoring/visualization tools Propose annotation sets UI generation Error reporting/suggestion (SHOULD/MUST/…) Shapes
  • 59. RDF Stream validation Validation of RDF streams Challenges: Incremental validation Named graphs Addition/removal of triples Sensor Stream RDF data Shapes Validator Stream Validation results
  • 60. Schema ecosystems: Wikidata In May, 2019, Wikidata announced ShEx adoption New namespace for schemas Example: https://www.wikidata.org/wiki/EntitySchema:E2 It opens lots of opportunities/challenges Schema evolution and comparison
  • 61. Schema ecosystems: Solid project SOLID (SOcial Linked Data): Promoted by Tim Berners-Lee Goal: Re-decentralize the Web Separate data from apps Give users more control about their data Internally using linked data & RDF Shapes needed for interoperability See: https://ruben.verborgh.org/blog/2019/06/17/shaping-linked-data-apps/ Data pod Shape A Shape C Shape B App 1 App 2 App 3
  • 62. Conclusions RDF as a basis for knowledge graphs Explicit schemas can help improve data quality 2 languages proposed: ShEx/SHACL Lots of new challenges and opportunities

Editor's Notes

  1. ShEx: 20 lines of code SHACL: 60 lines of code