RDF data validation 2017 SHACL

Validation of RDF Data
Jean-Paul Calbimonte
University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis)
Zurich, December 2017
@jpcik

2
HES-SO:
University of Applied Sciences and Arts Western Switzerland
We are here,
In the heart of the Valais!

3
Thanks to José Emilio Labra et al.
RDF and Linked Data Validation (ESWC 2016)
materials adapted from:
http://weso.github.io/RDFValidation_ESWC16/

4
a reminder …

5
RDF
University of Zurich City of Zurich
is located in
The University of Zurich
is located in the city of
Zurich
A triple:
http://dbpedia.org/resource/University_of_Zurich http://dbpedia.org/resource/Zürich
http://dbpedia.org/ontology/cityAn RDF triple:
<http://dbpedia.org/resource/University_of_Zurich> <http://dbpedia.org/ontology/city> <http://dbpedia.org/resource/Zürich>.
An RDF triple in N-Triples format:

6
RDF graphs
http://dbpedia.org/resource
/University_of_Zurich
http://dbpedia.org
/resource/Zürichhttp://dbpedia.org/ontology/city
1833 http://dbpedia.org/property/established
xsd:integer
3702
xsd:integer http://dbpedia.org/ontology/facultySize
http://dbpedia.org/
ontology/University
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
/Doris_Leuthard
http://dbpedia.org/ontology/almaMater
http://dbpedia.org/resource/President
_of_the_Swiss_Confederation
/Merenschwand
1963-04-10
LeuthardDoris
/ETH_Zurich
http://dbpedia.org/
ontology/city
http://xmlns.com/foaf/0.1
/givenName
http://xmlns.com/foaf/0.1
/surname
http://dbpedia.org/ontology
/birthDate
http://dbpedia.org/
ontology/birthPlace
http://dbpedia.org/
property/title
xsd:date
xsd:string xsd:string

7
RDF Turtle Format
prefix db: <http://dbpedia.org/resource/>
prefix dbo: <http://dbpedia.org/ontology/>
prefix dbp: <http://dbpedia.org/property/>
prefix foaf:<http://xmlns.com/foaf/0.1/>
db:University_of_Zurich dbo:city db:Zürich ;
dbp:established 1833 ;
dbo:facultySize 3702^^xsd:integer ;
dbp:almaMater db:Doris_Leuthard .
db:Doris_Leuthard dbp:birthDate "1963-04-10"^^xsd:date;
foaf:givenName "Doris";
foaf:surname "Leuthard";
dbo:birthPlace db:Merenschwand;
dbp:title db:President_of_the_Swiss_Confederation .
URIs
Literals
Predicate URIs

8
and now validation …

9
Why RDF Validation?
• Understand data contents
• Expected structure
• Describe data requirements
• Data guarantees
• Verifiable structure/contents
• Query processing
• Optimizations
In other technologies:
RDB: DDL
XML: DTD/XML Schema
JSON: JSON Schema

10
Use OWL?
ex:almaMater
a owl:ObjectProperty ;
rdfs:domain ex:Person ;
rdfs:range ex:University ;
Different purpose
Different level of abstraction
OWL -> ontology modeling

11
Validation Alternatives
• SPARQL queries
• SPIN
• Stardog ICV (based on OWL)
• OSLC Resource shapes
• RDFUnit
• RDF data descriptors
• ShEx expressions
what we will see today:
SHACL: W3C Recommendation (July 2017)

12
Shapes Constraint Language SHACL
https://www.w3.org/TR/shacl/
• Language for validating RDF graphs
• Conditions represented as shapes
• Shapes expressed in RDF
• SPARQL-based extensions
• W3C Recommendation

13
SHACL Basics
Shape
Node Shape Property Shape
shapes about
the focus node
shapes about the values
of a property/path
how to validate a focus node based on:
- values of properties
- other characteristics
Focus Node
An RDF term that is validated
against a shape
Constraint
componentsTarget
Target declarations can be
used to produce focus nodes
for a shape
Determine how to
validate a node

14
SHACL: an example
ex:CityShape
a sh:NodeShape ;
sh:targetClass ex:City ;
sh:property [
sh:path ex:population ;
sh:maxCount 1 ;
sh:datatype xsd:integer ;
] .
it is a node shape
applies to all cities
constraint the values
of ex:population
max 1 population
of type integer
e.g. "all cities have at
most one population
property of type
integer"
ex:London a ex:City ;
ex:population "two million" .
ex:Paris a ex:City ;
ex:population 2304 ;
ex:population 5342 ;

15
Targets
Declare the focus nodes for a shape
Node target: ex:CityShape
a sh:NodeShape ;
sh:targetNode ex:Zurich .
ex:London a ex:City ;
ex:Zurich a ex:City ;
directly declare nodes
Class target:
nodes with a given type
ex:CityShape
a sh:NodeShape ;
sh:targetClass ex:City .
Implicit class target:
same, but implicit
ex:City
a rdfs:Class, sh:NodeShape .
ex:Luzern a ex:City .
ex:Olten a ex:City .
ex:Valais a ex:Canton .
ex:SwissCity
rdfs:subClassOf ex:City .
ex:Basel a ex:SwissCity .
ex:Munich a ex:GermanCity .
ex:Lausanne a ex:City .
ex:Limmat a ex:River .
target subject of, target object of: see docs.

16
Node Shapes
:University
a sh:NodeShape ;
sh:nodeKind sh:IRI .
:epfl a :University.
<http://example.ch/unifr> a :University .
_:1 a :University .
Constraints about a focus node
sh:BlankNode
sh:IRI
sh:Literal
sh:BlankNodeOrIRI
sh:BlankNodeOrLiteral
sh:IRIOrLiteral
Possible values:

17
Property Shapes
Constraints about a given property and its values for the focus node
- sh:property associates a shape with a property constraint
- sh:path identifies the path
:Student a sh:NodeShape ;
sh:property [
sh:path ex:email;
sh:nodeKind sh:IRI
] .
:anna a :Student ;
ex:email <mailto:anna@uzh.ch> .
:max a :Student ;
ex:email <mailto:max@uzh.ch> .
:greta a :Student ;
ex:email "greta@uzh.ch" .

18
Disclaimer:
target declarations sometimes
omitted in the following examples

19
Core constraint components
Value Type: class, datatype, nodeKind
Cardinality: minCount, maxCount
Value Range: minInclusive, maxInclusive, minExclusive, maxExclusive
String-based: minLength, maxLength, pattern, languageIn, uniqueLang
Property Pair: equals, disjoint, lessThan, lessThanOrEquals
Logical: not, and, or, xone
Shape-based: node, property, qualifiedValueShape, qualifiedMinCount,
qualifiedMaxCount
On values: in, hasValue
Closed shapes: closed, ignoredProperties
Non-validating: name, description, order, group, defaultValue
SPARQL: sparql

20
Value Type Constraints: Datatype
sh:datatype: condition to be satisfied for the datatype of each value node.
:University a sh:NodeShape ;
sh:property [
sh:path ex:established;
sh:datatype xsd:date ;
] .
:hes-so ex:established "1997-01-20"^^xsd:date .
:eth ex:established "Unknown"^^xsd:date .
:uzh ex:established 1990 .

21
Value Type Constraints: Class
sh:class condition: each value node is a SHACL instance of a given type.
:Person
a sh:NodeShape, rdfs:Class ;
sh:property [
sh:path ex:almaMater ;
sh:class :University
] .
:unifr a :University .
:eth a :FederalSchool .
:unibe a :CantonalUniversity
:FederalSchool rdfs:subClassOf :University .
:anna a :Person;
ex:almaMater :unifr .
:max a :Person ;
ex:almaMater :eth .
:greta a :Person;
ex:almaMater :unibe .

22
Value Type Constraints: Kind
sh:nodeKind: condition to be satisfied by the RDF node kind
:Student
a sh:NodeShape, rdfs:Class ;
sh:property [
sh:path ex:name ;
sh:nodeKind sh:Literal ;
];
sh:property [
sh:path ex:friendOf ;
sh:nodeKind sh:BlankNodeOrIRI
];
sh:nodeKind sh:IRI .
:anna a :Student;
ex:name _:1 ;
ex:friendOf :max .
:max a :Student;
ex:name "Max";
ex:friendOf [ex:name "Lucas"] .
:greta a :Student;
ex:name "Greta" ;
ex:friendOf "Lucas" .
_:1 a :Student.
BlankNode, IRI, Literal,
BlankNodeOrIRI, IRIOrLiteral
BlankNodeOrLiteral,
possible
kinds

23
Cardinality constraints
sh:minCount: minimum number of value nodes that satisfy the condition
sh:maxCount: maximum number of value nodes that satisfy the condition.
sh:property [
sh:path ex:hasCourse ;
sh:minCount 2 ;
sh:maxCount 3 ;
] .
:anna ex:hasCourse
:math, :physics .
:max ex:hasCourse
:chemistry .
:greta ex:hasCourse
:math, :physics,
:chemistry, :history .

24
Value Range Constraints
Value range conditions for value nodes that are comparable via operators
such as <, <=, > and >=. sh:minInclusive, sh:maxInclusive,
sh:minExclusive, sh:maxExclusive
:Grade a sh:NodeShape ;
sh:property [
sh:path ex:gradeValue ;
sh:minInclusive 1 ;
sh:maxInclusive 5 ;
sh:datatype xsd:integer
] .
:failure ex:gradeValue 1 .
:sufficient ex:gradeValue 3 .
:excelent ex:gradeValue 5 .
:toobad ex:gradeValue 0 .

25
String-based Constraints
Specify conditions on the string representation of value nodes.
sh:minLength: minimum string length of each value node.
sh:maxLength: maximum string length of each value node.
sh:pattern: regular expression that each value node matches.
sh:languageIn: allowed language tags for each value node.
sh:uniqueLang: no pair of value nodes may use the same language tag.

26
minLength/maxLength
sh:property [
sh:path ex:name ;
sh:minLength 4 ;
sh:maxLength 10 ;
] .
:anna ex:name "Anna" .
:max ex:name "Max" .
:greta ex:name :Greta .
:strange ex:name _:strange .
sh:minLength: minimum string length of each value node.
sh:maxLength: maximum string length of each value node.

27
pattern
sh:property [
sh:path ex:studentID ;
sh:pattern "^Pd{3,4}" ;
sh:flags "i" ;
] .
:anna ex:studentID "P2345" .
:max ex:studentID "p567" .
:greta ex:studentID "P12" .
:lara ex:studentID "B123" .
sh:pattern: regular expression that each value node matches.

28
languageIn
ex:SwissLangShape a sh:NodeShape ;
sh:targetNode ex:Mountain, ex:Berg ;
sh:property [
sh:path ex:prefLabel ;
sh:languageIn ( "en" "fr" ) ;
] .
ex:Mountain
ex:prefLabel "Mountain"@en ;
ex:prefLabel "Hill"@en-UK ;
ex:prefLabel "Montagne"@fr .
ex:Berg
ex:prefLabel "Berg" ;
ex:prefLabel "Berg"@de ;
ex:prefLabel ex:BergLabel .
sh:languageIn: allowed language tags for each value node.

29
uniqueLang
:Canton a sh:NodeShape ;
sh:property [
sh:path ex:name ;
sh:uniqueLang true
] .
:valais ex:name
"Valais"@fr, "Wallis"@de .
:fribourg ex:name
"Fribourg"@fr,
"Freiburg"@de,
"Friburgo"@es .
:zurich ex:name
"Zurich"@de, "Zuerich"@de.
sh:uniqueLang: no pair of value nodes may use the same language tag.

30
Property Pair Constraints
Specify conditions on the sets of value nodes in relation to other properties.
sh:equals: all value nodes equal to the objects of the focus node
sh:disjoint: value nodes is disjoint with the objects of the focus node
sh:lessThan: each value node is smaller than all the objects of focus node
sh:lessThanOrEquals: same, but smaller than or equal

31
equals
sh:property [
sh:path ex:givenName ;
sh:equals ex:firstName
];
:anna ex:givenName "Anna";
ex:lastName "Parker";
ex:firstName "Anna" .
:max ex:givenName "Max";
ex:lastName "Sutter" ;
ex:firstName "Maximilian" .
:greta ex:givenName "Greta";
ex:lastName "Greta" ;
ex:firstName "Greta" .
sh:equals: all value nodes equal to the objects of the focus node

32
disjoint
sh:property [
sh:path ex:givenName ;
sh:disjoint ex:lastName
] .
:anna ex:givenName "Anna";
ex:lastName "Parker";
ex:firstName "Anna" .
:max ex:givenName "Max";
ex:lastName "Sutter" ;
ex:firstName "Maximilian" .
:greta ex:givenName "Greta";
ex:lastName "Greta" ;
ex:firstName "Greta" .
sh:disjoint: value nodes is disjoint with the objects of the focus node

33
lessThan/lessThanOrEquals
ex:LessThanShape a sh:NodeShape ;
sh:property [
sh:path ex:startDate ;
sh:lessThan ex:endDate ;
] .
:project
ex:startDate "2017-01-02"^^xsd:date ;
ex:endDate "2015-01-02"^^xsd:date .

34
Logical Constraints
Implement the common logical operators and, or, not, xone (kind of xor)
sh:and: Conjunction of a list of shapes
sh:or: Disjunction of a list of shapes
sh:not: Negation of a shape
sh:xone: Exactly one (similar XOR for 2 arguments)

35
not
ex:NotShape a sh:NodeShape ;
sh:targetNode :anna ;
sh:not [
a sh:PropertyShape ;
sh:path ex:established ;
sh:minCount 1 ;
] .
:anna ex:established "Some value" .
sh:not: Negation of a shape

36
and
ex:Shape1 a sh:NodeShape ;
sh:property [
sh:path ex:courses ;
sh:minCount 1 ;
] .
ex:Shape2 a sh:NodeShape ;
sh:targetNode :anna, :max ;
sh:and (
ex:Shape1
[ sh:path ex:courses ;
sh:maxCount 1 ; ]
) .
:anna ex:courses "Math" .
:max ex:courses "Math" ;
ex:courses "Chemistry" .
sh:and: Conjunction of a list of shapes

37
or
ex:OrShape a sh:NodeShape ;
sh:targetNode :anna, :max ;
sh:or (
[ sh:path ex:firstName ;
sh:minCount 1 ; ]
[ sh:path ex:givenName ;
sh:minCount 1 ; ]
) .
:anna ex:firstName "Anna" .
:max ex:givenName "Max" .

38
or
ex:AddressShape a sh:NodeShape ;
sh:targetClass ex:Student ;
sh:property [
sh:path ex:address ;
sh:or (
[ sh:datatype xsd:string ; ]
[ sh:class ex:Address ; ]
)
] .
:anna
ex:address "12 Petit Rue, 1220,
Geneva" .
:max ex:address :maxAddress .
:maxAddress a ex:Address ;
ex:street "Grand Rue" ;
ex:zip 3960 ;
ex:locality ex:Sierre .

39
xone
ex:XoneShape a sh:NodeShape ;
sh:targetClass ex:Person ;
sh:xone (
[ sh:property
[ sh:path ex:fullName ;
sh:minCount 1 ; ]
]
[ sh:property
[ sh:path ex:firstName ;
sh:minCount 1 ; ] ;
sh:property
[ sh:path ex:lastName ;
sh:minCount 1 ; ]
]
) .
ex:Bob a ex:Person ;
ex:firstName "Robert" ;
ex:lastName "Coin" .
ex:Carla a ex:Person ;
ex:fullName "Carla Miller" .
ex:Dory a ex:Person ;
ex:firstName "Dory" ;
ex:lastName "Dunce" ;
ex:fullName "Dory Dunce" .
sh:xone: conforms to exactly one of the provided shapes.

40
Shape-based Constraints
Specify complex conditions by validating the value nodes against certain shapes.
sh:node: each value node conforms to the given node shape.
sh:property: specify that each value node has a given property shape.
sh:qualifiedValueShape: a number of value nodes conforms to a given shape.
- one value for sh:qualifiedMinCount
- one value for sh:qualifiedMaxCount
- one value for each

41
node
ex:AddressShape a sh:NodeShape ;
sh:property [
sh:path ex:postalCode ;
sh:datatype xsd:string ;
sh:maxCount 1 ;
] .
ex:PersonShape a sh:NodeShape ;
sh:targetClass ex:Person ;
sh:property [
sh:path ex:address ;
sh:minCount 1 ;
sh:node ex:AddressShape ;
] .
ex:Bob a ex:Person ;
ex:address ex:BobsAddress .
ex:BobsAddress ex:postalCode "1234" .
ex:Reto a ex:Person ;
ex:address ex:RetosAddress .
ex:RetosAddress ex:postalCode 5678 .
sh:node: each value node conforms to the given node shape.

42
property, qualifiedValueShape
ex:QualifiedShape a sh:NodeShape ;
sh:targetNode ex:anna, ex:max ;
sh:property [
sh:path ex:parent ;
sh:minCount 2 ;
sh:maxCount 2 ;
sh:qualifiedValueShape [
sh:path ex:gender ;
sh:hasValue ex:female ; ] ;
sh:qualifiedMinCount 1 ;
] .
ex:John ex:gender ex:male .
ex:Jane ex:gender ex:female .
ex:Tim ex:gender ex:male .
ex:anna
ex:parent ex:John ;
ex:parent ex:Jane .
ex:max
ex:parent ex:John ;
ex:parent ex:Tim .
sh:property: specify that each value node has a given property shape.

43
Constraints on values: hasValue
sh:hasValue: at least one value node is equal to the given RDF term.
ex:ETHGraduate a sh:NodeShape ;
sh:targetNode :anna ;
sh:property [
sh:path ex:alumniOf ;
sh:hasValue ex:ETH ;
] .
:anna ex:alumniOf ex:EPFL ;
ex:alumniOf ex:ETH .

44
Constraints on values: in
sh:in: each value node is a member of a provided SHACL list.
ex:InShape a sh:NodeShape ;
sh:targetClass ex:SkiSlope ;
sh:property [
sh:path ex:difficulty ;
sh:in ( ex:Black ex:Blue ex:Red ) ;
] .
ex:slope1 a ex:SkiSlope;
ex:difficulty ex:Pink .
ex:slope2 a ex:SkiSlope;
ex:difficulty ex:Red .

45
Closed shapes
sh:closed Set to true to close the shape.
sh:ignoredProperties
Optional properties that are also permitted in addition to those
explicitly enumerated via sh:property.
ex:ClosedShape a sh:NodeShape ;
sh:targetNode ex:Alice, ex:Bob ;
sh:closed true ;
sh:ignoredProperties (rdf:type) ;
sh:property [ sh:path ex:firstName ; ] ;
sh:property [ sh:path ex:lastName ; ] .
ex:Alice ex:firstName "Alice" .
ex:Bob ex:firstName "Bob" ;
ex:middleInitial "J" .

46
Non-validating constraints
sh:name: provide human-readable labels for the property.
sh:description: provide descriptions of the property in the given context.
sh:order: indicate the relative order of the property shape for purposes
such as form building.
sh:group: indicate that the shape belongs to a group of related property
shapes.
Property shapes may have a single sh:defaultValue. The default value
does not have fixed semantics

47
Non validating constraints
ex:PersonFormShape a sh:NodeShape ;
sh:property [
sh:path ex:firstName ;
sh:name "first name" ;
sh:description "The given name(s)" ;
sh:order 0 ;
sh:group ex:NameGroup ; ] ;
sh:property [
sh:path ex:lastName ;
sh:name "last name" ;
sh:description "The last name" ;
sh:order 1 ;
sh:group ex:NameGroup ; ] ;
Name
Address
sh:property [
sh:path ex:streetAddress ;
sh:name "street address" ;
sh:description "The street address" ;
sh:order 11 ;
sh:group ex:AddressGroup ; ] ;
sh:property [
sh:path ex:locality ;
sh:name "locality" ;
sh:description "The town or city " ;
sh:order 12 ;
sh:group ex:AddressGroup ; ] ;
sh:property [
sh:path ex:postalCode ;
sh:name "postal code" ;
sh:name "zip code"@en-US ;
sh:description "The postal code" ;
sh:order 13 ;
sh:group ex:AddressGroup ; ] .
ex:NameGroup a sh:PropertyGroup ;
sh:order 0 ;
rdfs:label "Name" .
ex:AddressGroup a sh:PropertyGroup ;
sh:order 1 ;
rdfs:label "Address" .

48
SPARQL
PREFIX ex: <http://example.com#>
SELECT ?name ?university
WHERE {
?student ex:lastName ?name ;
ex:attends ?university .
?university ex:name ?uniname .
FILTER (langMatches(lang(?uniname), "fr"))
}

49
SPARQL-based constraints
ex:LanguageShape a sh:NodeShape ;
sh:targetClass ex:Country ;
sh:sparql [
a sh:SPARQLConstraint ;
sh:message "Values are literals with German language tag." ;
sh:prefixes ex: ;
sh:select """ SELECT $this (ex:germanLabel AS ?path) ?value
WHERE {
$this ex:germanLabel ?value .
FILTER (!isLiteral(?value) ||
!langMatches(lang(?value), "de"))
} """ ;
] .
ex:country1 a ex:Country ;
ex:germanLabel "Spanien"@de .
ex:country2 a ex:Country ;
ex:germanLabel "Spain"@en .

50
Shape Messages
ex:MyShape a sh:NodeShape ;
sh:targetNode ex:MyInstance ;
sh:property [
sh:path ex:myProperty ;
sh:minCount 1 ;
sh:severity sh:Warning ; ] ;
sh:property [
sh:path ex:myProperty ;
sh:maxLength 10 ;
sh:message "Too many characters"@en ;
sh:message "Zu viele Zeichen"@de ; ] ;
sh:deactivated true .

51
ShEx
:User IRI {
schema:name xsd:string
}
:User a sh:NodeShape, rdfs:Class ;
sh:targetClass :Person ;
sh:nodeKind sh:IRI ;
sh:property [
sh:path schema:name ;
sh:datatype xsd:string
] .
Shape expressions: http://shex.io/
SHACL ShEx

52
ShEx
:User {
schema:givenName xsd:string
schema:lastName xsd:string
}
:User a sh:NodeShape ;
sh:property [
sh:path schema:givenName ;
];
sh:property [
sh:path schema:lastName ;
] .
:alice schema:givenName "Alice" ;
schema:lastName "Cooper" .
:bob schema:givenName "Bob", "Robert" ;
schema:lastName "Smith", "Dylan" .
:carol schema:lastName "King" .
:dave schema:givenName 23;
schema:lastName :Unknown .

53
a lot more info on the docs of SHACL and ShEx

¡gracias! ¿tienes preguntas?
Jean-Paul Calbimonte
University of Applied Sciences and Arts Western Switzerland
HES-SO Valais-Wallis
@jpcik

RDF data validation 2017 SHACL

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to RDF data validation 2017 SHACL

Similar to RDF data validation 2017 SHACL (20)

More from Jean-Paul Calbimonte

More from Jean-Paul Calbimonte (20)

Recently uploaded

Recently uploaded (20)

RDF data validation 2017 SHACL