Although RDF is a corner stone of semantic web and knowledge graphs, it has not been embraced by everyday programmers and software architects who need to safely create and access well-structured data. There is a lack of common tools and methodologies that are available in more conventional settings to improve data quality by defining schemas that can later be validated. Two technologies have recently been proposed for RDF validation: Shape Expressions (ShEx) and Shapes Constraint Language (SHACL). In the talk, we will review the history and motivation of both technologies. We will also and enumerate some challenges and future work with regards to RDF validation.
Semantic Web technologies (such as RDF and SPARQL) excel at bringing together diverse data in a world of independent data publishers and consumers. Common ontologies help to arrive at a shared understanding of the intended meaning of data.
However, they don’t address one critically important issue: What does it mean for data to be complete and/or valid? Semantic knowledge graphs without a shared notion of completeness and validity quickly turn into a Big Ball of Data Mud.
The Shapes Constraint Language (SHACL), an upcoming W3C standard, promises to help solve this problem. By keeping semantics separate from validity, SHACL makes it possible to resolve a slew of data quality and data exchange issues.
Presented at the Lotico Berlin Semantic Web Meetup.
Comparison of features between ShEx (Shape Expressions) and SHACL (Shapes Constraint Language)
Changelog:
11/06/17
- Removed slides about compositionality
31/May/2017
- Added slide 30 about validation report
- Added slide 32 about stems
- Changed slides 7 and 8 adapting compact syntax to new operator .
23/05/2017:
Slide 14: Repaired typos in typos in sh:entailment, rdfs:range
21/05/2017:
- Slide 8. Changed the example to be an IRI and a datatype
- Added typically in slide 9
- Slide 10: Removed the phrase: "Target declarations can problematic when reusing/importing shapes"
and created slide 27 to talk about reuability
- Added slide 11 to talk about the differences in triggering validation
- Created slide 14 to talk about inference
- Renamed slide 15 as "Inference and triggering mechanism"
- Added slides 27 and 28 to talk about reuability
- Added slide 29 to talk about annotations
18/05/2017
- Slides 9 now includes an example using ShEx RDF vocabulary
- Slide 10 now says that target declarations are optional
- Slide 13 now says that some RDF Schema terms have special treatment in SHACL
- Example in slide 18 now uses sh:or instead of sh:and
- Added slides 22, 23 and 24 which show some features supported by SHACL but not supported by ShEx (property pair constraints, uniqueLang and owl:imports)
Semantic Web technologies (such as RDF and SPARQL) excel at bringing together diverse data in a world of independent data publishers and consumers. Common ontologies help to arrive at a shared understanding of the intended meaning of data.
However, they don’t address one critically important issue: What does it mean for data to be complete and/or valid? Semantic knowledge graphs without a shared notion of completeness and validity quickly turn into a Big Ball of Data Mud.
The Shapes Constraint Language (SHACL), an upcoming W3C standard, promises to help solve this problem. By keeping semantics separate from validity, SHACL makes it possible to resolve a slew of data quality and data exchange issues.
Presented at the Lotico Berlin Semantic Web Meetup.
Comparison of features between ShEx (Shape Expressions) and SHACL (Shapes Constraint Language)
Changelog:
11/06/17
- Removed slides about compositionality
31/May/2017
- Added slide 30 about validation report
- Added slide 32 about stems
- Changed slides 7 and 8 adapting compact syntax to new operator .
23/05/2017:
Slide 14: Repaired typos in typos in sh:entailment, rdfs:range
21/05/2017:
- Slide 8. Changed the example to be an IRI and a datatype
- Added typically in slide 9
- Slide 10: Removed the phrase: "Target declarations can problematic when reusing/importing shapes"
and created slide 27 to talk about reuability
- Added slide 11 to talk about the differences in triggering validation
- Created slide 14 to talk about inference
- Renamed slide 15 as "Inference and triggering mechanism"
- Added slides 27 and 28 to talk about reuability
- Added slide 29 to talk about annotations
18/05/2017
- Slides 9 now includes an example using ShEx RDF vocabulary
- Slide 10 now says that target declarations are optional
- Slide 13 now says that some RDF Schema terms have special treatment in SHACL
- Example in slide 18 now uses sh:or instead of sh:and
- Added slides 22, 23 and 24 which show some features supported by SHACL but not supported by ShEx (property pair constraints, uniqueLang and owl:imports)
Enterprise systems are increasingly complex, often requiring data and software components to be accessed and maintained by different company departments. This complexity often becomes an organization’s biggest challenge as changing data fields and adding new applications rapidly grow to meet business demands for increased customer insights.
These slides are from a Webinar discussing how using SHACL and JSON-LD with AllegroGraph helps our customers simplify the complexity of enterprise systems through the ability to loosely combine independent elements, while allowing the overall system to function smoothly.
In this Webinar we will demonstrate how AllegroGraph’s SHACL validation engine confirms whether JSON-LD data is conforming to the desired requirements. We will describe how SHACL provides a way for a Data Graph to specify the Shapes Graph that should be used for validation and describes how a given shape is linked to targets in the data.
The recording is at youtube.com/allegrograph
JSON-LD is a set of W3C standards track specifications for representing Linked Data in JSON. It is fully compatible with the RDF data model, but allows developers to work with data entirely within JSON.
More information on JSON-LD can be found at http://json-ld.org/
Understanding RDF: the Resource Description Framework in Context (1999)Dan Brickley
Dan Brickley, 3rd European Commission Metadata Workshop, Luxemburg, April 12th 1999
Understanding RDF: the Resource Description Framework in Context
http://ilrt.org/discovery/2001/01/understanding-rdf/
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
NOTE THAT I HAVE MOVED AWAY FROM SLIDESHARE TO ZENODO
The identical presentation is now here:
https://doi.org/10.5281/zenodo.7778641
General introduction to LinkML, The Linked Data Modeling Language.
Adapter from presentation given to NIH May 2022
https://linkml.io/linkml
SPARQL introduction and training (130+ slides with exercices)Thomas Francart
Full SPARQL training
Covers all SPARQL : basic graph patterns, FILTERs, functions, property paths, optional, negation, assignation, aggregation, subqueries, federated queries.
Does not cover except SPARQL updates.
Includes exercices on DBPedia.
CC BY license
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Jeff Z. Pan
Tutorial on "Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge Graphs" presented at the 4th Joint International Conference on Semantic Technologies (JIST2014)
Two graph data models : RDF and Property Graphsandyseaborne
Talk given at ApacheConEU Big Data 2015.
This talk describes the two common graph data approaches, RDF and Property Graphs. It concludes with observations about the different emphasis of each and where each is focused.
Enterprise systems are increasingly complex, often requiring data and software components to be accessed and maintained by different company departments. This complexity often becomes an organization’s biggest challenge as changing data fields and adding new applications rapidly grow to meet business demands for increased customer insights.
These slides are from a Webinar discussing how using SHACL and JSON-LD with AllegroGraph helps our customers simplify the complexity of enterprise systems through the ability to loosely combine independent elements, while allowing the overall system to function smoothly.
In this Webinar we will demonstrate how AllegroGraph’s SHACL validation engine confirms whether JSON-LD data is conforming to the desired requirements. We will describe how SHACL provides a way for a Data Graph to specify the Shapes Graph that should be used for validation and describes how a given shape is linked to targets in the data.
The recording is at youtube.com/allegrograph
JSON-LD is a set of W3C standards track specifications for representing Linked Data in JSON. It is fully compatible with the RDF data model, but allows developers to work with data entirely within JSON.
More information on JSON-LD can be found at http://json-ld.org/
Understanding RDF: the Resource Description Framework in Context (1999)Dan Brickley
Dan Brickley, 3rd European Commission Metadata Workshop, Luxemburg, April 12th 1999
Understanding RDF: the Resource Description Framework in Context
http://ilrt.org/discovery/2001/01/understanding-rdf/
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
NOTE THAT I HAVE MOVED AWAY FROM SLIDESHARE TO ZENODO
The identical presentation is now here:
https://doi.org/10.5281/zenodo.7778641
General introduction to LinkML, The Linked Data Modeling Language.
Adapter from presentation given to NIH May 2022
https://linkml.io/linkml
SPARQL introduction and training (130+ slides with exercices)Thomas Francart
Full SPARQL training
Covers all SPARQL : basic graph patterns, FILTERs, functions, property paths, optional, negation, assignation, aggregation, subqueries, federated queries.
Does not cover except SPARQL updates.
Includes exercices on DBPedia.
CC BY license
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Jeff Z. Pan
Tutorial on "Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge Graphs" presented at the 4th Joint International Conference on Semantic Technologies (JIST2014)
Two graph data models : RDF and Property Graphsandyseaborne
Talk given at ApacheConEU Big Data 2015.
This talk describes the two common graph data approaches, RDF and Property Graphs. It concludes with observations about the different emphasis of each and where each is focused.
Although RDF can be considered the corner stone of semantic web and knowledge graphs, it has not been embraced by everyday programmers and software architects who want to safely create and access well-structured data. There is a lack of common tools and methodologies that are available in more conventional settings to improve data quality by defining schemas that can later be validated. Two technologies have recently been proposed for RDF validation: Shape Expressions (ShEx) and Shapes Constraint Language (SHACL). In the talk, we will briefly introduce both technologies using some examples and compare them. We will also present some challenges and applications related with RDF data shapes.
Talk given at: KTH Royal Institute of Technology, School of Industrial Engineering and Management, Mechatronics Division, 7th February, 2020
Validating and Describing Linked Data Portals using RDF Shape ExpressionsJose Emilio Labra Gayo
Presentation at 1st Linked Data Quality Workshop, Leipzig, 2nd Sept. 2014
Author: Jose Emilio Labra Gayo
Applies Shapes Expressions to validate the WebIndex linked data portal
Culture Geeks Feb talk: Adventures in Linked Data Landval.cartei
Culture Geeks talk: "Adventures in Linked Data Land", by Richard Light.
Feb, 25th 2009 - Regency Town House
Culture Geeks is a Brighton-based community open to everyone who is
interested in using digital technologies in the cultural sector.
This slides I've used on talk about Semantic Web use-case. Not all know what exactly Semantic Web is about. So I've created set of slides showing this in a simple and correct way. Use-case slides are removed on this public available slides. Animated version here goo.gl/qKoF6k . Contact me for sources!
Tutorial on RDFa, to be held at ISWC2010 in Shanghai, China. (I was supposed to hold the tutorial but last minute issues made it impossible for me to travel there...)
Como publicar los datos: datos abiertos y enlazados
Charla impartida en Jornadas Open Data y Transparencia: Ayuntamiento de Oviedo
11 de septiembre de 2017
Charla Linked data - Datos abiertos enlazados impartida en el IV Foro Distrital Buenas Prácticas en Gestión de la Información Geográfica - Bogotá, Colombia, 14 Diciembre 2015
# Internet Security: Safeguarding Your Digital World
In the contemporary digital age, the internet is a cornerstone of our daily lives. It connects us to vast amounts of information, provides platforms for communication, enables commerce, and offers endless entertainment. However, with these conveniences come significant security challenges. Internet security is essential to protect our digital identities, sensitive data, and overall online experience. This comprehensive guide explores the multifaceted world of internet security, providing insights into its importance, common threats, and effective strategies to safeguard your digital world.
## Understanding Internet Security
Internet security encompasses the measures and protocols used to protect information, devices, and networks from unauthorized access, attacks, and damage. It involves a wide range of practices designed to safeguard data confidentiality, integrity, and availability. Effective internet security is crucial for individuals, businesses, and governments alike, as cyber threats continue to evolve in complexity and scale.
### Key Components of Internet Security
1. **Confidentiality**: Ensuring that information is accessible only to those authorized to access it.
2. **Integrity**: Protecting information from being altered or tampered with by unauthorized parties.
3. **Availability**: Ensuring that authorized users have reliable access to information and resources when needed.
## Common Internet Security Threats
Cyber threats are numerous and constantly evolving. Understanding these threats is the first step in protecting against them. Some of the most common internet security threats include:
### Malware
Malware, or malicious software, is designed to harm, exploit, or otherwise compromise a device, network, or service. Common types of malware include:
- **Viruses**: Programs that attach themselves to legitimate software and replicate, spreading to other programs and files.
- **Worms**: Standalone malware that replicates itself to spread to other computers.
- **Trojan Horses**: Malicious software disguised as legitimate software.
- **Ransomware**: Malware that encrypts a user's files and demands a ransom for the decryption key.
- **Spyware**: Software that secretly monitors and collects user information.
### Phishing
Phishing is a social engineering attack that aims to steal sensitive information such as usernames, passwords, and credit card details. Attackers often masquerade as trusted entities in email or other communication channels, tricking victims into providing their information.
### Man-in-the-Middle (MitM) Attacks
MitM attacks occur when an attacker intercepts and potentially alters communication between two parties without their knowledge. This can lead to the unauthorized acquisition of sensitive information.
### Denial-of-Service (DoS) and Distributed Denial-of-Service (DDoS) Attacks
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBrad Spiegel Macon GA
Brad Spiegel Macon GA’s journey exemplifies the profound impact that one individual can have on their community. Through his unwavering dedication to digital inclusion, he’s not only bridging the gap in Macon but also setting an example for others to follow.
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesSanjeev Rampal
Talk presented at Kubernetes Community Day, New York, May 2024.
Technical summary of Multi-Cluster Kubernetes Networking architectures with focus on 4 key topics.
1) Key patterns for Multi-cluster architectures
2) Architectural comparison of several OSS/ CNCF projects to address these patterns
3) Evolution trends for the APIs of these projects
4) Some design recommendations & guidelines for adopting/ deploying these solutions.
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
This 7-second Brain Wave Ritual Attracts Money To You.!nirahealhty
Discover the power of a simple 7-second brain wave ritual that can attract wealth and abundance into your life. By tapping into specific brain frequencies, this technique helps you manifest financial success effortlessly. Ready to transform your financial future? Try this powerful ritual and start attracting money today!
1. Validating RDF data:
Challenges and perspectives
Jose Emilio Labr Gayo
WESO Research Group (WEb Semantics Oviedo)
University of Oviedo, Spain
2. About me…
Founded WESO (Web Semantics Oviedo) research group
Practical applications of semantic technologies since 2004
Several domains: e-Government, e-Health
Some books:
"Web semántica" (in Spanish), 2012
"Validating RDF data", 2017
…and software:
SHaclEX (Scala library, implements ShEx & SHACL)
RDFShape (RDF playground)
HTML version: http://book.validatingrdf.com
Examples: https://github.com/labra/validatingRDFBookExamples
3. Contents
Short intro to semantic Web & knowledge graphs
Short overview of RDF
Understanding the RDF validation problem
2 technologies: ShEx/SHACL
Challenges & perspectives
4. Semantic Web
Vision of the Web as web of data
Not only pages, but also data: linked data
Data understandable by machines
Related with:
Big Data
Artificial intelligence
Knowledge representation
Example: Wikipedia vs Wikidata
Tim Berners-Lee
Source: Wikipedia
Source: http://www.internetlivestats.com/total-number-of-websites/
Date: 05/04/2019 (19:05h)
5. Who consumes Web information?
People
We access through some device
...and Machines
They show us web pages (browsers)
...but also manage information (bots)
Filtering content
Doing suggestions
...
"If Google doesn't understand your Web page, it almost doesn't exist"
7. People vs Machines
Programmed for some tasks
Predictable (no mistakes*)
No problem with repetitive tasks
Problems to understand the context
Creativity, imagination
Unpredictable (we do mistakes)
We get tired with repetitive tasks
Understanding based on context
*when well programmed
8. Information understandable by machines?
Example: "Oviedo has a temperature of 36 degrees"
Problem: Ambiguity and context identification
Oviedo ? It can be... A city in Spain
...or a city in Florida, USA
...or a soccer player http://www.wikidata.org/entity/Q325997
http://wikidata.org/entity/Q14317
http://www.wikidata.org/entity/Q1813449
...has a temperatrure of... https://www.wikidata.org/wiki/Property:P2076
Identifiers (URIs) in Wikidata
Machine representation
http://www.wikidata.org/entity/Q14317
https://www.wikidata.org/prop/direct/P2076
36^^<http://www.w3.org/2001/XMLSchema#integer>
10. Knowledge graphs
wd:Q14317 36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
Wikidata statistics (17-07-2019)
57,462,853 entities
728,979,397 triples
Source: https://www.wikidata.org/wiki/Wikidata:Statistics
Wikidata is an example of a knowledge graph
There are a lot more:
Open: DBpedia, BabelNet, etc
Proprietary: Google, facebook, Microsoft, etc. also have knowledge graphs
11. Web information & knowledge graphs
Documentos
Information
retrieval
Knowledge
graphs
DocumentosDocumentos
Posts
Tweets
Web pages
...
Publish
Web
documents
Reality
Mental model
Knowledge graphs are a key part of the Web nowadays
Sensors Generate
User
Content
generation &
filtering
12. RDF
RDF (Resource Description Framework)
Allows to represent knowledge graphs
RDF data model = graph as a set of statements
Predicates = URIs
Several syntaxes
Turtle, N-Triples, JSON-LD,...
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
BirthPlace
Picture
BirthDate
capital
temperature
population
Oviedo
Asturias
Fernando
Alonso
wd:Q14317
17. RDF graphs are composable
RDF helps information integration
36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
wd:Q14317
:alice
:bob
:carol
schema:birthPlace
schema:birthPlace
schema:knows
schema:knows
Alice Cooper
Schema:name Robert Smith
schema:nameschema:knows
Carol King
schema:name
18. RDF data model
Graph = set of statements (triples)
Predicates (arcs) = URIs
Subjects: URIs*
Objects: URIs or literals* 36^^xsd:integer
wdt:P2076
wd:Q10514 wd:Q3934
wdt:P19
wdt:P1376
"1981-07-29"^^xsd:date
220020^^xsd:integer
wdt:P1082
wdt:P18 wdt:P569
...jpg
wd:Q14317
*Exception: blank nodes (next slide)
19. RDF data model: Blank nodes
A statement about something
Example: "Alice knows someone who knows Bob"
:alice
schema:knows
:bob
schema:knows
:alice schema:knows _:x ;
_:x schema:knows :bob .
Notation (in Turtle)
:alice schema:knows [
schema:knows :bob
] .
x ( :alice :knows x
x :knows :bob )
20. RDF data model: Literals
:bob schema:name "Robert"^^xsd:string ;
schema:age "18"^^xsd:integer ;
schema:birthDate "2010-04-12"^^xsd:date .
:bob schema:name "Robert" ;
schema:age 18 ;
schema:birthDate "2010-04-12"^^xsd:date .
Objects can also be literals
Literals contain a lexical form and a datatype
Common datatypes: XML Schema primitive datatypes
If not specified, a literal has type xsd:string
:bob
schema:name
"2010-04-12"^^xsd:date18"Robert Smith"
schema:age
schema:birthDate
21. RDF data model: Language tagged strings
String literals can be qualified by a language tag
They have datatype rdfs:langString
:spain rdfs:label "Spain"@en ;
rdfs:label "España"@es .
:Spain
"España"@es"Spain"@en
rdfs:label
rdfs:label
24. SELECT ?name ?birthDate ?picture WHERE {
?person wdt:P19 wd:Q14317 ;
rdfs:label ?name ;
wdt:P569 ?birthDate ;
wdt:P18 ?picture .
FILTER (Lang(?name) = 'es')
}
SPARQL
SPARQL = Simple Protocol and RDF Query Language
Similar to SQL, but for RDF
Example: People born in Oviedo
wdt:P19
"1981-07-29"^^xsd:date
wdt:P18
wdt:P569
Oviedo
wd:Q14317
"Fernando Alonso"@en
rdfs:label
?person
?birthPlace?name ?picture
wdt:P19
wdt:P18wdt:P569
wd:Q14317
rdfs:label
wd:Q10514
wdt:P19
...
25. Shared entities and vocabularies
URIs as global identifiers allow to reuse them
Lots of vocabularies: domain specific or general purpose
Examples:
schema.org: properties that major browsers can recognize
Linked Open Vocabularies
26. Inference and ontologies
Some vocabularies define properties that can infer new information
RDF Schema: Classes, subclasses, etc.
OWL: Ontologies using description logics
Trade-off between expressiveness and complexity
rdf:type
:Student
:alice
:Person
rdfs:subClassOf
:HumanBeing
owl:sameAs
rdf:type
rdf:type
27. RDF, the good parts...
RDF as an integration language
RDF as a lingua franca for semantic web and linked data
RDF flexibility & integration
Data can be adapted to multiple environments
Open and reusable data by default
RDF for knowledge representation
RDF data stores & SPARQL
28. RDF, the other parts
Consuming & producing RDF
Multiple syntaxes: Turtle, RDF/XML, JSON-LD, ...
Embedding RDF in HTML
Describing and validating RDF content
Producer Consumer
29. Why describe & validate RDF?
For producers
Developers can understand the contents they are going to produce
Ensure they produce the expected structure
Advertise and document the structure
Generate interfaces
For consumers
Understand the contents
Verify the structure before processing it
Query generation & optimization
Producer Consumer
Shapes
31. Understanding the problem
Identifying the shape of graphs...
Shapes can describe the form of a node (node constraint)
...and the number of possible arcs incoming/outgoing from a node
...and the possible values associated with those arcs
:alice schema:name "Alice";
schema:knows :bob, :carol .
IRI schema:name string 1
schema:knows IRI 0, 1,...
RDF Node
Shape
RDF Node that
represents a User
<UserShape> IRI {
schema:name xsd:string ;
schema:knows IRI *
}
ShEx
32. Understanding the problem
Repeated properties
The same property can be used for different purposes in the same data
Example: A product must have 2 codes with different structure
:product schema:productID "isbn:123-456-789";
schema:productID "code456" .
A practical example from FHIR
See: http://hl7-fhir.github.io/observation-example-bloodpressure.ttl.html
33. Understanding the problem
Shapes ≠ types
Nodes in RDF graphs can have 0, 1 or many rdf:type declarations
A type can be used in multiple contexts, e.g. foaf:Person
Nodes are not necessarily annotated with discriminating types
Nodes with type :Person can represent friends, students, patients,...
Different meanings and different structure depending on context
Specific validation constraints for different contexts
34. Understanding the problem
RDF validation ≠ ontology definition ≠ instance data
Ontologies are usually focused on domain entities
RDF validation is focused on RDF graph features (lower level)
Ontology
Constraints
RDF Validation
Instance data
Different levels
:alice schema:name "Alice";
schema:knows :bob .
<User> IRI {
schema:name xsd:string ;
schema:knows IRI
}
schema:knows a owl:ObjectProperty ;
rdfs:domain schema:Person ;
rdfs:range schema:Person .
A user must have only two properties:
schema:name of value xsd:string
schema:knows with an IRI value
35. Why not using SPARQL to validate?
Pros:
Expressive
Ubiquitous
Cons
Expressive
Idiomatic
many ways to encode the same constraint
ASK {{ SELECT ?Person {
?Person schema:name ?o .
} GROUP BY ?Person HAVING (COUNT(*)=1)
}
{ SELECT ?Person {
?Person schema:name ?o .
FILTER ( isLiteral(?o) &&
datatype(?o) = xsd:string )
} GROUP BY ?Person HAVING (COUNT(*)=1)
}
{ SELECT ?Person (COUNT(*) AS ?c1) {
?Person schema:gender ?o .
} GROUP BY ?Person HAVING (COUNT(*)=1)}
{ SELECT ?Person (COUNT(*) AS ?c2) {
?S schema:gender ?o .
FILTER ((?o = schema:Female ||
?o = schema:Male))
} GROUP BY ?Person HAVING (COUNT(*)=1)}
FILTER (?c1 = ?c2)
}
Example: Define SPARQL query to check:
There must be one schema:name which
must be a xsd:string, and one
schema:gender which must be
schema:Male or schema:Female
36. Previous/other approaches
SPIN, by TopQuadrant http://spinrdf.org/
SPARQL templates, Influenced SHACL
Stardog ICV: http://docs.stardog.com/icv/icv-specification.html
OWL with UNA and CWA
OSLC resource shapes: https://www.w3.org/Submission/shapes/
Vocabulary for RDF validation
Dublin Core Application profiles (K. Coyle, T. Baker)
http://dublincore.org/documents/dc-dsp/
RDF Data Descriptions (Fischer et al)
http://ceur-ws.org/Vol-1330/paper-33.pdf
RDFUnit (D. Kontokostas)
http://aksw.org/Projects/RDFUnit.html
...
37. ShEx and SHACL
2013 RDF Validation Workshop
Conclusions of the workshop:
There is a need of a higher level, concise language for RDF Validation
ShEx initially proposed (v 1.0)
2014 W3c Data Shapes WG chartered
2017 SHACL accepted as W3C recommendation
2017 ShEx 2.0 released as Community group draft
2018 ShEx adopted by Wikidata
38. Short intro to ShEx
ShEx (Shape Expressions Language)
Concise and human-readable language for RDF validation & description
Syntax similar to SPARQL, Turtle
Semantics inspired by regular expressions & RelaxNG
2 syntaxes: Compact and RDF/JSON-LD
Official info: http://shex.io
Semantics: http://shex.io/shex-semantics/, primer: http://shex.io/shex-primer
40. Simple example
Nodes conforming to <User> shape must:
• Be IRIs
• Have exactly one schema:name with a value of type xsd:string
• Have zero or more schema:knows whose values conform to <User>
prefix schema: <http://schema.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
<User> IRI {
schema:name xsd:string ;
schema:knows @<User> *
}
Prefix
declarations
as in
Turtle/SPARQL
44. More info about ShEx
See:
ShEx by Example (slides):
https://figshare.com/articles/ShExByExample_pptx/6291464
ShEx chapter from Validating RDF data book:
http://book.validatingrdf.com/bookHtml010.html
45. Short intro to SHACL
W3C recommendation:
https://www.w3.org/TR/shacl/ (July 2017)
RDF vocabulary
2 parts: SHACL-Core, SHACL-SPARQL
46. SHACL implementations
Name Parts Language - Library Comments
Topbraid SHACL API SHACL Core, SPARQL Java (Jena) Used by TopBraid composer
SHACL playground SHACL Core Javascript (rdflib.js) http://shacl.org/playground/
SHACLEX SHACL Core Scala (Jena, RDF4j) http://rdfshape.weso.es
pySHACL SHACL Core, SPARQL Python (rdflib) https://github.com/RDFLib/pySHACL
Corese SHACL SHACL Core, SPARQL Java (STTL) http://wimmics.inria.fr/corese
RDFUnit SHACL Core, SPARQL Java (Jena) https://github.com/AKSW/RDFUnit
49. Some definitions about SHACL
Shape: collection of targets and constraints components
Targets: specify which nodes in the data graph must conform to a shape
Constraint components: Determine how to validate a node
Shape
target declarations
constraint components
:UserShape a sh:NodeShape ;
sh:targetNode :alice, :bob, :carol ;
sh:nodeKind sh:IRI ;
sh:property :hasName,
:hasEmail .
:hasName sh:path schema:name ;
sh:minCount 1;
sh:maxCount 1;
sh:datatype xsd:string .
. . .
50. Validation Report
The output of the validation process is a list of violation errors
No errors RDF conforms to shapes graph
[ a sh:ValidationReport ;
sh:conforms false ;
sh:result [
a sh:ValidationResult ;
sh:focusNode :bob ;
sh:message
"MinCount violation. Expected 1, obtained: 0" ;
sh:resultPath schema:name ;
sh:resultSeverity sh:Violation ;
sh:sourceConstraintComponent
sh:MinCountConstraintComponent ;
sh:sourceShape :hasName
] ;
...
[ a sh:ValidationReport ;
sh:conforms true
].
54. More info about SHACL
SHACL by example (slides):
https://figshare.com/articles/SHACL_by_example/6449645
SHACL chapter at Validating RDF data book
http://book.validatingrdf.com/bookHtml011.html
55. Some challenges and perspectives
Theoretical foundations of ShEx/SHACL
Inference shapes from data
Validation Usability
RDF Stream validation
Schema ecosystems
Wikidata
Solid
56. Theoretical foundations of ShEx/SHACL
Conversion between ShEx and SHACL
SHaclEX library converts subsets of both
Challenges
Recursion and negation
Performance and algorithmic complexity
Detect useful subsets of the languages
Convert to SPARQL
Schema/data mapping Shapes Shapes
ShEx SHACL
57. Inference of Shapes from Data
Useful use case in practice
Knowledge Graph summarization
Some prototypes:
ShExer, RDFShape, ShapeArchitect
Try it with RDFShape:
https://tinyurl.com/y8pjcbyf
Shape Expression
generated for
wd:Q51613194
Shapes
RDF data
infer
59. RDF Stream validation
Validation of RDF streams
Challenges:
Incremental validation
Named graphs
Addition/removal of triples
Sensor
Stream
RDF
data
Shapes
Validator
Stream
Validation
results
60. Schema ecosystems: Wikidata
In May, 2019, Wikidata announced ShEx adoption
New namespace for schemas
Example: https://www.wikidata.org/wiki/EntitySchema:E2
It opens lots of opportunities/challenges
Schema evolution and comparison
61. Schema ecosystems: Solid project
SOLID (SOcial Linked Data): Promoted by Tim Berners-Lee
Goal: Re-decentralize the Web
Separate data from apps
Give users more control about their data
Internally using linked data & RDF
Shapes needed for interoperability
See:
https://ruben.verborgh.org/blog/2019/06/17/shaping-linked-data-apps/
Data
pod
Shape
A
Shape
C
Shape
B
App
1
App
2
App
3
62. Conclusions
RDF as a basis for knowledge graphs
Explicit schemas can help improve data quality
2 languages proposed: ShEx/SHACL
Lots of new challenges and opportunities