Validata: A tool for testing
profile conformance
Alasdair J G Gray
Heriot-Watt University
www.macs.hw.ac.uk/~ajg33
A.J.G.Gray@hw.ac.uk
@gray_alasdair
Andrew Beveridge
Jacob Baungard Hansen
Johnny Val
Leif Gehrmann
Roisin Farmer
Sunil Khutan
Tomas Robertson
HCLS Dataset Descriptions
https://www.w3.org/TR/hcls-dataset/
Dumontier M, Gray AJG, Marshall MS, et al. (2016) The health care
and life sciences community profile for dataset descriptions.
PeerJ 4:e2331 https://doi.org/10.7717/peerj.2331
1 December 2016
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
2
Requirements
• Online tool
– Deployable on W3C
server
– GUI
– API
• Support multiple
constraints
– Properties
– Data values
– …
• Requirement levels
– Different levels of
user messages:
Error, Warning,
Information
• Configurable
– HCLS (Required)
– DCAT, Open
PHACTS, etc
(Optional)
1 December 2016
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
3
Example Constraint
1 December 2016 4
• Shape
• A Dataset
– MUST be declared to be of type dctype:Dataset
– MUST have a dcterms:title as a language typed
string
– MUST NOT have dcterms:created date
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
Dates are associated
with versions in HCLS
Example Validation
1 December 2016 5
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
• Shape
• Data
Example Validation
• Shape
• Data
1 December 2016 6
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
Example Validation
1 December 2016 7
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
• Shape
• Data
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
Shape
1 December 2016 8
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
Shape Expressions (ShEx)
1 December 2016 9
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
ShEx: Validation
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
<Dataset> {
rdf:type (dctypes:Dataset),
dct:title rdf:langString,
dct:alternative rdf:langString+,
!dct:created .
}
Validator can’t warn of
missing property
Example data
<Dataset> {
`MUST` rdf:type (dctypes:Dataset),
`MUST` dct:title rdf:langString,
`MAY` dct:alternative rdf:langString+,
`MUST` !dct:created .
}
Shape
1 December 2016 10
<Dataset> rdf:langString
.
✗
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
Requirement Levels
Validator can warn of
missing property
Implementation
Validata
• Web app front end
• Javascript + HTML
• Relies on ShEx-validator
– Validates documents
– Returns report
https://github.com/HW-
SWeL/Validata
ShEx-validator
• Validation system
• Validation API
• Javascript
– nodejs engine
• Reuses
– n3: RDF Library
– ShExParser
https://github.com/HW-
SWeL/ShEx-validator
1 December 2016
@gray_alasdair
www.macs.hw.ac.uk/~ajg33
11
http://hw-swel.github.io/Validata/
VALIDATA DEMO
Validata
https://github.com/HW-SWeL/Validata
• RDF constraint validation tool
– Configurable to any profile
• Shape Expression (ShEx) constraints
• Open source javascript implementation
www.macs.hw.ac.uk/~ajg33/
A.J.G.Gray@hw.ac.uk
@gray_alasdair

Validata: A tool for testing profile conformance

  • 1.
    Validata: A toolfor testing profile conformance Alasdair J G Gray Heriot-Watt University www.macs.hw.ac.uk/~ajg33 A.J.G.Gray@hw.ac.uk @gray_alasdair Andrew Beveridge Jacob Baungard Hansen Johnny Val Leif Gehrmann Roisin Farmer Sunil Khutan Tomas Robertson
  • 2.
    HCLS Dataset Descriptions https://www.w3.org/TR/hcls-dataset/ DumontierM, Gray AJG, Marshall MS, et al. (2016) The health care and life sciences community profile for dataset descriptions. PeerJ 4:e2331 https://doi.org/10.7717/peerj.2331 1 December 2016 @gray_alasdair www.macs.hw.ac.uk/~ajg33 2
  • 3.
    Requirements • Online tool –Deployable on W3C server – GUI – API • Support multiple constraints – Properties – Data values – … • Requirement levels – Different levels of user messages: Error, Warning, Information • Configurable – HCLS (Required) – DCAT, Open PHACTS, etc (Optional) 1 December 2016 @gray_alasdair www.macs.hw.ac.uk/~ajg33 3
  • 4.
    Example Constraint 1 December2016 4 • Shape • A Dataset – MUST be declared to be of type dctype:Dataset – MUST have a dcterms:title as a language typed string – MUST NOT have dcterms:created date <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 Dates are associated with versions in HCLS
  • 5.
    Example Validation 1 December2016 5 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 • Shape • Data
  • 6.
    Example Validation • Shape •Data 1 December 2016 6 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33
  • 7.
    Example Validation 1 December2016 7 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 • Shape • Data
  • 8.
    <Dataset> { rdf:type (dctypes:Dataset), dct:titlerdf:langString, dct:alternative rdf:langString+, !dct:created . } Shape 1 December 2016 8 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 Shape Expressions (ShEx)
  • 9.
    1 December 20169 @gray_alasdair www.macs.hw.ac.uk/~ajg33 ShEx: Validation <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } <Dataset> { rdf:type (dctypes:Dataset), dct:title rdf:langString, dct:alternative rdf:langString+, !dct:created . } Validator can’t warn of missing property Example data
  • 10.
    <Dataset> { `MUST` rdf:type(dctypes:Dataset), `MUST` dct:title rdf:langString, `MAY` dct:alternative rdf:langString+, `MUST` !dct:created . } Shape 1 December 2016 10 <Dataset> rdf:langString . ✗ @gray_alasdair www.macs.hw.ac.uk/~ajg33 Requirement Levels Validator can warn of missing property
  • 11.
    Implementation Validata • Web appfront end • Javascript + HTML • Relies on ShEx-validator – Validates documents – Returns report https://github.com/HW- SWeL/Validata ShEx-validator • Validation system • Validation API • Javascript – nodejs engine • Reuses – n3: RDF Library – ShExParser https://github.com/HW- SWeL/ShEx-validator 1 December 2016 @gray_alasdair www.macs.hw.ac.uk/~ajg33 11
  • 12.
  • 13.
    Validata https://github.com/HW-SWeL/Validata • RDF constraintvalidation tool – Configurable to any profile • Shape Expression (ShEx) constraints • Open source javascript implementation www.macs.hw.ac.uk/~ajg33/ A.J.G.Gray@hw.ac.uk @gray_alasdair

Editor's Notes

  • #3 Motivation: how do we check descriptions conform? Summary level: time unchanging information, e.g. name, description, publisher Version level: version specific information, e.g. version number, creator, etc Distribution level: file specific information, e.g. file location and format, number of triples 18 vocabularies: DCTerms, DCAT, VoID, FOAF, … 61 prescribed properties: MUST, SHOULD, MAY, MUST NOT for each level
  • #4 Link into data publishing pipeline via API Not tied to HCLS, only a motivation No existing tool meets these needs
  • #5 Constraints form a graph pattern that data must comply with
  • #6 How do we validate that our example data conforms to a certain shape Express expected shape as ShEx Toy example, what about for real
  • #7 How do we validate that our example data conforms to a certain shape Express expected shape as ShEx Toy example, what about for real
  • #8 How do we validate that our example data conforms to a certain shape Express expected shape as ShEx Toy example, what about for real
  • #9 ShEx: Concise notation regex based W3C SHACL not stable when work done ShEx is an implementation of SHACL with extra features
  • #10 Step through validation process
  • #11 Extended ShEx to allow arbitrary hierarchies Toy example, what about for real
  • #12 ShEx-validator has other dependencies too Minimist: arguments parser Promise: call backs Pegjs: parser generator Mocha: test driven development