SlideShare a Scribd company logo
1 of 28
Download to read offline
Facade-X
an opinionated approach to SPARQL Anything
Enrico Daga, Luigi Asprino, Paul Mulholland, Aldo Gangemi
Semantics Conference, Amsterdam, 6/9/2021
https://arxiv.org/pdf/2106.02361.pdf
This project has received funding from the European Union’s Horizon 2020 research and
innovation programme under grant agreement GA101004746.
The communication reflects only the author’s view and the Research Executive Agency is not
responsible for any use that may be made of the information it contains.
Playing the soundtrack of our history
Preserving musical heritage

through knowledge graphs
Managing musical heritage collections

through knowledge graphs
Studying musical heritage through

(interlinked) knowledge graphs
https://spice-h2020.eu/ https://polifonia-project.eu/
CSV
XML
JSON
XPath
IRI
JsonPath
HTML
JsonSchema
DOM
Web APIs HTTP
XSLT
XSD
Excel
Background
• A recent survey on KG benchmarking says data Integration is the dominant
use case for KG - [Atkin, 2021, in Lassila et al, 2021]
• Semantic Web always concerned with methods to “lift” legacy content to
RDF:
• Targeting specific types/formats: Direct mapping, Tarql, Any23,
JSON2RDF, CSV2RDF, COW, SPARQL Micro-services [Michel, 2019]
• Mapping languages, several types of (RML…, ShexML): high
learning demands. [Dimou, 2014] [García-González, 2020]
• Extending SPARQL with custom features: SPARQL Generate, high
learning demands, difficult to extend to other formats. [Lefrançois,
2017]
• Cognitive complexity: solutions transfer data source complexity to the user
(e.g. know XPath for XML, JsonPath for JSON, …)
• End-user development [Lieberman, 2006]. Many SPARQL users fall into
the category of end-user developer. In a recent survey [Warren et al, 2018]
42% SPARQL users are from non-IT areas, including social sciences and
the humanities, business and economics, and biomedical, engineering or
physical sciences.
Knowledge Graph Construction from structured resources
Iterative process:
• Observe: the resource (e.g. a CSV file)
• Design mappings to a target ontology
• Transform: execute the mappings
• Observe: compare / evaluate
Trail and error approach, many iterations
Knowledge Graph Construction, revisited
KG construction is a twofold job:
• perform a syntax/meta-model conversion (e.g. CSV to RDF)
• project semantics onto the data (applying a domain ontology)
A better model of the user cognitive process:
• Observe: the resource (e.g. a CSV file)
• Reengineering Design: what syntax/meta-model do I want?
• Remodelling Design: what semantics can we project?
• Transform: execute the mappings
• Observe: compare / evaluate
Trail and error approach, many iterations
Knowledge Graph Construction
an opinionated approach
• Reengineering: what syntax/meta-model do we
want?
• We cannot know what structure our user
wants but we know the meta-model: RDF
• Remodelling: what semantics do we project?
• SPARQL is great for projecting semantics
(change namespaces, create entities from
literals, adding types, sophisticated
relationships, composite structures, …)
• Can we use just SPARQL to do all of it?
@enridaga
Concept
Facade Design Pattern
From Object Oriented Programming
A single abstraction on different, alternative interfaces
https://en.wikipedia.org/wiki/Facade_pattern
An RDF facade?
• A common RDF structure over diverse
formats
• Focusing on the meta-model (data structure)
• Leaving domain semantics as-it-is!
• apply the least possible “ontological
commitment”
• Problem Space: CSV, JSON, HTML, XML,
Binary (JPEG, PNG, …),Text
• Solution space: RDFS
CSV
• Resource
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:root a rdfs:Class .
id,name,gender,dates,yearOfBirth,yearOfDeath,placeOfBirth,placeOfDeath,url
10093,"Abakanowicz, Magdalena",Female,born 1930,1930,,Polska,,http://www.tate.org.uk/art/artists/magdalena-abakanowicz-10093
…
https://github.com/tategallery/collection/blob/master/artist_data.csv
[ a fx:root ;
rdf:_1 [ xyz:dates "born 1930" ;
xyz:gender "Female" ;
xyz:id "10093" ;
xyz:name "Abakanowicz, Magdalena" ;
xyz:placeOfBirth "Polska" ;
xyz:placeOfDeath "" ;
xyz:url "http://www.tate.org.uk/art/artists/magdalena-
abakanowicz-10093" ;
xyz:yearOfBirth "1930" ;
xyz:yearOfDeath ""
] ;
csv.headers=true|false
[ a fx:root ;
rdf:_1 [ rdf:_1 "id" ;
rdf:_2 "name" ;
rdf:_3 "gender" ;
rdf:_4 "dates" ;
rdf:_5 "yearOfBirth" ;
rdf:_6 "yearOfDeath" ;
rdf:_7 "placeOfBirth" ;
rdf:_8 “placeOfDeath" ;
rdf:_9 "url"
] ;
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
@enridaga
JSON
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
https://github.com/tategallery/collection/artworks/t/023/t02319-9205.json
[ a fx:root ;
xyz:acno "T02319" ;
xyz:acquisitionYear "1978"^^<http://www.w3.org/2001/XMLSchema#int> ;
xyz:all_artists "Kazimir Malevich" ;
xyz:catalogueGroup […] ;
xyz:classification "painting" ;
xyz:contributorCount "1"^^<http://www.w3.org/2001/XMLSchema#int> ;
…
{
"acno": "T02319",
"acquisitionYear": 1978,
"all_artists": "Kazimir Malevich",
"catalogueGroup": {},
"classification": "painting",
"contributorCount": 1,
"contributors": [
{
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
DOM (HTML, XML, …)
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
rdf:type rdf:type rdf:Property
https://imma.ie/artists/
[ a fx:root , xhtml:div ;
xhtml:id “az-group” ;
rdf:_1 [ a xhtml:div ;
rdf:_1 [ a xhtml:h4 ;
rdf:_1 "A" ;
<https://html.spec.whatwg.org/#innerHTML>
"A" ;
<https://html.spec.whatwg.org/#innerText>
"A"
] ;
…
html.selector=#az-group
@prefix xhtml: <http://www.w3.org/1999/xhtml#> .
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
Binary and Text
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
xsd:base64Binary a rdfs:Datatype.
rdf:type df:type rdf:Property
[ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> “/9j/
4AAQSkZJRgABAQEASABIAAD/
4QmsRXhpZgAASUkqAAgAAAALAA8BAgAGAAAAkgAAABABAgAOAAAAmAAAABIBAw
ABAAAAAQAAABoBBQABAAAApgAAABsBBQABAAAArgAAACgBAwABAAAAAgAAADEB
AgALAAAAtgAAADIBAgAUAAAAwgAAABMCAwABAAAAAgAAAGmHBAABAAAA1gAAAC
WIBAABAAAA0gMAAOQDAABDYW5vbgBDYW5vbiBFT1MgNDBEAEgAAAABAAAASAAA
AAEAAABHSU1QIDIuNC41AAAyMDA4OjA3OjMxIDEwOjM4OjExAB4Am…”^^<http
://www.w3.org/2001/XMLSchema#base64Binary> ] .
bin.encoding # BASE64
txt.regex # tokenise into a sequence
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
https://imma.ie/collection/freeing-the-voice/
Hello World! [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> "Hello World!" ] .
Facade X
A simplified RDF meta-model, resembling a list-of-lists
Components: Containers (typed), slots (int / string), values (see paper for formal description)
Intuitive, abstract notions: key-value, sequence, type
“String”, 1, true,…
“String”, 1, true
xyz:row1, …
xyz:row_n, xyz:…
fx:root | xyz:*
rdf:type
xyz:*
rdf:_N
PREFIX fx: <http://sparql.xyz/facade-x/ns/>
PREFIX xyz: <http://sparql.xyz/facade-x/data/>
CSV XML JSON
FX
FX
SPARQL
https://sparql-anything.cc/
https://
github.com/
SPARQL-
Anything/
showcase-tate
Assumption: SPARQL
1.1 CONSTRUCT
queries will be enough
to design mappings (the
re-modelling phase)
@enridaga
https://github.com/SPARQL-Anything/showcase-mei
XML->CSV: using SPARQL to extract the note sequence from a XML/MEI file
Current features v0.3.0 (some not in the paper!)
• XML, JSON, CSV, HTML, Excel, Text, Binary, EXIF, File System, Archives
• Query templates / parameter queries (BASIL variables)
• Fully customisable HTTP requests, with authentication
• Support for pagination
• Helper functions for sequences: fx:anySlot, fx:before, fx:after, …
• Mix and nest SERVICE clauses (thanks to SPARQL)
• Use SPARQL Results Sets as input for parametric queries
• Combine multiple SPARQL queries in pipelines
• 100% open source, Apache Licence 2.0
• Implemented on top of Apache Jena ARQ
https://sparql-anything.cc/
https://github.com/
SPARQL-Anything/
showcase-imma
https://imma.ie/collection/freeing-the-memory/
Evaluation
• Qualitative discussion of solutions vs requirements (see the paper for details)
• Quantitative analysis of the cognitive complexity
• Quantitative analysis of performance, to assess practicability
• Experiments with real-world open data in the cultural heritage domain
• RML (Java RML Mapper): composing format specific languages in
declarative mappings
• SPARQL Generate: composing format specific languages extending
SPARQL
• SPARQL Anything: naive, in-memory implementation
• https://github.com/SPARQL-Anything/sparql.anything/tree/main/experiment
Transform several sources having heterogeneous formats
• + Spreadsheet
• + File system
• + Archives
• + EXIF metadata
Embedding binary data in literals with base64 encoding
Full fledged HTTP client to query Web APIs
WARN! No support for RDB yet
Low learning demands:
• No new language has to be learned as data can be queried using SPARQL 1.1
Meaningful abstraction:
• No need to know the technicalities of source formats
• No need to know JsonPath, XPath, …
• Resources can be accessed as-if they are RDF
Explorability:
• With SPARQL Generate and RML, the user needs to commit to a particular mapping or
transformation of the source data into RDF.
• Facade-X enables the user to avoid prematurely committing to a mapping (Observe!).
Low (cognitive) complexity
• One measure of complexity is the number of
(distinct) items or variables (Halford et al. 2004;
Warren et al. 2015). 
• 8 CQ (vs SPARQL Generate)
• What are the titles of the artworks attributed to “ANONIMO”?
• What are the titles of the artworks created in the 1935?
• …
• 4 transformations (vs RML and SPARQL Generate)
• Avg distinct tokens:
• SPARQL Anything: ~18
• SPARQL Generate: ~25 (∼39.72% more)
• RML: ~45 (∼150% more)
Practicable and sustainable
• Quantitative analysis of performance, to assess practicability
• In-Memory implementation (Naive)
• Execution time of q1-q12
• AVG on 10 executions
• Comparable to RML Mapper and SPARQL Generate on files
up to 1M JSON objects (~5M triples)
• In-Memory implementation scales linearly
• The approach is practicable
• Research on performance as future work
• Lines of Java code to maintain: SPARQL Generate 12280
(core module); RML Mapper 7951; SPARQL Anything: 3842
(all transformers) — v0.2.0 (v0.3.0 has ~11k)
Integrates with a typical SW workflow
• While we cannot assume that Semantic Web experts have knowledge of RML,
XPath, and SPARQL Generate, we can definitely expect knowledge of SPARQL
Adaptable and extendable
• Facade-X can be extended, if needed (although we didn’t need to so far)
• SPARQL Anything is easy to extend to more formats:
• New transformers just need to produce the facade
• No major changes to the user experience (new format-specific options)
• Changes to the adapters don’t require changing the user-facing code
• In contrast to RML / SPARQL Generate which need to extend the user-facing
toolkit
Future work
• Study formal properties of Facade-X, e.g. prove generality, semantics of meta-model mappings, …
• User study comparing Mapping Language vs Facade based approach
• Methodology for developers to connect new data sources (e.g. GraphViz DOT, MIDI, …)
• Current approach is limited to serialised resources (aka files)
• In-memory transformation before query execution
• A triple filtering strategy can reduce memory requirements significantly
• Study strategies to cope with very large files (e.g. slicing)
• Study query-rewriting strategies, eventually rewriting mappings into efficient, iterator-based
transformers (mapping translation [Corcho 2020])
• Relational Database, No-SQL (e.g. mongoDB)
• Reuse existing approaches (e.g. Ontop) but hide complexity to the user - avoid the need for
configuration.
Get in touch!
SPARQL Anything is under active development!
https://sparql-anything.cc
GitHub: https://github.com/SPARQL-Anything/sparql.anything
enrico.daga@open.ac.uk
@enridaga
www.enridaga.net
• Daga, E., Asprino, L., Mulholland, P., Gangemi, A.: Facade-x: an opinionated approach to sparql anything. In: SEMANTiCS 2021: 17th
International Conference on Semantic Systems (2021)
• Atkin, M., Deely, T., Scharffe, F.: Knowledge Graph Benchmarking Report 2021 (version 2.0). Zenodo, http://doi.org/10.5281/zenodo.4950097 (June
2021)
• Lassila, O., Michael Schmidt, Brad Bebee, Dave Bechberger, Willem Broekema, Ankesh Khandelwal, Kelvin Lawrence, Ronak Sharda, and Bryan
Thompson: Graph? Yes! Which one? Help!. 1st Squaring the circle on knowledge graphs workshop - Semantics (2021)
• Daga, E., Meroño-Peñuela, A., Motta, E.: Sequential linked data: the state of affairs. Semantic Web (2021)
• Warren, P., Mulholland, P.: Using sparql–the practitioners’ viewpoint. In: European Knowledge Acquisition Workshop. pp. 485–500. Springer (2018)
• Corcho, O., Priyatna, F., Chaves-Fraga, D.: Towards a new generation of ontology based data access. Semantic Web 11(1), 153–160 (2020)
• Michel, F., Faron-Zucker, C., Corby, O., Gandon, F.: Enabling automatic discovery and querying of web apis at web scale using linked data standards.
In: Companion Proceedings of The 2019 World Wide Web Conference. pp. 883–892 (2019)
• Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: Rml: a generic language for integrated rdf mappings of
heterogeneous data. In: 7th Workshop on Linked Data on the Web (2014)
• García-González, H., Boneva, I., Staworko, S., Labra-Gayo, J.E., Lovelle, J.M.C.: Shexml: improving the usability of heterogeneous data mapping
languages for firsttime users. PeerJ Computer Science 6, e318 (2020)
• Ko, A.J., Abraham, R., Beckwith, L., Blackwell, A., Burnett, M., Erwig, M., Scaffidi, C., Lawrance, J., Lieberman, H., Myers, B., et al.: The state of the
art in enduser software engineering. ACM Computing Surveys (CSUR) 43(3), 1–44 (2011)
• Lefrançois, M., Zimmermann, A., Bakerally, N.: A sparql extension for generating rdf from heterogeneous formats. In: European Semantic Web
Conference. pp. 35– 50. Springer (2017)
• Lieberman, H., Paternò, F., Klann, M., Wulf, V.: End-user development: An emerging paradigm. In: End user development, pp. 1–8. Springer (2006)
• Cyganiak, Richard. Tarql (sparql for tables): Turn csv into rdf using sparql syntax. Technical Report, 2015. http://tarql. github. io, 2015.
References
Facade-X: an opinionated approach to SPARQL anything

More Related Content

More from Enrico Daga

Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchEnrico Daga
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachEnrico Daga
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Enrico Daga
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterEnrico Daga
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesEnrico Daga
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User StudyEnrico Daga
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so farEnrico Daga
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsEnrico Daga
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionEnrico Daga
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsEnrico Daga
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEnrico Daga
 

More from Enrico Daga (12)

Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid Approach
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...
 
Ld4 dh tutorial
Ld4 dh tutorialLd4 dh tutorial
Ld4 dh tutorial
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data Cluster
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tables
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selection
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 

Recently uploaded

Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themeitharjee
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 

Recently uploaded (20)

Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 

Facade-X: an opinionated approach to SPARQL anything

  • 1. Facade-X an opinionated approach to SPARQL Anything Enrico Daga, Luigi Asprino, Paul Mulholland, Aldo Gangemi Semantics Conference, Amsterdam, 6/9/2021 https://arxiv.org/pdf/2106.02361.pdf This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement GA101004746. The communication reflects only the author’s view and the Research Executive Agency is not responsible for any use that may be made of the information it contains.
  • 2. Playing the soundtrack of our history Preserving musical heritage
 through knowledge graphs Managing musical heritage collections
 through knowledge graphs Studying musical heritage through
 (interlinked) knowledge graphs https://spice-h2020.eu/ https://polifonia-project.eu/
  • 4. Background • A recent survey on KG benchmarking says data Integration is the dominant use case for KG - [Atkin, 2021, in Lassila et al, 2021] • Semantic Web always concerned with methods to “lift” legacy content to RDF: • Targeting specific types/formats: Direct mapping, Tarql, Any23, JSON2RDF, CSV2RDF, COW, SPARQL Micro-services [Michel, 2019] • Mapping languages, several types of (RML…, ShexML): high learning demands. [Dimou, 2014] [García-González, 2020] • Extending SPARQL with custom features: SPARQL Generate, high learning demands, difficult to extend to other formats. [Lefrançois, 2017] • Cognitive complexity: solutions transfer data source complexity to the user (e.g. know XPath for XML, JsonPath for JSON, …) • End-user development [Lieberman, 2006]. Many SPARQL users fall into the category of end-user developer. In a recent survey [Warren et al, 2018] 42% SPARQL users are from non-IT areas, including social sciences and the humanities, business and economics, and biomedical, engineering or physical sciences.
  • 5. Knowledge Graph Construction from structured resources Iterative process: • Observe: the resource (e.g. a CSV file) • Design mappings to a target ontology • Transform: execute the mappings • Observe: compare / evaluate Trail and error approach, many iterations
  • 6. Knowledge Graph Construction, revisited KG construction is a twofold job: • perform a syntax/meta-model conversion (e.g. CSV to RDF) • project semantics onto the data (applying a domain ontology) A better model of the user cognitive process: • Observe: the resource (e.g. a CSV file) • Reengineering Design: what syntax/meta-model do I want? • Remodelling Design: what semantics can we project? • Transform: execute the mappings • Observe: compare / evaluate Trail and error approach, many iterations
  • 7. Knowledge Graph Construction an opinionated approach • Reengineering: what syntax/meta-model do we want? • We cannot know what structure our user wants but we know the meta-model: RDF • Remodelling: what semantics do we project? • SPARQL is great for projecting semantics (change namespaces, create entities from literals, adding types, sophisticated relationships, composite structures, …) • Can we use just SPARQL to do all of it? @enridaga
  • 8. Concept Facade Design Pattern From Object Oriented Programming A single abstraction on different, alternative interfaces https://en.wikipedia.org/wiki/Facade_pattern An RDF facade? • A common RDF structure over diverse formats • Focusing on the meta-model (data structure) • Leaving domain semantics as-it-is! • apply the least possible “ontological commitment” • Problem Space: CSV, JSON, HTML, XML, Binary (JPEG, PNG, …),Text • Solution space: RDFS
  • 9. CSV • Resource Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:root a rdfs:Class . id,name,gender,dates,yearOfBirth,yearOfDeath,placeOfBirth,placeOfDeath,url 10093,"Abakanowicz, Magdalena",Female,born 1930,1930,,Polska,,http://www.tate.org.uk/art/artists/magdalena-abakanowicz-10093 … https://github.com/tategallery/collection/blob/master/artist_data.csv [ a fx:root ; rdf:_1 [ xyz:dates "born 1930" ; xyz:gender "Female" ; xyz:id "10093" ; xyz:name "Abakanowicz, Magdalena" ; xyz:placeOfBirth "Polska" ; xyz:placeOfDeath "" ; xyz:url "http://www.tate.org.uk/art/artists/magdalena- abakanowicz-10093" ; xyz:yearOfBirth "1930" ; xyz:yearOfDeath "" ] ; csv.headers=true|false [ a fx:root ; rdf:_1 [ rdf:_1 "id" ; rdf:_2 "name" ; rdf:_3 "gender" ; rdf:_4 "dates" ; rdf:_5 "yearOfBirth" ; rdf:_6 "yearOfDeath" ; rdf:_7 "placeOfBirth" ; rdf:_8 “placeOfDeath" ; rdf:_9 "url" ] ; CSV JSON HTML XML Binary (JPEG, PNG, …) Text @enridaga
  • 10. JSON Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. https://github.com/tategallery/collection/artworks/t/023/t02319-9205.json [ a fx:root ; xyz:acno "T02319" ; xyz:acquisitionYear "1978"^^<http://www.w3.org/2001/XMLSchema#int> ; xyz:all_artists "Kazimir Malevich" ; xyz:catalogueGroup […] ; xyz:classification "painting" ; xyz:contributorCount "1"^^<http://www.w3.org/2001/XMLSchema#int> ; … { "acno": "T02319", "acquisitionYear": 1978, "all_artists": "Kazimir Malevich", "catalogueGroup": {}, "classification": "painting", "contributorCount": 1, "contributors": [ { CSV JSON HTML XML Binary (JPEG, PNG, …) Text
  • 11. DOM (HTML, XML, …) Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. rdf:type rdf:type rdf:Property https://imma.ie/artists/ [ a fx:root , xhtml:div ; xhtml:id “az-group” ; rdf:_1 [ a xhtml:div ; rdf:_1 [ a xhtml:h4 ; rdf:_1 "A" ; <https://html.spec.whatwg.org/#innerHTML> "A" ; <https://html.spec.whatwg.org/#innerText> "A" ] ; … html.selector=#az-group @prefix xhtml: <http://www.w3.org/1999/xhtml#> . CSV JSON HTML XML Binary (JPEG, PNG, …) Text
  • 12. Binary and Text Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. xsd:base64Binary a rdfs:Datatype. rdf:type df:type rdf:Property [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> “/9j/ 4AAQSkZJRgABAQEASABIAAD/ 4QmsRXhpZgAASUkqAAgAAAALAA8BAgAGAAAAkgAAABABAgAOAAAAmAAAABIBAw ABAAAAAQAAABoBBQABAAAApgAAABsBBQABAAAArgAAACgBAwABAAAAAgAAADEB AgALAAAAtgAAADIBAgAUAAAAwgAAABMCAwABAAAAAgAAAGmHBAABAAAA1gAAAC WIBAABAAAA0gMAAOQDAABDYW5vbgBDYW5vbiBFT1MgNDBEAEgAAAABAAAASAAA AAEAAABHSU1QIDIuNC41AAAyMDA4OjA3OjMxIDEwOjM4OjExAB4Am…”^^<http ://www.w3.org/2001/XMLSchema#base64Binary> ] . bin.encoding # BASE64 txt.regex # tokenise into a sequence CSV JSON HTML XML Binary (JPEG, PNG, …) Text https://imma.ie/collection/freeing-the-voice/ Hello World! [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> "Hello World!" ] .
  • 13. Facade X A simplified RDF meta-model, resembling a list-of-lists Components: Containers (typed), slots (int / string), values (see paper for formal description) Intuitive, abstract notions: key-value, sequence, type “String”, 1, true,… “String”, 1, true xyz:row1, … xyz:row_n, xyz:… fx:root | xyz:* rdf:type xyz:* rdf:_N PREFIX fx: <http://sparql.xyz/facade-x/ns/> PREFIX xyz: <http://sparql.xyz/facade-x/data/> CSV XML JSON FX FX SPARQL
  • 15. https:// github.com/ SPARQL- Anything/ showcase-tate Assumption: SPARQL 1.1 CONSTRUCT queries will be enough to design mappings (the re-modelling phase) @enridaga
  • 16. https://github.com/SPARQL-Anything/showcase-mei XML->CSV: using SPARQL to extract the note sequence from a XML/MEI file
  • 17. Current features v0.3.0 (some not in the paper!) • XML, JSON, CSV, HTML, Excel, Text, Binary, EXIF, File System, Archives • Query templates / parameter queries (BASIL variables) • Fully customisable HTTP requests, with authentication • Support for pagination • Helper functions for sequences: fx:anySlot, fx:before, fx:after, … • Mix and nest SERVICE clauses (thanks to SPARQL) • Use SPARQL Results Sets as input for parametric queries • Combine multiple SPARQL queries in pipelines • 100% open source, Apache Licence 2.0 • Implemented on top of Apache Jena ARQ https://sparql-anything.cc/
  • 19. Evaluation • Qualitative discussion of solutions vs requirements (see the paper for details) • Quantitative analysis of the cognitive complexity • Quantitative analysis of performance, to assess practicability • Experiments with real-world open data in the cultural heritage domain • RML (Java RML Mapper): composing format specific languages in declarative mappings • SPARQL Generate: composing format specific languages extending SPARQL • SPARQL Anything: naive, in-memory implementation • https://github.com/SPARQL-Anything/sparql.anything/tree/main/experiment
  • 20. Transform several sources having heterogeneous formats • + Spreadsheet • + File system • + Archives • + EXIF metadata Embedding binary data in literals with base64 encoding Full fledged HTTP client to query Web APIs WARN! No support for RDB yet
  • 21. Low learning demands: • No new language has to be learned as data can be queried using SPARQL 1.1 Meaningful abstraction: • No need to know the technicalities of source formats • No need to know JsonPath, XPath, … • Resources can be accessed as-if they are RDF Explorability: • With SPARQL Generate and RML, the user needs to commit to a particular mapping or transformation of the source data into RDF. • Facade-X enables the user to avoid prematurely committing to a mapping (Observe!).
  • 22. Low (cognitive) complexity • One measure of complexity is the number of (distinct) items or variables (Halford et al. 2004; Warren et al. 2015).  • 8 CQ (vs SPARQL Generate) • What are the titles of the artworks attributed to “ANONIMO”? • What are the titles of the artworks created in the 1935? • … • 4 transformations (vs RML and SPARQL Generate) • Avg distinct tokens: • SPARQL Anything: ~18 • SPARQL Generate: ~25 (∼39.72% more) • RML: ~45 (∼150% more)
  • 23. Practicable and sustainable • Quantitative analysis of performance, to assess practicability • In-Memory implementation (Naive) • Execution time of q1-q12 • AVG on 10 executions • Comparable to RML Mapper and SPARQL Generate on files up to 1M JSON objects (~5M triples) • In-Memory implementation scales linearly • The approach is practicable • Research on performance as future work • Lines of Java code to maintain: SPARQL Generate 12280 (core module); RML Mapper 7951; SPARQL Anything: 3842 (all transformers) — v0.2.0 (v0.3.0 has ~11k)
  • 24. Integrates with a typical SW workflow • While we cannot assume that Semantic Web experts have knowledge of RML, XPath, and SPARQL Generate, we can definitely expect knowledge of SPARQL Adaptable and extendable • Facade-X can be extended, if needed (although we didn’t need to so far) • SPARQL Anything is easy to extend to more formats: • New transformers just need to produce the facade • No major changes to the user experience (new format-specific options) • Changes to the adapters don’t require changing the user-facing code • In contrast to RML / SPARQL Generate which need to extend the user-facing toolkit
  • 25. Future work • Study formal properties of Facade-X, e.g. prove generality, semantics of meta-model mappings, … • User study comparing Mapping Language vs Facade based approach • Methodology for developers to connect new data sources (e.g. GraphViz DOT, MIDI, …) • Current approach is limited to serialised resources (aka files) • In-memory transformation before query execution • A triple filtering strategy can reduce memory requirements significantly • Study strategies to cope with very large files (e.g. slicing) • Study query-rewriting strategies, eventually rewriting mappings into efficient, iterator-based transformers (mapping translation [Corcho 2020]) • Relational Database, No-SQL (e.g. mongoDB) • Reuse existing approaches (e.g. Ontop) but hide complexity to the user - avoid the need for configuration.
  • 26. Get in touch! SPARQL Anything is under active development! https://sparql-anything.cc GitHub: https://github.com/SPARQL-Anything/sparql.anything enrico.daga@open.ac.uk @enridaga www.enridaga.net
  • 27. • Daga, E., Asprino, L., Mulholland, P., Gangemi, A.: Facade-x: an opinionated approach to sparql anything. In: SEMANTiCS 2021: 17th International Conference on Semantic Systems (2021) • Atkin, M., Deely, T., Scharffe, F.: Knowledge Graph Benchmarking Report 2021 (version 2.0). Zenodo, http://doi.org/10.5281/zenodo.4950097 (June 2021) • Lassila, O., Michael Schmidt, Brad Bebee, Dave Bechberger, Willem Broekema, Ankesh Khandelwal, Kelvin Lawrence, Ronak Sharda, and Bryan Thompson: Graph? Yes! Which one? Help!. 1st Squaring the circle on knowledge graphs workshop - Semantics (2021) • Daga, E., Meroño-Peñuela, A., Motta, E.: Sequential linked data: the state of affairs. Semantic Web (2021) • Warren, P., Mulholland, P.: Using sparql–the practitioners’ viewpoint. In: European Knowledge Acquisition Workshop. pp. 485–500. Springer (2018) • Corcho, O., Priyatna, F., Chaves-Fraga, D.: Towards a new generation of ontology based data access. Semantic Web 11(1), 153–160 (2020) • Michel, F., Faron-Zucker, C., Corby, O., Gandon, F.: Enabling automatic discovery and querying of web apis at web scale using linked data standards. In: Companion Proceedings of The 2019 World Wide Web Conference. pp. 883–892 (2019) • Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: Rml: a generic language for integrated rdf mappings of heterogeneous data. In: 7th Workshop on Linked Data on the Web (2014) • García-González, H., Boneva, I., Staworko, S., Labra-Gayo, J.E., Lovelle, J.M.C.: Shexml: improving the usability of heterogeneous data mapping languages for firsttime users. PeerJ Computer Science 6, e318 (2020) • Ko, A.J., Abraham, R., Beckwith, L., Blackwell, A., Burnett, M., Erwig, M., Scaffidi, C., Lawrance, J., Lieberman, H., Myers, B., et al.: The state of the art in enduser software engineering. ACM Computing Surveys (CSUR) 43(3), 1–44 (2011) • Lefrançois, M., Zimmermann, A., Bakerally, N.: A sparql extension for generating rdf from heterogeneous formats. In: European Semantic Web Conference. pp. 35– 50. Springer (2017) • Lieberman, H., Paternò, F., Klann, M., Wulf, V.: End-user development: An emerging paradigm. In: End user development, pp. 1–8. Springer (2006) • Cyganiak, Richard. Tarql (sparql for tables): Turn csv into rdf using sparql syntax. Technical Report, 2015. http://tarql. github. io, 2015. References