SlideShare a Scribd company logo
1 of 18
Download to read offline
The SPARQL Anything project
Enrico Daga and Luigi Asprino
The Web Conference - Developers Track
22/04/2021 - online @enridaga
Background
• Semantic Web developers always concerned with methods to
“lift” legacy content to RDF:
• Targeting specific types/formats: SPARQL Microservices
[Michel, 2019], Tarql, Any23, JSON2RDF, CSV2RDF
• Mapping languages, several types of (e.g. RML,
ShexML): high learning demands. [Dimou, 2014]
[García-González, 2020]
• SPARQL Generate: learning demands, difficult to extend
to other formats. [Lefrançois, 2017]
• Solutions transfer data source complexity to the user (e.g.
know XPath for XML, JsonPath for JSON, …)
• End-user development [Lieberman, 2006]. Many SPARQL
users fall into the category of end-user developer. In a recent
survey, 42% SPARQL users are from non-IT areas,
including social sciences and the humanities, business and
economics, and biomedical, engineering or physical sciences.
SPICE
Social Cohesion, Participation and Inclusion
through Cultural Engagement
Polifonia
Digital Harmoniser of Musical Cultural Heritage
-
Cultural Heritage Knowledge Graphs
-
Sources in different formats
x
Multiple / unknown ontologies
=
Duplication of effort!!!
https://spice.kmi.open.ac.uk/
http://spice-h2020.eu
https://polifonia-project.eu/
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme
Knowledge Graph Construction
Composite process:
• Observe: the data source (e.g. a CSV file)
• Map: develop mappings to a target ontology
• Triplify: run the mappings and evaluate the result
• (many iterations)
KG construction is a twofold job:
• perform a syntax/structure conversion (e.g. from CSV to RDF)
• project semantics onto the data (applying a domain ontology)
Concept
… twofold job:
• perform a syntax/structure conversion -> Re-engineering
• We want to solve this problem once and for all
• project semantics onto the data (applying a domain ontology) -> Re-modelling
• We leave this to the end user, powered by SPARQL 1.1
• Approach: design a single RDF facade for any data format
• Re-engineering
• Focus on the syntax and the meta-model (data structure)
• Leave data as much as possible as-it-is!
• apply the least possible “ontological commitment”
https://en.wikipedia.org/wiki/Facade_pattern
An RDF Facade?
Problem Space
• CSV
• JSON
• HTML
• XML
• Binary (JPEG, PNG, …)
• Text
Solution Space
• https://www.w3.org/TR/rdf11-concepts/
• https://www.w3.org/TR/rdf-schema/
rdf:type, rdf:Property, rdfs:label,
rdfs:Resource, rdfs:Class, rdf:Bag,
rdfs:Container, rdf:List, RDF Dataset,
Graph, …
Facade-X: (to be filled by picking and mixing from the solution space)
Ups! We are facing the same old problem … only this time we don’t care about the content
(domain) and we only focus on the format and data structure (meta-model)
CSV
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:Root a rdfs:Class .
id,name,gender,dates,yearOfBirth,yearOfDeath,placeOfBirth,placeOfDeath,url
10093,"Abakanowicz, Magdalena",Female,born 1930,1930,,Polska,,http://www.tate.org.uk/art/artists/magdalena-abakanowicz-10093
…
https://github.com/tategallery/collection/blob/master/artist_data.csv
[ a fx:root ;
rdf:_1 [ xyz:dates "born 1930" ;
xyz:gender "Female" ;
xyz:id "10093" ;
xyz:name "Abakanowicz, Magdalena" ;
xyz:placeOfBirth "Polska" ;
xyz:placeOfDeath "" ;
xyz:url "http://www.tate.org.uk/art/artists/magdalena-
abakanowicz-10093" ;
xyz:yearOfBirth "1930" ;
xyz:yearOfDeath ""
] ;
csv.headers=true|false
[ a fx:root ;
rdf:_1 [ rdf:_1 "id" ;
rdf:_2 "name" ;
rdf:_3 "gender" ;
rdf:_4 "dates" ;
rdf:_5 "yearOfBirth" ;
rdf:_6 "yearOfDeath" ;
rdf:_7 "placeOfBirth" ;
rdf:_8 “placeOfDeath" ;
rdf:_9 "url"
] ;
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
@enridaga
JSON
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:Root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
https://github.com/tategallery/collection/artworks/t/023/t02319-9205.json
[ a fx:root ;
xyz:acno "T02319" ;
xyz:acquisitionYear "1978"^^<http://www.w3.org/2001/XMLSchema#int> ;
xyz:all_artists "Kazimir Malevich" ;
xyz:catalogueGroup [] ;
xyz:classification "painting" ;
xyz:contributorCount "1"^^<http://www.w3.org/2001/XMLSchema#int> ;
…
{
"acno": "T02319",
"acquisitionYear": 1978,
"all_artists": "Kazimir Malevich",
"catalogueGroup": {},
"classification": "painting",
"contributorCount": 1,
"contributors": [
{
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
DOM (HTML, XML, …)
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:Root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
rdf:type rdf:type rdf:Property
https://imma.ie/artists/
[ a fx:root , xhtml:div ;
xhtml:id “az-group” ;
rdf:_1 [ a xhtml:div ;
rdf:_1 [ a xhtml:h4 ;
rdf:_1 "A" ;
<https://html.spec.whatwg.org/#innerHTML>
"A" ;
<https://html.spec.whatwg.org/#innerText>
"A"
] ;
…
html.selector=#az-group
@prefix xhtml: <http://www.w3.org/1999/xhtml#> .
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
Binary and Text
Facade: http://sparql.xyz/facade-x/ns/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fx: <http://sparql.xyz/facade-x/ns/>.
@prefix xyz: <http://sparql.xyz/facade-x/data/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
rdf:Property a rdfs:Class .
rdfs:ContainerMembershipProperty
rdfs:subClassOf rdf:Property .
fx:Root a rdfs:Class .
xsd:int a rdfs:Datatype.
xsd:string a rdfs:Datatype.
xsd:boolean a rdfs:Datatype.
xsd:decimal a rdfs:Datatype.
xsd:float a rdfs:Datatype.
xsd:double a rdfs:Datatype.
xsd:base64Binary a rdfs:Datatype.
rdf:type df:type rdf:Property
[ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> “/9j/
4AAQSkZJRgABAQEASABIAAD/
4QmsRXhpZgAASUkqAAgAAAALAA8BAgAGAAAAkgAAABABAgAOAAAAmAAAABIBAw
ABAAAAAQAAABoBBQABAAAApgAAABsBBQABAAAArgAAACgBAwABAAAAAgAAADEB
AgALAAAAtgAAADIBAgAUAAAAwgAAABMCAwABAAAAAgAAAGmHBAABAAAA1gAAAC
WIBAABAAAA0gMAAOQDAABDYW5vbgBDYW5vbiBFT1MgNDBEAEgAAAABAAAASAAA
AAEAAABHSU1QIDIuNC41AAAyMDA4OjA3OjMxIDEwOjM4OjExAB4Am…”^^<http
://www.w3.org/2001/XMLSchema#base64Binary> ] .
bin.encoding # BASE64
txt.regex # tokenise into a sequence
CSV
JSON
HTML
XML
Binary (JPEG, PNG, …)
Text
https://imma.ie/collection/freeing-the-voice/
Hello World! [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> "Hello World!" ] .
https://sparql-anything.cc/
https://
github.com/
SPARQL-
Anything/
showcase-tate
Assumption: SPARQL
1.1 CONSTRUCT
queries will be enough
to design mappings (the
re-modelling phase)
https://github.com/
SPARQL-Anything/
showcase-imma
https://imma.ie/collection/freeing-the-memory/
Preliminary feedback
• From 27 users, diverse SPARQL expertise
• Essential or very important
• the system should minimise the languages or syntaxes needed
• mappings should be easy to read and interpret
• the system must be easy to learn for a Semantic Web practitioner
• the system is able to support new types of data sources without changes to the mapping language
• How easy is this code to understand
(comparing equivalent mappings)?
• (a) RML
• (b) SPARQL Generate
• (c) SPARQL Anything
Benefits
• Transform / Query resources having heterogeneous formats
• Low learning demands (plain SPARQL 1.1)
• Minimise complexity of the mappings
• A single+consistent abstraction for any data format
• Enable data exploration in the absence of a domain ontology
• Integrate with a typical Semantic Web engineering workflow
• Flexible and adaptable (Facade-X can be extended, if needed)
• Easy to extend:
• new transformers just need to return the facade
• no major changes to the user experience
Challenges
• No commitment on the internal machinery! (It is a gift and a curse …)
• Current version v0.1.1 (we started Nov 2020):
• implemented on top of Apache Jena ARQ
• limited to files
• loads the triples in-memory and then performs the query
• A triple filtering strategy reduces in-memory dataset
• Very large files require very large memory
• Next: to develop strategies to cope with large resources (e.g. slicing)
• Next: to develop query-rewriting strategies, eventually rewriting mappings into efficient,
iterator-based transformers (mapping translation [Corcho 2020])
• Next: Relational Database, No-SQL (e.g. mongoDB)
• Reuse existing approaches (e.g. OBDA) but hide complexity to the user
Get in touch!
SPARQL Anything is under active development
https://github.com/SPARQL-Anything/sparql.anything
enrico.daga@open.ac.uk
@enridaga
www.enridaga.net
References
• Daga, E., Asprino, L., Mulholland, P., Gangemi, A.: Facade-x: an opinionated approach to sparql anything (submitted). In: SEMANTiCS 2021:
17th International Conference on Semantic Systems (2021)
• Daga, E., Meroño-Peñuela, A., Motta, E.: Sequential linked data: the state of affairs. Semantic Web (2021)
• Warren, P., Mulholland, P.: Using sparql–the practitioners’ viewpoint. In: European Knowledge Acquisition Workshop. pp. 485–500. Springer
(2018)
• Corcho, O., Priyatna, F., Chaves-Fraga, D.: Towards a new generation of ontology based data access. Semantic Web 11(1), 153–160 (2020)
• Michel, F., Faron-Zucker, C., Corby, O., Gandon, F.: Enabling automatic discovery and querying of web apis at web scale using linked data
standards. In: Companion Proceedings of The 2019 World Wide Web Conference. pp. 883–892 (2019)
• Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: Rml: a generic language for integrated rdf mappings
of heterogeneous data. In: 7th Workshop on Linked Data on the Web (2014)
• García-González, H., Boneva, I., Staworko, S., Labra-Gayo, J.E., Lovelle, J.M.C.: Shexml: improving the usability of heterogeneous data
mapping languages for firsttime users. PeerJ Computer Science 6, e318 (2020)
• Ko, A.J., Abraham, R., Beckwith, L., Blackwell, A., Burnett, M., Erwig, M., Scaffidi, C., Lawrance, J., Lieberman, H., Myers, B., et al.: The state
of the art in enduser software engineering. ACM Computing Surveys (CSUR) 43(3), 1–44 (2011)
• Lefrançois, M., Zimmermann, A., Bakerally, N.: A sparql extension for generating rdf from heterogeneous formats. In: European Semantic Web
Conference. pp. 35– 50. Springer (2017)
• Lieberman, H., Paternò, F., Klann, M., Wulf, V.: End-user development: An emerging paradigm. In: End user development, pp. 1–8. Springer
(2006)
• Cyganiak, Richard. Tarql (sparql for tables): Turn csv into rdf using sparql syntax. Technical Report, 2015. http://tarql. github. io, 2015.

More Related Content

What's hot

Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Mapping, Interlinking and Exposing MusicBrainz as Linked Data
Mapping, Interlinking and Exposing MusicBrainz as Linked DataMapping, Interlinking and Exposing MusicBrainz as Linked Data
Mapping, Interlinking and Exposing MusicBrainz as Linked DataPeter Haase
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introductionGraphity
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data ApplicationsEUCLID project
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudRichard Cyganiak
 
SPARQL in the Semantic Web
SPARQL in the Semantic WebSPARQL in the Semantic Web
SPARQL in the Semantic WebJan Beeck
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-tonvitucci
 

What's hot (12)

Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Triple Stores
Triple StoresTriple Stores
Triple Stores
 
RDF data model
RDF data modelRDF data model
RDF data model
 
RDF, linked data and semantic web
RDF, linked data and semantic webRDF, linked data and semantic web
RDF, linked data and semantic web
 
Mapping, Interlinking and Exposing MusicBrainz as Linked Data
Mapping, Interlinking and Exposing MusicBrainz as Linked DataMapping, Interlinking and Exposing MusicBrainz as Linked Data
Mapping, Interlinking and Exposing MusicBrainz as Linked Data
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introduction
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data Applications
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data Mud
 
SPARQL in the Semantic Web
SPARQL in the Semantic WebSPARQL in the Semantic Web
SPARQL in the Semantic Web
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
 

Similar to The SPARQL Anything project

Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsMaxime Lefrançois
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...Jose Quesada (hiring)
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyRobert Viseur
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introductionHektor Jacynycz García
 
Two heads are better than one a report p on the drf technical workshop
Two heads are better than one a report p on the drf technical workshopTwo heads are better than one a report p on the drf technical workshop
Two heads are better than one a report p on the drf technical workshopYuji Nonaka
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFDimitris Kontokostas
 
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...Franck Michel
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataPaco Nathan
 
JSON-LD update DC 2017
JSON-LD update DC 2017JSON-LD update DC 2017
JSON-LD update DC 2017Gregg Kellogg
 
Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Kai Eckert
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...datascienceiqss
 
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case StudyLeveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case StudyLuca Berardinelli
 
Hadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilindHadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilindEMC
 
Putting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAMPutting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAM4Science
 

Similar to The SPARQL Anything project (20)

Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
Spark meetup TCHUG
Spark meetup TCHUGSpark meetup TCHUG
Spark meetup TCHUG
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developments
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
 
Two heads are better than one a report p on the drf technical workshop
Two heads are better than one a report p on the drf technical workshopTwo heads are better than one a report p on the drf technical workshop
Two heads are better than one a report p on the drf technical workshop
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big Data
 
JSON-LD update DC 2017
JSON-LD update DC 2017JSON-LD update DC 2017
JSON-LD update DC 2017
 
Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
 
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case StudyLeveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
Leveraging Model-Driven Technologies for JSON Artefacts: The Shipyard Case Study
 
Changing Platforms
Changing PlatformsChanging Platforms
Changing Platforms
 
Hadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilindHadoop Tutorial with @techmilind
Hadoop Tutorial with @techmilind
 
Grails goes Graph
Grails goes GraphGrails goes Graph
Grails goes Graph
 
Putting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAMPutting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAM
 

More from Enrico Daga

Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyEnrico Daga
 
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...Enrico Daga
 
Capturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities researchCapturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities researchEnrico Daga
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Enrico Daga
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchEnrico Daga
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachEnrico Daga
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Enrico Daga
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterEnrico Daga
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesEnrico Daga
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User StudyEnrico Daga
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so farEnrico Daga
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsEnrico Daga
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionEnrico Daga
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsEnrico Daga
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEnrico Daga
 

More from Enrico Daga (16)

Citizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data JourneyCitizen Experiences in Cultural Heritage Archives: a Data Journey
Citizen Experiences in Cultural Heritage Archives: a Data Journey
 
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...Streamlining Knowledge Graph Construction with a façade:  the SPARQL Anything...
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
 
Capturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities researchCapturing the semantics of documentary evidence for humanities research
Capturing the semantics of documentary evidence for humanities research
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid Approach
 
Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...Challenging knowledge extraction to support
the curation of documentary evide...
Challenging knowledge extraction to support
the curation of documentary evide...
 
Ld4 dh tutorial
Ld4 dh tutorialLd4 dh tutorial
Ld4 dh tutorial
 
OU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data ClusterOU RSE Tutorial Big Data Cluster
OU RSE Tutorial Big Data Cluster
 
CityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tablesCityLABS Workshop: Working with large tables
CityLABS Workshop: Working with large tables
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
 
A bottom up approach for licences classification and selection
A bottom up approach for licences classification and selectionA bottom up approach for licences classification and selection
A bottom up approach for licences classification and selection
 
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL EndpointsA BASILar Approach for Building Web APIs on top of SPARQL Endpoints
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 

Recently uploaded

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...ttt fff
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 

Recently uploaded (20)

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 

The SPARQL Anything project

  • 1. The SPARQL Anything project Enrico Daga and Luigi Asprino The Web Conference - Developers Track 22/04/2021 - online @enridaga
  • 2. Background • Semantic Web developers always concerned with methods to “lift” legacy content to RDF: • Targeting specific types/formats: SPARQL Microservices [Michel, 2019], Tarql, Any23, JSON2RDF, CSV2RDF • Mapping languages, several types of (e.g. RML, ShexML): high learning demands. [Dimou, 2014] [García-González, 2020] • SPARQL Generate: learning demands, difficult to extend to other formats. [Lefrançois, 2017] • Solutions transfer data source complexity to the user (e.g. know XPath for XML, JsonPath for JSON, …) • End-user development [Lieberman, 2006]. Many SPARQL users fall into the category of end-user developer. In a recent survey, 42% SPARQL users are from non-IT areas, including social sciences and the humanities, business and economics, and biomedical, engineering or physical sciences.
  • 3. SPICE Social Cohesion, Participation and Inclusion through Cultural Engagement Polifonia Digital Harmoniser of Musical Cultural Heritage - Cultural Heritage Knowledge Graphs - Sources in different formats x Multiple / unknown ontologies = Duplication of effort!!! https://spice.kmi.open.ac.uk/ http://spice-h2020.eu https://polifonia-project.eu/ This project has received funding from the European Union’s Horizon 2020 research and innovation programme
  • 4. Knowledge Graph Construction Composite process: • Observe: the data source (e.g. a CSV file) • Map: develop mappings to a target ontology • Triplify: run the mappings and evaluate the result • (many iterations) KG construction is a twofold job: • perform a syntax/structure conversion (e.g. from CSV to RDF) • project semantics onto the data (applying a domain ontology)
  • 5. Concept … twofold job: • perform a syntax/structure conversion -> Re-engineering • We want to solve this problem once and for all • project semantics onto the data (applying a domain ontology) -> Re-modelling • We leave this to the end user, powered by SPARQL 1.1 • Approach: design a single RDF facade for any data format • Re-engineering • Focus on the syntax and the meta-model (data structure) • Leave data as much as possible as-it-is! • apply the least possible “ontological commitment” https://en.wikipedia.org/wiki/Facade_pattern
  • 6. An RDF Facade? Problem Space • CSV • JSON • HTML • XML • Binary (JPEG, PNG, …) • Text Solution Space • https://www.w3.org/TR/rdf11-concepts/ • https://www.w3.org/TR/rdf-schema/ rdf:type, rdf:Property, rdfs:label, rdfs:Resource, rdfs:Class, rdf:Bag, rdfs:Container, rdf:List, RDF Dataset, Graph, … Facade-X: (to be filled by picking and mixing from the solution space) Ups! We are facing the same old problem … only this time we don’t care about the content (domain) and we only focus on the format and data structure (meta-model)
  • 7. CSV Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:Root a rdfs:Class . id,name,gender,dates,yearOfBirth,yearOfDeath,placeOfBirth,placeOfDeath,url 10093,"Abakanowicz, Magdalena",Female,born 1930,1930,,Polska,,http://www.tate.org.uk/art/artists/magdalena-abakanowicz-10093 … https://github.com/tategallery/collection/blob/master/artist_data.csv [ a fx:root ; rdf:_1 [ xyz:dates "born 1930" ; xyz:gender "Female" ; xyz:id "10093" ; xyz:name "Abakanowicz, Magdalena" ; xyz:placeOfBirth "Polska" ; xyz:placeOfDeath "" ; xyz:url "http://www.tate.org.uk/art/artists/magdalena- abakanowicz-10093" ; xyz:yearOfBirth "1930" ; xyz:yearOfDeath "" ] ; csv.headers=true|false [ a fx:root ; rdf:_1 [ rdf:_1 "id" ; rdf:_2 "name" ; rdf:_3 "gender" ; rdf:_4 "dates" ; rdf:_5 "yearOfBirth" ; rdf:_6 "yearOfDeath" ; rdf:_7 "placeOfBirth" ; rdf:_8 “placeOfDeath" ; rdf:_9 "url" ] ; CSV JSON HTML XML Binary (JPEG, PNG, …) Text @enridaga
  • 8. JSON Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:Root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. https://github.com/tategallery/collection/artworks/t/023/t02319-9205.json [ a fx:root ; xyz:acno "T02319" ; xyz:acquisitionYear "1978"^^<http://www.w3.org/2001/XMLSchema#int> ; xyz:all_artists "Kazimir Malevich" ; xyz:catalogueGroup [] ; xyz:classification "painting" ; xyz:contributorCount "1"^^<http://www.w3.org/2001/XMLSchema#int> ; … { "acno": "T02319", "acquisitionYear": 1978, "all_artists": "Kazimir Malevich", "catalogueGroup": {}, "classification": "painting", "contributorCount": 1, "contributors": [ { CSV JSON HTML XML Binary (JPEG, PNG, …) Text
  • 9. DOM (HTML, XML, …) Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:Root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. rdf:type rdf:type rdf:Property https://imma.ie/artists/ [ a fx:root , xhtml:div ; xhtml:id “az-group” ; rdf:_1 [ a xhtml:div ; rdf:_1 [ a xhtml:h4 ; rdf:_1 "A" ; <https://html.spec.whatwg.org/#innerHTML> "A" ; <https://html.spec.whatwg.org/#innerText> "A" ] ; … html.selector=#az-group @prefix xhtml: <http://www.w3.org/1999/xhtml#> . CSV JSON HTML XML Binary (JPEG, PNG, …) Text
  • 10. Binary and Text Facade: http://sparql.xyz/facade-x/ns/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix fx: <http://sparql.xyz/facade-x/ns/>. @prefix xyz: <http://sparql.xyz/facade-x/data/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. rdf:Property a rdfs:Class . rdfs:ContainerMembershipProperty rdfs:subClassOf rdf:Property . fx:Root a rdfs:Class . xsd:int a rdfs:Datatype. xsd:string a rdfs:Datatype. xsd:boolean a rdfs:Datatype. xsd:decimal a rdfs:Datatype. xsd:float a rdfs:Datatype. xsd:double a rdfs:Datatype. xsd:base64Binary a rdfs:Datatype. rdf:type df:type rdf:Property [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> “/9j/ 4AAQSkZJRgABAQEASABIAAD/ 4QmsRXhpZgAASUkqAAgAAAALAA8BAgAGAAAAkgAAABABAgAOAAAAmAAAABIBAw ABAAAAAQAAABoBBQABAAAApgAAABsBBQABAAAArgAAACgBAwABAAAAAgAAADEB AgALAAAAtgAAADIBAgAUAAAAwgAAABMCAwABAAAAAgAAAGmHBAABAAAA1gAAAC WIBAABAAAA0gMAAOQDAABDYW5vbgBDYW5vbiBFT1MgNDBEAEgAAAABAAAASAAA AAEAAABHSU1QIDIuNC41AAAyMDA4OjA3OjMxIDEwOjM4OjExAB4Am…”^^<http ://www.w3.org/2001/XMLSchema#base64Binary> ] . bin.encoding # BASE64 txt.regex # tokenise into a sequence CSV JSON HTML XML Binary (JPEG, PNG, …) Text https://imma.ie/collection/freeing-the-voice/ Hello World! [ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> "Hello World!" ] .
  • 12. https:// github.com/ SPARQL- Anything/ showcase-tate Assumption: SPARQL 1.1 CONSTRUCT queries will be enough to design mappings (the re-modelling phase)
  • 14. Preliminary feedback • From 27 users, diverse SPARQL expertise • Essential or very important • the system should minimise the languages or syntaxes needed • mappings should be easy to read and interpret • the system must be easy to learn for a Semantic Web practitioner • the system is able to support new types of data sources without changes to the mapping language • How easy is this code to understand (comparing equivalent mappings)? • (a) RML • (b) SPARQL Generate • (c) SPARQL Anything
  • 15. Benefits • Transform / Query resources having heterogeneous formats • Low learning demands (plain SPARQL 1.1) • Minimise complexity of the mappings • A single+consistent abstraction for any data format • Enable data exploration in the absence of a domain ontology • Integrate with a typical Semantic Web engineering workflow • Flexible and adaptable (Facade-X can be extended, if needed) • Easy to extend: • new transformers just need to return the facade • no major changes to the user experience
  • 16. Challenges • No commitment on the internal machinery! (It is a gift and a curse …) • Current version v0.1.1 (we started Nov 2020): • implemented on top of Apache Jena ARQ • limited to files • loads the triples in-memory and then performs the query • A triple filtering strategy reduces in-memory dataset • Very large files require very large memory • Next: to develop strategies to cope with large resources (e.g. slicing) • Next: to develop query-rewriting strategies, eventually rewriting mappings into efficient, iterator-based transformers (mapping translation [Corcho 2020]) • Next: Relational Database, No-SQL (e.g. mongoDB) • Reuse existing approaches (e.g. OBDA) but hide complexity to the user
  • 17. Get in touch! SPARQL Anything is under active development https://github.com/SPARQL-Anything/sparql.anything enrico.daga@open.ac.uk @enridaga www.enridaga.net
  • 18. References • Daga, E., Asprino, L., Mulholland, P., Gangemi, A.: Facade-x: an opinionated approach to sparql anything (submitted). In: SEMANTiCS 2021: 17th International Conference on Semantic Systems (2021) • Daga, E., Meroño-Peñuela, A., Motta, E.: Sequential linked data: the state of affairs. Semantic Web (2021) • Warren, P., Mulholland, P.: Using sparql–the practitioners’ viewpoint. In: European Knowledge Acquisition Workshop. pp. 485–500. Springer (2018) • Corcho, O., Priyatna, F., Chaves-Fraga, D.: Towards a new generation of ontology based data access. Semantic Web 11(1), 153–160 (2020) • Michel, F., Faron-Zucker, C., Corby, O., Gandon, F.: Enabling automatic discovery and querying of web apis at web scale using linked data standards. In: Companion Proceedings of The 2019 World Wide Web Conference. pp. 883–892 (2019) • Dimou, A., Vander Sande, M., Colpaert, P., Verborgh, R., Mannens, E., Van de Walle, R.: Rml: a generic language for integrated rdf mappings of heterogeneous data. In: 7th Workshop on Linked Data on the Web (2014) • García-González, H., Boneva, I., Staworko, S., Labra-Gayo, J.E., Lovelle, J.M.C.: Shexml: improving the usability of heterogeneous data mapping languages for firsttime users. PeerJ Computer Science 6, e318 (2020) • Ko, A.J., Abraham, R., Beckwith, L., Blackwell, A., Burnett, M., Erwig, M., Scaffidi, C., Lawrance, J., Lieberman, H., Myers, B., et al.: The state of the art in enduser software engineering. ACM Computing Surveys (CSUR) 43(3), 1–44 (2011) • Lefrançois, M., Zimmermann, A., Bakerally, N.: A sparql extension for generating rdf from heterogeneous formats. In: European Semantic Web Conference. pp. 35– 50. Springer (2017) • Lieberman, H., Paternò, F., Klann, M., Wulf, V.: End-user development: An emerging paradigm. In: End user development, pp. 1–8. Springer (2006) • Cyganiak, Richard. Tarql (sparql for tables): Turn csv into rdf using sparql syntax. Technical Report, 2015. http://tarql. github. io, 2015.