Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

grlc Makes GitHub Taste Like Linked Data APIs

528 views

Published on

Building Web APIs on top of SPARQL endpoints is becoming common practice. It enables universal access to the integration favorable data space of Linked Data. In the majority of use cases, users cannot be expected to learn SPARQL to query this data space. Web APIs are the most common way to enable programmatic access to data on the Web. However, the implementation of Web APIs around Linked Data is often a tedious and repetitive process. Recent work speeds up this Linked Data API construction by wrapping it around SPARQL queries, which carry out the API functionality under the hood. Inspired by this development, in this paper we present grlc, a lightweight server that takes SPARQL queries curated in GitHub repositories, and translates them to Linked Data APIs on the fly.

Published in: Software
  • Be the first to comment

  • Be the first to like this

grlc Makes GitHub Taste Like Linked Data APIs

  1. 1. ‹#› Het begint met een idee GRLC MAKES GITHUB TASTE LIKE LINKED DATA APIS Chefs Albert Meroño-Peñuela Rinke Hoekstra Services and Applications over Linked APIs and Data (SALAD) ESWC 29-05-2016
  2. 2. Vrije Universiteit Amsterdam  VU University Amsterdam – Computer Science (Knowledge Representation & Reasoning group)  International Institute of Social History (IISG), Amsterdam  CLARIAH – National Infrastructure for Digital Humanities > DataLegend : Structured Data Hub  Previously incubated by CEDAR – Dutch historical censuses as 5-star LOD 2 INSTITUTIONAL SLIDE
  3. 3. ‹#› Het begint met een idee DISCLAIMER 3 Frustration- driven research
  4. 4. ‹#› Het begint met een idee 1. LD-CONSUMING APPLICATIONS 4
  5. 5. ‹#› Het begint met een idee 5 Het begint met een idee  Publishing Dutch historical censuses as 5-star LD > Intensive use of RDF Data Cube > Harmonization rules > Provenance  1st historical census data as Linked Data (1795-1971)  8 million observations (sex, marital status, occupation position, housing type, residence status)  External links > Geographical: 2.7M > Occupations: 350K > Belief: 250K  High value for social historians 5 Faculty / department / title presentation THE CEDAR STORY
  6. 6. Vrije Universiteit Amsterdam  Historians can’t really write SPARQL  Variety of access interfaces needed 6 CENSUS DATA QUERYING INTERFACES
  7. 7. Vrije Universiteit Amsterdam  CLARIAH-WP4: Structured data hub for social historians  IPUMS, NAPP, CEDAR, etc > Macro-, micro-, meso-data > Civil registries, occupation, religion, country-level economic indicators > National (Netherlands) and international  Mostly CSV tables turned into RDF Data Cube and CSVW  More than 1B triples already  Higher variety of humanities scholars  higher variety of data access requirements) 7 SCALING VARIETY Exi sts Frequency Table Variable does not yet existVariables Mappings Publish Augment Includes both external LinkedDataand standard vocabularies, e.g. World Bank External (Meta)Data Existing Variables & Codes Provenance tracking of a External Datasets StructuredDataHub
  8. 8. ‹#› Het begint met een idee8
  9. 9. ‹#› Het begint met een idee FRUSTRATION 1 9 This is SPARQL mess!!!1one
  10. 10. ‹#› Het begint met een idee
  11. 11. ‹#› Het begint met een idee 11 Het begint met een idee  One .rq file for SPARQL query  Good support of query curation processes > Versioning > Branching > Clone-pull-push  Web-friendly features! > One URI per query > Uniquely identifiable > De-referenceable (raw.githubusercontent.com) 11 Faculty / department / title presentation GITHUB AS A HUB OF SPARQL QUERIES
  12. 12. ‹#› Het begint met een idee LESSON 1 12 Query centralization helps maintaining distributed applications
  13. 13. ‹#› Het begint met een idee 2. THE NEED FOR APIS 13
  14. 14. Vrije Universiteit Amsterdam  Linked Data APIs emerge  RESTful entry point to Linked Data hubs for Web applications  OpenPHACTS  …but the Linked Data API (e.g. Swagger spec, code itself) still needs to be coded and maintained 14 MEANWHILE IN THE SEMANTIC WEB…
  15. 15. Vrije Universiteit Amsterdam  Love story – thanks KMi!  Automatically builds Swagger specs and API code  Takes SPARQL queries as input (1 API operation = 1 SPARQL query) > API call functionality limited to SPARQL expressivity  Makes SPARQL queries uniquely referenceable by using their equivalent LDA operation > Stores SPARQL internally > But we already have uniquely referenceable SPARQL… 15 BASIL
  16. 16. ‹#› Het begint met een idee FRUSTRATION 2 16 Copy-pasting 200 queries!!! & Organization problem
  17. 17. ‹#› Het begint met een idee 17 Het begint met een idee  Cousin of BASIL in a SALAD   Same basic principle: 1 SPARQL query = 1 API operation  Automatically builds Swagger spec and UI from SPARQL But:  External query management  Organization of SPARQL queries in the GitHub repo matches organization of the API  Thin layer – nothing stored server- side  Maps > GitHub API > Swagger spec 17 Faculty / department / title presentation
  18. 18. Vrije Universiteit Amsterdam 18 MAPPING GITHUB AND SWAGGER
  19. 19. Vrije Universiteit Amsterdam 19 SPARQL DECORATOR SYNTAX
  20. 20. Vrije Universiteit Amsterdam 20 THE GRLC SERVICE  Assuming your repo is at https://github.com/:owner/:repo and your grlc instance at :host, > http://:host/:owner/:repo/spec returns the JSON swagger spec > http://:host/:owner/:repo/api-docs returns the swagger UI > http://:host/:owner/:repo/:operation?p_1=v_1...p_n=v_n calls operation with specifiec parameter values > Uses BASIL’s SPARQL variable name convention for query parameters  Sends requests to > https://api.github.com/repos/:owner/:repo to look for SPARQL queries and their decorators > https://raw.githubusercontent.com/:owner/:repo/master/file.rq to dereference queries, get the SPARQL, and parse it
  21. 21. Vrije Universiteit Amsterdam 21 SPICED-UP SWAGGER UI
  22. 22. Vrije Universiteit Amsterdam 22 EVALUATION – USE CASES  CEDAR: Access to census data for historians > Hides SPARQL > Allows them to fill query parameters through forms > Co-existence of SPARQL and non-SPARQL clients  CLARIAH - Born Under a Bad Sign: Do prenatal and early-life conditions have an impact on socioeconomic and health outcomes later in life? (uses 1891 Canada and Sweden Linked Census Data) > Reduction of coupling between SPARQL libs and R > Shorter R code – input stream as CSV
  23. 23. Vrije Universiteit Amsterdam The spectrum of Linked Data clients: SPARQL intensive applications vs RESTful API applications grlc uses decoupling of SPARQL from all client applications (including LDA) as a powerful practice  Separates query curation workflows from everything else  Allows at the same time > Web-friendly SPARQL queries > Web-friendly RESTful APIs  Helps you to easily organise your LDA – just organise your SPARQL repository and you’re set  Try it out! > http://grlc.clariah-sdh.eculture.labs.vu.nl > https://github.com/CLARIAH/grlc 23 CONCLUSIONS
  24. 24. ‹#› Het begint met een idee THANK YOU! @ALBERTMERONYO DATALEGEND.NET CLARIAH.NL 24

×