Building materialised views for linked data systems using microservices

Building materialised views of linked
data systems using microservices
Augustine Kwanashie

Outline
o  Introduction
o  Current architecture and challenges
o  Building materialised views
o  Other things to consider

Publish Distribute
Publish and distribute content metadata

1
I’ll try to create like
Beckham 01-07-2018:19:01:01
urn:cps:1289394
3
2
about
English Football Team
http://wikidata.org/123
locator
Kieran
Trippier
label
Metadata on Tagging

Top 10 Articles
About “English Football Team”
Ordered by date published

Simplified Architecture
Write API
Triplestore
Read API
DistributionSystems
EditorialSystems

SPARQL endpoints
Performance and
Data Integrity
Flexibility and Pace
of Innovation
Custom APIs

Projected performance by 2019
60%
99 percentile response
time
100%
data volume

So what do we know about the API
requests?

We can group API requests by their
query profiles

Query by identifier
CONSTRUCT {
. . .
} WHERE {
<urn:01> a core:Atricle .
. . .
}

Query with filters
CONSTRUCT {
. . .
} WHERE {
?id property1 <urn:01> .
?id property2 "value2" .
. . .
}

Multi-hop query
CONSTRUCT {
. . .
} WHERE {
?id1 <urn:property1> ?id2
?id2 <urn:property2> ?id3
?id3 <urn:property3> "value3" .
}

We can group API requests by their
volume and performance
requirements

Low volume and
performance
requirements
High volume and
performance
requirements
More complex
queries
Mostly simple
queries

Build views that map closely to
query profiles

Target architecture
Event
Store
WriteAPI
Publish API
Publish API
Publish API
ReadAPI
Distribute
Query by ID
Multi-hop

The publish pipeline
λ
View DBIngest
View API
WriteAPI
Read API
Data
Input Queue
ID: 838394
Operation: Create
Timestamp: 1540906781999

Send to DLQ is errors persist
λ
View DBIngest
WriteAPI
Read API
Input Queue
Dead Letter Queue

Notify clients of a new ingest
λ
View DBIngest
View API
λ
SNS
Notifier

Verify ingest is successful
λ
View DBIngest
View API
λ
Verifier
Read API
Dead Letter Queue

The distribution pipeline
Read API
Triplestore
Router
View API
View DB 1
ReadAPIs
View API
View DB 2

Route traffic based on profile and format
If request matches {
format: "ld+json"
query: "?id=<GUID>"
}
Then route to View 1

Failover to the Triplestore
Read API
Triplestore
Router
View API
View DB 1
ReadAPIs

Split traffic between Views
Read API
Triplestore
Router
View API
View DB 1
ReadAPIs
60% of traffic
40% of traffic

{
”@id": "urn:article:01",
"about": [
"urn:tag:01",
"urn:tag:02",
…
]
}
{
”@id": "urn:tag:01",
"label": "Nigeria",
”@type": "Place"
}

{
”@id": "urn:article:01",
"about": [
{
”@id": "urn:tag:01",
"label": "Nigeria",
”@type": "Place"
},
…
]
}

Previously…
Write APIs
Triplestore
Read APIs
PUT <urn:article:01>
PUT <urn:tag:01>
combined data

Join on Writes
Publish API
ReadAPI
Distribute
PUT <urn:tag:01>
custom view for
combined data

Join on Reads
Publish API
Publish API
ReadAPI
Distribute
PUT <urn:tag:01>
combined data

Tracking Ontology Changes
biz:Company
rdf:type owl:Class ;
rdfs:comment "A company featured in BBC news"^^xsd:string ;
rdfs:isDeﬁnedBy <http://www.bbc.co.uk/ontologies/…> ;
rdfs:label "A company featured in BBC news"^^xsd:string ;
rdfs:subClassOf core:Organisation .

Tracking Ontology Changes
<http://www.bbc.co.uk/things/01#id> a biz:Company ;
core:label "Amazon Inc. " .
<http://www.bbc.co.uk/things/01#id> a core:Organisation .
Generated implicit triples

Single source of truth
Publish API
Triplestore
Ingest Script
Query all IDs
ID: 838394
Operation: Create
Timestamp: 1540906781999

Summary:
Using multiple data sources that match
specific query types is feasible and
beneficial

Building materialised views for linked data systems using microservices

More Related Content

What's hot

Similar to Building materialised views for linked data systems using microservices

More from Connected Data World

Recently uploaded

Building materialised views for linked data systems using microservices