Presentation at DeSemWeb@ISWC2017
Original slides: http://goo.gl/T9AxPd
Workshop: http://iswc2017.desemweb.org/
Paper: https://openreview.net/forum?id=ByFHXFy8W¬eId=ByFHXFy8W
1. Semantic Web in the Fog
of Browsers
Pascal Molli & Hala Skaf-Molli
University of Nantes - LS2N - GDD Team
22 October 2017 - DeSemWeb@ISWC2017
2. Decentralized Semantic Web
Applications (D-SWAP)?
● Decentralized on which
infrastructure ?
● What do you expect ?
● How to program it ?
● Research Challenges ?
3. Infrastructure ? Fog of Browsers (FoB)
● FoB: Web Browsers interconnected with
a WebRTC overlay network [1],
collaborating with cloud services...
○Ex: Skype-for-web, Peer5 (P2P CDN),
CRATE (Real-time editor)[2]
● Why ?
○Installation free !
○Can be deployed now on billions of
devices !
○Can interact with end-users and end-
users data in browsers.
[1] Nédelec, B., Molli, P., et al. J. World Wide Web (2017). https://doi.org/10.1007/s11280-017-0478-5
[2] Nédelec B, Molli P, Mostéfaoui A. A scalable sequence encoding for collaborative editing, J. Concurrency
Computat: Pract Exper. 2017;e4108. https://doi.org/10.1002/cpe.4108
4. Why Writing Decentralized Semantic
Web Applications?
● Reduce significantly the cost of deploying large-scale
semantic web applications
○Users devices run the entire or significant part of the
application.
● Extending the web to access data stored in browsers:
○Personal data, web of things data.
● Improve performance, availability and scalability on client-
side
5. How to Write and Deploy D-SWAP?
● Simple use-case:
○ People visiting a city have access to point of
interests around them, can rate these points,
and can list top-ranked point of interests…
● Straightforward to write this web application in the
cloud…, but can be costly to host...
● What can be decentralized ? For what benefit ?
7. Should be nearly the same in Fob...
Who are your neighbors ?
Public data ? Do it alone or
with my neighbors ?
Update ! put that on the cloud
? on the FoB ? or just local ?
Query in the FoB between
public data and application
data. Best way to query it ?
8. The D-SWAP creates its network...
Static code of one
semantic fog
application is here
One Semantic fog
application deployed and
running
Data here
And Data there
10. Collaborative Caching for Queries
● Collaborative caching 1
○ Neighborhoods based on
queries similarities
10
c1
c2
c4
c3 c6
c7
c8
c9
c1
c2
c4
c3
c5
c6
c7
c8
c9
HTTP Cache
DrugBankDBpedia
LDF Server
c5
1. P. Folz, H. Skaf-Molli et P. Molli (2016). CyClaDEs: A Decentralized Cache for Triple
Pattern Fragments. 13th Extended Semantic Web Conference, ESWC 2016.
11. Collaborative Caching for Queries
11
c1
c2
c4
c3 c6
c7
c8
c9
c1
c2
c4
c3
c5
c6
c7
c8
c9
HTTP Cache
DrugBankDBpedia
LDF Server
c5
During query execution :
1) Local cache
2) Neighborhood cache
3) HTTP cache
4) LDF Server
Reduces overhead on the server:
improves data availability
1
2
2
2
3
4
12. ~ 20% neighbors cache hit-rate, whatever the
number of clients, BSBM 1 million of triples
Without collaborative cache With collaborative cache
13. Queries Load Balancing in FoB
A client C1 has a workload W1 of
queries:
● ET(W1@C)>ET(W1@C1-Cn) ??
● Does balancing workload
among clients improve
performance ?
13
DrugBank
C1
C2
C4
C3
C5
W1(Q1,Q2,Q3)
A. Grall, P. Folz, G. Montoya, H. Skaf-Molli, P. Molli, M. V. Sande, and R. Verborgh. Ladda: SPARQL
Queries in the Fog of Browsers. Demo at 14th Extended Semantic Web Conference, 2017. Springer
17. Why is it Hard ?
● Distributed programming !
○Fault-tolerance, consistency, availability, scalability,
security. How to make it simple for developers?
● Distributed programming with web browsers
○WebRTC : no routing, no address; Resources limitations;
Web standards and browser API, Javascript, privacy !
● Which services in the FoB, Which services in the cloud ?
○ Collaboration between FoB and cloud (query, cache, load-
18. Customized Overlay Networks on FoB
● What is the best topology for a decentralized
semantic web application ?
○ Social, unstructured, DHT, PHT, several
overlays ? Which similarity metrics ?
○ Depends of the application and queries…
● Need a way to declare it for developpers.
19. Client-Side Data...
● Data placement, Caching, Replication,
Materialization on client side…
● ...but what ? where ? when ?
○ Depends on the chosen topology, queries
and behaviors of participants at run-time...
○ Adaptive strategies required based on the
application and the application monitoring
20. Decentralized Query Engines
● How to decentralize and optimize query
processing ?
○ Decomposition and subquery allocation within
the fog of browsers
○ Depends of topology and data placement...
○ Should adapt to runtime conditions...
21. Crowdsourcing with a Fog of Browsers
● D-SWAP applications are in touch with end
users.
○ Users can brings many data
○ But problem of data quality, certainty
● How to decentralize quality issue processing
○ Decentralized curation, aggregation, data
collection ? Decentralized inference ?
22. Security
● Malicious applications
○ DDOS attack, personal information attack
● Malicious users for one application
○ Attack the application and semantic data…
○ Transform loaded javascript code...
● How browsers security model can be adapted
23. Incentives
● Why people should participate ?
○ No choice -> the application is built like that ;)
○ Mutual profit -> queries run faster, better
resilience...
○ Privacy protection (no mediation)
○ Pay them -> integrate with distributed ledgers
(running in browsers as nimiq), but proof of
24. Conclusions
●Fog of browsers: One way to write decentralized
semantic web applications.
● Reduce cost, improve performances, gather
users, access data in browsers.
● Research challenges: many trade-off between
FoB/cloud services: topologies, data placement,
decentralized query processing,decentralized
crowdsourcing, security, incentives.
25. Semantic Web in the Fog
of Browsers
Pascal Molli & Hala Skaf-Molli
University of Nantes - LS2N - GDD Team
22 October 2017 - DeSemWeb@ISWC2017
26. 26
Data consumer Data providers Scope + -
Link Traversal (⋈,σ,
𝝿)
URI -> RDF whole web,
reachable web
freshness, available performance
Local Sparql server
(⋈,σ,𝝿)
DUMP dump imported in
local server
available,
performance
freshness, web
/ 1 Sparql Endpoint
(⋈,σ,𝝿)
Data in the server freshness availability
performance
Federated query
engine (⋈,σ,𝝿)
Sparql Endpoint
(⋈,σ,𝝿)
Data in servers freshness availability,
performance
Smart client (⋈,𝝿) Light Server (σ) Data in servers freshness,
available, Server-
side scalable
performance
Semantic foB
applications (⋈,𝝿)
Light Server (σ) Data in servers +
Data in smart
clients
freshness,
available,
performance,
Client-side scalable
Security,
incentives
Semantic foB
applications (⋈,𝝿,
σ)
URI -> RDF whole web + data in
smart clients
freshness,
available,
performance
Security,
incentives
Editor's Notes
Two overlay networks of browsers for building neighborhoods and handle dynamicity
First we wanted to see the impact of number of client on hir-rate
On the left we have the experiment without cyclades and the right with cyclades
The x axis is the number of client, the y axis is the percentage of calls
Green represent calls answered locally, orange calls answered in the neighborhood and yellow calls answered in the server
Without cyclades we can see that the number of calls answer locally is not impact by the number of clients
With cyclades we can see that almost half of the calls previously handle by the server is now handle by the neighbourhood