Why I don't use Semantic Web technologies anymore, event if they still influence me ?
1. Why I don’t use anymore semantic Web
technologies, even if they still influence me ?
12th December 2019
Linked Pasts, Bordeaux
Gautier Poupeau ,
gautier.poupeau@gmail.com
@lespetitescases
http://www.lespetitescases.net
10. Producteur
Utilisateur
The system strictly follows the principles of the OAIS model (Open Archival
Information System), including in its architecture.
SPAR Architecture
11. How to store and query metadata ?
A powerfull query
language, accessible
to non-IT staff
Flexibility to describe all the
data and to query them
without any preconceived
idea
Standard, independant of
any software
implementation
RDF model and SPARQL Query Language
12. How metadata is handled within SPAR ?
Step 1
Ingest of digital item
Update manager
Type detection of update
and automatic merge
Control and audit Enrichment
Customizable for the different types
of digital item
Vocabularies
Formats Agents
Service Level
Agreement
Result
A set of files compliant
with SLA
All metadata usefull to
manage file for long term
Step 2
Inventory
Storage and indexation of digital item
Repository
14. Metadata repositories in SPAR
• All master data
• all metadata from METS
manifest
• Rules to store in Selective
repository
• All master data
• a choice of metadata from
METS manifest ;
•All master data
Complete
repository
Selective
repository
Master data
repository
To fix performance issues, we had to adapt our architecture…
15. Outcome of this project
Performance issues
Flexibility
System still in place
BnF remains convinced
of this choice
17. What is Isidore ?
http://isidore.science
• Managed by TGIR Huma-NUM
• 6 445 data sources
• 6 millions of resources indexed in french,
english, spanish
• Use of vocabularies
• Enrichment of resources : automatic
annotation, classification, attribution of
normalized identifiers
21. Make Isidore data available
Enrichment
by Isidore
Data publication
by Isidore
Retrieving by
producers
Processing
by
producers
Data
publication
by producers
Harvesting
by Isidore
to allow a positive feedback
22. Outcome of this project
Complexity issues
Knowledge issues
Appropriation by the
community
Project is an example
"We mostly get in touch with the researchers when things go wrong with the data. And it
often goes wrong for several reasons. But, indeed, there was the question of these standards
giving the researchers a hard time [...] they tell us: but why don’t you just use csv rather than
bother with your semantic web business? " Raphaëlle Lapotre, product manager data.bnf.fr
23. FROM MASHUPS TO LINKED
ENTERPRISE DATA
Breaking silos / linking and bringing consistency to
heterogeneous data
24. Data mashup
Tim Berners Lee, Ora Lassila, James Hendler,
« Semantic Web », Scientific american, 2001
« The real power of semantic
Web will be realized when
people create many programs
that collect Web content from
diverses sources, process the
information and exchange the
results with other programs »
26. Architecture of historical
monuments mashup
Source
principale
Sources complémentaires
Web Service de
géo localisation
AIF
normalisation et
enrichissement
AFS
moteur de
recherche
AFS
Application
Monuments
Historiques
28. Architecture before LED project
SQL Server
DBMS
Structured Data
• Best sales
• Buzz
• Awards
• Reserved Titles
• Events
Professional Directory
• Publishers
• Distributors
• Managers
Quark XPRESS
CMS
File Maker
DBMS
Editorial content
• Articles
• Visuals
Livres Hebdo.fr Web site
Electre.com Web site
• Books
• Authors
• Publishers
• Articles (Reviews)
• Best Sales
• Media relays
• Events
• Articles (web)
• Blogs posts
• Visuals
• Documents
• Events
• Articles (Print)
• Authors
• Books
• Best sales
• Media relays
• Awards
• Reserved Titles
• Events
• Directory
Books
Awards
Articles (Reviews)
Best Sales
Media relays
29. Architecture with LED
SQL Server
DBMS
Structured Data
• Best sales
• Buzz
• Awards
• Reserved Titles
• Events
Professional Directory
• Publishers
• Distributors
• Managers
Quark XPRESS
CMS
File Maker
DBMS
Editorial content
• Articles
• Visuals
Livres Hebdo.fr Web site
Electre.com Web site
• Books
• Authors
• Publishers
• Articles (Reviews)
• Best Sales
• Media relays
• Events
• Articles (web)
• Blogs posts
• Visuals
• Documents
• Events
• Articles (Print)
• Authors
• Books
• Best sales
• Media relays
• Awards
• Reserved Titles
• Events
• Directory
Other internal sources
(works)
Other external sources
free or paid model
New services
New customers
RDF DW
Transform
Agregate
Link
Annotate
30. Outcome of this project
Scalability issues
Complexity/update issues
Skills issues
Maintenability issues
Cost issues
All data are linked and
consistent
Flexibility to manipulate
RDF data
32. The flexibility of the graph model
Benefits and limits of Semantic Web technologies
RDF Graph = absolute freedom
compared with the rigidity of
relational databases
Linking of heterogeneous entities
easily
Graph can evolve over time and its
growth is potentially infinite
Maintainability issues
Model issues
33. The flexibility of the graph model
RDF vs property graph
RDF Property graph
RDF model are based on triple model :
subject-predicat-object
Property graph are based on nodes, edges
and properties of nodes or edges.
34. The flexibility of the graph model
Beyond the limits
Reconciliation between
RDF and property graph ?
Example of RDF*
<<:bob foaf:age 23>> ex:certainty 0.9 .
Example of SPARQL*
SELECT ?p ?a ?c WHERE {
<<?p foaf:age ?a>> ex:certainty ?c .
}
RDF* / SPARQL*
Do you really need RDF model to store data ?
35. Data dissemination / Interoperability / Decentralisation
Contributions and limits of semantic Web technologies
Best solution to achieve
interoperability of data
Linking heterogeneous data
Create bridges between worlds
impossible to reconcile
SPARQL as powerful tool for
querying data
Asynchronous data retrieval
Costs of maintenability
Knowledge issues
Full text search not possible
Structural interoperability
impossible data mappings
36. Data dissemination / Interoperability / Decentralisation
Overcoming the limits
Easy-to-use ontologies
Simple CSV
or JSON/XML dumps
Simple API
What are the possibles
uses ? Who are the users ?
Do we need this level of interoperability?
38. Functionally separate data from their use
• To rethink data models in relation to their
logics and not theiru use
• To acknowledge that some data models are
dedicated to production and storage while
several other models are designed
specifically for data dissemination
39. Technically separate data from their use
• Information System is
organized in layers and
not anymore in silos
• The storage and process
of data are separated
from business
applications
40. An infrastructure to store and process data
4 types of database system to
store all types of data and to
address all types of usage
A process module to interact with
the data and synchronize data
between the different databases
A management module to
abstract the technical
infrastructure and expose logical
data to business applications
41. Thank you for your attention !
Do you have some questions ?
And sorry for this…
I would like to thank very much Emmanuelle Bermès (@figoblog) for the translation of
this keynote !