Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Why I don't use Semantic Web technologies anymore, event if they still influence me ?

13,899 views

Published on

Slides for the keynote of Linked pasts conference in Bordeaux, december 12th 2019

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Why I don't use Semantic Web technologies anymore, event if they still influence me ?

  1. 1. Why I don’t use anymore semantic Web technologies, even if they still influence me ? 12th December 2019 Linked Pasts, Bordeaux Gautier Poupeau , gautier.poupeau@gmail.com @lespetitescases http://www.lespetitescases.net
  2. 2. Plan A quick history of (semantic) Web Feedback Conclusions and perspectives
  3. 3. A QUICK HISTORY OF (SEMANTIC) WEB
  4. 4. Initial purpose of the Web
  5. 5. Document encoding language HTML Communication protocol Identification mechanism HTTP URL Web of documents Principle Hypertext
  6. 6. Success factors of Web of documents Web standards are open and free Web standards are robust Web standards are easy to implement
  7. 7. Differents names, same technologies 1994-2004 Semantic Web Era 2006-2014 Linked Open Data era 2014-???? Knowledge graph era
  8. 8. SEMANTIC WEB TECHNOLOGIES, A FEEDBACK
  9. 9. SPAR PROJECT (BnF) Flexibility and linking of heterogeneous data
  10. 10. Producteur Utilisateur The system strictly follows the principles of the OAIS model (Open Archival Information System), including in its architecture. SPAR Architecture
  11. 11. How to store and query metadata ? A powerfull query language, accessible to non-IT staff Flexibility to describe all the data and to query them without any preconceived idea Standard, independant of any software implementation RDF model and SPARQL Query Language
  12. 12. How metadata is handled within SPAR ? Step 1 Ingest of digital item Update manager Type detection of update and automatic merge Control and audit Enrichment Customizable for the different types of digital item Vocabularies Formats Agents Service Level Agreement Result A set of files compliant with SLA All metadata usefull to manage file for long term Step 2 Inventory Storage and indexation of digital item Repository
  13. 13. sparstructure:group sparstructure:set oai-ore:isAggregatedBy sparstructure:object sparstructure:file owl:Thing sparstructure:structuralMap sparprovenance:event sparprovenance:hasEvent sparprovenance:hasEvent sparprovenance:hasEvent sparprovenance:hasEvent oai-ore:isAggregatedBy oai-ore:aggregates oai-ore:aggregates dc:format sparcontext:channel sparcontext:isMemberOf dc:source owl:Thing sparcontext:hasLastVersion sparcontext:hasLastVersion xsd:string sparagent:agent sparprovenance:hasAuthorizer sparprovenance:hasImplementer sparprovenance:hasIssuer sparprovenance:hasPerformer dc:date sparprovenance:eventDetail xsd:dateTime sparrepresentation:format sparrepresentation:property sparrepresentation:hasProperty xsd:string sparrepresentation:propertyXpath rdfs:label rdf:value xsd:string rdfs:label dc:publisher dc:descriptiondc:date xsd:string xsd:string xsd:string xsd:string xsd:string owl:Thing owl:Thingowl:Thing sparcontext:hasLastRelease sparcontext:hasLastRelease sparstructure:fileGroup oai-ore:isAggregatedBy xsd:stringsparrepresentation:hasMimetype sparrepresentation:characterizationFormat xsd:string foaf:name xsd:string xsd:string sparprovenance:outcomeInformation sparprovenance:hasProduct doap:category sparagent:outcome sparagent:hasOutcomeProcessing dc:description sparagent:hasOutcome xsd:stringsparcontext:isMemberOf dc:title xsd:string xsd:string sparprovenance:eventOutcome sparprovenance:eventOutcomeDetailNote sparagent:hasOutcomeFormat sparagent:contains doap:Version doap:release xsd:string sparagent:entryPoint Liste des espaces de noms utilisés PREFIX oai-ore: <http://www.openarchives.org/ore/terms/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX doap: <http://usefulinc.com/ns/doap#> PREFIX sparstructure : <info:bnf/spar/structure#> PREFIX sparprovenance: <info:bnf/spar/provenance#> PREFIX sparrepresentation : <info:bnf/spar/representation#> PREFIX sparcontext: <info:bnf/spar/context#> PREFIX sparagent: <info:bnf/spar/agent#> SPAR Macro Model
  14. 14. Metadata repositories in SPAR • All master data • all metadata from METS manifest • Rules to store in Selective repository • All master data • a choice of metadata from METS manifest ; •All master data Complete repository Selective repository Master data repository To fix performance issues, we had to adapt our architecture…
  15. 15. Outcome of this project Performance issues Flexibility System still in place BnF remains convinced of this choice
  16. 16. ISIDORE PROJECT Data retrieval and dissemination
  17. 17. What is Isidore ? http://isidore.science • Managed by TGIR Huma-NUM • 6 445 data sources • 6 millions of resources indexed in french, english, spanish • Use of vocabularies • Enrichment of resources : automatic annotation, classification, attribution of normalized identifiers
  18. 18. Isidore macro architecture
  19. 19. Data dissemination with RDFa http://blog.stephanepouyllau.org/624 VS
  20. 20. Linked vocabularies in RDF ISIDORE Référentiel Disciplines HAL-SHS Référentiel Auteurs HAL-SHS Référentiel Organisation HAL-SHS Référentiel Catégories Calenda Référentiel Pactols Référentiel Geonames Référentiel Rameau Référentiel Lexvo Référentiel Thésaurus W SIAF
  21. 21. Make Isidore data available Enrichment by Isidore Data publication by Isidore Retrieving by producers Processing by producers Data publication by producers Harvesting by Isidore to allow a positive feedback
  22. 22. Outcome of this project Complexity issues Knowledge issues Appropriation by the community Project is an example "We mostly get in touch with the researchers when things go wrong with the data. And it often goes wrong for several reasons. But, indeed, there was the question of these standards giving the researchers a hard time [...] they tell us: but why don’t you just use csv rather than bother with your semantic web business? " Raphaëlle Lapotre, product manager data.bnf.fr
  23. 23. FROM MASHUPS TO LINKED ENTERPRISE DATA Breaking silos / linking and bringing consistency to heterogeneous data
  24. 24. Data mashup Tim Berners Lee, Ora Lassila, James Hendler, « Semantic Web », Scientific american, 2001 « The real power of semantic Web will be realized when people create many programs that collect Web content from diverses sources, process the information and exchange the results with other programs »
  25. 25. Data model for Historical monuments mashup
  26. 26. Architecture of historical monuments mashup Source principale Sources complémentaires Web Service de géo localisation AIF normalisation et enrichissement AFS moteur de recherche AFS Application Monuments Historiques
  27. 27. Linked Enterprise Data Data Mashup of « legacy » IS to separate data from use
  28. 28. Architecture before LED project SQL Server DBMS Structured Data • Best sales • Buzz • Awards • Reserved Titles • Events Professional Directory • Publishers • Distributors • Managers Quark XPRESS CMS File Maker DBMS Editorial content • Articles • Visuals Livres Hebdo.fr Web site Electre.com Web site • Books • Authors • Publishers • Articles (Reviews) • Best Sales • Media relays • Events • Articles (web) • Blogs posts • Visuals • Documents • Events • Articles (Print) • Authors • Books • Best sales • Media relays • Awards • Reserved Titles • Events • Directory Books Awards Articles (Reviews) Best Sales Media relays
  29. 29. Architecture with LED SQL Server DBMS Structured Data • Best sales • Buzz • Awards • Reserved Titles • Events Professional Directory • Publishers • Distributors • Managers Quark XPRESS CMS File Maker DBMS Editorial content • Articles • Visuals Livres Hebdo.fr Web site Electre.com Web site • Books • Authors • Publishers • Articles (Reviews) • Best Sales • Media relays • Events • Articles (web) • Blogs posts • Visuals • Documents • Events • Articles (Print) • Authors • Books • Best sales • Media relays • Awards • Reserved Titles • Events • Directory  Other internal sources (works)  Other external sources free or paid model  New services  New customers RDF DW  Transform  Agregate  Link  Annotate
  30. 30. Outcome of this project Scalability issues Complexity/update issues Skills issues Maintenability issues Cost issues All data are linked and consistent Flexibility to manipulate RDF data
  31. 31. CONCLUSIONS AND PERSPECTIVES
  32. 32. The flexibility of the graph model Benefits and limits of Semantic Web technologies RDF Graph = absolute freedom compared with the rigidity of relational databases Linking of heterogeneous entities easily Graph can evolve over time and its growth is potentially infinite Maintainability issues Model issues
  33. 33. The flexibility of the graph model RDF vs property graph RDF Property graph RDF model are based on triple model : subject-predicat-object Property graph are based on nodes, edges and properties of nodes or edges.
  34. 34. The flexibility of the graph model Beyond the limits Reconciliation between RDF and property graph ? Example of RDF* <<:bob foaf:age 23>> ex:certainty 0.9 . Example of SPARQL* SELECT ?p ?a ?c WHERE { <<?p foaf:age ?a>> ex:certainty ?c . } RDF* / SPARQL* Do you really need RDF model to store data ?
  35. 35. Data dissemination / Interoperability / Decentralisation Contributions and limits of semantic Web technologies Best solution to achieve interoperability of data Linking heterogeneous data Create bridges between worlds impossible to reconcile SPARQL as powerful tool for querying data Asynchronous data retrieval Costs of maintenability Knowledge issues Full text search not possible Structural interoperability impossible  data mappings
  36. 36. Data dissemination / Interoperability / Decentralisation Overcoming the limits Easy-to-use ontologies Simple CSV or JSON/XML dumps Simple API What are the possibles uses ? Who are the users ? Do we need this level of interoperability?
  37. 37. DATA MANAGEMENT AT FRENCH NATIONAL AUDIOVISUAL INSTITUTE
  38. 38. Functionally separate data from their use • To rethink data models in relation to their logics and not theiru use • To acknowledge that some data models are dedicated to production and storage while several other models are designed specifically for data dissemination
  39. 39. Technically separate data from their use • Information System is organized in layers and not anymore in silos • The storage and process of data are separated from business applications
  40. 40. An infrastructure to store and process data 4 types of database system to store all types of data and to address all types of usage A process module to interact with the data and synchronize data between the different databases A management module to abstract the technical infrastructure and expose logical data to business applications
  41. 41. Thank you for your attention ! Do you have some questions ? And sorry for this… I would like to thank very much Emmanuelle Bermès (@figoblog) for the translation of this keynote !

×