Your SlideShare is downloading. ×
0
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Linked data at the Science Museum
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Linked data at the Science Museum

364

Published on

Slides to accompany my talk at the Museum Computer Network conference 2012. …

Slides to accompany my talk at the Museum Computer Network conference 2012.

I discuss how we extract information from various repositories at the museum, and convert it to RDF format, as well as some of the challenges along the way.

The video of the talk can be seen at http://www.youtube.com/watch?v=NZZhkyEnxhk and the description of the conference session is at http://www.mcn.edu/2012/linked-open-data-science-museum

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
364
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Daniel can’t be here
  • In common with other institutions, SM has disparate data silos
  • Not consolidation. Instead: scheduled, repeatable extraction
  • This is called RDFEverything is a tripleUnique URIsTriples interconnect to form a graph
  • Individual triples with the same subject or object interconnect to form a graph of data
  • Inferencing: e.g. hasMother, hasFather: hasParent
  • Storage: no schema changesQuerying: no inherent hierarchyEvolution: easy to change or layer or mergeStandard: works regardless of underlying systemsLinking: LD->LOD. Unambiguously connect
  • http://richard.cyganiak.de/2007/10/lod/
  •     Overall scheme:        Collections Management System (MultiMimsy XG) = RDDMS = COBOAT        Digital Asset Management System (iBase)  = RDDMS = COBOAT        Archives Management System (AdLib) = REST API = custom Python scripts (rdflib)        Web Content Management System (Sitecore) = .NET = custom workflow hook (dotNetRDF)        Legacy Web content (XML files) = custom script for one-off importSome channels still pending
  • Example query
  • Results shown as a graph
  • Explain structure of e.g. http://data.sciencemuseum.org.uk/id/agents/sm-12345Mention Cool URIs, and UK government guidelinesPossible example of opaque vs non
  • http://www.w3.org/TR/cooluris/http://data.gov.uk/resources/urishttp://www.cabinetoffice.gov.uk/resource-library/designing-uri-sets-uk-public-sector
  • Mention that we talked to the BL, the BM, archives hub, Kasabi
  • Discussion about design principle: minimise number of different classes; reuse popular ontologies; model our domain; limit repetition;
  • Popular ontologies: FOAF
  • Useful ontology: SKOSMention unresolved questions e.g. how much duplication of predicates; how to map to others – inferencing?
  • Brand new in terms of adoptionTools: subtle differences between systems. Inferencing not standard.Experts: compare finding info on Jena versus find info on Apache web serverMindset change: lots of relational database developers, comparatively few for LD
  • Filtering: only public data should be exposed (at both item and field level)Reconciliation: making sure identifiers sync; making sure data links up(i.e. no orphaned content)Inconsistencies: example of many source predicates for ‘maker’ collapsed to a few in final structure; example of different uses of similar systems such as iBaseBtLvsiBase IngeniousCopyright: issue of text or images not owned by institution: triple store would need to only have references if made available as CC0
  • UGC example: crowdsourced Babbage transcriptionsExternal example: wikipedia bios for people, geonames data for places
  • Verbal summary: the move to linked data can be challenging, but is exciting and liberating
  • Transcript

    • 1. Linked Dataat The Science MuseumTristan Roddis, CogappDaniel Evans, Science MuseumMCN 2012, Seattle
    • 2. AgendaThe contextThe big ideaWhy linked data?The approachThe challengesWhere next?Questions
    • 3. The context
    • 4. SpecificallyCollections Management System (MultiMimsy XG)Digital Asset Management System (iBase)Archives Management System (AdLib)Web Content Management System (Sitecore)Legacy Web content (XML files)Etc.
    • 5. Sitecore Mimsy CMS CMSiBaseDAM AdLib AMS Legacy docs
    • 6. The big idea
    • 7. Extract data from all silos and connectUse Linked Data for extensibility
    • 8. Sitecore Mimsy CMS CMS Triple storeiBaseDAM AdLib AMS Legacy docs
    • 9. Why linked data?
    • 10. A brief history of dataRelationalHierarchicalGraph
    • 11. Linked Data is easy! foaf:firstName cog:tristan “Tristan” subject predicate object <http://data.cogapp.com/id/tristan> <http://xmlns.com/foaf/0.1/firstName> "Tristan".
    • 12. Linked Data ingredientsRDF triplesTriple-storeSPARQL endpoint(Inferencing engine)
    • 13. Benefits of Linked DataFlexible storageFlexible queryingEvolution of dataStandard format and interfaceLinking to the web of data
    • 14. The approach
    • 15. Sitecore Mimsy CMS CMS COBOAT Workflow hook Triple store COBOAT APIiBase +DAM rdflib AdLib One-off import AMS Legacy docs
    • 16. The challenges
    • 17. IdentifiersStability303 redirectsOpaque versus human-readablehttp://data.sciencemuseum.org.uk/id/objects/smxg-12345
    • 18. OntologiesDublin Core (DC)
    • 19. OntologiesDublin Core (DC)Dublin Core Terms (DCT)
    • 20. OntologiesDublin Core (DC)Dublin Core Terms (DCT)Friend Of A Friend (FOAF)
    • 21. OntologiesDublin Core (DC)Dublin Core Terms (DCT)Friend Of A Friend (FOAF)Simple Knowledge Organization System (SKOS)
    • 22. OntologiesDublin Core (DC)Dublin Core Terms (DCT)Friend Of A Friend (FOAF)Simple Knowledge Organization System (SKOS)CIDOC Conceptual Reference Model (CIDOC CRM)
    • 23. OntologiesDublin Core (DC)Dublin Core Terms (DCT)Friend Of A Friend (FOAF)Simple Knowledge Organization System (SKOS)CIDOC Conceptual Reference Model (CIDOC CRM)Europeana Data Model (EDM)
    • 24. OntologiesDublin Core (DC)Dublin Core Terms (DCT)Friend Of A Friend (FOAF)Simple Knowledge Organization System (SKOS)CIDOC Conceptual Reference Model (CIDOC CRM)Europeana Data Model (EDM)schema.org
    • 25. OntologiesDublin Core (DC)Dublin Core Terms (DCT)Friend Of A Friend (FOAF)Simple Knowledge Organization System (SKOS)CIDOC Conceptual Reference Model (CIDOC CRM)Europeana Data Model (EDM)schema.orgEtc.
    • 26. Linked Data is still youngImmature tools and technologySmall pool of expertsMindset change
    • 27. Linked Data doesn’t solve everythingFiltering tasksReconciliation tasksExposes inconsistenciesExposes copyright issues
    • 28. Where next?
    • 29. Where next?Data opportunities: More sources (Sitecore, legacy sites, new content) More data from existing sources (reconcilliation between systems, turning literal strings into nodes) From Linked Data to Linked Open Data: link to DBpedia, Geonames, VIAF, BNB, etc. Inferencing to expose data via different ontologies
    • 30. Where nextPublishing opportunities: Public SPARQL Endpoint REST API Website Pull in UGC Pull in external data
    • 31. Geonames DBPedia Sitecore Web pages Mimsy CMS REST API CMS UGC Triple store SPARQLiBaseDAM AdLib AMS Legacy docs
    • 32. Questions?

    ×