Successfully reported this slideshow.
Your SlideShare is downloading. ×

Marc and beyond: 3 Linked Data Choices

Ad

MARC and Beyond
Our Three Linked Data Choices
Richard Wallis
Evangelist and Founder
Data Liberate
richard.wallis@dataliber...

Ad

Independent Consultant, Evangelist & Founder
Worked With:
•Google – Schema.org vocabulary, site, extensions, documentation...

Ad

• 1989 The Web Conceived
• 1994 1st
Web-based library discovery interface
• 2001 Semantic Web Introduced
• 2006 Linked Dat...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 20 Ad
1 of 20 Ad
Advertisement

More Related Content

Similar to Marc and beyond: 3 Linked Data Choices (20)

Advertisement
Advertisement

Marc and beyond: 3 Linked Data Choices

  1. 1. MARC and Beyond Our Three Linked Data Choices Richard Wallis Evangelist and Founder Data Liberate richard.wallis@dataliberate.com @rjw IFLA WLIC 2018 Kuala Lumpar August 26th 2018
  2. 2. Independent Consultant, Evangelist & Founder Worked With: •Google – Schema.org vocabulary, site, extensions, documentation and community •OCLC – Global library cooperative •FIBO – Financial Industry Business Ontology Group •Various Clients – Implementing/understanding Schema.org British Library — Stanford University — Europeana — NLB Singapore W3C Community Groups: •Schema Bib Extend (Chair) - Bibliographic data •Schema Architypes (Chair) - Archives •Financial Industry Business Ontology – fibo.schema.org •Tourism Structured Web Data (Co-Chair) •Schema Course Extension •Schema IoT Community •Educational & Occupational Credentials in Schema.org richard.wallis@dataliberate.com — @rjw 40+ Years – Computing 28+ Years – Cultural Heritage technology 13+ Years – Semantic Web & Linked Data
  3. 3. • 1989 The Web Conceived • 1994 1st Web-based library discovery interface • 2001 Semantic Web Introduced • 2006 Linked Data • 2011 British Library Data Model • 2011 Schema.org Launched • 2012 OCLC add Schema.org to WorldCat.org • 2012 Library of Congress launch BIBFRAME Project A bit of Linked Data HistoryA bit of Linked Data HistoryLinked Data Tim Berners-Lee
  4. 4. • 1989 The Web Conceived • 1994 1st Web-based library discovery interface • 2001 Semantic Web Introduced • 2006 Linked Data • 2011 British Library Data Model • 2011 Schema.org Launched • 2012 OCLC add Schema.org to WorldCat.org • 2012 Library of Congress launch BIBFRAME Project A bit of Linked Data HistoryA bit of Linked Data History
  5. 5. • Lots of great individual linked data initiatives (interoperability between them a question) • Bibframe 2.0 – ready for wider standardized adoption • Schema.org – the de facto structured data standard for the web (30% domains using it) • Potential for adding web links to MARC (Linky MARC) Library Linked DataLibrary Linked Data …… where are we now?where are we now?
  6. 6. Why? Linked Data
  7. 7. Linked DataThe Challenge of Entity Reconciliation
  8. 8. Prototyping a Linked Data Platform for Production Cataloging Workflows A project vision statement Work with our members through a foundational shift in the collaborative work of libraries, communities of practice, and end-users—dramatically improving efficiency, embracing the inclusive, diverse, and earnest OCLC membership, and empowering a new and trusted knowledge network enabled by the web. Andrew K. Pace Executive Director, Technical Research
  9. 9. #ALAAC18 Phase I Partners (Dec ’17 - Apr ‘18) • Cornell University • University of California, Davis Who Phase II Partners (!!!!) (May ‘18 – Sep ‘18) – American University – Brigham Young University – Cleveland Public Library  – Harvard University – Michigan State University – National Library of Medicine – North Carolina State University – Northwestern University – Princeton University – Smithsonian Library – Temple University – University of Minnesota – University of New Hampshire – Yale University OCLC Global Technolo- gies OCLC Research OCLC Global Product Management
  10. 10. #ALAAC18 What • Develop an Entity Ecosystem that facilitates: o Creation and editing of new entities o Connecting entities to the Web • Build a community of users who can: o Create/Curate data in the ecosystem o Imagine/propose workflow uses o Communicate easily with each other and with OCLC to iteratively improve the prototype • Provide services to: o Reconcile data o Explore the data
  11. 11. RECONCILER INDEX RECONCILIATION API BATCH Local Bibliographic and Authority Data RANKING BY EDITOR UI DUPLICATE DETECTION WORLDCAT CREATIVE WORK ASSOCIATION ENTITY ECOSYSTEM UI MINTING / EDITING API AUTHENTICATION & AUTHORIZATION ENTITY to ENTITY RELATOR External Client Applications External Client Applications
  12. 12. Entity Ecosystem • Entity Creation / Edit / Reconciliation • Research Prototype – but working with OCLC Production and OCLC Members • Building on Open Source tools • Learning and applying lessons • Vocabulary independent • Potential to become a useful tool/service plus an authoritative knowledge graph
  13. 13. What are my Library Linked Data Options? BIBFRAME 2.0 Schema.org Linky MARC Do nothing Who Wants to be a Linked Data Library
  14. 14. BIBFRAME 2.0 • Matured from 1.0 (and variations) • Not perfect but ‘good enough’ – continuing to develop • MARC Conversion specifications • Open source MARC Conversion software • Good basis for standardized Linked Data (RDF) interchange • Backed by Library of Congress - Supported by others • Introduces potential for entity-based cataloguing
  15. 15. BIBFRAME 2.0 • A Library vocabulary/ontology/standard • Not recognized outside of library and associated organisations • No real use for increasing visibility & discoverability on the general web
  16. 16. • De facto structured web data standard vocabulary • Has bibliographic extension • Shared by embedding in normal page HTML (No special data endpoint required) • Already found in 30% of web • Requested by Google and others • Used to populate Search Engine Knowledge Graphs • Acknowledged by Google to influence indexing • Driving Semantic Search, Voice Search, Local Search, etc. • Not detailed/specific enough for library cataloguing, etc. Schema.org
  17. 17. • Adding http URIs to $0 & $1 MARC subfields • Recommendation of the PCC Task Group on URIs in MARC • Consistent approach to include links to entities such as people, organisations, etc. (authorities) • Not Linked Data – but a way to preserve identified entity URIs for the future and possibly use them in user interfaces etc. Linky MARC
  18. 18. Do nothing • Wait for system suppliers to catch up • Need to keep aware of developments
  19. 19. What are my Library Linked Data Options? BIBFRAME 2.0 Schema.org Linky MARC Do nothing Who Wants to be a Linked Data Library BIBFRAME 2.0 Schema.org
  20. 20. MARC and Beyond Our Three Linked Data Choices Richard Wallis Evangelist and Founder Data Liberate richard.wallis@dataliberate.com @rjw IFLA WLIC 2018 Kuala Lumpar August 26th 2018

Editor's Notes

  • My topic for today – Linked Data
    Let’s start with a brief bit of history
    1989 The Web Conceived
    Tim Berners Lee introduces “Information Management: A Proposal” to his boss – eventual response was “Vague but interesting”
    1994 1st Web-based library discovery interface
    Cooperation between Loughborough University & BLCMP (UK Library Cooperative) – I had pleasure to be part of
    2001 Semantic Web Introduced
    By Tim Berners Lee & others in now infamous Scientific American Article
    2006 Linked Data
    Sir Tim introduces a pragmatic application of Semantic Web standards
    2011 British Library Data Model
    A seminal project that delivered a native Linked Data Model for library meta data
    2011 Schema.org Launched
    Google, Bing, Yahoo introduce a standard vocabulary for web data markup
    2012 OCLC add Schema.org to WorldCat.org
    Schema.org structured data embedded in all 300 million+ WorldCat pages – still live today
    2012 Library of Congress launch BIBFRAME Project
    A Linked Data replacement/future for MARC
  • As you can see it has been a long and lively history for linked data.
    I have been promoting the benefits for well over a decade now.
    So for libraries…
  • Library Linked Data – where are we now?
    Over the years there have been many great Linked Data projects – from British, French, German, Spanish and other national libraries, open source systems, large public libraries, Europeana etc. – all great projects but one has to question how data interchange will scale between them.
    Bibframe, after a bit of an uneven start, has evolved into Bibframe 2.0 – ready for wider adoption
    Schema.org – grown in capability – has bibliographic extension (which I helped introduce) – suitable for sharing library data with the world - widely adopted across many domains (on approx. 30%) - starting to be adopted by the SEO community,
    Linky MARC - my term for proposals for adding URLs to MARC records describing major entities such as people, places, organisations, Works, subjects, etc.
  • So the obvious question is – why do we want linked data
    Here are a few reasons I hear
  • Picking out three of these highlights what I like to call the The Challenge of Linked Data Entity Reconciliation
    When working with entities it becomes obvious to users when we match or duplicate entity descriptions wrongly.
    A problem with naming that we have been grappling with for years, exacerbated by entity properties and a wide range of matching authorities.
  • I have identified a project from OCLC Research that serves to address the workflows around such issues.
    They are working with OCLC Members to prototype workflows that will empower a new and trusted knowledge network enabled by the web.
    To help me describe this, my former colleague Andrew Pace has kindly lent me some slides from his recent presentation at at ALA in New Orleans.
  • The initial phase was in cooperation with two OCLC members and, critically, two front line areas of OCLC – this is a project heading toward productized delivery.
    The second phase has added 14 other significant US members
  • They are building an Entity Ecosystem to enable the creation, relating, management and reconciliation of entities of all relevant types – people, organisations, places, concepts, events, etc.
    It is being built using open source technologies such as MediaWiki, WikiBase, and OpenRefine.
    It addresses issues such as relating entities, as against traditional practices of copy cataloguing; and the linking of entities to external authorities such as Wikidata and Geonames etc.
  • I’m not going to go into detail, but it is fairly clear from this diagram that there are two main functional areas focused on editing and reconciliation are built around a shared entity ecosystem – each with their own focused API and user interfaces.
  • In summary OCLC are working on:
    Entity Creation / Edit / Reconciliation
    Research Prototype – but working with OCLC Production and OCLC Members
    Building on Open Source tools
    Learning and applying lessons
    Vocabulary independent
    Potential to become a useful tool/service plus an authoritative knowledge graph
    Such a service can be vocabulary independent, but individual libraries and system suppliers need to decide what linked data vocabularies they are going to support in their systems….
  • For libraries with Linked Data ambitions the $64,000 question is which option do I take?
    BIBFRAME 2.0
    Schema.org
    Linky MARC
    Do nothing
    Let’s review the options
  • Bibframe 2.0:
    Matured from 1.0 (and variations)
    Not perfect but ‘good enough’ – continuing to develop
    MARC Conversion specifications
    Open source MARC Conversion software
    Good basis for standardized Linked Data (RDF) interchange
    Backed by Library of Congress - Supported by others
    Introduces potential for entity-based cataloguing
  • However….
    It is a library specific standard - not recognized by organisations outside of the library world, such as the search engines.
    If you are looking to increase the visibility & discoverability of you resources across the web – as my mother would have put it Bibframe is about as much use as a chocolate teapot!
  • De facto structured web data standard vocabulary
    Has bibliographic extension
    Shared by embedding in normal page HTML(No special data endpoint required)
    Already found in 30% of web
    Requested by Google and others
    Used to populate Search Engine Knowledge Graphs
    Acknowledged by Google to influence indexing
    Driving Semantic Search, Voice Search, Local Search, etc.
    Not detailed/specific enough for library cataloguing, etc.
    So, you would not use Schema.org to build a comprehensive library system around.
  • As a library, you could wait for the system suppliers to catch up and then implement what they deliver.
    I would however suggest that you do not ignore developments in this area, engaging staff in understanding the basic principles and watching what the community and system suppliers are doing.
  • So what options should we take?
    Schema.org – for discovery on and from the web
    BIBFRAME 2.0 – for entity-based cataloguing – library linked data interchange – enhanced local user experience
    You really need to do both, but as libraries are primarily there to assist their users, most of which do not start discovery in a library interface, Schema.org should provide the most initial benefit.
    Linky MARC – if you systems do not yet support these Linked Data options, look into adding this to your cataloguing practices to stop throwing away those links you increasingly discover.
    Doing nothing is not an ideal option for libraries – if you are a system supplier it is definitely not an option.
  • Thank you.

×