Bib Data Experiment -- The GW Journey

  • 508 views
Uploaded on

ALA Annual Conference 2013 …

ALA Annual Conference 2013
Chicago
Linked Data IG (LITA/ALCTS)

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
508
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
11
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • You are here, because I think you are as interested as I am to know how we can integrate our library data differently on the Web. How we can tease apart years of tradition that will continue our mission as librarians in the Web of datasphere. The trustworthiness of our work continues in information navigation and knowledge pursuitHow are we to move away from years of training, to challenge our ingrained mindset and work habits to increate usability value and extensibility outlook of our existing data.So we can begin to expose, share, and connect pieces of data, information, and knowledge on the Web by using URIs and RDF.As the deconstructing metadata begins, depending on your personal perspective and experience, this upward curve,Is it: stress, excitement, uncertainty, heightened expectation?
  • TimBerns-Lee’s 5-Stars Rating Model which he used in the Gov 2.0 Expo in 2010https://www.youtube.com/watch_popup?v=ga1aSJXCFe0
  • ENVIRONMENTAL SCANBack in early 200, LC under the leadership of Deana Marcum, had already begun thinking about our future in the world of Web data. Here is a quick run down of the timeline up to current with the public unveiling of the BIBFRAME web site in Jan 2013.2006 Bib Control Working Group2007-2009 BCWG Reports, Responsesto Final Reports, MARC Marketplace Survey, Four-Year Recommendations2009- RDADeanna B. Marcum on May 13 2011 issued statement on transforming bib framework. With imminent implementation of RDA after the national test and analyses from participants’ experiences the rethinking of bib control, including MARC communication format took the front seat. The review and evaluation must extend to a wider bib frame work.BIBFRAME Initiative  ultimately to replace MARC, content standard agnostic, data framework to support traditional bib, authority, holdings. Move us toward information service on the Web beyond library#########In early 2000s, GW libraries like many of its colleagues, has also began looking at library services and its staff. R2’s recommendation resulted in major tech serv reorganization. When I arrived in 2008, the reorg was at its last phase.The library took part in the US national RDA test and be came an early experimenter of the BIBFRAME initiative.
  • BF purported to tease apart the existing bibliographic data. BF data is organized in 4 classes: Work, Instance. Authority and AnnotationRELATIONSHIP is the key that holds the THINGs: person, family, Corporate body; event, concept, place, topicsRDFvoc models data via HTTP URIs. The “string” data turns into entity that connects to its source.Data that associated with ownership, description for a particular portion of a physical item, location where the physical piece can be located, such as holding, are at this time part of the Annotation entity. Cover Art and Review that are viewed as important as holdings data.
  • BF data are in RDF. Here are a few example of how the data is modeled: RDF/XML Turtle N-Triple, a line based RDF serialization
  • Here we see the form of a bibliographic record represented in triple, 3 parts.
  • In this slide, an ID is in place of Jane Austen, title, Pride and prejudice,Her birth place, country, Topic of her work, Pride and prejudiceIncluding type of materials that isRDF provides the simplest form for HTTP URI links. End of each triple can begin another set of triplePerpetual linking of HTTP URI in the triple makes the Identifier an extremely important piece. Implicitly the authority data that undergird the entity: Person, corporate boy, place, event, concept, object.
  • In the last quarter of 2012 (October-December)1.8 M bib, 18% have at least one invalid, obsolete field in Bib.Inconsistency of data recordingIdentify a smaller test set: in Oct to test Zepheira’s initial mapping algorithmOlder and newer records: English, German, French, Arabic,Criteria: 130 and 2406XX and subfield t7XX and subfield tGW Local PracticeSingle Record for multiple formats
  • MARC BIB, date range (mono, sound recordings, dvd)LC Experimentation? Working primarily with MARC21 data? Base BIBFRAME Works generated from Authorities? Name/Title Authorities? Title Authorities? MARC21 BIBFRAME/? Raw transform (Splitting)? Essentially splitting MARC21 Bib to Work, Instance(s),Annotation(s)? Consolidation (Folding)? Look for pre-existing BIBFRAME Work? Based on [Name.] Title[. Language] match? If not found, Create new BIBFRAME Work
  • Here we began with monographs.The data that belong to the Instance class. Then worked our way to CreativeWork. Considering the elements, properties, classes that are part of an Instance from the selected dataset. Metadata that are straightforward: Instance:1) author, title, publication, collation, identifier (ISBN), edition, CreativeWork:2) author, title, language, subject, class numberAs we saw earlier, a resource has an ID.Work entity will have one, so does topic, place of publication, the entity of corporate body, etc.URI is applied in most if not all the entity/element of a THINGFor us, the hardest portion of the framework was to wrap the Holdings and Item records in the Annotation Class. We will hear more it at the Bib Updates at 10:30AMRDARIMMF
  • The EE worked on 15 pointpapers, 3 now linked on the Documents section of BIBFRAME Web site: Resource type, Authority and Annotation.On Thu, Schema.org was published via BF listserv.Vocabulary Navigator help to apply Work, Instance and Authorities, in relation to MARC21Tools for supporting the transformation from MARC 21 to the BIBFRAME modelLinked Data Browsers of BIBFRAME data to help demonstrate the benefit Discussions among EE though library focus, members often have roles extended to the Web ·     Relationship to schema.org development -   Jean Godby·         Relationships in the model - Eric Miller, Kevin Ford, Sally McCallum; RDA relationships in particular -  Alan Danskin (volunteered post meeting)·         CCO – mapping to bibframe - Jon Stroop, Rebecca Guenther·         Resource types - Rebecca Guenther, Reinhold Heuvelmann, Jean Godby, Sally McCallum·         Serials/Collections (whole part-relationships) - Ted Fons, Nancy Fallgren, Jackie Shieh·         Labels/non-sorting characters/punctuation – Reinhold Heuvelmann·         Holdings -  Jan Ashton, CorineDeliot, Nancy Fallgren, Reinhold Heuvelmann, Jackie Shieh·         Authority in Bibframe – Ted Fons, Kevin Ford·         Summary of MARC splitting used – Nate Trail, Kevin Ford, Sally McCallum·         Modelling of serials – Zepheira (Eric Miller), Nancy Fallgren, Jackie Shieh·         DACS in the model -- Sally will check on someone from LC's Manuscripts whre they use DACS·         Annotations -- Ray Denenberg·         Music Issues -- Jon Stroop·         Practical cataloging scenerios -- Zepheira (Eric Miller)·         Mappping MODS to BIBFRAME -- Rebecca Guenther·         Labels/non-sorting characters/punctuation/sorting -- Reinhold Heuvelmann
  • Look beyond MARCContinuing Refinements of Property and ClassesCompeting Standards (Reconciling)Best Practices (Community-based)Serialization Proposals and ProcessesBest Practices: some semantic issuee.g.owl:SameAs--the built-in property links an individual to an individualthe URIs links two which have the same identityexpand to Concepts and resourcesSoccer == FootballGW:What is valid trigger for a different Instance (Manifestation) or Work how that can be expressed: Freebase: adaptedFrom // adaptedWorkWill the value 2 of 2nd indicator and/or a note field suffice?Language (008/35-37, 041 and 546; 040 $b)Resource (Work and/or Instance)Record descriptionWhat is role of a description standard in Bib Record (040 $e)?How about 042, an agency assigned element?Mul-ver RecordPrint bib attached with multiple holdings for different physical representation in formatsMultiple Bib-Hol to a Single ItemMulti Scripts and Vernacular Scripts--separate set of W/I/A?Will unstructured note fields with predictable punctuation suffice to generate secondary Work/Instance link record?QUESTIONS:? Work/Instance/Annotation? Holdings/Annotation? Authorities? Data Serialization (RDF/XML, Turtle, N-Triples, JSON)
  • The EE worked on 15 pointpapers, 3 now linked on the Documents section of BIBFRAME Web site.Vocabulary Navigator help to apply Work, Instance and Authorities, in relation to MARC21Tools for supporting the transformation from MARC 21 to the BIBFRAME modelLinked Data Browsers of BIBFRAME data to help demonstrate the benefit Additional community tests:VTLS: --BIBFRAME experimentation, linked data navigation and future OPACsOCLC -- Jean Godby, BIBFRAME and schema.org
  • Look beyond MARCContinuing Refinements of Property and ClassesCompeting Standards (Reconciling)Best Practices (Community-based)Serialization Proposals and ProcessesBest Practices: some semantic issuee.g.owl:SameAs--the built-in property links an individual to an individualthe URIs links two which have the same identityexpand to Concepts and resourcesSoccer == FootballGW:What is valid trigger for a different Instance (Manifestation) or Work how that can be expressed: Freebase: adaptedFrom // adaptedWorkWill the value 2 of 2nd indicator and/or a note field suffice?Language (008/35-37, 041 and 546; 040 $b)Resource (Work and/or Instance)Record descriptionWhat is role of a description standard in Bib Record (040 $e)?How about 042, an agency assigned element?Mul-ver RecordPrint bib attached with multiple holdings for different physical representation in formatsMultiple Bib-Hol to a Single ItemMulti Scripts and Vernacular Scripts--separate set of W/I/A?Will unstructured note fields with predictable punctuation suffice to generate secondary Work/Instance link record?QUESTIONS:? Work/Instance/Annotation? Holdings/Annotation? Authorities? Data Serialization (RDF/XML, Turtle, N-Triples, JSON)

Transcript

  • 1. Clip courtesy of Europeana’s Linked Data
  • 2. BIBFRAME
  • 3. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Huckleberry_Finn"> <dc:title>Adventures of Huckleberry Finn</dc:title> <dc:creator>MarkTwain</dc:creator> </rdf:Description> </rdf:RDF> RDF Serialization @prefix dc: <http://purl.org/dc/elements/1.1/>. <http://en.wikipedia.org/wiki/Huckleberry_Finn> dc:creator "MarkTwain"; dc:title "Adventures of Huckleberry Finn". Turtle <http://en.wikipedia.org/wiki/Huckleberry_Finn> <http://purl.org/dc/elements/1.1/title> "Adventures of Huckleberry Finn". <http://en.wikipedia.org/wiki/Huckleberry_Finn> <http://purl.org/dc/elements/1.1/creator> "MarkTwain". N-Triple
  • 4. GWU 8 Bib record (flat-file) Author: Title: Content type: Carrier type: Provenance: Subject: Austen, Jane, 1775-1817. Pride and prejudice Spoken word Audio disc Courtship--Fiction Donated by John Smith Name authority record Name: Subject authority record Identifier: … Label: Identifier: … Bib record (description) Item information Manifestation information Expression information Work information FRBR/RDA record RDA content type registry Label: Identifier: … Spoken word RDA element registry RDA carrier type registry Future record ONIX FRBR registry (IFLA) BasedonGordonDunsire’sslide Work title: http://id.loc.gov/authorities/names/n79032879 http://id.loc.gov/authorities/subjects/sh200810029 8 http://id.loc.gov/authorities/names/n2002041181 Database Scenario
  • 5. RDF Triple JaneAusten Subject authored Predicate Pride and Prejudice Object 1775-1817 Born on Person Female Type Book Type Courtship Topic Born in Hampshire, England County Type Great Britain IsIn United Kingdom sameAs
  • 6. RDF Triple http://viaf.org/viaf/102333412 Subject Identifier authored Predicate Verb/Relationship http://id.loc.gov/authorities/names/n2002041181 Object Identifier/Value 1775-1817 Born on Person Female Type http://schema.org/Book Type http://id.loc.gov/authorities/subjects/sh85033596 Topic Born in http://en.wikipedia.org/wiki/Hampshire,_England County Type http://viaf.org/viaf/127756949 IsIn http://en.wikipedia.org/wiki/United_kingdom sameAs
  • 7. GW MARC Data
  • 8. GW Work-Arounds
  • 9. Work from Instance to CreativeWork Instance (Embodiment of CreativeWork) CreativeWork (Conception Essence) Annotation (Holdings/Item) (Reviews, CoverArt) Authority (Subject, Creator) (Publisher, Place, Event)
  • 10. GW BIBFRAMEMARC
  • 11. EE Activities  Prepared a Test Set to put through LC and Zepheira Pipeline (Oct 2012)  Adapted Draft BIBFRAME Structure for local process (Nov-Dec 2012 )  Feedback on Data Modeling (Jan 2013 - )  Initiated 15+ Focused Discussion Papers Initial draft: late Feb. 2013  First Point Papers posted on BIBFRAME (May 2013)
  • 12. Thank You!