• Save
BEA 2014--Understanding New Developments in Metadata
 

Like this? Share it with your network

Share

BEA 2014--Understanding New Developments in Metadata

on

  • 449 views

Originally presented at the BEA Conference session in 2014, this presentation covers ISNI, ONIX 3, and Thema.

Originally presented at the BEA Conference session in 2014, this presentation covers ISNI, ONIX 3, and Thema.

Statistics

Views

Total Views
449
Views on SlideShare
443
Embed Views
6

Actions

Likes
1
Downloads
6
Comments
0

2 Embeds 6

https://twitter.com 4
http://www.slideee.com 2

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

BEA 2014--Understanding New Developments in Metadata Presentation Transcript

  • 1. Understanding New Developments in Metadata BEA Conference 2014
  • 2. Richard Stark, Moderator Director of Product Data Barnes & Noble Laura Dawson, Speaker Product Manager, Identifier Services Bowker Chris Saynor, Speaker Metadata Manager and Project Manage GiantChair Kempton Mooney, Speaker Research and Analytics Director Nielsen Book
  • 3. ISNI Disambiguating Public Identities
  • 4. What Is ISNI • ISO Standard, published in 2012 • International Standard Name Identifier • Numerical representation of a name – 16 digits – Assigned to public figures, contributors of content – researchers, authors, musicians, actors, publishers, research institutions – and subjects of that content (if they are people or institutions). – Example: 0000 0004 1029 5439
  • 5. Who is ISNI • Founding members – IFRRO (International Federation of Reproduction Rights Organizations) – CISAC (International Confederation of Authors and Composers Societies) – SCAPR (Societies’ Council for the Collective Management of Performers’ Rights) – OCLC – CENL (Conference of European National Librarians), represented by the British Library and the National Library of France – ProQuest, represented by Bowker
  • 6. Members Quality Team Board of Directors ISNI Organizational Structure Registration Agencies Ongoing assignments/ general public
  • 7. How Does ISNI Registration Work • Publisher submits names for assignment through a Registration Agency • RA works with the publisher to ensure the data feed is well- formatted, and sends that feed to the Assignment Agency • AA assigns as many ISNIs to the names in the feed as it can, using complex algorithms and business rules that evolve with each feed • AA returns a file of names with ISNIs attached to them – This may not be the full file of names – Ambiguous names are held for review by Quality Team – QT assignments and other exceptions (assignments as a result of improvements to the algorithm) are returned to RA quarterly – Process is not instant. Assignment may be immediate if the name and other information is unique, but frequently assignments take a week or two.
  • 8. Stage One Customer submits data to Registration Agency Registration Agency sends file to Assignment Agency Assignment Agency assigns as many ISNIs to the names as it can
  • 9. Stage Two Assignment Agency sends assigned file to Registration Agency Registration Agency sends assigned file to Customer Customer reviews, QAs, ingests
  • 10. Stage Three Assignment Agency sends updates on a monthly basis Registration Agency disperses files to appropriate Customers Customers ingest updates
  • 11. Display • Only minimal metadata is displayed • Not meant as a comprehensive profile • ISNI is a tool for linking data sets, collocation, and disambiguation • Enhancements to the record can be made but not required
  • 12. Sample Public ISNI Record
  • 13. ISNI links 12
  • 14. Who is using ISNIs? • Wikipedia/Wikidata • VIAF • Access Copyright • Scholar Universe and Pivot • British Library • JISC • Musicbrainz • Macmillan (Digital Science) • Booknet Canada (piloting) • Authors Guild (piloting)
  • 15. Einstein’s Wikipedia Page
  • 16. How many names in the ISNI database? • Over 8,300,000 assigned • 10,112,931 provisional (awaiting a match from another data set for corroboration) • Your author names may well already have ISNIs. http://www.isni.org/search.
  • 17. Use Case: Publisher
  • 18. Use Case: Cross-Domain Linking
  • 19. Use Case: Cross-Domain Linking
  • 20. Data Quality • Based on matching names to existing records in database (over 18 million names) • Strict criteria for assigning ISNIs to names • Quality team oversight (manual edits) – British Library – National Library of France – OCLC 19
  • 21. Assignment Criteria • If on the common surname list: – Birth date – Death date – ISBN(s) – Title(s) – Co-authors or institutional affiliation • If not on the common surname list – Title(s) – Birth date – Death date – Any other distinguishing factors (“is not”) • If unique – Immediate assignment 20
  • 22. ISNI and ORCID • ORCID numbers are a subset of ISNI’s database • Working towards alignment, with ultimate goal of single assignment • There is ISNI representation on the ORCID Technical Steering Group, and ORCID representation on the ISNI Technical Committee • A researcher may have both an ORCID and an ISNI 21
  • 23. Do You Have An ISNI? 22
  • 24. Laura.Dawson@bowker.com
  • 25. Understanding New Developments in Metadata
  • 26. ???
  • 27. What is ONIX?
  • 28. • ONIX stands for ONline Information eXchange.
  • 29. • ONIX stands for ONline Information eXchange. • There are over 200 data elements.
  • 30. • ONIX stands for ONline Information eXchange. • There are over 200 data elements. • ONIX is an international metadata standard for communicating book product information.
  • 31. • ONIX stands for ONline Information eXchange. • There are over 200 data elements. • ONIX is an international metadata standard for communicating book product information. • This electronic information is distributed between publishers, distributors, wholesalers, bookstores, online retailers, libraries, book data aggregators and anyone else involved in the supply chain.
  • 32. • ONIX stands for ONline Information eXchange. • There are over 200 data elements. • ONIX is an international metadata standard for communicating book product information. • This electronic information is distributed between publishers, distributors, wholesalers, bookstores, online retailers, libraries, book data aggregators and anyone else involved in the supply chain. • ONIX allows global communication regardless of language.
  • 33. • ONIX stands for ONline Information eXchange. • There are over 200 data elements. • ONIX is an international metadata standard for communicating book product information. • This electronic information is distributed between publishers, distributors, wholesalers, bookstores, online retailers, libraries, book data aggregators and anyone else involved in the supply chain. • ONIX allows global communication regardless of language. • Book information can be communicated between organizations with different technical infrastructures.
  • 34. • ONIX stands for ONline Information eXchange. • There are over 200 data elements. • ONIX is an international metadata standard for communicating book product information. • This electronic information is distributed between publishers, distributors, wholesalers, bookstores, online retailers, libraries, book data aggregators and anyone else involved in the supply chain. • ONIX allows global communication regardless of language. • Book information can be communicated between organizations with different technical infrastructures. • ONIX is not a database, but uses XML to organize data storage.
  • 35. ONIX. A history. With the growth of the internet and e-commerce in the 1990s there was a compelling need to create a standard digital format to communicate book information. The goal was to create a universal, international format with which publishers large and small could exchange information about their books.
  • 36. • ONIX was developed jointly in the late 1990s by Editeur with Book Industry Standards Group (BISG) in the US and Book Industry Communication in the UK.
  • 37. • ONIX was developed jointly in the late 1990s by Editeur with Book Industry Standards Group (BISG) in the US and Book Industry Communication in the UK. • ONIX for books 1.0 was published in January 2000.
  • 38. • ONIX was developed jointly in the late 1990s by Editeur with Book Industry Standards Group (BISG) in the US and Book Industry Communication in the UK. • ONIX for books 1.0 was published in January 2000. • ONIX for books 2.1 (revision 02) was published in 2004.
  • 39. • ONIX was developed jointly in the late 1990s by Editeur with Book Industry Standards Group (BISG) in the US and Book Industry Communication in the UK. • ONIX for books 1.0 was published in January 2000. • ONIX for books 2.1 (revision 02) was published in 2004. • ONIX for books 3.0 was released in January 2009.
  • 40. • ONIX was developed jointly in the late 1990s by Editeur with Book Industry Standards Group (BISG) in the US and Book Industry Communication in the UK. • ONIX for books 1.0 was published in January 2000. • ONIX for books 2.1 (revision 02) was published in 2004. • ONIX for books 3.0 was released in January 2009. • ONIX is governed by an International Steering Committee with local committees providing information, support and feedback internationally.
  • 41. • ONIX was developed jointly in the late 1990s by Editeur with Book Industry Standards Group (BISG) in the US and Book Industry Communication in the UK. • ONIX for books 1.0 was published in January 2000. • ONIX for books 2.1 (revision 02) was published in 2004. • ONIX for books 3.0 was released in January 2009. • ONIX is governed by an International Steering Committee with local committees providing information, support and feedback internationally. • There are national ONIX groups in Australia, Belgium, Canada, China, Egypt, Finland, France, Germany, Italy, Japan, Korea, The Netherlands, Norway, Russia, Spain, Sweden, the UK and the USA. It is also used in many other countries.
  • 42. Why use ONIX?
  • 43. • ONIX is a message – not a database.
  • 44. • ONIX is a message – not a database. • ONIX is a standard – a common language.
  • 45. O N I X
  • 46. • ONIX is a message – not a database. • ONIX is a standard – a common language. • ONIX is international.
  • 47. • ONIX is a message – not a database. • ONIX is a standard – a common language. • ONIX is international. • ONIX can communicate your title information with everyone.
  • 48. ???
  • 49. Why ONIX 3.0?
  • 50. • With the growth of new digital formats ONIX needed revision.
  • 51. • With the growth of new digital formats ONIX needed revision. • ONIX 2.1 had a lot of depreciated elements left over from earlier versions of ONIX 2.
  • 52. What is different about ONIX 3.0?
  • 53. • ONIX 3.0 reflects the changed global book market.
  • 54. • ONIX 3.0 reflects the changed global book market. • ONIX 2.1 and 3.0 share many common traits. About 66% of a typical ONIX 2.1 message does not need significant changes made to make it valid ONIX 3.0.
  • 55. • ONIX 3.0 reflects the changed global book market. • ONIX 2.1 and 3.0 share many common traits. About 66% of a typical ONIX 2.1 message does not need significant changes made to make it valid ONIX 3.0. • Outdated and depreciated elements have been removed.
  • 56. Product supply information now better reflects the global nature of market
  • 57. • ONIX 3.0 pushes you to express all market data even if it is to say “Not known for these countries”.
  • 58. • ONIX 3.0 pushes you to express all market data even if it is to say “Not known for these countries”. • Can express much more detailed pricing information on a global scale.
  • 59. • ONIX 3.0 pushes you to express all market data even if it is to say “Not known for these countries”. • Can express much more detailed pricing information on a global scale. • Can express dates and availability by market.
  • 60. Percentage of population who Speak English Source: wikipedia
  • 61. ? ? ? ? ? ? ? ? ?? ? ?
  • 62. Digital products can now be described more completely
  • 63. • Formats changed to express method of delivery.
  • 64. • Formats changed to express method of delivery. • Information on DRM and usage constraints.
  • 65. • Formats changed to express method of delivery. • Information on DRM and usage constraints. • Accessibility information.
  • 66. • Formats changed to express method of delivery. • Information on DRM and usage constraints. • Accessibility information. • Rental information and conditions.
  • 67. “Set” and “Series” replaced by a more general notion of “Collections” • It is easier to express a shared identity.
  • 68. Title information can be expressed and defined more clearly
  • 69. In Search of Lost Time Volume 1 Swann’s Way
  • 70. A Storm of Swords A Song of Ice and Fire Book 3 Game of Thrones
  • 71. Better expression of data for marketing material
  • 72. • Text content – any text included in your metadata.
  • 73. • Text content – any text included in your metadata. • Cited content – any third party content you make reference to that could improve sales.
  • 74. • Text content – any text included in your metadata. • Cited content – any third party content you make reference to that could improve sales. • Supporting resources – any material a publisher wishes to make available in their metadata to support the sale of the title.
  • 75. “I always wanted to be a writer.”
  • 76. Multilingual data
  • 77. • Can repeat and send textual information in different languages and different scripts.
  • 78. • Can repeat and send textual information in different languages and different scripts. • Add a note about a product in English, French and Spanish.
  • 79. • Not suitable for children under 36 months, due to small parts • No apto para niños menores de 36 meses, debido a las piezas pequeñas • Ne convient pas aux enfants de moins de 36 mois, en raison de petites pièce • Nicht geeignet für Kinder unter 36 Monaten, wegen verschluckbarer Kleinteile • Не подходит для детей в возрасте до 36 месяцев, в связи с мелких деталей
  • 80. • Can repeat and send textual information in different languages and different scripts. • Add a note about a product in English, French, Spanish etcetera... • Send your author’s biography in English and Spanish.
  • 81. • Miguel de Cervantes Saavedra; 29 September 1547 (assumed) – 22 April 1616) was a Spanish novelist, poet, and playwright. His magnum opus, Don Quixote, considered to be the first modern European novel, is a classic of Western literature, and is regarded amongst the best works of fiction ever written. His influence on the Spanish language has been so great that the language is often called la lengua de Cervantes ("the language of Cervantes"). He was dubbed El Príncipe de los Ingenios ("The Prince of Wits"). • Miguel de Cervantes Saavedra (Alcalá de Henares,29 de septiembre de 1547 – Madrid, 22 de abril de 1616) fue un soldado, novelista, poeta y dramaturgo español.Es considerado una de las máximas figuras de la literatura española y universalmente conocido por haber escrito Don Quijote de la Mancha, que muchos críticos han descrito como la primera novela moderna y una de las mejores obras de la literatura universal, además de ser el libro más editado y traducido de la historia, sólo superado por la Biblia. Se le ha dado el sobrenombre de «Príncipe de los Ingenios».
  • 82. Block updates
  • 83. • Send updates for part of the product instead of sending the whole product file.
  • 84. • Send updates for part of the product instead of sending the whole product file. • So updates can be sent as smaller files.
  • 85. Even better resources • Very comprehensive ONIX 3.0 Global Best Practice and implementation documents available. • For developers, ONIX 3.0 has XSD and RNG schemas.
  • 86. More about best practices • BISG – Best Practices for Product Metadata: Guide for North American Senders and Receivers. • BISG – Best Practices for Keywords in Metadata: Guide for North American Senders and Receivers. • Editeur – Implementation and Best Practice Guide
  • 87. To find out more about ONIX www.editeur.org www.bisg.org http://www.booknetcanada.ca/
  • 88. Thema: The First Global Subject Category Codes May 2014 *Contains information from Howard Willows LBF 2014 Presentation
  • 89. Thema… What is it? How will it help? What are its implications? What does it look like? 94
  • 90. Thema: What is it? • Thema is a subject category classification system. • Thema is made for all members of the supply chain to use. • Thema is meant for use with physical and digital products. • Thema is an international standard for the global book trade. 95
  • 91. Thema: How will it help? • Book trade subject schemes tend to be national, not international • We can now clearly communicate all product data – except subject classification • Thema can replace the need for endless mappings & conversions It is live! Version 1.0 was released November 2013 Sunrise Date was December 2013 96
  • 92. Thema: How will it help? • Facilitate international transactions • Increase understanding in international markets • Reduce subject code confusion • Increase discoverability 97
  • 93. Technical Committee Review Committee National Work Group National Work Group National Work Group National Work Group Thema Committee Structure a subcommittee of 98 (Maintained by EDItEUR)
  • 94. 99 Thema countries: LBF 2014
  • 95. AIE Amazon Australian PA Baker & Taylor Barnes & Noble BIC BISG Bokrondellen Booknet Canada Bowker BTLF CB Danish PA Dilve Editis Electre Elkotob.com Elsevier Giant Chair Guild of Book Dealers (Russia) Hachette HarperCollins Informazioni Editoriali Ingram Japan Publishers Organisation Kobo Kogan Page Libri MVB Nielsen Book Norske Bokdatabasen NTCPDS China Penguin Random House Springer Waterstones 100 Current Participants * As of London Book Fair 2014
  • 96. Implications for BISAC Subject Heading Users • Thema will reduce mappings to BIC, BISAC, CLIL, etc. • Thema and BISAC will operate in parallel. • No timeline for BISAC being deprecated. • There is a BISAC-to-Thema mapping. 101 (Can use BISAC to select a Thema code.)
  • 97. What does Thema look like? F Fiction & related FJ Speculative fiction FJB Dystopian fiction Use for any fiction set in dysfunctional or degraded society; use with FL or FB codes if appropriate Code Heading Notes HIERARCHY Because of hierarchy, F is implied in FJB. Subject Headings 102
  • 98. What does Thema look like? Code Heading Subject Headings – More Examples 103 AGA History of art FRX Erotic romance XAMC Manga: Kodomo NHW Military history QRRF Zoroastrianism KJMP Project management LWKF Shariah law: family relations MKE Dentistry UGB Web graphics & design WBB TV / celebrity chef cookbooks YBC Children's picture books
  • 99. What does Thema look like? 1K The Americas 1KBB United States of America, USA 1KBB-US-NAKC New York City Geographic Code Heading Qualifiers 3M c 1500 onwards to present day 3MPQS c 1960 to c 1969 3MPQS-US-P USA: Civil Rights Movement Time Period Code Heading 104
  • 100. 1K The Americas 1KBB United States of America, USA 1KBB-US-NAKC New York City 3M c 1500 onwards to present day 3MPQS c 1960 to c 1969 3MPQS-US-P USA: Civil Rights Movement What does Thema look like? Geographic Code Heading Qualifiers Time Period Code Heading 105
  • 101. Geographic 1HFGU Uganda Language 2ACSC Icelandic Time Period 3MD 16th century, c 1500 to c 1599 Education 4GH For International GCSE (IGCSE) Interest 5AG Interest age: from c 6 years Artistic Style 6BA Baroque What does Thema look like? Code Heading Qualifiers – More Examples 106 Type (about, not in)
  • 102. Diving Deeper: Technical specs Summary of Elements 107 Element Code begins May contain Length Mandatory / Optional Categories A-Y A-Z 1-9 1-8 Mandatory Geographical Qualifiers 1 1 A-Z - 2-19 Optional Language Qualifiers 2 2 A-Z - 2-19 Optional Time Period Qualifiers 3 3 A-Z - 2-19 Optional Educ Purpose Qualifiers 4 4 A-Z - 2-19 Optional Interest Qualifiers 5 5 A-Z - 2-19 Optional Artistic Style Qualifiers 6 6 A-Z - 2-19 Optional Thema in Onix: Use following values from code lists 26 & 27 93 Thema subject category 94 Thema geographical qualifier 95 Thema language qualifier 96 Thema time period qualifier 97 Thema educational purpose qualifier 98 Thema interest age / special interest qualifier 99 Thema style qualifier
  • 103. Diving Deeper: Technical specs • Only a Subject Category is mandatory; Qualifiers are optional. • The first Subject Category entered is the primary subject. • Thema is recognized in ONIX, and can be sent as part of any ONIX 2.* and ONIX 3.* messages, using standard ONIX practice for subject classification metadata. • In product records and message formats (such as ONIX), only the code is required to be communicated. • There is no defined upper limit of the number of Subject Category values or Qualifier values that may be assigned. • It is expected that a maximum of 10 of each type would sufficiently cover all reasonable circumstances. • Systems designers working with systems which require limits to be placed on data element lengths and/or number of occurrences are advised to provide for the full length of codes and recommended maximum number of occurrences. 108
  • 104. Notes on Implementation • The schema is now available via the EDItEUR website. • Documentation on structure definitions is available. • A document of basic user instructions is available. • A BISAC-to-Thema mapping is available. It is live! Version 1.0 was released November 2013 • Mappings from BIC & BISAC schemes completed • Full translations into French, German and Norwegian • Workshops & presentations for publishers in Germany • Other groups working on translations into Italian, Spanish, Swedish etc… • In the US, various supply chain partners have said they are working towards transmitting and receiving Thema 109
  • 105. More on Thema 110 Official Thema Documentation http://www.editeur.org/151/Thema/ The US Thema Working Group www.bisg.org BISAC to Thema Translator http://bisactothema.biblioshare.org/ Kempton MooneyResearch and Analytics Director, Nielsen