BISG WEBCAST -- Identification & Digital Publications


Published on

The book industry has had the ISBN for nearly 40 years; there has been little cause for excitement. Now, suddenly the whole subject of "identifiers" has become a hot topic, particularly when it comes to digital books and other online resources.

Published in: Education, Technology, Business

BISG WEBCAST -- Identification & Digital Publications

  1. 1. This BISG WEBCAST took place Tuesday, September 15, 2009 at 11:00 a.m. EDT. Be B sure t check to h k To register for future BISG Webcasts, please visit: your system prior to the event! Visit the link below. If the Special thanks to our Webcast Sponsor: link is li k i not clickable, copy li k bl The U.S. ISBN Agency and paste it into a new browser window. Please take any actions suggested by the systems check. SYSTEMS CHECK URL 1
  2. 2. “Working to create a more Working informed, empowered and efficient book industry supply chain for both physical and digital products.” Be sure to check your system prior to the event! Visit the link to the right. If the link is not clickable, copy and paste it into a new browser window. Please take any actions suggested by the systems check. SYSTEMS CHECK URL 2
  3. 3. Andy Weissberg A d W i b VP of Identifier Services & Corporate Marketing R.R. Bowker Board of Directors International ISBN Agency SYSTEMS CHECK URL 3
  4. 4. a quick audience poll...
  5. 5. Do you believe that digital manifestations (specific formats, e.g., PDF, .Mobi) of books and/or other content should to be identified with separate ISBNs?
  6. 6. Do you believe that digital manifestations (specific formats, e.g., PDF, .Mobi) of books and/or other content should to be identified with separate ISBNs? 60.00% 50.00% Yes 52.1% 40.00% 40 00% (76 votes) ) 30.00% No 22.6% 20.00% (33 votes) 10.00% 10 00% Not sure 25.3% 0.00% (37 votes) Yes No Not sure 52.1% 22.6% 25.3% ( (76 votes) ) ( (33 votes) ) ( (37 votes) ) Results gathered during a live participant poll.
  7. 7. Mark Bide, Executive Director, EDItEUR Mark Bide is Executive Director of EDItEUR, the global trade standards organization for the book and serial publishing industries. He is also the Project Director for the ACAP Project, and a Director of Rightscom, the specialist media consultancy. Since the early 1990s, Mark has been closely involved in media standardization strategies and in the design and management of standards f id f d d for identification and metadata i the media. H h worked i ifi i d d in h di He has k d in and around the publishing industry for nearly 40 years, having been a Director of the European subsidiaries of both CBS Publishing and John Wiley & Sons. He is a Visiting Professor of the University of the Arts London. 7
  8. 8. A brief introduction...
  9. 9.  Identifiers are “just” a special class of name ◦ Unique within a given context  Why do we assign identifiers? ◦ Collocation – to bring together instances of the same thing ◦ Disambiguation – to distinguish things that are not the same  What does “the same” mean? ◦ Whether things are or are not the same is always contextual ◦ For example an ISBN identifies instances as being “the same” example, the same for particular purposes – the meaning is not universal  Why does this matter? ◦ Unambiguous communication… g ◦ …particularly from machine to machine (people don’t often use unique identifiers in discourse – “that one over there” is usually enough)
  10. 10.  When there is a need to communicate across organizational boundaries – within a supply chain…  …particularly where anyone in the supply chain needs to manage and aggregate information from multiple sources ◦ That means nearly everyone, particularly in a digital supply h l l l d l l chain  What matters about standard identifiers? ◦ That their semantic should be clear to everyone… everyone ◦ …in other words, everyone in the chain knows what type of thing they are identifying  So, an ISBN identifies a book, right? ◦ W ll no Well, ◦ It identifies a product in the book supply chain…but we have tried to make the standard do so much more
  11. 11. Some quotes from the PersonaNonData blog (Michael Cairns) August 4th ISBN is Dead I am increasingly concerned about the future health of the ISBN. In its current form the ISBN is not yet dead but therein lies the problem: ‘in its current form.’… As a community, we need to recognize that the ISBN may not b meeting its intended market need and that be d d k d d h the future may make this deficiency even more stark. … Into this mix I would also add that ISBN can no longer stand generally i d d ll independent of other id ifi d f h identifiers, such h as a work ID or party ID.
  12. 12.  Identity is the critical item of metadata for interoperability… i bili ◦ …“Are we talking about the same thing”?  The book industry recognised this very early – y g y y using the ISBN to identify products… ◦ …but then unfortunately went on to use the ISBN to identify everything else… else ◦ …and built systems that were entirely ISBN-centric (a folly to which will return)
  13. 13.  The “Master ISBN” is a commonly used as a proxy “work” identifier in publishers’ systems ◦ The ISBN used to identify both a Work and a Product ◦ Not a huge problem in the world of physical products… ◦ …but a growing one i the world of di i l ones… b i in h ld f digital ◦ …where we can add the problem of using the same identifier to identify two (or more) different products  Greater granularity substantially adds t th challenge G t l it b t ti ll dd to the h ll ◦ Granularity of digital use….many different products (different ebook formats, different channels, different devices) ◦ G Granularity of di it l content…many diff l it f digital t t different it t items of content f t t used in different contexts (eg the same content used in many different learning objects)
  14. 14. Beginning at the beginning...
  15. 15. Manifestation Atoms/bits “I made it” Perceivable Fixation Abstracted to Fixed in Actions “I did it” Expression Spatio- temporal Abstracted to Expressed in Conceptual aka “Work” Abstraction Thoughts “I conceived it” Th ht i d
  16. 16. Digital Asset Management as an example of the requirement for work identification in the publishing process.
  17. 17. Works Abstractions Manifestations Items Abstract assets or “content” Classes of perceivable assets Individual perceivable assets Most recent standard metadata and identifier systems (eg ONIX, FRBR, DOI, RDA, DDEX, ISTC) recognize some variation of this model, though the exact terminology may vary Physical Fixed, made of atoms Physical Content Abstract, distinct from any specific perceivable manifestation Digital Fixed, made of bits Digital Manifestations are classes of physical or digital assets with common identifiers and identical attributes. For example: the class of all books with the same ISBN is viewed as a single Manifestation. An individual copy of the book is an Item Item.
  18. 18. Abstractions Manifestations Items Abstract assets or “content” Classes of perceivable assets Individual perceivable assets eg the words of a book or article the image in a photograph the figures and layout of a table a logo the design of a chair Physical Physical a graph eg, a class of eg, an individual copy of a printed book printed book Content music CD music CD leaflet leaflet sculpture sculpture or the complete contents of a volume of a journal Digital Digital a magazine g a book of photographs and maps eg, an individual copy of a series of books eg, a class of .pdf of an article a website .pdf of an article .mp3 of a sound recording an academic course pack .mp3 of a sound recording .exe of a program a tv soap opera .exe of a program .gif of a photograph .gif of a photograph
  19. 19. This simplified example shows just one asset and relationship of each type. In practice there may be multiple assets of all types with relationships at all levels including between assets of the same type levels, type… Abstractions Manifestations Items Abstract assets or “content” Classes of perceivable assets Individual perceivable assets Physical a copy of Physical Hardback print edition Content Digital download Digital Content of book
  20. 20. This is also simplified, but starts to show the complexity of identification and relationship that exists... e ists Abstractions Manifestations Items Abstract assets or “content” Classes of perceivable assets Individual perceivable assets Print on Demand Content Digital Physical Physical Photograph TIFF Customer copy PDF Digital Content Content of book Digital Individual Digital Digital files in Digital Content Content Digital DAM Digital System XML Ch Chapter F Foreword d Adobe Ebook Digital Content Physical Physical MS Ebook Hardback print Illustration Archive copy edition
  21. 21. ...identifying creations.
  22. 22. Abstractions Manifestations Items Abstract assets or “content” Classes of perceivable assets Individual perceivable assets Library Physical Physical shelf mark Hardback print Individual copy ISBN edition DOI GS1 SGTIN (RFID) ISTC Content Content of book Digital Digital Filename ISBN ebook File on individual computer DOI (any)
  23. 23. Some standard identifiers suitable for identifying assets of different types Abstractions Manifestations Items Abstract assets or “content” Classes of perceivable assets Individual perceivable assets Physical Physical ISBN (books) Individual copy ISMN (sheet music) DOI UPC (products) GS1 SGTIN (RFID) Content EAN13 (products) DOI (any) ISTC (words) DOI (any) Digital Digital PII (articles) ISSN ( (serials) i l ) File on individual computer ISAN (audiovisual) DOI (any) ISWC (music) ISBN (products) DOI (any)
  24. 24.  When there is a need to communicate across organizational boundaries – within a supply chain  Internal requirements for interoperability between systems are not the same as external communication requirements  Never allow your operational flexibility to be limited by the limitations or requirements of existing standards ◦ You must be able to identify what you need to be able to identify within your own systems how, when and where you need to identify it! ◦ …and to be able to identify the same entity with standard id tifi when thi i an appropriate identifier h this is i t  Prefer standards for external communications ◦ Minimise confusion and complexity in the supply chain
  25. 25. ...the solution to textual work identification?
  26. 26.  The purpose of the International Standard Text Code (ISTC) is to enable the efficient identification of textual works. The ISTC provides a means of uniquely and persistently identifying textual works in information systems and of facilitating the exchange of information about those works between authors, agents, publishers, retailers libraries, publishers retailers, libraries rights administrators and other interested parties, on an international level.  The ISTC may be applied to any textual work, whenever there is an intention to produce such a work in the form of one or more p manifestations. It provides an identification data element for applications that record and exchange information about textual works and related manifestations. For example, the ISTC may be used for the purposes of collocating subsequent manifestations of the same work or derivations of the same work in applications involving electronic rights administration or information retrieval.
  27. 27.  Textual works that are eligible for an ISTC include any distinct abstract entity, predominantly composed of a di ti t b t t tit d i tl d f combination of words, that can be described to satisfy the ISTC metadata requirements. In order to be assigned an ISTC, the declared metadata for any textual work at the time of registration shall contain at least one element pertaining to the work itself that p g distinguishes it from every other textual work to which an ISTC has already been assigned.  If two entities share identical ISTC metadata they metadata, shall be treated as the same textual work and shall have the same ISTC.
  28. 28. Different users may have different functional requirements – so may have a different view of the necessary granularity: ◦ Publishers may need to collocate all the different manifestations of the same edition of a book they y publish ◦ Retailers may need to do the same…or may need to collocate all the different editions of “the same work” from many different publishers ◦ Librarians may need to collocate the same work, but distinguish between what FRBR calls different “expressions” of “the same work” ◦ Rights management organisations may need to distinguish between different versions of “the same work” b k” because of diff f differences in rights ownership i i h hi
  29. 29.  Yes… but at the expense of rather more sophisticated metadata management than we hi ti t d t d t t th are used to ◦ Many relationships have to be created and managed y p g  And what about the problems of “fragments” of text… ◦ …or photographs or  Serious system implications for everyone who needs to manage the ISTC  ISTC will need to f d a sustainable economic ll d find bl model
  30. 30. ...why have we run into problems & how can we escape?
  31. 31.  ISBN system devised in late 1960s ◦ Initially implemented in the UK as the 9-digit SBN I iti ll i l t d i th th 9 di it  ISO ISBN standard (ISO 2108) first published in 1970 ◦ UPC introduced in 1973, EAN-13 in 1977  Universally adopted as the key identifier for books in the supply chain ( g pp y (agencies in 170 countries)  4th Edition of standard published May 2005 ◦ 13-digit ISBN – 1 January 2007 ◦ Explicit guidelines for e-books ◦ Assignment to chapters/fragments
  32. 32.  “A separate ISBN shall be assigned to each separate monographic publication, or separate edition of a hi bli ti t diti f monographic publication issued by a publisher. A separate ISBN shall be assigned to each different language edition of a monographic publication.”  “Different product forms (e.g. hardcover, paperback, Braille, audio book, video Braille audio-book video, online electronic publication) shall be assigned separate ISBNs. Each different format of an electronic publication (e.g. “.lit ,“.pdf , “.html , “.pdb ) that is published and lit” pdf” html” pdb”) made separately available shall be given a separate ISBN.”
  33. 33.  Ease of trading ◦ Most book trade e-commerce systems require ISBNs ◦ Certainty of identification is critical for effective e- commerce  Ease of discovery of the different formats available ◦ Bibliographic databases require ISBNs and users do not want to be tied to one channel  Collecting detailed sales/usage data ◦ If separate formats are not identified in a standard way, sales and usage data by format cannot be easily collected
  34. 34.  “We only “publish” one generic format (e.g. .epub) and assign an ISBN to that” epub)  “We are not responsible for formats provided by third part intermediaries”  “We don’t care whether or not different product formats are listed in bibliographic databases.”  “Our hardware-led channels do not require hardware led standard identifiers and customers will find our books through their preferred platform.”  “Our system requires us to manually create and manage separate ONIX records for each ISBN we assign.”
  35. 35.  It avoids an explosion of identifiers ◦ Think of all the numbers you might need when you multiply the different potential permutations of content by the number of different formats  But will it work in the supply chain? Not everyone thinks so “Each e-book title should have a unique ISBN for its format and for its vendor. This is necessary to endor necessar allow librarians to easily discover who is supplying e-books, in what format they are available and th through which vendors th can acquire th h hi h d they i them.” ” JISC Collections(UK) Consortium for Common Information Infrastructure (the Netherlands)
  36. 36. 1. 1 Use proprietary product identifiers in the channel 2. Have someone else apply ISBNs in the channel 3. Introduce yet another new identifier… like the music industry has y
  37. 37.  Advantages ◦ Some vendors already apply proprietary identifiers at the level of individual SKUs, so no additional work ◦ Publishers don’t need to bother with proliferation of new identifiers, and can simply issue an “ebook ISBN” (against ebook ISBN which vendors report)  Disadvantages ◦ Further along the chain (eg in libraries) t e identifiers will u t e ao gt ec a b a es) the de t e s have no meaning (and may be impossible to manage) ◦ The information available to publishers collecting data simply against a single ISBN may be inadequate  The worst of all possible worlds? ◦ Identifiers which look like ISBNs but are not  A systems driven solution – semantic and technical chaos
  38. 38.  Advantages ◦ Identifier familiar throughout the chain ◦ Publishers don’t need to bother with proliferation of new identifiers, and can simply issue an “ebook ISBN” (against which vendors report)  Disadvantages ◦ Potentially, considerable confusion – is the channel the correct point of granularity? ◦ P bli h Publishers deeply di lik the id of someone else b i d l dislike h idea f l being allowed to identify “their books”  Nevertheless, some wholesalers are now moving towards getting an ISBN prefix and assigning their own ◦ “We will always prefer the publisher’s format specific ISBN and also will link the parent ISBN [???] to our own prefixed number”
  39. 39.  The Global Release Identifier [GRid] ◦ Identifies… “bundles of one or more Digital Id ifi b dl f Di i l Resources compiled for the purpose of electronic distribution. It is not used to identify any specific Product hi h P d t which contains such a Release, or individual t i h R l i di id l instances of the Release.”  Purpose: to manage the proliferation of products and the lack of an appropriate standard product identifier ◦ The music industry has never had its own standard product identifier ◦ Has used UPC/EAN
  40. 40. ISWC Song ISRC Recording Recording Recording Recording Many recordings of the same song
  41. 41. ISWC Song Song Song ISRC Other content Recording Recording Recording GRid Release Prop. UPC ISRC! Product Product Product Product Many products with the same content, but different technical characteristics, permissions etc
  42. 42.  Advantages ◦ Clarity of identification  Disadvantages ◦ Implementation costs and comprehension problems  The reality in the music industry ◦ GRid adoption has been slow ◦ Different labels are applying in different ways ff l b l l d ff  The labels have never been very disciplined in applying ISRC ◦ There is a huge amount of metadata required in reporting  Not simply “10 copies of GRid 123 sold this month”  Standard product identifiers also required?
  43. 43.  Millions of books without ISBNs ◦ Should they be applied retrospectively? ◦ By whom?  Digitisations of those millions books (with and Di iti ti f th illi b k ( ith d without ISBNs) ◦ Should they be given (different) ISBNs? ◦ By whom?  Critical decisions to be made by ISBN community as well as by those undertaking digitisations ◦ But note these two – completely different – potential applications of ISBN should not be conflated
  44. 44. ...the next one to watch?
  45. 45. From a draft of the “Draft International Standard” ( ) (DIS)  This International Standard specifies the International Standard name identifier (ISNI) for the identification of public identities of parties; that is, the identities used publicly by parties involved throughout the media content industries in the creation, production, management, and content distribution chains.  The ISNI system uniquely identifies public identities across multiple fi ld of creative activity and provides a t l f lti l fields f ti ti it d id tool for disambiguating public identities that might otherwise be confused.  The ISNI is not intended to provide direct access to comprehensive information about a public identity but can provide links to other systems where such information is held.
  46. 46.  Library name authority projects [VIAF]  Rights management ◦ Across the media ◦ Has implications (eg) for the Book Rights Registry  Requires the development of unique identities for publishers and imprints  Has potential retail application but not the main driver ◦ “Other books by this author” Other author
  47. 47.  Probably simply “watch this space”… ◦ …and think about the implications for you from a systems perspective if the system is widely implemented p  If you want to influence, you need to engage through your national body (NISO in the US) ◦ Moving towards the end of the process
  48. 48. Some conclusions...
  49. 49.  Identifiers support interoperability between systems; if all those y g , systems are within the same organisation, then identifiers can (and should) be proprietary ◦ Nobody (other than you!) cares much about how your DAM communicates with your distribution system or your royalty system  Don’t allow external constraints to dictate internal system requirements ◦ Standards are for external not internal interoperability  Short term system constraints are a poor basis for determining a gy strategy ◦ However persuasive those arguments are at a time of financial constraint  You can identify things with the same identifier as long as you are happy always to treat them as being “the same thing” – but when you need to d h d distinguish b h between them, using the same h h identifier will cause you problems  Whatever you decide will be extremely difficult to undo ◦ Lumping is easier than splitting “after the event”… p g p g
  50. 50.  As you need to communicate with other people’s systems, standard identifiers become increasingly helpful – particularly in supporting unambiguous many to many communication. ◦ Common syntax ◦ Common “identity model”  What is being identified  What are the granularity rules  No single constituency in the supply chain makes decisions about the identifiers that are going to be applied, it requires applied consensus ◦ It is only supply chain pressure that will be effective in “enforcing” the consensus… ◦ …and that this may not be easy if not everyone wants standards (for their own reasons)… ◦ …and remember that some constituencies will be more powerful than others
  51. 51.  ISTC has to demonstrate that is has sufficient value to overcome inertia (and particularly the cost of appropriate metadata management) ◦ …and to demonstrate that it can properly fulfil the different requirements of its very different constituencies q y ◦ Book Rights Registry may be the tipping point  Work collocation and disambiguation lie at the heart of rights management  ISBN has to resolve some significant challenges if it is continue to be an effective identifier for the next 40 years ◦ It certainly isn’t dead but we could kill it (and will almost isn t dead…but certainly regret it if we do)  ISNI, like ISTC, will need to demonstrate value ◦ …but don’t forget that the value to others may be considerable g y
  52. 52.  Do we need other identifiers? ◦ Gain consensus around business requirements ◦ Gain consensus around technical requirements ◦ Then and only then specify the solution Then, then,  Don’t take an existing identifier and try to use it for something for which it was not g designed ◦ It won’t work ◦ Y You risk creating complete chaos…. i k ti l t h
  53. 53. A quick audience poll...
  54. 54. Now that you ve taken part in this BISG you’ve Webcast, what do you believe is the biggest barrier to assigning ISBNs to digital products?
  55. 55. Now that you’ve taken part in this BISG Webcast, what do you believe is the biggest barrier to assigning ISBNs to digital products? 35.00% 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% There are no Price of Perceived Information Current Current Other barriers ISBNs value (or / metadata workflows digital 5.6% 11.3% 8.5% lack thereof) "bloat" make it business (8 votes) (16 votes) (12 votes) for my 33.1% difficult to model(s) business (47 votes) assign them don't 6.3% 19.0% necessarily (9 votes) (27 votes) require them 16.2% Results gathered (23 votes) during a live Series1 11.30% 8.50% 6.30% 33.10% 19.00% 16.20% 5.60% participant poll.
  56. 56. Mark Bide: Website: Andy Weissberg: andy weissberg@bowker com Website: Angela Bole: Website: W b i bi 61