eBooks: Why they break ISBNs
It's different from print publishing
Who we are
● Unit of the Victoria University library
● Digital (re)publisher of documents used in
teaching, learning and research
● TEI/XML, tomcat/cocoon/XSLT
● Out-sourced digitisation
● In-house authority control
● Open standard for eBooks
● A zip file of all the same stuff you can put on a
● DAISY metadata for naviation
● XHTML, CSS, etc
● We create ePubs by crawling our website
● Device not page does navigation
● grep dimensioned measurements from CSS
● Widely used in the print world to track editions
● Issued to publishers by a bureaucracy
● Used end-to-end in supply chain
● Printing, wharehousing, distribution,
wholesaling, retailing, purchase, cataloging,
● 99% of the time in traditional prublishing ISBNs
are print run identifiers
● Print runs are extraordinarily expensive
● Print runs are a speculative gamble on the part
● Print runs have no direct analogue in the pure-
What's an edition?
● Currecting a single-character OCR error?
● Authority control change in body?
● Authority control change in metadata?
● Decreasing image quality?
● Increasing image quality?
● Factual corrections?
What's an edition?
● It doesn't matter because all non-commercial
ePubs are “digital photocopies” and don't
quality for ISBNs anyway.
Free of bureaucracy
● Arguments about what an “book” / “eBook” is
● Arguments about what an “edition” is
● Arguments about jurisdiction (cloud, ISO, etc)
● Baked-in assumptions about who produces
what, why and for whom
● $$$ to support
● Many more things appear to qualify as eBooks
● ISBNs are being reused
● Versions / updates
● NZETC: 1300 works x regenerated monthly
Naïve hashes insufficient
● “Use an hash of the ePub as the identifier”
● Needs to be an identifier not the identifier
● The identifer can't be used within the ePub
● Many tools in the tool chain alter the ePub
● Does a bookseller's sticker on a book make it a
● Does an author's signature?
● Does the intended market?