Description of the origins and development of the BookServer architecture and the Open Publication Distribution System (OPDS). Why OPDS Catalogs can help build a web of books. Discussion of the challenges ahead.
Entering the digital fold,
a tangled landscape:
1. finding the book
2. format of the book
3. acquiring the book
Finding the book
Web search? (Google, Bing, etc)
Publisher website? (Tor.com ... )
The local library? (borrowing/lending)
Online bookstore? (Amazon, Indigo, B&N)
Indie bookstore? (Vroman’s, Powell’s)
Alt. vendor? (Smashwords, Kobo)
Format of the book
Highly structured display (pdf)
Downloadable book package (epub, mobi)
Web- or “cloud”-based (Google Editions)
Non-standard enhanced book (Blio)
Not really available at all (ill)
Acquiring the book
Reading systems –
Amazon Kindles, Sony Readers, B&N nook
IBIS Reader, Aldiko, Stanza, Kobo
Standard desktops and laptops
Game consoles (Wii)
What readers want
What readers want to have ..
Be able to find the books they want,
in the formats that they can use,
for the device that they have,
and not have it be painful.
What publishers, libraries, bookstores want -
Make books available for discovery,
with accurate descriptive information,
at as many different places as possible,
under the sales / use terms permitted.
For the United States
Even the U.S. Dept of Justice is an advocate:
“[book] data provided should be available in
multiple, standard, open formats supported by
a wide variety of different applications, devices,
BookServer: A future for books
Creating a new architecture using common,
open standards that permits people to find,
buy, acquire, and read books from any source,
on any device, using many different ebook
Relation: Library catalogs
Library 2.0 Gang (02/09):
Google books and libraries
“Open Catalogue Crawling Protocol”
Google, DLF, Talis, and others
Atom vs Sitemap discussions
Stages of support
Tools of Change (NYC, Feb 2009)
Web Expo 2.0 (SF, Apr 2009)
OPDS “Catalog” launch
“The Open Publication Distribution System
(OPDS) is a generalization of the Atom [XML]
approach used by Stanza's online catalog.
I believe this effort has the potential to be a
critical enabler to the growth in access to, and
adoption of, digital books.”
- Bill McCoy, Adobe, 04.09
Getting the terms right
1. “BookServer” is the architecture.
2. “OPDS” is the technical specification.
3. “Catalogs” are made using OPDS.
4. “Atom” is the XML scheme for OPDS.
Based on Atom
Because OPDS is based on a commonly
used XML standard, called Atom,
OPDS Catalogs can be read by –
news readers (rss)
Because Catalogs are easy to make –
Any web site can run a bookstore/library.
Libraries, bookstores, publishers can play.
Search engines can serve as book gateways.
Aggregators can harvest multiple catalogs.
Because Catalogs contain simple data
describing books and their availability –
Catalogs can also be used for B2B, to distribute
data to partners for “harvest” instead of using
(Future: “real time web” notifications.)
What’s in this thing?
Catalogs provide manifests –
List of the titles available
Information about each title
Formats the title is available in
Ways the title can be acquired
How it works
A reader ...
1. Browses a Catalog of titles -
2. selects a title for more information -
3. makes a purchase/borrow decision -
4. obtains book (PayPal, Amazon, Google) -
5. installs and reads the book.
A good catalog ...
For best user experience:
Catalogs can be derived from basic
bibliographic metadata. Such as:
ONIX, MARC, (ahem) spreadsheets
(Internally OPDS Catalogs use
simple Dublin Core metadata.)
Why not ONIX?
ONIX (and BISG “BookDROP”) are:
Designed for a different use cases
Complex standard with many options
Not widely used beyond publishing
Not understood by web browsers
Established; change is difficult
Catalogs are emergent
Because we use open standards for describing
data, it is possible to link bibliographic book
data more easily.
Catalogs can tie together –
§ Book reviews
§ Reading lists
§ Fan fiction
Make Books Apparent
A workshop sponsored by the Internet Archive
October 19-20, Fort Mason, San Francisco, CA
With the assistance (among many others):
O’Reilly Media http://oreilly.com/
Book Oven http://bookoven.com/
Building the ecosystem
For this to work, we need:
1. Good (independent!) reading systems
2. Books, journals, magazines, and more
3. $ Publishers must contribute frontlist
4. Mobile reading systems
5. Aggregators (incl. search)
Issues – I
Two roles for OPDS:
1. simple publication
2. catalog aggregation
Aggregating resembles metasearch:
out of many sources must come order.
Issues – II
Matching title <> reader is not trivial.
FRBR, recommending, clustering
- and then there is plain old GIGO
Issues – III
OMG. Where does one start?
- Author, work, and subjects.
Data from publishers (book and journal);
libraries, trade organizations and assns.
Issues – IV.a
Publishers carve up markets into territories,
geographic and language-based.
One publisher might have UK, AU, NZ rights,
whilst another might possess U.S. rights.
Spanish publishers typically retain worldwide
Issues – IV.b
Territorial rights make zero sense for
digital editions (n.b. language might).
Publishers must obtain non-geographic
rights for electronic text versions.
(Regional DVD codes is a sad analogy).
Issues – V
OPDS defines search via OpenSearch.
OpenSearch ver status is “under development”
and not really “owned” by anyone (origin: A9).
Could benefit from support and enhancement.
Issues – VI
On a small screen device, faceting must be
a normative discovery user interface form.
What is baked in? – Top-20. Classics. New.
What is algorithmically derived, on the fly?
How can one do this against aggregations?
Issues – VII
Users should be able to define and maintain
their own book lists in OPDS format.
Ideally, these should be portable across book
Issues – VIII
Bad word, but many publishers still reliant.
Best market solution: Adobe ACS4
Pay per transaction model.
Desperate need for open source solution.
(Perhaps premised on “social-DRM” spec.)
Issues – IX
Not a trivial problem.
Need an abstracted selling API.
Application elicits essential purchaser data,
then handles transaction “under the covers”
Paypal, Google Checkout, Amazon Checkout
Issues – X
Internet Archive would like to lend books
(directly, not via a third-party).
Is every lending a renting? (no ... !)
Is there digital first-sale? (yes ... !)
Options: ACS4, streaming (cloud)
Issues – XI
Currently no way for new OPDS Catalogs to
announce themselves to the world.
We have discussed a “ping server” to aid the
auto-aggregation of Catalogs. This remains
a manual notification process.
OpenPub on Google Code: