EPUB for Website Producers


Published on

My presentation from the Internet Archive's Books in Browsers conference.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • * Waldo Jaquith
    * website developer since 1993
    * worked for UVA’s Virginia Quarterly Review for five years
    * left a month ago, now at the Miller Center
  • Here to talk about EPUB. We’ll take a quick look at the format, look at the parallels to website development, talk about how website developers are particularly well-equipped to develop EPUBs, and look at what that means for the development process.
  • The two best-known hardware EPUB readers are the iPad (and iPhone, and iPod Touch), and...
  • ...the Sony Reader
  • Let’s look through the structure of an EPUB. This is a free, out of copyright text I downloaded—“Dracula’s Guest.” Note that there’s very little about this that’s required, in terms of how files are structured.
  • This is important, but it just has to exist—we don’t need to worry about it.
  • This is just standard XHTML. Nothing fancy required.
  • Ditto for the CSS. You call it from your HTML, you can load multiple stylesheets, etc. Again, just like a website.
  • Just what it says on the tin.
  • The NCX and OPF files. These two XML files are the bit that make an EPUB an EPUB. Let’s look at each of these more closely.
  • This is the Navigation Control XML, or NCX file. You can’t read this, but it’s pretty straightforward XML. Let’s look at the two interesting bits.
  • This small portion tells us a bit about the EPUB generally.
  • We’ve got the title and the author here.
  • The important part of this file is the bottom two thirds—the navigation control that’s the point of the NCX file—one entry for each chapter.
  • So here’s one navigation point—that is, one “chapter.”
  • We specify that this chapter is titled “Cover Page,” that its contents consist of cover.html, and that it is the first chapter in this EPUB.
  • And that it—those are the meaty bits of the Navigation Control XML, or NCX file.
  • This is the OPF. It holds most of the book metadata, the file manifest, and a listing of the chapters (much like the NCX file). Let’s just look broadly at the two most important sections of this.
  • The book metadata. [explain]
  • The manifest—the listing of every file that’s to be found in the EPUB. [explain]
  • As with the NCX file, you can see this is pretty straightforward.
  • So those two files comprise the XML. As you can see, it’s simple and straightforward.
  • An EPUB is just a website with some XML to describe it—kind of like meta tags mashed up with sitemap.xml. It’s a book in a browser. Perfect for website developers.
  • This implies these three important things: website developers are good to go with EPUBs, EPUBs can—when appropriate—be an end-of-pipe solution, and CMSs can be used to produce EPUBs automagically. Let’s look at each of these.
  • Website developers can pick this stuff up in a few hours. [explain how I did] They can use CVS, SVN, BitKeeper, Git, or whatever for revision control.
  • [explain VQR vs. Miller Center workflow, InDesign] If your HTML is well-formed and semantically rich, there’s no reason why it can’t go directly into an EPUB.
  • If an EPUB is just a website, it follows that CMSs can produce them. There’s no reason why not. There are already two plugins for WordPress that do just that.
  • A few catches. 1. There’s nothing about being a website developer that prepares me to design books. 2. Print layouts can’t be handed off to programers to be “translated.”
  • 1. Establishes standards for your HTML, use a base stylesheet—gets you 90% of the way there. Let $$ developers finalize it nicely. 2. Keith @ Threepress’s idea for filesystem mounting.
  • EPUB for Website Producers

    1. 1. EPUB for Website Producers Waldo Jaquith Miller Center of Public Affairs University of Virginia
    2. 2. Hi!
    3. 3. EPUB File Structure
    4. 4. Uninteresting Metadata
    5. 5. XHTML 1.0 Files
    6. 6. CSS
    7. 7. Images
    8. 8. XML
    9. 9. Navigation Control
    10. 10. Navigation Control
    11. 11. Navigation Control
    12. 12. Navigation Control
    13. 13. Navigation Control
    14. 14. Navigation Control
    15. 15. Navigation Control
    16. 16. Open Packaging Format
    17. 17. Open Packaging Format
    18. 18. Open Packaging Format
    19. 19. Open Packaging Format
    20. 20. Open Packaging Format
    21. 21. XML
    22. 22. EPUB == Website • HTML + CSS + images + XML = website • an EPUB is a book in a browser • we can reuse existing design patterns • we can reuse existing tools
    23. 23. What this Implies • website developers make the best EPUB developers • EPUBs can flow from web content • CMSs can produce EPUBs
    24. 24. Website Developers • already have 95% of the skills • can use standard development tools (their text editor, their source repository, etc.) • accustomed to widely disparate platforms and adding not-yet-supported features
    25. 25. Flow from Web Content • if your workflow allows it • print begets web begets EPUB • just use structured, semantic markup
    26. 26. CMSs Begetting EPUBs • CMSs generate HTML—and EPUBs are HTML • e.g.,WordPress • wp2epub • GMU’s Anthologize
    27. 27. Caveats • books need designers, whether print or digital • designers need to collaborate with developers
    28. 28. Tips • create a standard, baseline stylesheet • for iOS, mount the device’s filesystem
    29. 29. Questions?