FOR EPUB ACCESSIBILITY
2017.03.02 Hyun-Young Kim SookMyungWomen’s University
WHAT IS EPUB
• One of eBook File Format
• De Facto Standard published by the International Digital Publishing Forum (IDPF) Since 2007
• De Jour International Standards Organization as ISO/IECTS 30135 (parts 1-7) in 2014
• EPUB 2.0 in October 2007,
Maintenance update (2.0.1) in September 2010.
EPUB 3.0 in October 2011
Maintenance update (3.0.1) in June 2014
The current version of EPUB 3.1 in January 2017
EPUB & WEB RELATION
• EPUB production needs web technologies
• W3C'sWeb Accessibility Initiative
Web Content Accessibility Guidelines (WCAG) 2.0
Accessible Rich Internet Applications (WAI-ARIA) 1.0
• Also EPUB needs book metaphor and structure information
Semantic Markup Features
EXISTED ACCESSIBILITY DOCUMENTS
• IDPF EPUB3 Accessibility Guidelines
• Semantics, Navigation, Metadata
• XHTML Content Documents, MathML, SVG, EPUB Style Sheets, Media Overlay
• IDPF EPUB Accessibility 1.0
• Developed as part of EPUB 3.1 to provide guidance on making EPUB publications accessible
• BISG (Book Industry Study Group) Quick Start GuideTo Accessible Publishing
• Essential Check Points from EPUB3 Accessibility Guidelines
• DAISY member, DIAGRAM Image Description Guidelines
• Description guidelines that apply to any type of image.
• Guidelines for describing images within specific types of categories, such as maps.
EPUB PRODUCTION STATUS IN KOREA
• Only Conversion, No Accessibility
• National Library should reproduce DAISY or Accessible EPUB
• The library defined e-book accessibility certification criteria
and designated that as an industry standard in Korea
• Proposed Accessibility Checker is based on e-book accessibility certification criteria
• 156 Check Points from Previous Guidelines
• Some Check Points can be decided automatically
• Language Definition, Existence of LOI and LOT, Existence of LOA and LOV, and etc
• Others can be decided manually
• epub:type attribute is meaningful enough
• whether the page number accurately is the same as the number at paper book, and etc.
• 2-tier Checker
• Automatic Check for 39 Points, PC Standalone version
• Semi-Automatic Check for 117 Points,Web version linked with editor
• Web Checker indicates points where problems may occur
• HTML Editor that opens XHTML and CSS documents after decomposing EPUB
PROPOSED CHECKER VS. EPUBCHECK
• Tool to validate EPUB files, developed by IDPF and DAISY
• Detecting many types of errors in EPUB structure such as OCF container structure, OPF and OPS mark-up, internal
• Do not Support Any Accessibility Issues
• Proposed Checker
• Tool to investigate the accessibility of EPUB
• Some modules are same as those of EpubCheck
parsing in the EPUB Package and checking the OCF Related Content
WORKFLOW OF PROPOSED CHECKER
Lang / Audio Clip /Video Clip / Alt Text …
CSS separation / em / strong / Formatting / justified …
SVG lang / description
media-type / list
TOC / LOI / LOV / LOT…
OPF Metadata / lang …
VERIFICATION OF CHECKER
• 50 EPUB files which has deposited into the national library of Korea
• 148 accessibility defects per each file on average
• Accessibility errors focus on 8 points
• The Korean e-book market has EPUB2x 90% and EPUB3x 10%
• Rare percentage of e-books available for Multimedia, MathML, and Media Overlay support
• 8 error points occurs at parts which are irrelevant to EPUB3 specifications
• To define the default language for an XHTML document, the lang and xml:lang language attributes need to be attached to the root
html element. It occupies 41% over all defects.
• In the case of multilingual publications, best practice is to always specify the language in each content document to ensure proper
rendering. It occupies 21% over all defects.
• When using the epub:type attribute in a content document, the epub namespace must be declared on the element containing the
attribute, or on one of its ancestors. It occupies 13% over all defects.
• Images that are central to the understanding of a publication must always include a text alternative in their alt attribute. It occupies
7% over all defects.
• When creating hyperlinks, the text inside of the link can provide the full context of what is being linked to or the link can have
alternate text. It occupies 7% over all defects.
• Separating style from markup is consequently not just about keeping CSS in a separate file from your markup, but recognizing that
markup must convey meaning to be useful to all readers. It occupies 7% over all defects.
• When using bolding and italics, EPUB follow the rules of HTML5 and CSS standard. It occupies 2% over all defects.
• Avoid justifying text, as the uneven spacing that occurs between words can reduce the readability for some people. It occupies 1%
over all defects.
• 1st tier automatic system could pick up problematic items which defined as 39 check points
• It is responsible for 25% of all 156 check points
• 2nd tier semi-automatic system handle 75% of check points
• It should be changed into automatic detection through Machine Learning algorithm