Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Csun presentation-170302-hykim


Published on

Check System for EPUB Accessibility

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Csun presentation-170302-hykim

  2. 2. WHAT IS EPUB • One of eBook File Format • De Facto Standard published by the International Digital Publishing Forum (IDPF) Since 2007 • De Jour International Standards Organization as ISO/IECTS 30135 (parts 1-7) in 2014 • EPUB 2.0 in October 2007, Maintenance update (2.0.1) in September 2010. EPUB 3.0 in October 2011 Maintenance update (3.0.1) in June 2014 The current version of EPUB 3.1 in January 2017
  3. 3. EPUB & WEB RELATION • EPUB production needs web technologies • W3C'sWeb Accessibility Initiative Web Content Accessibility Guidelines (WCAG) 2.0 Accessible Rich Internet Applications (WAI-ARIA) 1.0 • Also EPUB needs book metaphor and structure information Semantic Markup Features Navigation Features
  4. 4. EXISTED ACCESSIBILITY DOCUMENTS • IDPF EPUB3 Accessibility Guidelines • Semantics, Navigation, Metadata • XHTML Content Documents, MathML, SVG, EPUB Style Sheets, Media Overlay • IDPF EPUB Accessibility 1.0 • Developed as part of EPUB 3.1 to provide guidance on making EPUB publications accessible • BISG (Book Industry Study Group) Quick Start GuideTo Accessible Publishing • Essential Check Points from EPUB3 Accessibility Guidelines • DAISY member, DIAGRAM Image Description Guidelines • Description guidelines that apply to any type of image. • Guidelines for describing images within specific types of categories, such as maps.
  5. 5. EPUB PRODUCTION STATUS IN KOREA • Only Conversion, No Accessibility • National Library should reproduce DAISY or Accessible EPUB • The library defined e-book accessibility certification criteria and designated that as an industry standard in Korea • Proposed Accessibility Checker is based on e-book accessibility certification criteria
  6. 6. PROPOSED CHECKER • 156 Check Points from Previous Guidelines • Some Check Points can be decided automatically • Language Definition, Existence of LOI and LOT, Existence of LOA and LOV, and etc • Others can be decided manually • epub:type attribute is meaningful enough • whether the page number accurately is the same as the number at paper book, and etc. • 2-tier Checker • Automatic Check for 39 Points, PC Standalone version • Semi-Automatic Check for 117 Points,Web version linked with editor • Web Checker indicates points where problems may occur • HTML Editor that opens XHTML and CSS documents after decomposing EPUB
  9. 9. PROPOSED CHECKER VS. EPUBCHECK • EpubCheck • Tool to validate EPUB files, developed by IDPF and DAISY • Detecting many types of errors in EPUB structure such as OCF container structure, OPF and OPS mark-up, internal reference consistency • Do not Support Any Accessibility Issues • Proposed Checker • Tool to investigate the accessibility of EPUB • Some modules are same as those of EpubCheck parsing in the EPUB Package and checking the OCF Related Content
  10. 10. WORKFLOW OF PROPOSED CHECKER EPUB XHTML CSS SVG SMIL Decomposition Navigation Inspection Lang / Audio Clip /Video Clip / Alt Text … CSS separation / em / strong / Formatting / justified … SVG lang / description media-type / list TOC / LOI / LOV / LOT… OPF Metadata / lang …
  11. 11. VERIFICATION OF CHECKER • 50 EPUB files which has deposited into the national library of Korea • 148 accessibility defects per each file on average • Accessibility errors focus on 8 points • The Korean e-book market has EPUB2x 90% and EPUB3x 10% • Rare percentage of e-books available for Multimedia, MathML, and Media Overlay support • 8 error points occurs at parts which are irrelevant to EPUB3 specifications
  12. 12. MAJOR DEFECTS • To define the default language for an XHTML document, the lang and xml:lang language attributes need to be attached to the root html element. It occupies 41% over all defects. • In the case of multilingual publications, best practice is to always specify the language in each content document to ensure proper rendering. It occupies 21% over all defects. • When using the epub:type attribute in a content document, the epub namespace must be declared on the element containing the attribute, or on one of its ancestors. It occupies 13% over all defects. • Images that are central to the understanding of a publication must always include a text alternative in their alt attribute. It occupies 7% over all defects. • When creating hyperlinks, the text inside of the link can provide the full context of what is being linked to or the link can have alternate text. It occupies 7% over all defects. • Separating style from markup is consequently not just about keeping CSS in a separate file from your markup, but recognizing that markup must convey meaning to be useful to all readers. It occupies 7% over all defects. • When using bolding and italics, EPUB follow the rules of HTML5 and CSS standard. It occupies 2% over all defects. • Avoid justifying text, as the uneven spacing that occurs between words can reduce the readability for some people. It occupies 1% over all defects.
  13. 13. FUTURE WORKS • 1st tier automatic system could pick up problematic items which defined as 39 check points • It is responsible for 25% of all 156 check points • 2nd tier semi-automatic system handle 75% of check points • It should be changed into automatic detection through Machine Learning algorithm