Animated GIFs and videos can be found on my site at http://adrianroselli.com/2017/03/slides-from-roledrinks-at-csun.html

  1. 1. Mind Your lang Presented by Adrian Roselli (@aardrian) For role=drinks San Diego Slides from this talk will be available at rosel.li/roledrinks17 “Calm San Diego Night” by Justin Brown, CC BY-NC-SA 2.0
  2. 2. What Is lang?
  3. 3. What Is lang? • Examples: <html lang="en"> <html lang="en-ca"> <html lang="en-us"> <html lang="en-GB-x-hixie"> • Source: BCP47: Tags for Identifying Languages, https://tools.ietf.org/html/bcp47 We’ll come back to that last one.
  4. 4. Who Uses lang?
  5. 5. Who Uses lang? • WHATWG Bug: “why do these examples of <html> lack the lang attribute?” This is where my research started. “Why not? Realistically, few people include it. It just means the language is unknown.”
  6. 6. Who Uses lang? • “why do these examples of <html> lack the lang attribute?” • WHATWG HTML bug (26942) • Reported: 2014-09-30 • Resolved: 2016-04-18 • Git merge: • Editorial: Add lang to most examples #1061 Spoiled the surprise, I know, but we aren’t here for a bug.
  7. 7. Who Uses lang? • Pulled January 2015 archive from WebDevData.org (a W3C Community Group), • Parsed 84,054 pages, • Found that 39,433 pages use the lang attribute on the <html> element, • 47% use <html lang="…">. 12,762 use xml:lang, which is wrong.
  8. 8. Why Would You Use lang?
  9. 9. Why Would You Use lang? • HTML 5 Specification • HTML Validation • Internationalization (i18) • WCAG 2.0 A, AA • Numbers • Dates • Hyphens • Quotes • Screen Readers
  10. 10. HTML 5 Specification
  11. 11. HTML 5 Specification • The spec now provides a warning, • Notes that it must match detected language of the page, • Identified ways which it is used, • Added in April 2016 • add warning/advice about lang attribute use #218 https://github.com/w3c/html/issues/218
  12. 12. HTML 5 Specification http://w3c.github.io/html/dom.html#lang-warning
  13. 13. HTML Validation
  14. 14. HTML Validation • The W3C HTML validator compares the following attributes on the page with the detected page language: • dir • lang • If there is a mismatch, the validator will provide a warning, • If there is no dir or lang, the validator will provide a warning. It will know if you lie.
  15. 15. HTML Validation https://www.w3.org/blog/International/2016/07/13/w3c-html5-validator-enhanced-with-language-detection-functionality/
  16. 16. Internationalization (i18n)
  17. 17. Internationalization (i18n) • Spelling and grammar checkers: • spellcheck attribute (at caniuse.com) • CSS: • ::first-letter (at caniuse.com) • Hanging punctuation • Translation tools (particularly when looking at parts of a page). https://www.w3.org/International/questions/qa-lang-why
  18. 18. Internationalization (i18n) • Font selection for CJK (for political reasons). https://medium.com/behancetech/localization-gotchas-for-asian-languages-cjk-e52a57c0fde1
  19. 19. WCAG 2.0 A, AA
  20. 20. WCAG 2.0 A, AA • Guideline 3.1 Readable: Make text content readable and understandable. • 3.1.1 Language of Page (Level A) • H57: Using language attributes on the html element • 3.1.2 Language of Parts (Level AA) • H58: Using language attributes to identify changes in the human language https://www.w3.org/TR/2008/REC-WCAG20-20081211/#meaning-doc-lang-id
  21. 21. Numbers
  22. 22. Numbers • A browser can adjust decimal characters in number fields, • Some use comma, some use period, • Yes, this is for Latin scripts. • Do not worry about browser support unless you are mixing within a page. • In that case, Firefox is the way to go. If left blank, the browser should go with locale settings.
  23. 23. Numbers http://codepen.io/aardrian/pen/rOGYNL
  24. 24. Numbers http://codepen.io/aardrian/pen/rOGYNL
  25. 25. Dates
  26. 26. Dates • Not so much. http://s.codepen.io/aardrian/debug/ZpgNWJ
  27. 27. Hyphens
  28. 28. Hyphens • For browsers that support hyphens, you will enjoy the benefit just by using the attribute. • This assumes you use the following CSS: • hyphens: auto; • -ms-hyphens: auto; (ugh) • -webkit-hyphens: auto; (also ugh) • Browser support: • http://caniuse.com/#search=hyphens If left blank, the browser should go with locale settings.
  29. 29. Hyphens http://codepen.io/aardrian/pen/zKVLvO
  30. 30. Hyphens http://codepen.io/aardrian/pen/zKVLvO
  31. 31. Quotes
  32. 32. Quotes • Let the browser choose the quote marks based on the language. • This assumes you use the following HTML: • <q>…</q> Obviously you can override this with CSS, but that would be silly.
  33. 33. Quotes http://s.codepen.io/aardrian/debug/zKgbVv
  34. 34. Quotes http://s.codepen.io/aardrian/debug/zKgbVv
  35. 35. Screen Readers
  36. 36. Screen Readers • VoiceOver uses it to auto-switch voices. • VoiceOver can speak using a different accent. • JAWS uses it to load the correct phonetic engine / phonologic dictionary. • NVDA uses it in the same way as VoiceOver and JAWS. • For HTML in ePub or Apple iBooks document, it affects how VoiceOver will read the book. • Leaving out the lang attribute may require the user to manually switch to the correct language for proper pronunciation. This gist is that things can sound funny if done wrong.
  37. 37. Screen Readers http://s.codepen.io/aardrian/debug/eBOrZY NVDA:
  38. 38. Screen Readers http://s.codepen.io/aardrian/debug/eBOrZY JAWS:
  39. 39. Fun Facts
  40. 40. Fun Facts • WHATWG HTML 5 <html class=split lang=en-US-x-hixie> • W3C HTML 5.0 <html lang="en-US-x-Hixie"> • W3C HTML 5.1 <html lang="en"> You can confirm this by viewing the source of each.
  41. 41. Fun Facts “Private-use subtags do not appear in the subtag registry, and are chosen and maintained by private agreement amongst parties.” “Because these subtags are only meaningful within private agreements and cannot be used interoperably across the Web, they should be used with great care, and avoided whenever possible.” http://www.w3.org/International/articles/language-tags/Overview.en.php#extension
  42. 42. Fun Facts • There is a normative spec: • Hixie English • Version: 1.0-pre43 • Language Tag: en-GB-x-Hixie • “This is a normative reference to Hixie English. Hixie English is a variant of the language spoken by the majority of the residents of the United Kingdom (England) and the United States of America.” http://ian.hixie.ch/bible/english
  43. 43. Drink!
