Mind Your lang
Presented by Adrian Roselli for
Accessibility Camp Toronto 2016
Slides from this talk will be available at
rosel.li/a11yTO
“Toronto Skyline” by Ronan Jouve, CC BY-NC 2.0
About Adrian Roselli
About Adrian Roselli
• I’ve written some stuff,
• Member of W3C,
• Building for the web
since 1993,
• Learn more at
AdrianRoselli.com,
• Avoid on Twitter
@aardrian.
Great bedtime reading!
About Adrian Roselli
What Is lang?
What Is lang?
• Examples:
<html lang="en">
<html lang="en-ca">
<html lang="en-us">
<html lang="en-GB-x-hixie">
• Source:
BCP47: Tags for Identifying Languages,
https://tools.ietf.org/html/bcp47
We’ll come back to that last one.
Who Uses lang?
Who Uses lang?
• WHATWG Bug: “why do these examples of <html>
lack the lang attribute?”
This is where my research started.
“Why not? Realistically,
few people include it. It
just means the language
is unknown.”
Who Uses lang?
• “why do these examples of <html> lack the
lang attribute?”
• WHATWG HTML bug (26942)
• Reported: 2014-09-30
• Resolved: 2016-04-18
• Git merge:
• Editorial: Add lang to most examples #1061
Spoiled the surprise, I know, but we aren’t here for a bug.
Who Uses lang?
• Pulled January 2015 archive from
WebDevData.org (a W3C Community Group),
• Parsed 84,054 pages,
• Found that 39,433 pages use the lang
attribute on the <html> element,
• 47% use <html lang="…">.
12,762 use xml:lang, which is wrong.
Why Would You Use lang?
Why Would You Use lang?
• HTML 5 Specification
• HTML Validation
• Internationalization (i18)
• WCAG 2.0 A, AA
• Numbers
• Dates
• Hyphens
• Quotes
• Screen Readers
https://github.com/w3c/html/issues/218
HTML 5 Specification
HTML 5 Specification
• The spec now provides a warning,
• Notes that must match detected language of
page,
• Identified ways which it is used,
• Added in April 2016
• add warning/advice about lang attribute use #218
https://github.com/w3c/html/issues/218
HTML 5 Specification
http://w3c.github.io/html/dom.html#lang-warning
HTML Validation
HTML Validation
• The W3C HTML validator compares the
following attributes on the page with the
detected page language:
• dir
• lang
• If there is a mismatch, the validator will
provide a warning,
• If there is no dir or lang, the validator will
provide a warning.
It will know if you lie.
HTML Validation
https://www.w3.org/blog/International/2016/07/13/w3c-html5-validator-enhanced-with-language-detection-functionality/
Internationalization (i18n)
Internationalization (i18n)
• Spelling and grammar checkers:
• spellcheck attribute (at caniuse.com)
• CSS:
• ::first-letter (at caniuse.com)
• Hanging punctuation
• Translation tools (particularly when looking at
parts of a page).
https://www.w3.org/International/questions/qa-lang-why
Internationalization (i18n)
• Font selection for CJK (for political reasons).
https://medium.com/behancetech/localization-gotchas-for-asian-languages-cjk-e52a57c0fde1#.vkrhr614s
WCAG 2.0 A, AA
WCAG 2.0 A, AA
• Guideline 3.1 Readable: Make text content
readable and understandable.
• 3.1.1 Language of Page (Level A)
• H57: Using language attributes on the html
element
• 3.1.2 Language of Parts (Level AA)
• H58: Using language attributes to identify changes
in the human language
https://www.w3.org/TR/2008/REC-WCAG20-20081211/#meaning-doc-lang-id
Numbers
Numbers
• A browser can adjust decimal characters in
number fields,
• Some use comma, some use period,
• Yes, this is for Latin scripts.
• Do not worry about browser support unless
you are mixing within a page.
• In that case, Firefox is the way to go.
If left blank, the browser may go with local settings.
Numbers
http://codepen.io/aardrian/pen/rOGYNL
Numbers
http://codepen.io/aardrian/pen/rOGYNL
Dates
Dates
• Not so much.
http://s.codepen.io/aardrian/debug/ZpgNWJ
Hyphens
Hyphens
• For browsers that support hyphens, you will
enjoy the benefit just by using the attribute.
• This assumes you use the following CSS:
• hyphens: auto;
• -ms-hyphens: auto; (ugh)
• -webkit-hyphens: auto; (also ugh)
• Browser support:
• http://caniuse.com/#search=hyphens
If left blank, the browser may go with local settings.
Hyphens
http://codepen.io/aardrian/pen/zKVLvO
Hyphens
http://codepen.io/aardrian/pen/zKVLvO
Quotes
Quotes
• Let the browser choose the quote marks
based on the language.
• This assumes you use the following HTML:
• <q>…</q>
Obviously you can override this with CSS, but that would be silly.
Quotes
http://s.codepen.io/aardrian/debug/zKgbVv
Quotes
http://s.codepen.io/aardrian/debug/zKgbVv
Screen Readers
Screen Readers
• VoiceOver uses it to auto-switch voices.
• VoiceOver can speak using a different accent.
• JAWS uses it to load the correct phonetic engine /
phonologic dictionary.
• NVDA uses it in the same way as VoiceOver and JAWS.
• For HTML in ePub or Apple iBooks document, it affects
how VoiceOver will read the book.
• Leaving out the lang attribute may require the user to
manually switch to the correct language for proper
pronunciation.
This gist is that things can sound funny if done wrong.
Screen Readers
http://s.codepen.io/aardrian/debug/eBOrZY
NVDA:
Screen Readers
http://s.codepen.io/aardrian/debug/eBOrZY
JAWS:
Fun Facts
Fun Facts
• WHATWG HTML 5
<html class=split lang=en-US-x-hixie>
• W3C HTML 5.0
<html lang="en-US-x-Hixie">
• W3C HTML 5.1
<html lang="en">
You can confirm this by the viewing the source of each.
Fun Facts
“Private-use subtags do not appear in the
subtag registry, and are chosen and maintained
by private agreement amongst parties.”
“Because these subtags are only meaningful
within private agreements and cannot be used
interoperably across the Web, they should be
used with great care, and avoided whenever
possible.”
http://www.w3.org/International/articles/language-tags/Overview.en.php#extension
Fun Facts
• There is a normative spec:
• Hixie English
• Version: 1.0-pre43
• Language Tag: en-GB-x-Hixie
• “This is a normative reference to Hixie English.
Hixie English is a variant of the language
spoken by the majority of the residents of the
United Kingdom (England) and the United
States of America.”
http://ian.hixie.ch/bible/english
Questions?
Mind Your lang
Presented by Adrian Roselli for
Accessibility Camp Toronto 2016
Slides from this talk will be available at
rosel.li/a11yTO
“Toronto Skyline” by Ronan Jouve, CC BY-NC 2.0

Mind Your lang — Accessibility Camp Toronto 2016

  • 1.
    Mind Your lang Presentedby Adrian Roselli for Accessibility Camp Toronto 2016 Slides from this talk will be available at rosel.li/a11yTO “Toronto Skyline” by Ronan Jouve, CC BY-NC 2.0
  • 2.
  • 3.
  • 4.
    • I’ve writtensome stuff, • Member of W3C, • Building for the web since 1993, • Learn more at AdrianRoselli.com, • Avoid on Twitter @aardrian. Great bedtime reading! About Adrian Roselli
  • 5.
  • 6.
    What Is lang? •Examples: <html lang="en"> <html lang="en-ca"> <html lang="en-us"> <html lang="en-GB-x-hixie"> • Source: BCP47: Tags for Identifying Languages, https://tools.ietf.org/html/bcp47 We’ll come back to that last one.
  • 7.
  • 8.
    Who Uses lang? •WHATWG Bug: “why do these examples of <html> lack the lang attribute?” This is where my research started. “Why not? Realistically, few people include it. It just means the language is unknown.”
  • 9.
    Who Uses lang? •“why do these examples of <html> lack the lang attribute?” • WHATWG HTML bug (26942) • Reported: 2014-09-30 • Resolved: 2016-04-18 • Git merge: • Editorial: Add lang to most examples #1061 Spoiled the surprise, I know, but we aren’t here for a bug.
  • 10.
    Who Uses lang? •Pulled January 2015 archive from WebDevData.org (a W3C Community Group), • Parsed 84,054 pages, • Found that 39,433 pages use the lang attribute on the <html> element, • 47% use <html lang="…">. 12,762 use xml:lang, which is wrong.
  • 11.
    Why Would YouUse lang?
  • 12.
    Why Would YouUse lang? • HTML 5 Specification • HTML Validation • Internationalization (i18) • WCAG 2.0 A, AA • Numbers • Dates • Hyphens • Quotes • Screen Readers https://github.com/w3c/html/issues/218
  • 13.
  • 14.
    HTML 5 Specification •The spec now provides a warning, • Notes that must match detected language of page, • Identified ways which it is used, • Added in April 2016 • add warning/advice about lang attribute use #218 https://github.com/w3c/html/issues/218
  • 15.
  • 16.
  • 17.
    HTML Validation • TheW3C HTML validator compares the following attributes on the page with the detected page language: • dir • lang • If there is a mismatch, the validator will provide a warning, • If there is no dir or lang, the validator will provide a warning. It will know if you lie.
  • 18.
  • 19.
  • 20.
    Internationalization (i18n) • Spellingand grammar checkers: • spellcheck attribute (at caniuse.com) • CSS: • ::first-letter (at caniuse.com) • Hanging punctuation • Translation tools (particularly when looking at parts of a page). https://www.w3.org/International/questions/qa-lang-why
  • 21.
    Internationalization (i18n) • Fontselection for CJK (for political reasons). https://medium.com/behancetech/localization-gotchas-for-asian-languages-cjk-e52a57c0fde1#.vkrhr614s
  • 22.
  • 23.
    WCAG 2.0 A,AA • Guideline 3.1 Readable: Make text content readable and understandable. • 3.1.1 Language of Page (Level A) • H57: Using language attributes on the html element • 3.1.2 Language of Parts (Level AA) • H58: Using language attributes to identify changes in the human language https://www.w3.org/TR/2008/REC-WCAG20-20081211/#meaning-doc-lang-id
  • 24.
  • 25.
    Numbers • A browsercan adjust decimal characters in number fields, • Some use comma, some use period, • Yes, this is for Latin scripts. • Do not worry about browser support unless you are mixing within a page. • In that case, Firefox is the way to go. If left blank, the browser may go with local settings.
  • 26.
  • 27.
  • 28.
  • 30.
    Dates • Not somuch. http://s.codepen.io/aardrian/debug/ZpgNWJ
  • 31.
  • 32.
    Hyphens • For browsersthat support hyphens, you will enjoy the benefit just by using the attribute. • This assumes you use the following CSS: • hyphens: auto; • -ms-hyphens: auto; (ugh) • -webkit-hyphens: auto; (also ugh) • Browser support: • http://caniuse.com/#search=hyphens If left blank, the browser may go with local settings.
  • 33.
  • 34.
  • 35.
  • 36.
    Quotes • Let thebrowser choose the quote marks based on the language. • This assumes you use the following HTML: • <q>…</q> Obviously you can override this with CSS, but that would be silly.
  • 37.
  • 38.
  • 39.
  • 40.
    Screen Readers • VoiceOveruses it to auto-switch voices. • VoiceOver can speak using a different accent. • JAWS uses it to load the correct phonetic engine / phonologic dictionary. • NVDA uses it in the same way as VoiceOver and JAWS. • For HTML in ePub or Apple iBooks document, it affects how VoiceOver will read the book. • Leaving out the lang attribute may require the user to manually switch to the correct language for proper pronunciation. This gist is that things can sound funny if done wrong.
  • 41.
  • 42.
  • 43.
  • 44.
    Fun Facts • WHATWGHTML 5 <html class=split lang=en-US-x-hixie> • W3C HTML 5.0 <html lang="en-US-x-Hixie"> • W3C HTML 5.1 <html lang="en"> You can confirm this by the viewing the source of each.
  • 45.
    Fun Facts “Private-use subtagsdo not appear in the subtag registry, and are chosen and maintained by private agreement amongst parties.” “Because these subtags are only meaningful within private agreements and cannot be used interoperably across the Web, they should be used with great care, and avoided whenever possible.” http://www.w3.org/International/articles/language-tags/Overview.en.php#extension
  • 46.
    Fun Facts • Thereis a normative spec: • Hixie English • Version: 1.0-pre43 • Language Tag: en-GB-x-Hixie • “This is a normative reference to Hixie English. Hixie English is a variant of the language spoken by the majority of the residents of the United Kingdom (England) and the United States of America.” http://ian.hixie.ch/bible/english
  • 47.
  • 48.
    Mind Your lang Presentedby Adrian Roselli for Accessibility Camp Toronto 2016 Slides from this talk will be available at rosel.li/a11yTO “Toronto Skyline” by Ronan Jouve, CC BY-NC 2.0