Successfully reported this slideshow.
Your SlideShare is downloading. ×

A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessibility, internationalization, localization, and character sets (long version)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 40 Ad

A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessibility, internationalization, localization, and character sets (long version)

Download to read offline

Web accessibility (A11Y) is about making the web usable for people with disabilities, and it also benefits others with changing abilities, such as older people. Internationalization (I18N) and localization (L10N) are about translating web sites into other languages. UTF8 is a Unicode character set, which is now the dominant one used on the web, and it’s designed to include characters from just about every written language. Each of these topics are typically discussed in isolation from each other, but in this talk – after a gentle introduction to each of them – we’ll explore their interconnections. We’ll also take a look at what WordPress provides for supporting them in your work creating sites, themes, or plugins.

Web accessibility (A11Y) is about making the web usable for people with disabilities, and it also benefits others with changing abilities, such as older people. Internationalization (I18N) and localization (L10N) are about translating web sites into other languages. UTF8 is a Unicode character set, which is now the dominant one used on the web, and it’s designed to include characters from just about every written language. Each of these topics are typically discussed in isolation from each other, but in this talk – after a gentle introduction to each of them – we’ll explore their interconnections. We’ll also take a look at what WordPress provides for supporting them in your work creating sites, themes, or plugins.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessibility, internationalization, localization, and character sets (long version) (20)

Advertisement

More from mtoppa (20)

Recently uploaded (20)

Advertisement

A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessibility, internationalization, localization, and character sets (long version)

  1. 1. A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessibility, internationalization, localization, and character sets Michael Toppa @mtoppa WordCamp Nashville May 3, 2014
  2. 2. About me… * I’ve been developing for the web since the days of HTML 1.0, when web pages were first painted on cave walls. * This is my 7th WordCamp presentation, and I have 7 plugins at wordpress.org, dating back to 2006. * I was previously the Director of Development for WebDevStudios. One of my assignments while there was managing the WordPress VIP project for NBC Latino. * I’ve also managed the 16 person web application team at the U Penn School of Medicine, and I previously worked at Stanford, Georgetown, Ask Jeeves, and E-Trade.
  3. 3. I’m now working at PromptWorks, a small consultancy in Philadelphia. We do a lot of work in Ruby, Rails, JavaScript, and infrastructure automation. In addition to building products for our clients, we work closely with them, and pair program with them when possible. So that when we leave, they can continue on the path using TDD and good object oriented design.
  4. 4. Accessibility, internationalization, and character sets are normally presented as separate, distinct topics. But I see them as strongly interconnected, and so in this talk I’m going to discuss all of them, with a focus on how they relate to each other. This talk is by no means comprehensive, as they are each big topics. My goal is start you thinking about how to make your web content more accessible to people with varying levels of ability using the web, and who speak different languages.
  5. 5. Accessibility (A11Y) Red, yellow, and green all look yellowish to many color-blind people. So how do they understand street signal lights? They pay attention to the order instead. The signal lights communicate in two ways.
  6. 6. Accessibility (A11Y) Like the signal lights, this wheel chair ramp is a good example of incorporating accessibility into a design without it making things ugly or seeming like an afterthought.
  7. 7. Why bother? Before going any further, why should you spend time worrying about any of this? If you’re just making a web site for a small business here in Nashville, why spend time coding for accessibility, or worrying about languages other than English?
  8. 8. Reason #1 Accessibility ≠ Disability Accessibility is important for a wide variety of people: * older people, who often have impaired hearing, difficulty clicking on small targets, etc * people with low literacy or not fluent in the language * people with low bandwidth connections or using older technologies * new and infrequent users * …and persons with disabilities
  9. 9. Reason #2 More people need help than you think * More than half of Americans over 65 are now online, and they spend a lot of time online * About 9% of men suffer from a type of color blindness * The number of Americans who speak a language other than English at home has tripled since 1980, to 1 in 5 Americans. That’s about 60 million people. * About 5% of Americans don’t speak English fluently, that’s over 15 million people. * Another 5% live in places where the only way to get online is through slow dial-up connections. * And that’s just the US…
  10. 10. Reason #3 The cost is low As we go along in this talk, you’ll see that meeting basic accessibility needs is not that hard. It’s also not hard to set up your content to be translation ready, even if you don’t need to support other languages right now.
  11. 11. Reason #4 It’s the right thing to do When you don’t give any thought to how people with varying abilities can use your site, the result can be a miserable experience for them.
  12. 12. Things I learned by pretending to be blind for a week Some well known sites, such as Facebook and Amazon, are almost unusable by blind people. The Amazon home page has over 1,000 links, few alt tags for images, and few ARIA landmarks (“role” attributes), which help screen readers identify different regions of a page.
  13. 13. WCAG Accessibility Guidelines 1. Perceivable <img src="smiley.gif" alt="Smiley face"> 2. Operable <input accesskey="S" type="submit" value="Submit"> 3. Understandable and Predictable <a href="news.html" target=“_blank”>latest news (opens new window)</a> 4. Robust and Compatible <label for="first_name">First Name</label> The World Wide Web Consortium (W3c) put together version 2 of their Web Content Accessibility Guidelines in 2008, and it has 4 key principles: Perceivable - e.g. provide text alternatives for non-textual content Operable - e.g make all functionality available from the keyboard, provide good site navigation Understandable - e.g. help users avoid and prevent mistakes, such as clearly indicating errors in a form submission Robust - e.g. use valid, well-structured HTML to maximize compatibility with user agents such as screen readers
  14. 14. WCAG Accessibility Guidelines 1. Perceivable 2. Operable 3. Understandable and Predictable ❖ Guideline 3.1.1 Language of Page: ❖ The default human language of each Web page can be programmatically determined. 4. Robust and Compatible There are 17 guidelines to follow for making a web page understandable. The first one is that it should be possible to programmatically determine the language of a web page.
  15. 15. The lang attribute ❖ Declare the language of a WordPress theme in header.php: <html <?php language_attributes(); ?>> For a US English site, this renders as: <html lang="en-US"> ❖ In HTML 5, declare the language of part of a document <div lang="fr"> WordPress itself has been translated to over 70 languages, and if you are developing a theme or plugin, you need to make sure you are using the lang attribute appropriately. The language_attributes function will set a lang attribute based on the language specified in your wp-config.php file
  16. 16. Uses of the lang attribute ❖ Supports speech synthesizers and automated translators ❖ Supports spelling and grammar checkers ❖ Improves search engine results ❖ Helps support server content negotiation ❖ Allows user-agents to select language appropriate fonts Content negotiation lets the browser tell the server what media types and languages it prefers, and the server will do its best to comply. There is a plugin to support this in WordPress.
  17. 17. Language appropriate fonts This ideographic character has the same Unicode value and meaning in Chinese, Japanese, and Korean. The character means “snow.” But it is rendered differently, depending on whether the lang attribute of the page is set to Simplified Chinese, Traditional Chinese, Japanese, or Korean.
  18. 18. Unicode? Unicode is a single character set designed to include characters from just about every writing system on the planet. This is a small section of the Unicode character map, showing characters used in languages spoken in Myanmar.
  19. 19. Klingon for Unicode It supports languages from off the planet as well. Although the Klingon application for incorporation into Unicode was rejected in 2001, encoding for it was created it what’s called the “private use” range of code points in Unicode. So there are web sites out there written in Klingon, and you can download Klingon fonts so you can read them.
  20. 20. Solving the Unicode Puzzle: PHP Architect, May 2005 In 2005 I wrote an article on configuring Apache, Oracle, and PHP for Unicode, published in PHP Architect. At that time Unicode was just emerging as the new standard for character encoding, and configuring end-to-end support for using it in web applications was a significant undertaking. These days, Unicode support comes out of the box for the most part.
  21. 21. Before there was Unicode… Lower ASCII Unicode has been prevalent on the web for about 10 years now. In the 1960s, unaccented English characters, as well as various control characters for carriage returns, page feeds, etc., were each assigned a number from 0 to 127; there was general agreement on these number assignments, and so ASCII was born (American Standard Code for Information Interchange).
  22. 22. Before there was Unicode… Upper ASCII: ISO 8859-1 (aka Latin 1) The ASCII characters could fit in 7 bits, and computers used 8-bit bytes, which left an extra bit of space. This led to the proliferation of many different character sets, with each one using this extra space in a different way. Here’s Latin 1, which contains special symbols and accented characters for Western languages.
  23. 23. Before there was Unicode… Upper ASCII: ISO 8859-2 Here’s the version of Upper ASCII that supports Slavic languages. There are 15 variations on this ISO standard. This means that text generated on, say, a computer in Russia would turn into gibberish if you tried to read it on a computer in the US. This happened because the number codes representing the Cyrillic characters were assigned to totally different characters on the US computer. This became a bit of a problem when everyone started using the internet.
  24. 24. The Unicode slogan “Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language.” Unicode represents an effort to clean up this mess. Unicode can do this because it allows characters to occupy more than one byte, so it has enough room to store characters from languages around the world—even Asian languages that have thousands of characters. It’s a character set able to support over 1 million characters.
  25. 25. So what is UTF-8? Unicode is a character set, and there are 3 different ways to encode it. UTF-8 is the unicode encoding standard for the web because, like ASCII, it’s an 8- bit encoding, and it’s compatible with the Latin1 ASCII character set. This makes it backwards compatible with most previously created Western language documents.
  26. 26. Learning everyday Japanese with Mangajin UTF-8 is the standard character encoding in WordPress, since version 2.2. Here’s an example from my blog, showing a multi-lingual post.
  27. 27. WordPress supports UTF-8
  28. 28. Localization (L10N) and Internationalization (I18N) A multi-lingual page like that is fairly uncommon. More commonly, content is created in one language, but we want a standardized way to enable the creation of translations into other languages. This is where localization and internationalization come in.
  29. 29. Localization “Localization refers to the adaptation of a product, application or document content to meet the language, cultural and other requirements of a specific target market (a locale).” This often involves more than just translation In addition to translation, this can also involve dealing with variations in numeric, date, currency, and time formats, varying legal requirements, and awareness of things that may be misunderstood or be offensive in other cultures.
  30. 30. Internationalization “Internationalization is the design and development of a product, application or document content that enables easy localization for target audiences that vary in culture, region, or language.”
  31. 31. WordPress provides internationalization features so you can localize your themes and plugins
  32. 32. Step 1: use WordPress’ I18N functions ❖ Wrap all your text in WordPress’ I18N functions, using a custom “text domain”. This is for my “shashin” plugin: ❖ $greeting = __( 'Howdy', 'shashin' ); ❖ <li><?php _e( 'Howdy', 'shashin' ); ?></li> ❖ $string = _x( 'Buffalo', 'an animal', 'shashin' ); ❖ $string = _x( 'Buffalo', 'a city in New York', 'shashin' ); ❖ And others…
  33. 33. Step 2: load your text domain ❖ For plugins: load_plugin_textdomain( 'shashin', false, dirname(plugin_basename(__FILE__)) . '/languages/' ); Give it the path to translation files, which we will create in the next steps
  34. 34. Step 2: load your text domain ❖ For themes: function custom_theme_setup() { load_theme_textdomain( 'my_theme', get_template_directory() . '/languages') ); } add_action('after_setup_theme', 'custom_theme_setup');
  35. 35. Step 3: generate a POT file The POT file serves as a template for translating your theme or plugin into other languages. It extracts all the text you wrapped in the WordPress’ I18N functions and puts them in a single file. If you have a plugin in the wordpress.org repository, it can generate a POT file for you. There are other tools available for this as well. See the references at the end of this talk for other ways to generate a POT file for themes and plugins
  36. 36. Step 4: create translation files This is a screenshot from POEdit. With POEdit, a translator can take your POT file and create a translation to another language. This translation creates a textual .po file, and then a binary, compiled version of it, in a .mo file. If you include a .mo file translation that matches the language configuration of a WordPress site, your theme or plugin can be shown in that language. If you include the .pot file with your theme or plugin, and it becomes popular, you’ll probably start receiving unsolicited translations from people who have translated for us in their language, and want to share the translation for others to use.
  37. 37. Step 4: create translation files ❖ Other translation options: ❖ The Codestyling Localization plugin ❖ For themes, the ThemeZee translation site The Codestyling localization plugin creates files compatible with POEdit, and works directly with the Google Translate API and Microsoft Translator API to help you translate. It has not been updated in over a year though. ThemeZee has a collaborative online theme translation community, which you can join for free
  38. 38. Step 5: include translation files This shows all the different language translations available for the popular plugin, Contact Form 7. Maintaining translations can be difficult, as you will usually need to get an updated translation for each new release of your plugin or theme. Even just changes in line numbers can throw off the translation.
  39. 39. Questions?
  40. 40. Further reading ❖ W3C ❖ How to meet WCAG 2.0: quick reference ❖ Why use the language attribute? ❖ Localization vs. Internationalization ❖ WordPress ❖ How To Localize WordPress Themes and Plugins ❖ I18n for WordPress Developers ❖ Internationalization: You’re probably doing it wrong ❖ Solving the Unicode Puzzle

×