Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Exploring language technologies to provide
support to WCAG 2.0 and E2R guidelines
Lourdes Moreno * Paloma Martínez, Isabel...
Contents
• Motivation and introduction
• EASY-TO-READ (E2R) Guidelines
• WCAG 2.0: readability and understandability
• Nat...
MOTIVATION
• Part of citizenship faces accessibility barriers when texts
containing:
 long sentences
 unusual words
 co...
INTRODUCTION
Target groups
• People with cognitive or learning
disabilities
• Also:
 Pre lingually deaf persons
 Older p...
INTRODUCTION
Initiatives
• Easy-to-Read (E2R)
 Inclusion Europe 2009
 Guidelines of IFLA 2010
• Web Content Accessibilit...
EASY-TO-READ (E2R) Guidelines
• In general terms these guidelines are:
 Use simplest and most common words
 Avoid long w...
EASY-TO-READ (E2R) Guidelines
What can be done?
• To make online texts more accessible and readable
• In complex words or ...
EASY-TO-READ (E2R) Guidelines
• These E2R guidelines are aimed only to text content.
• In addition: page structure, presen...
WCAG 2.0: READABILITY AND UNDERSTANDABILITY
understandable vs readability
“a text could be highly readable, since the synt...
WCAG success criteria
concerning text
• 3.1 (Readable: Make text
content readable and
understandable)
 Readability - 3.1....
WCAG 2.0: READABILITY AND UNDERSTANDABILITY
Additional accessibility requirements
• WCAG 2.0 document does not specify gui...
WCAG 2.0: READABILITY AND UNDERSTANDABILITY
Discussion and conclusions
• No correspondence between concepts in E2R guideli...
WCAG 2.0: READABILITY AND UNDERSTANDABILITY
Discussion and conclusions
• Proposal:
 PLN approaches with a use of E2R and ...
Natural language processing (NLP)
• The discipline devoted to develop technology to understand
natural language
• Applicat...
NLP APPROACHES FOR TEXT SIMPLIFICATION
Support to accessibility
• NLP processes are applied with the objective of transfor...
NLP APPROACHES FOR TEXT SIMPLIFICATION
Language detection
• Language detection consists on identifying the language of a
t...
NLP APPROACHES FOR TEXT SIMPLIFICATION
Abbreviations
• Approaches to recognized abbreviations and corresponding
expansions...
NLP APPROACHES FOR TEXT SIMPLIFICATION
Text summarization or topic detection
• Goal : to obtain a set of sentences that re...
NLP APPROACHES FOR TEXT SIMPLIFICATION
Text Simplification
• It is essential in several types of texts: News, Government a...
NLP APPROACHES FOR TEXT SIMPLIFICATION
Text Simplification
Lexical simplification:
• Replacing words (taking into account ...
NLP APPROACHES FOR TEXT SIMPLIFICATION
Text Simplification
Lexical simplification
• Complexity measures: frequency of word...
NLP APPROACHES FOR TEXT SIMPLIFICATION
WCAG 2.0 PLN Approach
2.4.2 (Page Titled)
2.4.6 (Headings and Labels)
2.4.10 (Secti...
PROOF OF CONCEPT
Lexical Simplification of Drug Package Leaflets
• The principal text source of information for patients
•...
PROOF OF CONCEPT
Lexical Simplification of Drug Package Leaflets
• Goal of the system:
 Provide information in an easy an...
PROOF OF CONCEPT
Lexical Simplification of Drug Package Leaflets
FIRST Module:
Named Entity Recognition
(NER)
• Detects th...
PROOF OF CONCEPT
Lexical Simplification of Drug Package Leaflets
SECOND module:
Lexical Simplifier
• To Identify the effec...
PROOF OF CONCEPT
Lexical Simplification of Drug Package Leaflets
SECOND module. Lexical Simplifier
• Preferred Term Substi...
PROOF OF CONCEPT
Lexical Simplification of Drug Package Leaflets
SECOND module. Lexical Simplifier
• Most Frequent Term Su...
PROOF OF CONCEPT
Lexical Simplification of Drug Package Leaflets
SECOND module. Lexical Simplifier
Synonyms from MedDRA ap...
PROOF OF CONCEPT
Lexical Simplification of Drug Package Leaflets
SECOND module. Lexical Simplifieroriginal
Muy frecuentes:...
CONCLUSIONS
• For some people, it is difficult to infer the meaning of an unusual
word or phrase from context
• Long sente...
CONCLUSIONS
Work in progress
• New approaches to offer support: abbreviations, summaries,
definitions of unusual words, et...
REFERENCE
Lourdes Moreno, Paloma Martínez, Isabel Segura-Bedmar, and
Ricardo Revert. 2015. Exploring language technologies...
Upcoming SlideShare
Loading in …5
×

Exploring language technologies to provide support to WCAG 2.0 and E2R guidelines. Lourdes Moreno, Paloma Martínez, Isabel Segura-Bedmar, and Ricardo Revert. 2015. Universidad Carlos III de Madrid

535 views

Published on

Lourdes Moreno, Paloma Martínez, Isabel Segura-Bedmar, and Ricardo Revert. 2015. Exploring language technologies to provide support to WCAG 2.0 and E2R guidelines. In Proceedings of the XVI International Conference on Human Computer Interaction (Interacción '15). ACM, New York, NY, USA, , Article 57 , 8 pages. DOI=http://dx.doi.org/10.1145/2829875.2829927

URL ACM Digital Library: http://dl.acm.org/citation.cfm?id=2829927

Published in: Technology

Exploring language technologies to provide support to WCAG 2.0 and E2R guidelines. Lourdes Moreno, Paloma Martínez, Isabel Segura-Bedmar, and Ricardo Revert. 2015. Universidad Carlos III de Madrid

  1. 1. Exploring language technologies to provide support to WCAG 2.0 and E2R guidelines Lourdes Moreno * Paloma Martínez, Isabel Segura-Bedmar and Ricardo Revert Grupo LaBDA Departamento de Informática Universidad Carlos II de Madrid (*) lmoreno@inf.uc3m.es Vilanova I la Geltrú (Universitat Politècnica Catalunya ), septiembre 2015 Reference ACM Digital Library: http://dl.acm.org/citation.cfm?id=2829927&CFID=573822944&CFTOKEN=54544041
  2. 2. Contents • Motivation and introduction • EASY-TO-READ (E2R) Guidelines • WCAG 2.0: readability and understandability • Natural language processing (NLP) approaches for text simplification • Proof of Concept: Lexical Simplification of Drug Package Leaflets • Conclusions LaBDA, Universidad Carlos III de Madrid
  3. 3. MOTIVATION • Part of citizenship faces accessibility barriers when texts containing:  long sentences  unusual words  complex linguistic structures  … • Environment: web content • Readability and understanding should be considered when texts are created LaBDA, Universidad Carlos III de Madrid
  4. 4. INTRODUCTION Target groups • People with cognitive or learning disabilities • Also:  Pre lingually deaf persons  Older people (Individual cognitive abilities such as attention span and memory)  Non-alphabetized people  Immigrants (different native language)  People with aphasia, dyslexia, autism LaBDA, Universidad Carlos III de Madrid
  5. 5. INTRODUCTION Initiatives • Easy-to-Read (E2R)  Inclusion Europe 2009  Guidelines of IFLA 2010 • Web Content Accessibility Guidelines (WCAG) 2.0  Regulatory framework  Hard Success criteria  Conformance level AA LaBDA, Universidad Carlos III de Madrid
  6. 6. EASY-TO-READ (E2R) Guidelines • In general terms these guidelines are:  Use simplest and most common words  Avoid long words  Avoided use of abbreviations  The same term used to refer to the same concept  Use short sentences  Avoid complex sentences with dependent clauses  Use active language and avoid passive voice LaBDA, Universidad Carlos III de Madrid
  7. 7. EASY-TO-READ (E2R) Guidelines What can be done? • To make online texts more accessible and readable • In complex words or phrases are replaced with more commonly used words • These adaptations are carried out with the use of text simplification techniques:  www.noticiasfacil.es  www.e-include.info/  simple.wikipedia.org/  www.simplext.es/ • Manual process? In some cases it is unfeasible • Support Technology LaBDA, Universidad Carlos III de Madrid
  8. 8. EASY-TO-READ (E2R) Guidelines • These E2R guidelines are aimed only to text content. • In addition: page structure, presentation, … => For this reason, accessibility requirements of WCAG 2.0 must be taken into account LaBDA, Universidad Carlos III de Madrid
  9. 9. WCAG 2.0: READABILITY AND UNDERSTANDABILITY understandable vs readability “a text could be highly readable, since the syntax is extremely simple, but extremely hard to understand because of the lexicon used”  Readability gives an evaluation about the structure of sentences (it concerns syntax and consequently requires syntactic simplification approaches)  understandability captures the lexical aspects and lexical simplification approaches are required LaBDA, Universidad Carlos III de Madrid
  10. 10. WCAG success criteria concerning text • 3.1 (Readable: Make text content readable and understandable)  Readability - 3.1.5 (Reading Level)  Understandable - 3.1.3 (Unusual Words) and 3.1.4 ( Abbreviations) Code(Level Conformance) Description 1.1.1 Non-text Content (Level A). Every non-text content that is presented to the user has a alternative text that serves the equivalent purpose 2.4.2 Page Titled (Level A). Web pages have titles that describe topic or purpose. 2.4.4 Link Purpose (In Context): (text type) The purpose of each link can be determined from the link text alone or from the link text together with its programmatically determined link context 2.4.6 Headings and Labels (Level AA). Headings and labels describe topic or purpose. 2.4.9 Link Purpose (Link Only) (Level AAA). (text type) A mechanism is available to allow the purpose of each link to be identified from link text alone, except where the purpose of the link would be ambiguous to users in general. 2.4.10 Section Headings (Level AAA). Section headings are used to organize the content. 3.1.1 Language of Page (Level A). The default human language of each Web page can be programmatically determined. 3.1.2 Language of Parts (Level AA). The human language of each passage or phrase in the content can be programmatically determined. 3.1.3 Unusual Words (Level AAA). A mechanism is available for identifying specific definitions of words or phrases used in an unusual. 3.1.4 Abbreviations (Level AAA). A mechanism for identifying the expanded form or meaning of abbreviations is available. 3.1.5 Reading Level (Level AAA). When text requires reading ability more advanced than the lower secondary education level after removal of proper names and titles, supplemental content, or a version that does not require reading ability more advanced than the lower secondary education level, is available.LaBDA, Universidad Carlos III de Madrid
  11. 11. WCAG 2.0: READABILITY AND UNDERSTANDABILITY Additional accessibility requirements • WCAG 2.0 document does not specify guidelines to these matters as concerning visual or auditory accessibility • A set of additional WCAG 2.0 success criteria has been obtained regarding the presentation, navigation, structure, cognitive aspects in user task,… • Some of these additional success criteria are:  1.4.8 (Visual Presentation)  2.2.3 (No Timing)  2.4.5 (Multiple Ways)  3.2.3 (Consistent Navigation)  3.2.4 (Consistent Identification)  2.2.3 (No Timing)  3.3.1 (Error Identification)  3.3.2 (Labels or Instructions)  3.3.5 (Help) LaBDA, Universidad Carlos III de Madrid
  12. 12. WCAG 2.0: READABILITY AND UNDERSTANDABILITY Discussion and conclusions • No correspondence between concepts in E2R guidelines and success criteria of WCAG 2.0 => The professional closely to the field of the accessibility conformity WCAG does not know how to accomplish requirements E2R • Aside from WCAG 2.0 regarding the text, further accessibility features should be considered • WCAG 2.0 support is not enough • Technology supporting the authorship of texts is required LaBDA, Universidad Carlos III de Madrid
  13. 13. WCAG 2.0: READABILITY AND UNDERSTANDABILITY Discussion and conclusions • Proposal:  PLN approaches with a use of E2R and WCAG 2.0 resources provide the semi-automatic support  Different NLP strategies to simplify texts depending on whether you want to analyse understandable or readability LaBDA, Universidad Carlos III de Madrid
  14. 14. Natural language processing (NLP) • The discipline devoted to develop technology to understand natural language • Applications:  Machine translation  Information retrieval  Information extraction from unstructured data  Summarization  Question answering  …. LaBDA, Universidad Carlos III de Madrid
  15. 15. NLP APPROACHES FOR TEXT SIMPLIFICATION Support to accessibility • NLP processes are applied with the objective of transforming a text in an equivalent one, but more accessible to people with any kind of cognitive disability • Three NLP processes that could be applied to text simplification tasks are described:  Language detection  Abbreviations detection  Topic detection LaBDA, Universidad Carlos III de Madrid
  16. 16. NLP APPROACHES FOR TEXT SIMPLIFICATION Language detection • Language detection consists on identifying the language of a text • It is helpful for example: when screen readers are used • Approaches:  To find out it is to check if language-specific characters, (e.g. Dutch if string “ik” appears, German is “ich” or “β” is used, Polish if “czy” or “ń”, “Ł”, “ź” are included in words)  To use n-grams frequency distributions. All languages have words that occur more frequently than others (Zipf´s Law) • if two texts of a same language are compared then they should have similar n-grams frequency distributions) LaBDA, Universidad Carlos III de Madrid
  17. 17. NLP APPROACHES FOR TEXT SIMPLIFICATION Abbreviations • Approaches to recognized abbreviations and corresponding expansions:  Pattern-matching methods based on rules and heuristics to detect upper alphanumeric strings • To identify Long form (short form) or Short form (long form)  A sequence of words co-occurs frequently with an abbreviation and the sequence does not occur with other near words => it is an “abbreviation-definition” relationship. LaBDA, Universidad Carlos III de Madrid
  18. 18. NLP APPROACHES FOR TEXT SIMPLIFICATION Text summarization or topic detection • Goal : to obtain a set of sentences that reflects the content • This technique offers accessibility support to editors of web contents to create:  Titles of paragraphs  Sections that faithfully represent the content • Approach:  Automatic text extraction: considering relevant sentences of a text has a big amount of important words  The importance of a word is calculated with a measure that relies on how frequent is a word in a document and in how many documents from a collection the word appears. LaBDA, Universidad Carlos III de Madrid
  19. 19. NLP APPROACHES FOR TEXT SIMPLIFICATION Text Simplification • It is essential in several types of texts: News, Government and administrative information, laws and rights, etc. • There are three subtasks of text simplification 1 Syntactic simplification that divides complex sentences in simplest sentences 2 Lexical simplification whose objective is to replace complex vocabulary by common vocabulary 3 Clarification that provides definitions and explanations. These tasks are not completely automatic, they have to be manually reviewed in some cases. LaBDA, Universidad Carlos III de Madrid
  20. 20. NLP APPROACHES FOR TEXT SIMPLIFICATION Text Simplification Lexical simplification: • Replacing words (taking into account the context) and complex utterances by easier words or phrases. • Heuristic: complex words have a low frequency • Proposals based on frequency give better results compared to other sophisticated systems [Semeval 2012] • Resource: lexical resources as Wordnet are used to extract synonyms as candidates to replace a complex or difficult word. LaBDA, Universidad Carlos III de Madrid
  21. 21. NLP APPROACHES FOR TEXT SIMPLIFICATION Text Simplification Lexical simplification • Complexity measures: frequency of words in texts as well as the length of phrases  FOX index  Flesch-Kinaid These indexes have to be validated by final users LaBDA, Universidad Carlos III de Madrid
  22. 22. NLP APPROACHES FOR TEXT SIMPLIFICATION WCAG 2.0 PLN Approach 2.4.2 (Page Titled) 2.4.6 (Headings and Labels) 2.4.10 (Section Headings) Text summarization 3.1.4 (Abbreviations ) Abbreviations 3.1.3 (Unusual Words) Dictionaries with definition 3.1.5 (Reading Level) Syntactic simplification LaBDA, Universidad Carlos III de Madrid
  23. 23. PROOF OF CONCEPT Lexical Simplification of Drug Package Leaflets • The principal text source of information for patients • This document provides information about a its appearance, actions, side effects and drug interactions, contraindications, special warnings • It is difficult to understand by patients:  Vocabulary is specific, technical.  Long paragraphs, especially those containing lists of side effects.  Using a small font size (9 points) • Problems: Patient misunderstanding could be a potential source of medication errors and adverse drug reactions. LaBDA, Universidad Carlos III de Madrid
  24. 24. PROOF OF CONCEPT Lexical Simplification of Drug Package Leaflets • Goal of the system:  Provide information in an easy and clear way to read. • Medical terms (in particular, drug effects) are translated into lay terms, which patients can understand. LaBDA, Universidad Carlos III de Madrid
  25. 25. PROOF OF CONCEPT Lexical Simplification of Drug Package Leaflets FIRST Module: Named Entity Recognition (NER) • Detects the mentions of drug effects • Use MedDRA (medical multilingual terminology dictionary about events associated with drugs ) • MeaningCloud integrates MedDRA, into GATE LaBDA, Universidad Carlos III de Madrid
  26. 26. PROOF OF CONCEPT Lexical Simplification of Drug Package Leaflets SECOND module: Lexical Simplifier • To Identify the effects whose names are considered complex with the objective of replacing them by a simpler synonym • Two different strategies: preferred term substitution and most frequent term substitution. LaBDA, Universidad Carlos III de Madrid
  27. 27. PROOF OF CONCEPT Lexical Simplification of Drug Package Leaflets SECOND module. Lexical Simplifier • Preferred Term Substitution  MedDRA allows to defining sets of synonyms and providing a preferred term for each set • Cefalalgia (cephalalgia) would be substituted for cefalea (headache) LaBDA, Universidad Carlos III de Madrid
  28. 28. PROOF OF CONCEPT Lexical Simplification of Drug Package Leaflets SECOND module. Lexical Simplifier • Most Frequent Term Substitution  Corpus of MedlinePlus website documents (1,536 documents) • 939 belonging to drug package leaflets • 597 to general health related articles about diseases, effects and diagnoses.  Elasticsearch to index the MedLinePlus documents  Hypothesis: complex terms should be less frequent than simpler terms in the corpus 1) The frequency of each effect in the corpus is calculated 2) an effect will be substituted for its synonym with the highest frequency (if it is not itself) in the corpus. LaBDA, Universidad Carlos III de Madrid
  29. 29. PROOF OF CONCEPT Lexical Simplification of Drug Package Leaflets SECOND module. Lexical Simplifier Synonyms from MedDRA appear in MedLinePlus corpus catarro (nasopharyngitis), 12 resfriado (cold), 48 resfriado común (common cold) 7 síntomas de resfriado (cold symptoms) 6 The complex term replaced by resfriado (cold) LaBDA, Universidad Carlos III de Madrid
  30. 30. PROOF OF CONCEPT Lexical Simplification of Drug Package Leaflets SECOND module. Lexical Simplifieroriginal Muy frecuentes: diarrea e indigestión. Frecuentes: náuseas, vómitos, dolor abdominal. Poco frecuentes: hemorragia. Raros: perforación gástrica, flatulencia, estreñimiento PT Muy frecuentes: diarrea e dispepsia. Frecuentes: náuseas, vómitos, dolor abdominal. Poco frecuentes: hemorragia. Raros: perforación gástrica, flatulencia, estreñimiento freq Muy frecuentes: diarrea e pirosis. Frecuentes: náuseas, vómitos, dolor abdominal. Poco frecuentes: sangrado. Raros: perforación gástrica, gases, estreñimiento LaBDA, Universidad Carlos III de Madrid
  31. 31. CONCLUSIONS • For some people, it is difficult to infer the meaning of an unusual word or phrase from context • Long sentences and complex linguistic structures can cause barriers in access to the text content as indicated in WCAG and E2R guidelines However, these guidelines do not provide precise methods and support (semi) automatic with which to address these accessibility issues concerning to text readable and understandable • PLN approaches with a use of E2R and WCAG 2.0 resources provide the semi-automatic support  Proof of concept: Prototype to simplify drug package leaflet that implements a component for lexical simplificationLaBDA, Universidad Carlos III de Madrid
  32. 32. CONCLUSIONS Work in progress • New approaches to offer support: abbreviations, summaries, definitions of unusual words, etc. • Evaluations by users (In addition, by experts) • Taking into account other important issues as:  Presentation elements  Page structure  Navigation structures LaBDA, Universidad Carlos III de Madrid
  33. 33. REFERENCE Lourdes Moreno, Paloma Martínez, Isabel Segura-Bedmar, and Ricardo Revert. 2015. Exploring language technologies to provide support to WCAG 2.0 and E2R guidelines. In Proceedings of the XVI International Conference on Human Computer Interaction (Interacción '15). ACM, New York, NY, USA, , Article 57 , 8 pages. DOI=http://dx.doi.org/10.1145/2829875.2829927

×