Outline
Ricardo Baeza-Yates
Web Research Group
Universitat Pompeu Fabra
& Yahoo Labs Barcelona
DysWebxia: A Text Accessibi...
OutlineOutline
— What?
!
— Why?
— Goal
!
— Motivation
— Understanding
— Text Presentation
— Text Content
— Integration— Ho...
OutlineMain Goal
Improve Digital
Accessibility
People with
Dyslexia
PhD Thesis Defense — 27th June 2014, Universitat Pompe...
OutlineSecondary Goals
— To have a deeper understanding of dyslexia by analyzing how people
with dyslexia read and write, ...
OutlineWhy?
Dyslexia is a learning
disability characterized by
difficulties with accurate
word recognition and by
poor spell...
Outline
— Neurological origin
— Language specific manifestations
— 8.6% in Spanish (Canary Islands)
— 11.8% in Spanish (Mur...
Outline
— Information access
— Information democratization
— Benefits people without dyslexia
— Benefits others users, e.g. ...
how?
A Multidisciplinary Challenge
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
how?
A Multidisciplinary Challenge
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
OutlineHow Do We Read? Eye Tracking!
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2...
OutlineMethodology - Participants, Equipment
Participants with Dyslexia Control Group
— From 23 to 56 participants
— Nativ...
OutlineMethodology — Materials
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Te...
Outline
— within-subjects design
— between-subject design
Methodology — Design
Qualitative Data
Quantitative Data
Design
D...
OutlineOutline
— What?
!
— Why?
— Goal
!
— Motivation
— Understanding
— Text Presentation
— Text Content
— Applications— H...
Outline
Understanding
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013PhD Thesis D...
how?
A Multidisciplinary Challenge
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
how?
A Multidisciplinary Challenge
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
OutlineWhy Errors?
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Understanding
...
OutlineDyslexia in the Web
[Rello & Baeza-Yates, New Review of Hypermedia and Multimedia, 2012]
English Spanish
How people...
OutlineAre there Linguistic Foundations?
Written Errors by People with Dyslexia
[Rello & Llisterri, LDW 1012 ]
[Rello, Bae...
Outline
Please read this text. It is just an example but helps
to underztand how we read text. A text can be
legivle but t...
Outline
Demographic Questionnaire
Writing/memory test
Variant B
Comprehension Test
Comprehension Test
Comprehension Test
C...
OutlineResults — Lexical Quality
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
...
OutlineHow Fast You Can Read This?
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
OutlineHow Well We Process Text?
[Baeza-Yates & Rello, to be submitted, 2014]
How people with dyslexia read and what can H...
OutlineDo They See the Errors?
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Un...
OutlineContributions
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Understandin...
Outline
Text Presentation
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013PhD Thes...
how?
A Multidisciplinary Challenge
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
how?
A Multidisciplinary Challenge
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
OutlineConditions Studied
— Font type
— Font size
— Font grey scale & background grey scale
— Color pairs
— Character spac...
OutlineWhy Fonts?
Fonts Designed for Dyslexia
User Studies
What is missing?
!
Evidence via
quantitative
data
!
!
!
Partici...
OutlineMethodology — Design
Italics 
roman
!
italic
Serif  
sans serif
!
serif
Spacing  
monospace
!
proportional
Independ...
OutlineMethodology — Design
[Rello & Baeza-Yates, ASSETS 2013]
Times
Times Italic
Verdana
[±Italic] [ Italic]
[+Italic]
[±...
OutlineResults — Fixation Duration
Fixation Duration: χ2 (11) = 93.63, p < 0.001
D group
Understanding
Text Presentation
T...
OutlineResults — Fixation Duration
Fixation Duration: χ2 (11) = 93.63, p < 0.001
D group
Understanding
Text Presentation
T...
OutlineResults — Fixation Duration
Fixation Duration: χ2 (11) = 93.63, p < 0.001
D group
Understanding
Text Presentation
T...
OutlineResults — Fixation Duration
Fixation Duration: χ2 (11) = 93.63, p < 0.001
D group
Understanding
Text Presentation
T...
OutlineResults
Partial order obtained from Reading Time and Preference Ratings
D group
[Rello & Baeza-Yates, ASSETS 2013]
...
Outline
— Font types have an impact on readability of people (with/out dyslexia)
!
— OpenDys and OpenDys It. did not lead ...
OutlineText Presentation - Conditions
— Font type
— Font size
— Font grey scale & background grey scale
— Color pairs
— Ch...
OutlineText Presentation — Web
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
[R...
OutlineContributions
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
—  Larger fo...
Outline
Text Content
PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
how?
A Multidisciplinary Challenge
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
how?
A Multidisciplinary Challenge
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 201...
OutlineProblems of Dyslexia
Surface Dyslexia
— Less frequent words: prístino
— Long words: colecciones
— Substitutions of ...
OutlineHow NLP can Help?
Difficulties
Orthography & Phonology
Derivational errors
New words
Pseudo-words
Less frequent words...
OutlineMethodology — Design
[+LONG]
[−LONG]
prestidigitador (3.75 shorter)
!
mago
[+FREQUENT]
[−FREQUENT]
ataques (474 tim...
OutlineResults — Word-frequency
0.1 0.15 0.2 0.25 0.3 0.35 0.4
10
20
30
40
50
60
70
80
90
Mean fixation duration (s)
Visit...
OutlineResults — Word-length
— The presence of short
words compared to long words
increases comprehensibility
for people w...
OutlineNext Steps?
Understanding
Text Presentation
Text Content
Integration
Implement and
evaluate a lexical
simplification...
OutlineWhat has Been Done so far?
Experimental psychology 

and word processing
Accessibility studies about 

people with ...
OutlineEvaluation of Simplification Strategies
Independent variable
(counter-balanced order)
Lexical simplification
ORIGINAL...
—  Same genre: Scientific American
—  Similar topics: reports from Nature
!
—  Same discourse structure
!
!
!
!
—  Same num...
OutlineResults — Objective Measures
r = 0.625r = 0.994 r = 0.429
Group D Group N
No effects!
[Rello, Baeza-Yates, Bott & Sa...
OutlineResults — Subjective Measures
Subject. Readability
Subject. Comprehension
H(3) = 9.595, p = 0.022
[SubsBest] more d...
OutlineResults
[Rello, Baeza-Yates, Bott & Saggion, W4A 2013 (best paper award)]
Lexical
Simplification
substitution negati...
OutlineNext Steps?
implement and evaluate a
lexical simplification
algorithm
via synonyms on
demand is helpful
Lexical Simp...
Outline
What is missing?Resources for Lexical Simplification
in Spanish
What has Been Done so far?
resource containing list...
Outline
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
— Google Books N-gram Cor...
Outline
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
CASSA
Synonyms Resource f...
OutlineMethodology — Design
[Rello & Baeza-Yates, W4A 2014
(best paper award runner-up)]
Understanding
Text Presentation
T...
Outline
Results
— Synonymy & Simplicity
— Ratings of Group N significantly higher than Group G 

for all the conditions
!
—...
Outline
— Word frequency 

— Word length
— Numerical Representation
— Paraphrases
— Graphical Schemes
— Keywords
Condition...
OutlineContributions
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
—  Frequent ...
Outline
Integrating
Form and Content
PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
Outline
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Values with positive e↵ec...
Outline
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Text Presentation
Recomme...
Outline
Text Content
Recommendations
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2...
Outline
Text Content
Recommendations
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2...
how?
Applications
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
IDEAL
e-Book re...
OutlineIDEAL eBook Reader
[Kanvinde, Rello & Baeza-Yates, ASSETS 2012 (demo)]
— 35,000 downloads
— Finalist - Vodafone Fou...
‘Simpler’
Ideal
Configuration
Font
Synonyms
Color
Helvetica
Outline
[Rello, Baeza-Yates, Saggion, Bayarri & Barbosa, ASSETS...
OutlineText4all DysWebxia
[Rello, Baeza-Yates, Bott, Saggion, Carlini, Bayarri, Gorriz, Kanvinde, Gupta, Topac 2013 (chall...
Tools Overview
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Understanding
Text...
OutlineOngoing Work
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Understanding...
OutlineMain Contributions
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
!
— A n...
OutlineAcknowledgments
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
Ricardo Ba...
Thank you
How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013
luzrello@acm.org
PhD Th...
Upcoming SlideShare
Loading in …5
×

Luz rello - Ph.D. Thesis presentation - DysWebxia: A Text Accessibility Model for People with Dyslexia

8,662 views

Published on

Ph.D. Presentation
Title: DysWebxia: A Text Accessibility Model for People with Dyslexia
Author: Luz Rello
Advisors: Ricardo Baeza-Yates and Horacio Saggion

Abstract: Worldwide, 10% of the population has dyslexia, a cognitive disability that reduces readability and comprehension of written information. The goal of this thesis is to make text more accessible for people with dyslexia by combining human computer interaction validation methods and natural language processing techniques. In the initial phase of this study we examined how people with dyslexia identify errors in written text. Their written errors were analyzed and used to estimate the presence of text written by individuals with dyslexia in the Web. After concluding that dyslexic errors relate to presentation and content features of text, we carried out a set of experiments using eye tracking to determine the conditions that led to improved readability and comprehension. After finding the relevant parameters for text presentation and content modification, we implemented a lexical simplification system. Finally, the results of the investigation and the resources created, lead to a model, DysWebxia, that proposes a set of recommendations that have been successfully integrated in four applications.

Published in: Science, Education

Luz rello - Ph.D. Thesis presentation - DysWebxia: A Text Accessibility Model for People with Dyslexia

  1. 1. Outline Ricardo Baeza-Yates Web Research Group Universitat Pompeu Fabra & Yahoo Labs Barcelona DysWebxia: A Text Accessibility Model for People with Dyslexia Advisors: PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona Luz Rello Horacio Saggion Natural Language Processing Group Universitat Pompeu Fabra Barcelona
  2. 2. OutlineOutline — What? ! — Why? — Goal ! — Motivation — Understanding — Text Presentation — Text Content — Integration— How? — Methodology PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona Applications
  3. 3. OutlineMain Goal Improve Digital Accessibility People with Dyslexia PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  4. 4. OutlineSecondary Goals — To have a deeper understanding of dyslexia by analyzing how people with dyslexia read and write, using their misspelling errors as a starting point. ! — To find out the best text presentation parameters which benefit the reading performance –readability and comprehension– of people with dyslexia. ! — To find out the text content modifications that benefit the reading performance of people with dyslexia. ! — To propose a set of recommendations combining the positive results, and integrate them in reading applications for people with dyslexia. PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  5. 5. OutlineWhy? Dyslexia is a learning disability characterized by difficulties with accurate word recognition and by poor spelling and decoding abilities ! ! ! As side effect, this impedes the growth of vocabulary and background knowledge. Children with dyslexia tend to show signs of depression and low self- esteem [Vellutino et al., 2004] [International Association of Dyslexia, 2011][Shaywitz, 2008] PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  6. 6. Outline — Neurological origin — Language specific manifestations — 8.6% in Spanish (Canary Islands) — 11.8% in Spanish (Murcia) — 10 - 17.5% of the USA population — 10.8% English speaking children How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 — Most frequent signal — 15.2% in Europe — 25% in Spain — 4 of 6 cases are related to dyslexia Frequent ! ! ! ! ! Universal ! ! ! ! School Failure Dyslexia [International Dyslexia Association, 2011] [European Commission, 2011] [Eurostat, 2011] [Spanish Federation of Dyslexia, 2008] [Vellutino et al., 2004] [Brunswick, 2010] [Jiménez et al. 2009] [Carrillo et al. 2011] [National Academy of Sciences, 1987] [Shaywitz et al. 1992] PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  7. 7. Outline — Information access — Information democratization — Benefits people without dyslexia — Benefits others users, e.g. low vision How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 — Digital format — eBook sales increased by 115.8% (January 2011) Human Right ! ! ! ! Good for Dyslexia, 
 Useful for All ! ! ! Right Moment Dyslexia [Dixon, 2007]
 [McCarthy & Swierenga, 2010] [Evett & Brown, 2005] [United Nations Committee of the General Assembly, 2006] [Association of American Publishers, 2011] PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  8. 8. how? A Multidisciplinary Challenge How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Which problems dyslexic people experience? Are there linguistic foundations? Linguistics Cognitive Neuroscience Natural Language Processing How NLP could help dyslexic people? How text presentation could help people with dyslexia? Human Computer Interaction Eye-trackingHow can we measure the reading performance? PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  9. 9. how? A Multidisciplinary Challenge How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Eye-trackingHow can we measure the reading performance? PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  10. 10. OutlineHow Do We Read? Eye Tracking! How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Every dot is a fixation point PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona https://www.youtube.com/watch?v=P1dRqpRi4csSee VIDEO here:
  11. 11. OutlineMethodology - Participants, Equipment Participants with Dyslexia Control Group — From 23 to 56 participants — Native Spanish speakers — Confirmed diagnosis of dyslexia — Ages ranging from 11 to 56 
 (average around 20 - 21 years depending on the experiment) — Participants with attention deficit disorder — Frequent users of Internet and frequent readers — Education — Same number — Idem ! — Mapped ! ! ! ! — Similar — Similar ! — Tobii T50 (17-inch TFT monitor) Eye-Tracker How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  12. 12. OutlineMethodology — Materials How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Text Presentation —  Controlled Comprehension 
 Questionnaires — Multiple choice tests —  Literal and inferential questions. — Correct, partially correct and wrong answers 1 2 3 4 5 muy fácil ‘very easy’ muy difícil ‘very difficult’ Facilidad comprensión ‘Ease of understanding’Subjective Ratings Base Texts —  Same genre —  Similar topics —  Same number of sentences —  Same number of words — Similar average word length — Same number of unique named entities, 
 foreign words and same number/
 type of numerical expressions + Text modifications (Independent variables) Facilidad de Comprensión PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  13. 13. Outline — within-subjects design — between-subject design Methodology — Design Qualitative Data Quantitative Data Design Dependent Variables Statistical Tests (conditions in counterbalanced order) Likert scales Eye tracking Questionnaires PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  14. 14. OutlineOutline — What? ! — Why? — Goal ! — Motivation — Understanding — Text Presentation — Text Content — Applications— How? — Methodology PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  15. 15. Outline Understanding How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  16. 16. how? A Multidisciplinary Challenge How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Which problems dyslexic people experience? Are there linguistic foundations? Linguistics Cognitive Neuroscience Natural Language Processing How NLP could help dyslexic people? How text presentation could help people with dyslexia? Human Computer Interaction Eye-trackingHow can we measure the reading performance? PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  17. 17. how? A Multidisciplinary Challenge How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Which problems dyslexic people experience? Are there linguistic foundations? Linguistics Cognitive Neuroscience PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  18. 18. OutlineWhy Errors? How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration ! Dyslexia — Studying dyslexia — Diagnosing dyslexia — Accessibility tools ! ! The Web — Detecting spam — Measuring quality Source of Knowledge Errors [Treiman, 1997] 
 [Lindgrén & Laine, 2011] [Schulte-Körne et al. 1996] [Pedler, 2007] [Piskorski et al. 2008] [Gelman & Barletta, 2008] PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  19. 19. OutlineDyslexia in the Web [Rello & Baeza-Yates, New Review of Hypermedia and Multimedia, 2012] English Spanish How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  20. 20. OutlineAre there Linguistic Foundations? Written Errors by People with Dyslexia [Rello & Llisterri, LDW 1012 ] [Rello, Baeza-Yates & Llisterri, LREC 2014] How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Analysis Visual & Phonetic Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  21. 21. Outline Please read this text. It is just an example but helps to underztand how we read text. A text can be legivle but this does not mean that it will be compreensible. Hence, we habe to take care about the presantation of a text as well as the lexical, syntactic, and semmantical levels of its content. How Do We Process Text? How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona Test
  22. 22. Outline Demographic Questionnaire Writing/memory test Variant B Comprehension Test Comprehension Test Comprehension Test Comprehension Test Variant A Text 1: 16% errors Text 2: 16% errors Text 2: 16% errors Text 1: 16% errors Error Perception Test Error Perception Test — 0 or 12/75 words (16% errors) — dyslexic — unique Errors priosridad presupuetsos indutricas implse [Rello & Baeza-Yates, WWW 2012 (poster)] Does Lexical Quality Matters? How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Error Awareness Dependent Measure Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  23. 23. OutlineResults — Lexical Quality How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 ρ = 0.799 
 (p < 0.001) Group D no effects! Group N (p = 0.08) Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona [Rello & Baeza-Yates, WWW 2012 (poster)]
  24. 24. OutlineHow Fast You Can Read This? How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Olny srmat poelpe can raed tihs ! ! I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg. Due to the phaonmneal pweor of the hmuan mnid, aoccdrnig to a raerscheer at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, t he olny iprmoatnt tihng is taht the frist and lsat ltteer are in the rgh it pclae. The ruslet can be a taotl mses but you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Amzanig huh? Yaeh and I awlyas tghuhot taht slpeling was ipmorantt! Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  25. 25. OutlineHow Well We Process Text? [Baeza-Yates & Rello, to be submitted, 2014] How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 How important is the order in our internal representation of words? Words with Errors 50.0 62.5 75.0 87.5 100.0 No errors 8% errors 16% errors 50% errors Without Dyslexia With Dyslexia Comprehension Score (%) Reading Time also increases Words with Errors Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  26. 26. OutlineDo They See the Errors? How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona https://www.youtube.com/watch?v=P1dRqpRi4csSee VIDEO here:
  27. 27. OutlineContributions How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration — The presence of errors written by people with dyslexia in the text does not impact the reading performance of people with dyslexia, while it does for people without dyslexia. —  Normal –correctly written– texts present more difficulties for people with dyslexia than for people without dyslexia. To the contrary, texts with jumbled letters present similarly difficulties, for both, people with and without dyslexia. —  Lexical quality is a good indicator for text readability and comprehensibility, except for people with dyslexia. — Written errors by people with dyslexia are phonetically and visually motivated. The most frequent errors involve the letter without a one-to- one correspondence between grapheme and phone. Most of the substitution errors share phonetic features and the letters tend to have certain visual features, such as mirror and rotation features. —  The rate of dyslexic errors is independent from the rate of spelling errors in web pages. Around 0.67% and 0.43% of the errors in the Web are dyslexic errors for English and Spanish, respectively. These rates are smaller than expected probably due to spelling correction aids. Rello L., Baeza-Yates R., and Llisterri, J. DysList: An Annotated Resource of Dyslexic Errors. In: Proc. LREC’14. Reykjavik, Ice- land; 2014. p. 26–31. Rello L., and Llisterri, J. There are Phonetic Patterns in Vowel Substitution Errors in Texts Written by Persons with Dyslexia. In: 21st Annual World Congress on Learning Disabilities (LDW 2012). Oviedo, Spain; 2012. p. 327–338 Rello L., and Baeza-Yates R. The Presence of English and Spanish Dyslexia in the Web. New Review of Hypermedia and Multimedia. 2012;8. p. 131–158 PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  28. 28. Outline Text Presentation How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  29. 29. how? A Multidisciplinary Challenge How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 How text presentation could help people with dyslexia? Human Computer Interaction Which problems dyslexic people experience? Are there linguistic foundations? Linguistics Cognitive Neuroscience Natural Language Processing How NLP could help dyslexic people? Eye-trackingHow can we measure the reading performance? PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  30. 30. how? A Multidisciplinary Challenge How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 How text presentation could help people with dyslexia? Human Computer Interaction PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  31. 31. OutlineConditions Studied — Font type — Font size — Font grey scale & background grey scale — Color pairs — Character spacing — Line spacing — Paragraph spacing — Column width How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Text Presentation Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  32. 32. OutlineWhy Fonts? Fonts Designed for Dyslexia User Studies What is missing? ! Evidence via quantitative data ! ! ! Participants ! ! ! More fonts Most frequent fonts Recommendations The British Dyslexia Association sans-serif
 fonts — Arial — no italics — no fancy fonts Sylexiad, OpenDyslexic, 
 Dyslexie & Read Regular — Arial and Dyslexie — word-reading test — 21 students [De Leeuw, 2010] [Rello & Baeza-Yates, ASSETS 2013] What has been done so far? Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  33. 33. OutlineMethodology — Design Italics  roman ! italic Serif   sans serif ! serif Spacing   monospace ! proportional Independent variables [Rello & Baeza-Yates, ASSETS 2013] Understanding Text Presentation Text Content Integration Dyslexic   specially designed ! not specially designed PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  34. 34. OutlineMethodology — Design [Rello & Baeza-Yates, ASSETS 2013] Times Times Italic Verdana [±Italic] [ Italic] [+Italic] [± Serif] [ Serif] [+Serif] [±Monospace] [ Monospace] [+Monospace] [±Dyslexic] [ Dyslexic] [+ Dyslexic] [±Dyslexic It.] [ Dyslexic It.] [+ Dyslexic It.] Dependent Reading Time (objective readability) Variables Fixation Duration Preference Rating (subjective preferences) Control Variable Comprehension Score (objective comprehensibility) Participants Group D (48 participants) 22 female, 26 male Age: range from 11 to 50 (¯x = 20.96, s = 9.98) Education: high school (26), university (19), no higher education (3) Group N (49 participants) (28 female, 21 male) age range from 11 to 54 (¯x = 29.20, s = 9.03) Education: high school (17), university (27), no higher education (5) Materials Texts 12 story beginnings Text Presentation Comprehension Quest. 12 literal items (1 item/text) Preferences Quest. 12 items (1 item/condition) Equipment Eye tracker Tobii 1750 Procedure Steps: Instructions, demographic questionnaire, reading task (⇥ 12), comprehension questionnaire (⇥ 12), preferences questionnaire (⇥ 12) Table 9.2: Methodological summary for the Font Experiment. Font Experiment Design Within-subjects Independent Font Type Arial Variables Arial Italic Computer Modern Unicode (CMU) Courier Garamond Helvetica Myriad OpenDyslexic OpenDyslexic Italic Times Times Italic Verdana [±Italic] [ Italic] [+Italic] [± Serif] [ Serif] [+Serif] [±Monospace] [ Monospace] [+Monospace] [±Dyslexic] [ Dyslexic] [+ Dyslexic] [±Dyslexic It.] [ Dyslexic It.] [+ Dyslexic It.] Dependent Reading Time (objective readability) Variables Fixation Duration Preference Rating (subjective preferences) Control Variable Comprehension Score (objective comprehensibility) Participants Group D (48 participants) 22 female, 26 male Age: range from 11 to 50 (¯x = 20.96, s = 9.98) Base Texts — comparable —  Same genre —  Same discourse structure —  Same number of sentences: 11 —  Same number of words: 60 — Similar word length 
 (from 4.92 to 5.87 letters) — No acronyms, foreign words, or numerical expressions — 12 different texts — 12 different fonts 
 (counter-balanced) Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  35. 35. OutlineResults — Fixation Duration Fixation Duration: χ2 (11) = 93.63, p < 0.001 D group Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  36. 36. OutlineResults — Fixation Duration Fixation Duration: χ2 (11) = 93.63, p < 0.001 D group Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  37. 37. OutlineResults — Fixation Duration Fixation Duration: χ2 (11) = 93.63, p < 0.001 D group Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  38. 38. OutlineResults — Fixation Duration Fixation Duration: χ2 (11) = 93.63, p < 0.001 D group Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  39. 39. OutlineResults Partial order obtained from Reading Time and Preference Ratings D group [Rello & Baeza-Yates, ASSETS 2013] Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  40. 40. Outline — Font types have an impact on readability of people (with/out dyslexia) ! — OpenDys and OpenDys It. did not lead to a better or worse read ! Values with positive e↵ects for Condition Measures with Dyslexia without Dyslexia Font Type Obj. Readability Arial Arial Courier Courier CMU CMU Helvetica Verdana Preferences Verdana Verdana Helvetica Helvetica Arial Arial Recommendation: Arial, Courier, CMU, Helvetica, and Verdana. Font Face Obj. Readability roman roman sans serif sans serif monospaced monospaced Preferences roman roman sans serif no e↵ects no e↵ects proportional Recommendation: roman, sans serif and monospaced. Font Size Obj. Readability 18, 22 and 18, 22 and 26 points 26 points Obj. Comprehensibility 18, 22 and 14, 18, 22 and [Rello & Baeza-Yates, ASSETS 2013] Understanding Text Presentation Text Content Integration Results PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  41. 41. OutlineText Presentation - Conditions — Font type — Font size — Font grey scale & background grey scale — Color pairs — Character spacing — Line spacing — Paragraph spacing — Column width dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia black/ white off-black/ off-white black/ yellow blue/ white dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia grey scale: 0% black/ creme dark brown/ light mucky green brown/ mucky green blue/ yellow 25% 50% 75% dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia dyslexia black/ white off-black/ off-white black/ yellow blue/ white dyslexia dyslexia dyslexia dyslexia exia exia exia exia grey scale: 0% black/ creme dark brown/ light mucky green brown/ mucky green blue/ yellow char. spacing: +14% +7% 0% –7% 25% 50% 75% dyslexia dyslexia dyslexia dyslexia size: 14 p. 18 p. 22 p. 24 p. [Rello, Kanvinde & Baeza-Yates, W4A 2012] How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  42. 42. OutlineText Presentation — Web How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 [Rello, Pielot, Marcos & Carlini, W4A 2013] Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  43. 43. OutlineContributions How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 —  Larger font sizes improve the readability, especially for people with dyslexia. — Larger character spacing improve readability for people with and without dyslexia. — For reading web text, font size of 18 points ensures good subjective and objective readability and comprehensibility. —  Sans serif, monospaced, and roman font types increase the readability of people with and without dyslexia, while italic fonts decrease it. —  Good fonts for people with dyslexia are Helvetica, Courier, Arial, Verdana and CMU, taking into consideration both, reading performance and subjective preferences. Rello, L. and Baeza-Yates, R. Good Fonts for Dyslexia. Proc. ASSETS’13. Bellevue, Washington, USA: ACM Press; 2013. Rello & Baeza-Yates, How to Present more Readable Text for People with Dyslexia. An eye-tracking study on text colors, size and spacings. To appear in Universal Access in the Information Society (UAIS). Rello, L., Kanvinde, G., Baeza-Yates, R. Layout guidelines for web text and a web service to improve accessibility for dyslexics. In: Proc. W4A 2012. Lyon, France: ACM Press; 2012. Rello L., Pielot M., Marcos, MC., and Carlini R. Size Matters (Spacing not): 18 Points for a Dyslexic-friendly Wikipedia. In: Proc. W4A ’13. Rio de Janeiro, Brazil: ACM Press; 2013. Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  44. 44. Outline Text Content PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  45. 45. how? A Multidisciplinary Challenge How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Natural Language Processing How NLP could help dyslexic people? Which problems dyslexic people experience? Are there linguistic foundations? Linguistics Cognitive Neuroscience How text presentation could help people with dyslexia? Human Computer Interaction Eye-trackingHow can we measure the reading performance? PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  46. 46. how? A Multidisciplinary Challenge How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Natural Language Processing How NLP could help dyslexic people? PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  47. 47. OutlineProblems of Dyslexia Surface Dyslexia — Less frequent words: prístino — Long words: colecciones — Substitutions of functional words: para, por — Confusions of small words: en, el, es Phonology — Irregular words: vase — Homophonic words or pseudo homophonic words ! — Foreign words Discourse — Long sentences — Long paragraphs Orthography — Orthographically similar words: homo, horno — Alternation of different typographical cases: ElefANte Morphology — Derivational errors: *inmacularidad Phonological Dyslexia Lexicon & Syntax — New words: chocaviar — Pseudo–words and non–words: maledo Cognitive Neuroscience Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  48. 48. OutlineHow NLP can Help? Difficulties Orthography & Phonology Derivational errors New words Pseudo-words Less frequent words Long words Functional words Small words Morphology, Lexicon & Syntax Strong visual thinkers Pattern Recognition Visual Thinking NLP Orthographically similar Misspellings Irregular words Homophonic words Pseudo-homophonic words Foreign words Strengths Orthographic and Phonetic Similarity Measures Corpus Analyses Lexical Simplification ! Syntactic Simplification — Word frequency 
 — Word length — Numerical 
 Representation — Paraphrases Discourse Simplification Long sentences Long paragraphs Discourse — Graphical 
 Schemes — Keywords How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Content Conditions Understanding Text Presentation Text Content Integration — Errors PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  49. 49. OutlineMethodology — Design [+LONG] [−LONG] prestidigitador (3.75 shorter) ! mago [+FREQUENT] [−FREQUENT] ataques (474 times more freq.)! ! refriegas Word Frequency and Word Length Experiments Design within-subjects Word Frequency Experiment Independent [±Frequent] [+Frequent] Variables [ Frequent] Word Length Experiment [±Long] [+Long] [ Long] Dependent Reading Time (Objective readability) Variables (Sec. 3.1.1) Fixation Duration Comprehension Score (Objective comprehensibility) Participants Group D (23 participants) 12 female, 11 male Age: range from 13 to 37 (¯x = 20.74, s = 8.18) Education: high school (11), university (10), no higher education (2) Reading: more than 8 hours (13.0%), 4-8 hours (39.1%), less than 4 hours/day (47.8%) Group N (23 participants) (13 female, 10 male) Age: range from 13 to 35 (¯x = 20.91, s = 7.33) Education: high school (6), university (16), no higher education (1) Reading: more than 8 hours (4.3%), 4-8 hours (52.2%), less than 4 hours/day (43.5%) Materials Texts 4 texts (2 texts/experiment) Synonym Pairs 15 in Word Frequency Exp. 6 in Word Length Exp. Text Presentation Compren. Quest. 8 inferential items (2 items/text) Equipment Eye tracker Tobii 1750 Procedure Steps: (per experiment) Instructions, demographic questionnaire, reading task (⇥ 2), comprehension questionnaire (⇥ 2), and preferences questionnaire (⇥ 2) Target 
 Words — common names — non ambiguous names — no compound nouns — no foreign words — no homophonic words Base Texts — comparable Frequency — relative frequencies
 (one order of magnitude) — no short words Length — at least double 
 the length — longest words Comprehension 
 Questionnaires — inferential questions Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  50. 50. OutlineResults — Word-frequency 0.1 0.15 0.2 0.25 0.3 0.35 0.4 10 20 30 40 50 60 70 80 90 Mean fixation duration (s) Visitduration(s) −freq +dys +freq +dys −freq −dys +freq −dys Fixation duration (sec.) R eadability axis ReadingTime(sec.) 0.1 0.15 0.2 0.25 0.3 0.35 0.4 90 80 70 60 50 40 30 20 10 Group N: [+Frequent] [–Frequent] Group D: [+Frequent] [–Frequent]−freq +dys +freq +dys −freq −dys +freq −dys −freq +dys +freq +dys −freq −dys +freq −dys −freq +dys +freq +dys −freq −dys +freq −dys −freq +dys +freq +dys −freq −dys +freq −dys — A larger number of high frequency words 
 increases readability for people with dyslexia. ! Reading Time t(33.488)=−2.120, p=0.035 Fixation Duration t(35.741)=−2.150, p=0.038 — No effects for Group N [Rello, Baeza-Yates, Dempere & Saggion, INTERACT 2013] Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  51. 51. OutlineResults — Word-length — The presence of short words compared to long words increases comprehensibility for people with dyslexia. ! Comprehension Score t(38.636) = −2.396, p = 0.022 ! — No effects for Group N [Rello, Baeza-Yates, Dempere & Saggion, INTERACT 2013] Understanding Text Presentation Text Content Integration — A total dissociation of frequency and 
 length is not possible — Word frequency and word length are 
 naturally related in language [Jurafsky et al., 2001] Limitations PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  52. 52. OutlineNext Steps? Understanding Text Presentation Text Content Integration Implement and evaluate a lexical simplification algorithm Find out how to make lexical simplification useful Lexical Simplification PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  53. 53. OutlineWhat has Been Done so far? Experimental psychology 
 and word processing Accessibility studies about 
 people with dyslexia What is missing? Spanish Word length Interaction strategies ! ! ! Automatic ! ! Natural language processing and lexical simplification detect — complex words 
 (Frequency) substitute — dictionaries — Wordnet — ontologies Frequent & 
 long words Content [Rello, Baeza-Yates, Bott & Saggion, W4A 2013 (best paper award)] Understanding Text Presentation Text Content Integration Design PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  54. 54. OutlineEvaluation of Simplification Strategies Independent variable (counter-balanced order) Lexical simplification ORIGINAL SUBSBEST SHOWSYNS GOLD laptop iPad Android device [Rello, Baeza-Yates, Bott & Saggion, W4A 2013 (best paper award)] Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  55. 55. —  Same genre: Scientific American —  Similar topics: reports from Nature ! —  Same discourse structure ! ! ! ! —  Same number of sentences: 11 —  Same number of words: 302 — No acronyms nor numbers OutlineMethodology — Design Lexical Simplification Experiment. Design Within-subjects Independent Lexical Simplification [Orig] Variables Strategy [SubsBest] [ShowSyns] [Gold] Dependent Reading Time (objective readability) Variables Fixation Duration Comprehension Score (objective comprehensibility) Subject. Readability Rating (subjective readability) Subject. Comprehension Rating (subjective comprehensibility) Subject. Memorability Rating (subjective memorability) Participants Group D (47 participants) 28 female, 19 male Age: range from 13 to 50 (¯x = 24.36, s = 10.19) Education: high school (18), university (26), no higher education (3) Group N (49 participants) (29 female, 20 male) Age: range from 13 to 40 (¯x = 28.24, s = 7.24) Education: high school (16), university (31), no higher education (2) Materials Base Texts 2 texts Word Substitutions 34 per text (in [SubsBest]), and 40/44 per text (in [Gold]) Synonyms on-demand 100/110 synonyms for 50/55 words per text (in [ShowSyns]) Text Presentation Comprehension Quest. 6 inferential items (3 per text) Sub. Readability Quest. 2 likert scales (1/condition level) Sub. Comprehension Quest. 2 likert scales (1/condition level) Sub. Memorability Quest. 2 likert scales (1/condition level) Equipment Eye tracker Tobii 1750, Samsung Galaxy Ace S5830 iPad 2, and MacBook Air Procedure Steps: Instructions, demographic questionnaire, text choosing, reading task, comprehension questionnaires, sub. readability quest. sub. comprehension quest., and subjective memorability quest. [Rello, Baeza-Yates, Bott & Saggion, W4A 2013 (best paper award)] 1&2p — Intro 3p — Background 4p — Details Target Words Base Texts Engagement Choose the text you like! Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  56. 56. OutlineResults — Objective Measures r = 0.625r = 0.994 r = 0.429 Group D Group N No effects! [Rello, Baeza-Yates, Bott & Saggion, W4A 2013 (best paper award)] Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  57. 57. OutlineResults — Subjective Measures Subject. Readability Subject. Comprehension H(3) = 9.595, p = 0.022 [SubsBest] more difficult than [Original] 
 (p = 0.003) and [ShowSyns] (p = 0.047) H(3) = 9.020, p = 0.029 [SubsBest] significantly more difficult 
 than [Gold] (p = 0.003) Group D Group N Subject. Comprehension Subject. Memorability ● ● ● Dys.Gold Dys.lesSIS Dys.lexSIS Dys.Original 0.100.150.200.25 Font Size FixationDurationMe ● ● ● Dys.Gold Dys.lesSIS Dys.lexSIS Dys.Original 0.100.150.200.25 Font Size FixationDurationMe ● ● ● Dys.Gold Dys.lesSIS Dys.lexSIS Dys.Original 0.100.150.200.25 Font Size FixationDurationMe ● Dys.Gold D 50100150200 FixationDurationMe ● Dys.Gold Dys.lesSIS 50100150200 FixationDurationMe Dys.Gold 50100150200 FixationDurationMe oup D Group N 4118 3.888889 Original 0.1597582109 8235 3.700000 LexSIS 2857 4.142857 Dyswebxia 7500 4.375000 Gold oup D Group N 5294 4.444444 Original -0.084924633 7059 3.800000 LexSIS 7143 4.285714 Dyswebxia 0000 4.250000 Gold D Group N 9 4.222222 Original 0.2410992628 3 3.900000 LexSIS 4 4.357143 Dyswebxia 0 4.250000 Gold 294118 3.888889 Original 588235 3.700000 LexSIS 142857 4.142857 Dyswebxia 437500 4.375000 Gold 1 2 3 4 5 Readability Group D Group N 1 2 3 4 5 Understandability Group D Group N (ave.) (ave.) Very bad Very good Very bad Very good [Original] [SubsBest] [ShowSyns] [Gold] 1 2 3 4 5 Memorability Group D Group N Very bad Very good (ave.) [Original] [SubsBest] [ShowSyns] [Gold] [Original] [SubsBest] [ShowSyns] [Gold] [Original][SubsBest][Gold] 50100150200 0.100.150.200.25 [Gold] Group D Group N H(3) = 8.275, p = 0.041 [ShowSyns] easier than [Gold] 
 (p = 0.034) and [Original] (p = 0.034) H(3) = 12.197, p = 0.007 [ShowSyns] easier than [SubsBest] 
 (p = 0.013) and [Original] (p = 0.001) [Rello, Baeza-Yates, Bott & Saggion, W4A 2013 (best paper award)] Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  58. 58. OutlineResults [Rello, Baeza-Yates, Bott & Saggion, W4A 2013 (best paper award)] Lexical Simplification substitution negatively affects the reading experience does not help objective readability comprehension subjective measures interaction matters showing synonyms on-demand makes texts more comprehensible and more readable help to get out of the vicious circle Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  59. 59. OutlineNext Steps? implement and evaluate a lexical simplification algorithm via synonyms on demand is helpful Lexical Simplification language resource of synonyms 
 available to be used in tools Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  60. 60. Outline What is missing?Resources for Lexical Simplification in Spanish What has Been Done so far? resource containing lists of synonyms ranked by their complexity — no Simple Wikipedia in Spanish ! — Simplext Corpus (200 news articles) 6,595 words original and 3,912 words 
 simplified ! — Spanish OpenThesaurus (SpOT) 21,378 target words (lemmas), 
 44,348 different word senses ! — EuroWordNet 
 50,526 word meanings, 23,370 synsets Understanding Text Presentation Text Content Integration [Baeza-Yates, Rello & Dembowski, to be submitted] PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  61. 61. Outline How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 — Google Books N-gram Corpus (5-grams) in Spanish  (8,116,746 books, over 6% of all books, 83,967,471,303 tokens Output: Dyslexia Features — Analysis of Corpus 
 of dyslexic errors + CASSA 
 Simpler Synonyms Ranking Relative Web Frequency — CASSA Resource Input: Word Candidates Relative Web Frequency Filters — Valid words — Proper names — Stop words + Lemmatization Complexity 
 Detection — List of Senses 
 (from Spanish 
 OpenThesaurus)
 — Web Frequencies Context Frequency Word Sense
 Disambiguation — List of Senses 
 — Google Books 
 n-gram Corpus 
 Context Frequencies Understanding Text Presentation Text Content Integration [Baeza-Yates, Rello & Dembowski, to be submitted] Context Aware Synonym Simplification Algorithm PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  62. 62. Outline How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 CASSA Synonyms Resource for Spanish CASSA disambiguated CASSA baseline (Frequency) Understanding Text Presentation Text Content Integration [Baeza-Yates, Rello & Dembowski, to be submitted] PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  63. 63. OutlineMethodology — Design [Rello & Baeza-Yates, W4A 2014 (best paper award runner-up)] Understanding Text Presentation Text Content Integration Evaluation Dataset — 80 target words HIGH freq. LOW freq. — Contexts and sentences 
 (20th, 21st Century books) vs. 130 [Biran et al. 2011] and 200 [Yatskar et al. 2010] PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  64. 64. Outline Results — Synonymy & Simplicity — Ratings of Group N significantly higher than Group G 
 for all the conditions ! —  Low frequency: better results for all ratings and 
 conditions ! —  CASSA: More accurate and simpler synonyms Synonymy Rating (groups D & N) (H(1) = 110.36, p < 0.001), (H(1) = 198.72, p < 0.001) 
 Simplicity Rating (groups D & N) (H(1) = 131.76, p < 0.001), (H(1) = 179.82, p < 0.001) — Test well calibrated: expected low value answers: 1.41 (s = 0.98) for group D, 1.47 (s = 0.51) for Group N expected high value answers: 8.77 (s = 0.93) for group D, 9.16 (s = 0.69) for Group N [Rello & Baeza-Yates, W4A 2014 (best paper award runner-up)] Understanding Text Presentation Text Content Integration — New algorithm CASSA, outperforms the 
 hard-to-beat Frequency Baseline [Specia et al. 2012] PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  65. 65. Outline — Word frequency 
 — Word length — Numerical Representation — Paraphrases — Graphical Schemes — Keywords Conditions Studied How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Text Content Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  66. 66. OutlineContributions How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 —  Frequent words improve readability while shorter words may improve comprehensibility, especially in people with dyslexia. —  Numbers represented as digits instead of words, as well as percentages instead of fractions, improve readability of people with dyslexia. —  Graphical schemes improve the subjective readability and comprehensibility of people with dyslexia. — Highlighted keywords increases the objective comprehension by people with dyslexia, but not the readability. — Lexical simplification via automatic substitution of complex words by simpler synonyms is not helpful. However, showing synonyms on demand improves the subjective readability and comprehensibility of people with dyslexia. Rello, L., Baeza-Yates, R., Dempere, L. and Saggion, H. Frequent Words Improve Readability and Short Words Improve Understand- ability for People with Dyslexia. Proc. INTERACT ’13. Cape Town, South Africa: IFIP Press; 2013, p. 203–219 Rello, L., Bautista, S., Baeza-Yates, R., Gervás, P., Hervás, R. and Saggion, H. One Half or 50%? An Eye-Tracking Study of Number Representation Readability. Proc. INTERACT ’13. Cape Town, South Africa: IFIP Press; 2013, p. 229-245 Rello, L., Baeza-Yates, R., Bott, S. and Saggion, H. Simplify or Help? Text Simplification Strategies for People with Dyslexia. Proc. W4A ’13. Rio de Janeiro, Brazil: ACM Press; 2013 (best paper award). Rello, L. and Baeza-Yates, R. Evaluation of DysWebxia: A Reading App Designed for People with Dyslexia. Proc. W4A ’14. Seoul, South Korea: ACM Press; 2014 (Chapter 15 [319], best paper nominee). Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  67. 67. Outline Integrating Form and Content PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  68. 68. Outline How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Values with positive e↵ects for Condition Measures with Dyslexia without Dyslexia Font Type Obj. Readability Arial Arial Courier Courier CMU CMU Helvetica Verdana Preferences Verdana Verdana Helvetica Helvetica Arial Arial Recommendation: Arial, Courier, CMU, Helvetica, and Verdana. Font Face Obj. Readability roman roman sans serif sans serif monospaced monospaced Preferences roman roman sans serif no e↵ects no e↵ects proportional Recommendation: roman, sans serif and monospaced. Font Size Obj. Readability 18, 22 and 18, 22 and 26 points 26 points Obj. Comprehensibility 18, 22 and 14, 18, 22 and 26 points 26 points Subj. Readability 18 and 22 points 18 and 22 points Subj. Comprehensibility 18, 22 and 14, 18, 22 and 26 points 26 points Recommendation: 18 and 22 points Character Spacing Obj. Readability +7%, +14% +7%, +14% Preferences no e↵ects 0% Text Presentation Recommendations [Rello & Baeza-Yates, to appear in Universal Access in the Information Society (UAIS)] Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  69. 69. Outline How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Text Presentation Recommendations Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona [Rello & Baeza-Yates, to appear in Universal Access in the Information Society (UAIS)]
  70. 70. Outline Text Content Recommendations How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona [Rello, Baeza-Yates, Dempere & Saggion, INTERACT 2013]
 [Rello, Bautista, Baeza-Yates, Gervás, Hervás & Saggion, INTERACT 2013]
  71. 71. Outline Text Content Recommendations How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona [Rello, Baeza-Yates & Saggion. CICLing 2013]
 [Rello, Saggion & Baeza-Yates, PITR 2014]
 [Rello, Baeza-Yates, Saggion & Graells, PITR 2012]
 [Rello, Baeza-Yates, Bott, & Saggion, W4A 2013]
 [Rello, L. and Baeza-Yates. W4A 2014]
  72. 72. how? Applications How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 IDEAL e-Book reader Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  73. 73. OutlineIDEAL eBook Reader [Kanvinde, Rello & Baeza-Yates, ASSETS 2012 (demo)] — 35,000 downloads — Finalist - Vodafone Foundation Smart 
 Accessibility Awards 2012 — Usability Evaluation - 14 participantsAccessible Systems Mumbai, India — Table of contents — Supports text-to-speech technology. 
 — Spells word-by-word or letter-by-letter. — Write a comment. Google Play https://play.google.com/store/apps/ details?id=org.easyaccess.epubreader How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 dd Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  74. 74. ‘Simpler’ Ideal Configuration Font Synonyms Color Helvetica Outline [Rello, Baeza-Yates, Saggion, Bayarri & Barbosa, ASSETS 2013 (demo)] iOS Reader Soon in the App Store — Usability evaluation with 12 participants Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  75. 75. OutlineText4all DysWebxia [Rello, Baeza-Yates, Bott, Saggion, Carlini, Bayarri, Gorriz, Kanvinde, Gupta, Topac 2013 (challenge)] [Topac 2014 (PhD thesis)] How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 by Vasile Topac Polytechnic University of Timisoara, Romania — Finalist in The Paciello Group Web
 Accessibility Challenge http://www.text4all.net/dyswebxia.html Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  76. 76. Tools Overview How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  77. 77. OutlineOngoing Work How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Understanding Text Presentation Text Content Integration — Departament d’Ensenyament 
 (Àrea de Tecnologies per a l'Aprenentatge i el Coneixement) Department of Education (Technologies for Learning) ! ! ! — Cloud4All Project with Technosite ! ! — Web standards PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  78. 78. OutlineMain Contributions How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 ! — A new model called DysWebxia, 
 that combines all our results and that 
 has been integrated so far in four
 reading tools. ! ! — Two new available language resources http://www.luzrello.com/Resources — Text Content Recommendations — Text Presentation Recommendations — DysList, a list of dyslexic errors 
 annotated with linguistic, phonetic and 
 visual features. ! — CASSA List, a new resource for Spanish 
 lexical simplification composed of a list of 
 disambiguated complex words, their 
 context, and their corresponding simpler 
 synonyms, ranked by complexity. — Written errors — Processed differently (reading) by people with and without dyslexia — Phonetically and visually motivated PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  79. 79. OutlineAcknowledgments How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 Ricardo Baeza-Yates Horacio Saggion Gaurang Kanvinde Vasile Topac Joaquim Llisterri Mari-Carmen Marcos Laura Dempere Simone Barbosa Clara Bayarri Stefan Bott Roberto Carlini Families with children with dyslexia People with dyslexia Yolanda Otal de la Torre María Sanz-Pastor Moreno de Alborán Luis Miret Martin Pielot Julia Dembowski Eduardo Graells Diego Saez-Trumper Azuki Gorriz Verónica Moreno PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona
  80. 80. Thank you How people with dyslexia read and what can HCI and NLP do about it? Keynote at DSAI 2013 luzrello@acm.org PhD Thesis Defense — 27th June 2014, Universitat Pompeu Fabra, Barcelona

×