Towards A Real-Time System for  Finding and Reading Signs  for Visually Impaired Users     James Coughlan, Ph.D.
Informational signsSigns are ubiquitous indoors and outdoorsUseful for wayfinding, finding shops andbusinesses, accessing ...
OCR (Optical Character          Recognition)Originally developed for clear images of textdocuments, acquired by a flatbed ...
Portable OCR for visually         impaired usersSmartphone (Nokia N82) implementation:kReader Mobile, knfbReader Mobile (K...
kReader Mobile limitationAssumes text comprises all (or most) ofimage:“Get as close to the text as you can withoutcutting ...
Related workMuch research on computer vision algorithmsfor finding text in cluttered imagesVery challenging problemEven if...
Related work (continued)Some smartphone apps find text, read itand translate it in real time                              ...
Related work (continued)A small amount of work targeted specifically atfinding and reading text for blind and visuallyimpa...
Our approach• Design algorithm to rapidly find text on  Android smartphone running in video mode  (640 x 480 pixels)• Perf...
System UI (user interface)• Philosophy: text detection/reading errors are  inevitable. To overcome them, have user  obtain...
Overview of algorithm                        11
Big challenge: how to aim the     smartphone camera?If you are blind, you may have little idea whereto aim the camera! (kR...
Help with aiming: UI features• Tilt detection function:  allows user to vary pitch  and yaw but forces roll to  be zero. I...
Help with aiming: UI           features2) Warning whenever text is close to being cutoff: read aloud detected text in a lo...
Help with aiming: UI     features           Red box = camera               field of view            “No smoking”         ...
Help with aiming: UI features3) Warning whenever text is small: read text ina high pitch  signal user to approach text fo...
ExperimentsTen signs printed out and placed on twoadjoining walls of conference roomTwo blind volunteer subjects, out of r...
-Subjects told to search for an unknownnumber of signs on the two walls, and totell experimenter content of each signdetec...
Experimental resultsSubject 1:•6 signs reported perfectly correctly•2 signs completely missed•2 other signs reported with ...
Experimental resultsSubject 2:•3 signs reported perfectly correctly•Typical errors:- “Exam Room 150” was detected andread ...
DiscussionSystem still very difficult to use!False positives and false negatives (i.e.,  missed text) still a big problem ...
Discussion (continued)UI planned in the future:• Have user scan environment, sound an  audio tone whenever text is detecte...
Discussion (continued)Further in the future:“Visual spam” is a big problem  task-driven  search (“find me Dr. Smith’s off...
Thanks to…First author: Dr. Huiying Shen (Smith-   Kettlewell)Collaborators: Dr. Roberto Manduchi (UC  Santa Cruz), Dr. Vi...
Upcoming SlideShare
Loading in...5
×

Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

450
-1

Published on

Portable and Mobile Systems in Assistive Technology - Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users - Coughlan, James (f)

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
450
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users

  1. 1. Towards A Real-Time System for Finding and Reading Signs for Visually Impaired Users James Coughlan, Ph.D.
  2. 2. Informational signsSigns are ubiquitous indoors and outdoorsUseful for wayfinding, finding shops andbusinesses, accessing variety of servicesBut nearly all are inaccessible to blind andvisually impaired persons! 2
  3. 3. OCR (Optical Character Recognition)Originally developed for clear images of textdocuments, acquired by a flatbed scannerNot equipped to find text in an image with lotsof non-text clutter (buildings, trees, etc.) 3
  4. 4. Portable OCR for visually impaired usersSmartphone (Nokia N82) implementation:kReader Mobile, knfbReader Mobile (K–NFBReading Technology, Inc.) 4
  5. 5. kReader Mobile limitationAssumes text comprises all (or most) ofimage:“Get as close to the text as you can withoutcutting off any text, as it is displayed on thescreen”“Distance from the target can greatly affect thetext recognition quality. Most, but not all,documents should be approximately 10 inchesfrom the Reader.” (KNFB Mobile Reader UserGuide) 5
  6. 6. Related workMuch research on computer vision algorithmsfor finding text in cluttered imagesVery challenging problemEven if text is correctly located in an image,many problems with OCR:• non-standard fonts• poor illumination• curved surfaces, perspective distortion• other forms of noise in images 6
  7. 7. Related work (continued)Some smartphone apps find text, read itand translate it in real time 7
  8. 8. Related work (continued)A small amount of work targeted specifically atfinding and reading text for blind and visuallyimpaired persons:•C. Yi & Y. Tian, 2011•“Smart Telescope” project from BlindsightCorporation (www.blindsight.com): find textregions and present enlarged text to low visionuser 8
  9. 9. Our approach• Design algorithm to rapidly find text on Android smartphone running in video mode (640 x 480 pixels)• Perform on-board OCR (Tesseract)• Read aloud (text-to-speech) immediately• For speed, all processing is done on-board (no need for internet connection). Read aloud up to 1-2 frames per second. 9
  10. 10. System UI (user interface)• Philosophy: text detection/reading errors are inevitable. To overcome them, have user obtain multiple readings of each text sign over time. Ignore spurious (unreproducible) readings, and come to consensus about true contents of each sign.• If multiple text strings in one image, read aloud in “raster” order (from top to bottom, and along a line from left to right) 10
  11. 11. Overview of algorithm 11
  12. 12. Big challenge: how to aim the smartphone camera?If you are blind, you may have little idea whereto aim the camera! (kReader Mobile UserGuide has an entire section on “Learning toAim Your Reader”)Also, text is best read when it is horizontal, butmany blind users have trouble holding camerahorizontal 12
  13. 13. Help with aiming: UI features• Tilt detection function: allows user to vary pitch and yaw but forces roll to be zero. Issue vibration any time roll is far enough from zero. Allows user to point in any compass direction, and to aim high or low depending on whether text is above or below shoulder height. Increases chances that text appears horizontal in image. 13
  14. 14. Help with aiming: UI features2) Warning whenever text is close to being cutoff: read aloud detected text in a low pitch. Red box = camera field of view  “Smoking” (low pitch) 14
  15. 15. Help with aiming: UI features Red box = camera field of view  “No smoking” (normal pitch) 15
  16. 16. Help with aiming: UI features3) Warning whenever text is small: read text ina high pitch  signal user to approach text forclearer view Red box = camera field of view NO SMOKING  “No smoking” (high pitch) 16
  17. 17. ExperimentsTen signs printed out and placed on twoadjoining walls of conference roomTwo blind volunteer subjects, out of reachof wallBrief training session: purpose ofexperiment, how to hold and movecamera 17
  18. 18. -Subjects told to search for an unknownnumber of signs on the two walls, and totell experimenter content of each signdetected 18
  19. 19. Experimental resultsSubject 1:•6 signs reported perfectly correctly•2 signs completely missed•2 other signs reported with some errors:“Dr. Samuels” was detected as “Samuels”(audible to experimenter but not subject)•“Meeting in Session” sign gave rise tothe words “Meeting” and “section” (thoughthey were not uttered together) 19
  20. 20. Experimental resultsSubject 2:•3 signs reported perfectly correctly•Typical errors:- “Exam Room 150” was detected andread aloud correctly, but subject wasunable to understand the word “exam”- Reported “D L Samuels meeting insession” as a sign, which is an incorrectcombination of two signs, “Dr. Samuels”(which the system misread as “Dr.”) and 20“Meeting in Session”
  21. 21. DiscussionSystem still very difficult to use!False positives and false negatives (i.e., missed text) still a big problem  we are improving our text detection algorithmEven when text is correctly detected, OCR still causes many errorsSlow processing speeds (plus camera motion blur) force user to pan camera very slowly 21
  22. 22. Discussion (continued)UI planned in the future:• Have user scan environment, sound an audio tone whenever text is detected• Compute an image mosaic (panorama) of entire scene, to seamlessly read text strings that don’t fit inside a single image frame• Cluster multiple text strings into distinct sign regions• User will be able to hear text-to-speech repeated for any sign region 22
  23. 23. Discussion (continued)Further in the future:“Visual spam” is a big problem  task-driven search (“find me Dr. Smith’s office”)Finding signs will always be difficult at times (even for people with normal vision)  integration with “indoor GPS” (i.e., localization indoors) to provide useful, location-specific information 23
  24. 24. Thanks to…First author: Dr. Huiying Shen (Smith- Kettlewell)Collaborators: Dr. Roberto Manduchi (UC Santa Cruz), Dr. Vidya Murali and Dr. Ender Tekin (Smith-Kettlewell)Funding from NIH and NIDRR 24
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×